Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"The main changes in this cycle were KASLR improvements for rare
environments with special boot options, by Baoquan He. Also misc
smaller changes/cleanups"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/debug: Extend the lower bound of crash kernel low reservations
x86/boot: Remove unused copy_*_gs() functions
x86/KASLR: Use the right memcpy() implementation
Documentation/kernel-parameters.txt: Update 'memmap=' boot option description
x86/KASLR: Handle the memory limit specified by the 'memmap=' and 'mem=' boot options
x86/KASLR: Parse all 'memmap=' boot option entries
|
|
KASLR uses hack to detect whether we booted via startup_32() or
startup_64(): it checks what is loaded into cr3 and compares it to
_pgtables. _pgtables is the array of page tables where early code
allocates page table from.
KASLR expects cr3 to point to _pgtables if we booted via startup_32(), but
that's not true if we booted with 5-level paging enabled. In this case top
level page table is allocated separately and only the first p4d page table
is allocated from the array.
Let's modify the check to cover both 4- and 5-level paging cases.
The patch also renames 'level4p' to 'top_level_pgt' as it now can hold
page table for 4th or 5th level, depending on configuration.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Kernel text KASLR is separated into physical address and virtual
address randomization. And for virtual address randomization, we
only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE.
So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR,
but not the original kernel loading address 'output'.
The bug will cause kernel boot failure if kernel is loaded at a different
position than the address, 16M, which is decided at compiled time.
Kexec/kdump is such practical case.
To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial
value.
Tested-by: Dave Young <[email protected]>
Signed-off-by: Baoquan He <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
randomization
For kernel text KASLR, the virtual address is confined to area of 1G,
[0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of
virtual address randomization, we only randomize to get an offset
between 16M and 1G, then add this offset to the starting address,
0xffffffff80000000. Here 16M is the offset which is decided at linking
stage. So the amount of the local variable 'virt_addr' which respresents
the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE.
Add a debug check for the offset. If out of bounds, print error
message and hang there.
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Baoquan He <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The Sparse static analyzer emits this warning:
symbol 'strchr' was not declared. Should it be static?
This patch adds the appropriate extern declaration to string.h
to fix the warning.
Signed-off-by: Tommy Nguyen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/20170623143601.GA20743@NoChina
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We need to cover two basic cases: when bootloader left us in 32-bit mode
and when bootloader enabled long mode.
The patch implements unified codepath to enabled 5-level paging for both
cases. It means case when we start in 32-bit mode, we first enable long
mode with 4-level and then switch over to 5-level paging.
Switching from 4-level to 5-level paging is not trivial. We cannot do it
directly. Setting LA57 in long mode would trigger #GP. So we need to
switch off long mode first and the then re-enable with 5-level paging.
NOTE: The need of switching off long mode means we are in trouble if
bootloader put us above 4G boundary. If bootloader wants to boot 5-level
paging kernel, it has to put kernel below 4G or enable 5-level paging on
it's own, so we could avoid the step.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We would need to switch temporarily to compatibility mode during booting
with 5-level paging enabled. It would require 32-bit code segment
descriptor.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Define __KERNEL_CS GDT entry as long mode (.L=1, .D=0) on 64-bit
configurations.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This is preparation for following patches without changing semantics of the
code.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Matt Fleming <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The kernel has several code paths that read CR3. Most of them assume that
CR3 contains the PGD's physical address, whereas some of them awkwardly
use PHYSICAL_PAGE_MASK to mask off low bits.
Add explicit mask macros for CR3 and convert all of the CR3 readers.
This will keep them from breaking when PCID is enabled.
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: xen-devel <[email protected]>
Link: http://lkml.kernel.org/r/883f8fb121f4616c1c1427ad87350bb2f5ffeca1.1497288170.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
copy_from_gs() and copy_to_gs() are unused in the boot code. They have
actually never been used -- they were always commented out since their
addition in 2007:
5be865661516 ("String-handling functions for the new x86 setup code.")
So remove them -- they can be restored from history if needed.
Signed-off-by: Jiri Slaby <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The decompressor has its own implementation of the string functions,
but has to include the right header to get those, while implicitly
including linux/string.h may result in a link error:
arch/x86/boot/compressed/kaslr.o: In function `choose_random_location':
kaslr.c:(.text+0xf51): undefined reference to `_mmx_memcpy'
This has appeared now as KASLR started using memcpy(), via:
d52e7d5a952c ("x86/KASLR: Parse all 'memmap=' boot option entries")
Other files in the decompressor already do the same thing.
Signed-off-by: Arnd Bergmann <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
boot options
The 'mem=' boot option limits the max address a system can use - any memory
region above the limit will be removed.
Furthermore, the 'memmap=nn[KMG]' variant (with no offset specified) has the same
behaviour as 'mem='.
KASLR needs to consider this when choosing the random position for
decompressing the kernel. Do it.
Tested-by: Masayoshi Mizuma <[email protected]>
Signed-off-by: Baoquan He <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In commit:
f28442497b5c ("x86/boot: Fix KASLR and memmap= collision")
... the memmap= option is parsed so that KASLR can avoid those reserved
regions. It uses cmdline_find_option() to get the value if memmap=
is specified, however the problem is that cmdline_find_option() can only
find the last entry if multiple memmap entries are provided. This
is not correct.
Address this by checking each command line token for a "memmap=" match
and parse each instance instead of using cmdline_find_option().
Signed-off-by: Baoquan He <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The boot code Makefile contains a straight 'readelf' invocation. This
causes build warnings in cross compile environments, when there is no
unprefixed readelf accessible via $PATH.
Add the missing $(CROSS_COMPILE) prefix.
[ tglx: Rewrote changelog ]
Fixes: 98f78525371b ("x86/boot: Refuse to build with data relocations")
Signed-off-by: Rob Landley <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Paul Bolle <[email protected]>
Cc: "H.J. Lu" <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Kernel identity mappings on x86-64 kernels are created in two
ways: by the early x86 boot code, or by kernel_ident_mapping_init().
Native kernels (which is the dominant usecase) use the former,
but the kexec and the hibernation code uses kernel_ident_mapping_init().
There's a subtle difference between these two ways of how identity
mappings are created, the current kernel_ident_mapping_init() code
creates identity mappings always using 2MB page(PMD level) - while
the native kernel boot path also utilizes gbpages where available.
This difference is suboptimal both for performance and for memory
usage: kernel_ident_mapping_init() needs to allocate pages for the
page tables when creating the new identity mappings.
This patch adds 1GB page(PUD level) support to kernel_ident_mapping_init()
to address these concerns.
The primary advantage would be better TLB coverage/performance,
because we'd utilize 1GB TLBs instead of 2MB ones.
It is also useful for machines with large number of memory to
save paging structure allocations(around 4MB/TB using 2MB page)
when setting identity mappings for all the memory, after using
1GB page it will consume only 8KB/TB.
( Note that this change alone does not activate gbpages in kexec,
we are doing that in a separate patch. )
Signed-off-by: Xunlei Pang <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The compressed boot function error() is used to halt execution, but it
wasn't marked with "noreturn". This fixes that in preparation for
supporting kernel FORTIFY_SOURCE, which uses the noreturn annotation
on panic, and calls error(). GCC would warn about a noreturn function
calling a non-noreturn function:
arch/x86/boot/compressed/misc.c: In function ‘fortify_panic’:
arch/x86/boot/compressed/misc.c:416:1: warning: ‘noreturn’ function does return
}
^
Signed-off-by: Kees Cook <[email protected]>
Cc: Daniel Micay <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Link: http://lkml.kernel.org/r/20170506045116.GA2879@beast
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
"The main x86 MM changes in this cycle were:
- continued native kernel PCID support preparation patches to the TLB
flushing code (Andy Lutomirski)
- various fixes related to 32-bit compat syscall returning address
over 4Gb in applications, launched from 64-bit binaries - motivated
by C/R frameworks such as Virtuozzo. (Dmitry Safonov)
- continued Intel 5-level paging enablement: in particular the
conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov)
- x86/mpx ABI corner case fixes/enhancements (Joerg Roedel)
- ... plus misc updates, fixes and cleanups"
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
x86/mm: Fix flush_tlb_page() on Xen
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm/64: Fix crash in remove_pagetable()
Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
x86/boot/e820: Remove a redundant self assignment
x86/mm: Fix dump pagetables for 4 levels of page tables
x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"
x86/espfix: Add support for 5-level paging
x86/kasan: Extend KASAN to support 5-level paging
x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y
x86/paravirt: Add 5-level support to the paravirt code
x86/mm: Define virtual memory map for 5-level paging
x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert
x86/boot: Detect 5-level paging support
x86/mm/numa: Remove numa_nodemask_from_meminfo()
...
|
|
Dave found that a kdump kernel with KASLR enabled will reset to the BIOS
immediately if physical randomization failed to find a new position for
the kernel. A kernel with the 'nokaslr' option works in this case.
The reason is that KASLR will install a new page table for the identity
mapping, while it missed building it for the original kernel location
if KASLR physical randomization fails.
This only happens in the kexec/kdump kernel, because the identity mapping
has been built for kexec/kdump in the 1st kernel for the whole memory by
calling init_pgtable(). Here if physical randomizaiton fails, it won't build
the identity mapping for the original area of the kernel but change to a
new page table '_pgtable'. Then the kernel will triple fault immediately
caused by no identity mappings.
The normal kernel won't see this bug, because it comes here via startup_32()
and CR3 will be set to _pgtable already. In startup_32() the identity
mapping is built for the 0~4G area. In KASLR we just append to the existing
area instead of entirely overwriting it for on-demand identity mapping
building. So the identity mapping for the original area of kernel is still
there.
To fix it we just switch to the new identity mapping page table when physical
KASLR succeeds. Otherwise we keep the old page table unchanged just like
"nokaslr" does.
Signed-off-by: Baoquan He <[email protected]>
Signed-off-by: Dave Young <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yinghai Lu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The minimum size for a new stack (512 bytes) setup for arch/x86/boot components
when the bootloader does not setup/provide a stack for the early boot components
is not "enough".
The setup code executing as part of early kernel startup code, uses the stack
beyond 512 bytes and accidentally overwrites and corrupts part of the BSS
section. This is exposed mostly in the early video setup code, where
it was corrupting BSS variables like force_x, force_y, which in-turn affected
kernel parameters such as screen_info (screen_info.orig_video_cols) and
later caused an exception/panic in console_init().
Most recent boot loaders setup the stack for early boot components, so this
stack overwriting into BSS section issue has not been exposed.
Signed-off-by: Ashish Kalra <[email protected]>
Cc: <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There's a conflict between ongoing level-5 paging support and
the E820 rewrite. Since the E820 rewrite is essentially ready,
merge it into x86/mm to reduce tree conflicts.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The E820 rework in WIP.x86/boot has gone through a couple of weeks
of exposure in -tip, merge it in a wider fashion.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In this initial implementation we force-require 5-level paging support
from the hardware, when compiled with CONFIG_X86_5LEVEL=y. (The kernel
will panic during boot on CPUs that don't support 5-level paging.)
We will implement boot-time switch between 4- and 5-level paging later.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Sparse complains about missing forward declarations:
arch/x86/boot/compressed/error.c:8:6:
warning: symbol 'warn' was not declared. Should it be static?
arch/x86/boot/compressed/error.c:15:6:
warning: symbol 'error' was not declared. Should it be static?
Include the missing header file.
Signed-off-by: Zhengyi Shen <[email protected]>
Acked-by: Kess Cook <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
Include declarations for various symbols defined in the error.h header file
to fix the following Sparse warnings:
arch/x86/boot/compressed/error.c:8:6:
warning: symbol 'warn' was not declared. Should it be static?
arch/x86/boot/compressed/error.c:15:6:
warning: symbol 'error' was not declared. Should it be static?
Signed-off-by: Zhengyi Shen <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixed/enhanced the changelog. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
updates
Conflicts:
arch/x86/xen/setup.c
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"Misc updates:
- fix e820 error handling
- convert page table setup code from assembly to C
- fix kexec environment bug
- ... plus small cleanups"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kconfig: Remove misleading note regarding hibernation and KASLR
x86/boot: Fix KASLR and memmap= collision
x86/e820/32: Fix e820_search_gap() error handling on x86-32
x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C
x86/e820: Make e820_search_gap() static and remove unused variables
|
|
Get the firmware's secure-boot status in the kernel boot wrapper and stash
it somewhere that the main kernel image can find.
The efi_get_secureboot() function is extracted from the ARM stub and (a)
generalised so that it can be called from x86 and (b) made to use
efi_call_runtime() so that it can be run in mixed-mode.
For x86, it is stored in boot_params and can be overridden by the boot
loader or kexec. This allows secure-boot mode to be passed on to a new
kernel.
Suggested-by: Lukas Wunner <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Small readability edits. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Provide the ability to perform mixed-mode runtime service calls for x86 in
the same way the following commit provided the ability to invoke for boot
services:
0a637ee61247bd ("x86/efi: Allow invocation of arbitrary boot services")
Suggested-by: Lukas Wunner <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matt Fleming <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Eliminate the separate 32-bit and 64x- bit code paths by way of the shiny
new efi_call_proto() macro.
No functional change intended.
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There's one ARM, one x86_32 and one x86_64 version which can be folded
into a single shared version by masking their differences with the shiny
new efi_call_proto() macro.
No functional change intended.
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Linus pointed out that relying on the compiler to pack structures with
enums is fragile not just for the kernel, but for external tooling as
well which might rely on our UAPI headers.
So separate the two from each other: introduce 'struct boot_e820_entry',
which is the boot protocol entry format.
This actually simplifies the code, as e820__update_table() is now never
called directly with boot protocol table entries - we can rely on
append_e820_table() and do a e820__update_table() call afterwards.
( This will allow further simplifications of __e820__update_table(),
but that will be done in a separate patch. )
This change also has the side effect of not modifying the bootparams structure
anymore - which might be useful for debugging. In theory we could even constify
the boot_params structure - at least from the E820 code's point of view.
Remove the uapi/asm/e820/types.h file, as it's not used anymore - all
kernel side E820 types are defined in asm/e820/types.h.
Reported-by: Linus Torvalds <[email protected]>
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
So there's a number of constants that start with "E820" but which
are not types - these create a confusing mixture when seen together
with 'enum e820_type' values:
E820MAP
E820NR
E820_X_MAX
E820MAX
To better differentiate the 'enum e820_type' values prefix them
with E820_TYPE_.
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In line with the rename to 'struct e820_array', harmonize the naming of common e820
table variable names as well:
e820 => e820_array
e820_saved => e820_array_saved
e820_map => e820_array
initial_e820 => e820_array_init
This makes the variable names more consistent and easier to grep for.
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
'struct e820_array'
The 'e820entry' and 'e820map' names have various annoyances:
- the missing underscore departs from the usual kernel style
and makes the code look weird,
- in the past I kept confusing the 'map' with the 'entry', because
a 'map' is ambiguous in that regard,
- it's not really clear from the 'e820map' that this is a regular
C array.
Rename them to 'struct e820_entry' and 'struct e820_array' accordingly.
( Leave the legacy UAPI header alone but do the rename in the bootparam.h
and e820/types.h file - outside tools relying on these defines should
either adjust their code, or should use the legacy header, or should
create their private copies for the definitions. )
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There's an assembly guard in asm/e820/types.h, and only
a single .S file includes this header: arch/x86/boot/header.S,
but it does not actually make use of any of the E820 defines.
Remove the inclusion and remove the assembly guard as well.
No change in functionality.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
A commonly used lowlevel x86 header, asm/pgtable.h, includes asm/e820/api.h
spuriously, without making direct use of it.
Removing it is not simple: over the years various .c code learned to rely
on this indirect inclusion.
Remove the unnecessary include - this should speed up the kernel build a bit,
as a large header is not included anymore in totally unrelated code.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In line with asm/e820/types.h, move the e820 API declarations to
asm/e820/api.h and update all usage sites.
This is just a mechanical, obviously correct move & replace patch,
there will be subsequent changes to clean up the code and to make
better use of the new header organization.
Cc: Alex Thorlton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Huang, Ying <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Paul Jackson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wei Yang <[email protected]>
Cc: Yinghai Lu <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
CONFIG_RANDOMIZE_BASE=y relocates the kernel to a random base address.
However it does not take into account the memmap= parameter passed in from
the kernel command line. This results in the kernel sometimes being put in
the middle of memmap.
Teach KASLR to not insert the kernel in memmap defined regions. We support
up to 4 memmap regions: any additional regions will cause KASLR to disable.
The mem_avoid set has been augmented to add up to 4 unusable regions of
memmaps provided by the user to exclude those regions from the set of valid
address range to insert the uncompressed kernel image.
The nn@ss ranges will be skipped by the mem_avoid set since it indicates
that memory is useable.
Signed-off-by: Dave Jiang <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Kees Cook <[email protected]>
Acked-by: Baoquan He <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/148417664156.131935.2248592164852799738.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Add the missing declarations of basic string functions to string.h to allow
a clean build.
Fixes: 5be865661516 ("String-handling functions for the new x86 setup code.")
Signed-off-by: Nicholas Mc Guire <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
This reverts commit ed68d7e9b9cfb64f3045ffbcb108df03c09a0f98.
The patch wasn't quite correct -- there are non-Intel (and hence
non-486) CPUs that we support that don't have CPUID. Since we no
longer require CPUID for sync_core(), just revert the patch.
I think the relevant CPUs are Geode and Elan, but I'm not sure.
In principle, we should try to do better at identifying CPUID-less
CPUs in early boot, but that's more complicated.
Reported-by: One Thousand Gnomes <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Matthew Whitehead <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Henrique de Moraes Holschuh <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: xen-devel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/82acde18a108b8e353180dd6febcc2876df33f24.1481307769.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
NTB: correct ntb_spad_count comment typo
misc: ibmasm: fix typo in error message
Remove references to dead make variable LINUX_INCLUDE
Remove last traces of ikconfig.h
treewide: Fix printk() message errors
Documentation/device-mapper: s/getsize/getsz/
|
|
Commit 4fd06960f120 ("Use the new x86 setup code for i386") introduced a
reference to the make variable LINUX_INCLUDE. That reference got moved
around a bit and copied twice and now there are three references to it.
There has never been a definition of that variable. (Presumably that is
because it started out as a mistyped reference to LINUXINCLUDE.) So this
reference has always been an empty string. Let's remove it before it
spreads any further.
Signed-off-by: Paul Bolle <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
"Misc cleanups/simplifications by Borislav Petkov, Paul Bolle and Wei
Yang"
* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/64: Optimize fixmap page fixup
x86/boot: Simplify the GDTR calculation assembly code a bit
x86/boot/build: Remove always empty $(USERINCLUDE)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes in this development cycle were:
- Implement EFI dev path parser and other changes to fully support
thunderbolt devices on Apple Macbooks (Lukas Wunner)
- Add RNG seeding via the EFI stub, on ARM/arm64 (Ard Biesheuvel)
- Expose EFI framebuffer configuration to user-space, to improve
tooling (Peter Jones)
- Misc fixes and cleanups (Ivan Hu, Wei Yongjun, Yisheng Xie, Dan
Carpenter, Roy Franz)"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/libstub: Make efi_random_alloc() allocate below 4 GB on 32-bit
thunderbolt: Compile on x86 only
thunderbolt, efi: Fix Kconfig dependencies harder
thunderbolt, efi: Fix Kconfig dependencies
thunderbolt: Use Device ROM retrieved from EFI
x86/efi: Retrieve and assign Apple device properties
efi: Allow bitness-agnostic protocol calls
efi: Add device path parser
efi/arm*/libstub: Invoke EFI_RNG_PROTOCOL to seed the UEFI RNG table
efi/libstub: Add random.c to ARM build
efi: Add support for seeding the RNG from a UEFI config table
MAINTAINERS: Add ARM and arm64 EFI specific files to EFI subsystem
efi/libstub: Fix allocation size calculations
efi/efivar_ssdt_load: Don't return success on allocation failure
efifb: Show framebuffer layout as device attributes
efi/efi_test: Use memdup_user() as a cleanup
efi/efi_test: Fix uninitialized variable 'rv'
efi/efi_test: Fix uninitialized variable 'datasize'
efi/arm*: Fix efi_init() error handling
efi: Remove unused include of <linux/version.h>
|
|
Since the bootloader may load the compressed x86 kernel at any address,
it should always be built as PIE, not just when CONFIG_RELOCATABLE=y.
Otherwise, linker in binutils 2.27 will optimize GOT load into the
absolute address when building the compressed x86 kernel as a non-PIE
executable.
Signed-off-by: H.J. Lu <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
[ Small wording changes. ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Linux will have all kinds of sporadic problems on systems that don't
have the CPUID instruction unless CONFIG_M486=y. In particular,
sync_core() will explode.
I believe that these kernels had a better chance of working before
commit 05fb3c199bb0 ("x86/boot: Initialize FPU and X86_FEATURE_ALWAYS
even if we don't have CPUID"). That commit inadvertently fixed a
serious bug: we used to fail to detect the FPU if CPUID wasn't
present. Because we also used to forget to set X86_FEATURE_ALWAYS, we
end up with no cpu feature bits set at all. This meant that
alternative patching didn't do anything and, if paravirt was disabled,
we could plausibly finish the entire boot process without calling
sync_core().
Rather than trying to work around these issues, just have the kernel
fail loudly if it's running on a CPUID-less 486, doesn't have CPUID,
and doesn't have CONFIG_M486 set.
Reported-by: Matthew Whitehead <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/70eac6639f23df8be5fe03fa1984aedd5d40077a.1479598603.git.luto@kernel.org
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Apple's EFI drivers supply device properties which are needed to support
Macs optimally. They contain vital information which cannot be obtained
any other way (e.g. Thunderbolt Device ROM). They're also used to convey
the current device state so that OS drivers can pick up where EFI
drivers left (e.g. GPU mode setting).
There's an EFI driver dubbed "AAPL,PathProperties" which implements a
per-device key/value store. Other EFI drivers populate it using a custom
protocol. The macOS bootloader /System/Library/CoreServices/boot.efi
retrieves the properties with the same protocol. The kernel extension
AppleACPIPlatform.kext subsequently merges them into the I/O Kit
registry (see ioreg(8)) where they can be queried by other kernel
extensions and user space.
This commit extends the efistub to retrieve the device properties before
ExitBootServices is called. It assigns them to devices in an fs_initcall
so that they can be queried with the API in <linux/property.h>.
Note that the device properties will only be available if the kernel is
booted with the efistub. Distros should adjust their installers to
always use the efistub on Macs. grub with the "linux" directive will not
work unless the functionality of this commit is duplicated in grub.
(The "linuxefi" directive should work but is not included upstream as of
this writing.)
The custom protocol has GUID 91BD12FE-F6C3-44FB-A5B7-5122AB303AE0 and
looks like this:
typedef struct {
unsigned long version; /* 0x10000 */
efi_status_t (*get) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name,
OUT void *buffer,
IN OUT u32 *buffer_len);
/* EFI_SUCCESS, EFI_NOT_FOUND, EFI_BUFFER_TOO_SMALL */
efi_status_t (*set) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name,
IN void *property_value,
IN u32 property_value_len);
/* allocates copies of property name and value */
/* EFI_SUCCESS, EFI_OUT_OF_RESOURCES */
efi_status_t (*del) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name);
/* EFI_SUCCESS, EFI_NOT_FOUND */
efi_status_t (*get_all) (
IN struct apple_properties_protocol *this,
OUT void *buffer,
IN OUT u32 *buffer_len);
/* EFI_SUCCESS, EFI_BUFFER_TOO_SMALL */
} apple_properties_protocol;
Thanks to Pedro Vilaça for this blog post which was helpful in reverse
engineering Apple's EFI drivers and bootloader:
https://reverse.put.as/2016/06/25/apple-efi-firmware-passwords-and-the-scbo-myth/
If someone at Apple is reading this, please note there's a memory leak
in your implementation of the del() function as the property struct is
freed but the name and value allocations are not.
Neither the macOS bootloader nor Apple's EFI drivers check the protocol
version, but we do to avoid breakage if it's ever changed. It's been the
same since at least OS X 10.6 (2009).
The get_all() function conveniently fills a buffer with all properties
in marshalled form which can be passed to the kernel as a setup_data
payload. The number of device properties is dynamic and can change
between a first invocation of get_all() (to determine the buffer size)
and a second invocation (to retrieve the actual buffer), hence the
peculiar loop which does not finish until the buffer size settles.
The macOS bootloader does the same.
The setup_data payload is later on unmarshalled in an fs_initcall. The
idea is that most buses instantiate devices in "subsys" initcall level
and drivers are usually bound to these devices in "device" initcall
level, so we assign the properties in-between, i.e. in "fs" initcall
level.
This assumes that devices to which properties pertain are instantiated
from a "subsys" initcall or earlier. That should always be the case
since on macOS, AppleACPIPlatformExpert::matchEFIDevicePath() only
supports ACPI and PCI nodes and we've fully scanned those buses during
"subsys" initcall level.
The second assumption is that properties are only needed from a "device"
initcall or later. Seems reasonable to me, but should this ever not work
out, an alternative approach would be to store the property sets e.g. in
a btree early during boot. Then whenever device_add() is called, an EFI
Device Path would have to be constructed for the newly added device,
and looked up in the btree. That way, the property set could be assigned
to the device immediately on instantiation. And this would also work for
devices instantiated in a deferred fashion. It seems like this approach
would be more complicated and require more code. That doesn't seem
justified without a specific use case.
For comparison, the strategy on macOS is to assign properties to objects
in the ACPI namespace (AppleACPIPlatformExpert::mergeEFIProperties()).
That approach is definitely wrong as it fails for devices not present in
the namespace: The NHI EFI driver supplies properties for attached
Thunderbolt devices, yet on Macs with Thunderbolt 1 only one device
level behind the host controller is described in the namespace.
Consequently macOS cannot assign properties for chained devices. With
Thunderbolt 2 they started to describe three device levels behind host
controllers in the namespace but this grossly inflates the SSDT and
still fails if the user daisy-chained more than three devices.
We copy the property names and values from the setup_data payload to
swappable virtual memory and afterwards make the payload available to
the page allocator. This is just for the sake of good housekeeping, it
wouldn't occupy a meaningful amount of physical memory (4444 bytes on my
machine). Only the payload is freed, not the setup_data header since
otherwise we'd break the list linkage and we cannot safely update the
predecessor's ->next link because there's no locking for the list.
The payload is currently not passed on to kexec'ed kernels, same for PCI
ROMs retrieved by setup_efi_pci(). This can be added later if there is
demand by amending setup_efi_state(). The payload can then no longer be
made available to the page allocator of course.
Tested-by: Lukas Wunner <[email protected]> [MacBookPro9,1]
Tested-by: Pierre Moreau <[email protected]> [MacBookPro11,3]
Signed-off-by: Lukas Wunner <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Cc: Andreas Noever <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Pedro Vilaça <[email protected]>
Cc: Peter Jones <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|