Age | Commit message (Collapse) | Author | Files | Lines |
|
In order to change the {JMP,CALL}_NOSPEC macros to call out-of-line
versions of the retpoline magic, we need to remove the '%' from the
argument, such that we can paste it onto symbol names.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Increase legibility by adding whitespace to the efi_config_table_type_t
arrays that describe which EFI config tables we look for when going over
the firmware provided list. While at it, replace the 'name' char pointer
with a char array, which is more space efficient on relocatable 64-bit
kernels, as it avoids a 8 byte pointer and the associated relocation
data (24 bytes when using RELA format)
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Commit
d9e3d2c4f10320 ("efi/x86: Don't map the entire kernel text RW for mixed mode")
updated the code that creates the 1:1 memory mapping to use read-only
attributes for the 1:1 alias of the kernel's text and rodata sections, to
protect it from inadvertent modification. However, it failed to take into
account that the unused gap between text and rodata is given to the page
allocator for general use.
If the vmap'ed stack happens to be allocated from this region, any by-ref
output arguments passed to EFI runtime services that are allocated on the
stack (such as the 'datasize' argument taken by GetVariable() when invoked
from efivar_entry_size()) will be referenced via a read-only mapping,
resulting in a page fault if the EFI code tries to write to it:
BUG: unable to handle page fault for address: 00000000386aae88
#PF: supervisor write access in kernel mode
#PF: error_code(0x0003) - permissions violation
PGD fd61063 P4D fd61063 PUD fd62063 PMD 386000e1
Oops: 0003 [#1] SMP PTI
CPU: 2 PID: 255 Comm: systemd-sysv-ge Not tainted 5.6.0-rc4-default+ #22
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0008:0x3eaeed95
Code: ... <89> 03 be 05 00 00 80 a1 74 63 b1 3e 83 c0 48 e8 44 d2 ff ff eb 05
RSP: 0018:000000000fd73fa0 EFLAGS: 00010002
RAX: 0000000000000001 RBX: 00000000386aae88 RCX: 000000003e9f1120
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 000000000fd73fd8 R08: 00000000386aae88 R09: 0000000000000000
R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000000
R13: ffffc0f040220000 R14: 0000000000000000 R15: 0000000000000000
FS: 00007f21160ac940(0000) GS:ffff9cf23d500000(0000) knlGS:0000000000000000
CS: 0008 DS: 0018 ES: 0018 CR0: 0000000080050033
CR2: 00000000386aae88 CR3: 000000000fd6c004 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
Modules linked in:
CR2: 00000000386aae88
---[ end trace a8bfbd202e712834 ]---
Let's fix this by remapping text and rodata individually, and leave the
gaps mapped read-write.
Fixes: d9e3d2c4f10320 ("efi/x86: Don't map the entire kernel text RW for mixed mode")
Reported-by: Jiri Slaby <[email protected]>
Tested-by: Jiri Slaby <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
efi_thunk_set_variable() treated the NULL "data" pointer as an invalid
parameter, and this broke the deletion of variables in mixed mode.
This commit fixes the check of data so that the userspace program can
delete a variable in mixed mode.
Fixes: 8319e9d5ad98ffcc ("efi/x86: Handle by-ref arguments covering multiple pages in mixed mode")
Signed-off-by: Gary Lin <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"The main changes in this cycle were:
Kernel side changes:
- A couple of x86/cpu cleanups and changes were grandfathered in due
to patch dependencies. These clean up the set of CPU model/family
matching macros with a consistent namespace and C99 initializer
style.
- A bunch of updates to various low level PMU drivers:
* AMD Family 19h L3 uncore PMU
* Intel Tiger Lake uncore support
* misc fixes to LBR TOS sampling
- optprobe fixes
- perf/cgroup: optimize cgroup event sched-in processing
- misc cleanups and fixes
Tooling side changes are to:
- perf {annotate,expr,record,report,stat,test}
- perl scripting
- libapi, libperf and libtraceevent
- vendor events on Intel and S390, ARM cs-etm
- Intel PT updates
- Documentation changes and updates to core facilities
- misc cleanups, fixes and other enhancements"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (89 commits)
cpufreq/intel_pstate: Fix wrong macro conversion
x86/cpu: Cleanup the now unused CPU match macros
hwrng: via_rng: Convert to new X86 CPU match macros
crypto: Convert to new CPU match macros
ASoC: Intel: Convert to new X86 CPU match macros
powercap/intel_rapl: Convert to new X86 CPU match macros
PCI: intel-mid: Convert to new X86 CPU match macros
mmc: sdhci-acpi: Convert to new X86 CPU match macros
intel_idle: Convert to new X86 CPU match macros
extcon: axp288: Convert to new X86 CPU match macros
thermal: Convert to new X86 CPU match macros
hwmon: Convert to new X86 CPU match macros
platform/x86: Convert to new CPU match macros
EDAC: Convert to new X86 CPU match macros
cpufreq: Convert to new X86 CPU match macros
ACPI: Convert to new X86 CPU match macros
x86/platform: Convert to new CPU match macros
x86/kernel: Convert to new CPU match macros
x86/kvm: Convert to new CPU match macros
x86/perf/events: Convert to new CPU match macros
...
|
|
Conflicts:
arch/x86/events/intel/uncore.c
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The new macro set has a consistent namespace and uses C99 initializers
instead of the grufty C89 ones.
Get rid the of the local macro wrappers for consistency.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Commit:
59f2a619a2db8611 ("efi: Add 'runtime' pointer to struct efi")
modified the assembler routine called by efi_set_virtual_address_map(),
to grab the 'runtime' EFI service pointer while running with paging
disabled (which is tricky to do in C code)
After the change, register %ebx is not restored correctly, resulting
in all kinds of weird behavior, so fix that.
Reported-by: Guenter Roeck <[email protected]>
Tested-by: Guenter Roeck <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/core
More EFI updates for v5.7
- Incorporate a stable branch with the EFI pieces of Hans's work on
loading device firmware from EFI boot service memory regions
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Just like with PCI options ROMs, which we save in the setup_efi_pci*
functions from arch/x86/boot/compressed/eboot.c, the EFI code / ROM itself
sometimes may contain data which is useful/necessary for peripheral drivers
to have access to.
Specifically the EFI code may contain an embedded copy of firmware which
needs to be (re)loaded into the peripheral. Normally such firmware would be
part of linux-firmware, but in some cases this is not feasible, for 2
reasons:
1) The firmware is customized for a specific use-case of the chipset / use
with a specific hardware model, so we cannot have a single firmware file
for the chipset. E.g. touchscreen controller firmwares are compiled
specifically for the hardware model they are used with, as they are
calibrated for a specific model digitizer.
2) Despite repeated attempts we have failed to get permission to
redistribute the firmware. This is especially a problem with customized
firmwares, these get created by the chip vendor for a specific ODM and the
copyright may partially belong with the ODM, so the chip vendor cannot
give a blanket permission to distribute these.
This commit adds support for finding peripheral firmware embedded in the
EFI code and makes the found firmware available through the new
efi_get_embedded_fw() function.
Support for loading these firmwares through the standard firmware loading
mechanism is added in a follow-up commit in this patch-series.
Note we check the EFI_BOOT_SERVICES_CODE for embedded firmware near the end
of start_kernel(), just before calling rest_init(), this is on purpose
because the typical EFI_BOOT_SERVICES_CODE memory-segment is too large for
early_memremap(), so the check must be done after mm_init(). This relies
on EFI_BOOT_SERVICES_CODE not being free-ed until efi_free_boot_services()
is called, which means that this will only work on x86 for now.
Reported-by: Dave Olsthoorn <[email protected]>
Suggested-by: Peter Jones <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Sometimes it is useful to be able to dump the efi boot-services code and
data. This commit adds these as debugfs-blobs to /sys/kernel/debug/efi,
but only if efi=debug is passed on the kernel-commandline as this requires
not freeing those memory-regions, which costs 20+ MB of RAM.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
When booting with SME active, EFI tables must be mapped unencrypted since
they were built by UEFI in unencrypted memory. Update the list of tables
to be checked during early_memremap() processing to account for the EFI
RNG seed table.
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Heinrich Schuchardt <[email protected]>
Link: https://lore.kernel.org/r/b64385fc13e5d7ad4b459216524f138e7879234f.1582662842.git.thomas.lendacky@amd.com
Link: https://lore.kernel.org/r/[email protected]
|
|
When booting with SME active, EFI tables must be mapped unencrypted since
they were built by UEFI in unencrypted memory. Update the list of tables
to be checked during early_memremap() processing to account for the EFI
TPM tables.
This fixes a bug where an EFI TPM log table has been created by UEFI, but
it lives in memory that has been marked as usable rather than reserved.
Signed-off-by: Tom Lendacky <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Heinrich Schuchardt <[email protected]>
Cc: <[email protected]> # v5.4+
Link: https://lore.kernel.org/r/4144cd813f113c20cdfa511cf59500a64e6015be.1582662842.git.thomas.lendacky@amd.com
Link: https://lore.kernel.org/r/[email protected]
|
|
The mixed mode runtime wrappers are fragile when it comes to how the
memory referred to by its pointer arguments are laid out in memory, due
to the fact that it translates these addresses to physical addresses that
the runtime services can dereference when running in 1:1 mode. Since
vmalloc'ed pages (including the vmap'ed stack) are not contiguous in the
physical address space, this scheme only works if the referenced memory
objects do not cross page boundaries.
Currently, the mixed mode runtime service wrappers require that all by-ref
arguments that live in the vmalloc space have a size that is a power of 2,
and are aligned to that same value. While this is a sensible way to
construct an object that is guaranteed not to cross a page boundary, it is
overly strict when it comes to checking whether a given object violates
this requirement, as we can simply take the physical address of the first
and the last byte, and verify that they point into the same physical page.
When this check fails, we emit a WARN(), but then simply proceed with the
call, which could cause data corruption if the next physical page belongs
to a mapping that is entirely unrelated.
Given that with vmap'ed stacks, this condition is much more likely to
trigger, let's relax the condition a bit, but fail the runtime service
call if it does trigger.
Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Mixed mode calls at runtime are rather tricky with vmap'ed stacks,
as we can no longer assume that data passed in by the callers of the
EFI runtime wrapper routines is contiguous in physical memory.
We need to fix this, but before we do, let's drop the implementations
of routines that we know are never used on x86, i.e., the RTC related
ones. Given that UEFI rev2.8 permits any runtime service to return
EFI_UNSUPPORTED at runtime, let's return that instead.
As get_next_high_mono_count() is never used at all, even on other
architectures, let's make that return EFI_UNSUPPORTED too.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Hans reports that his mixed mode systems running v5.6-rc1 kernels hit
the WARN_ON() in virt_to_phys_or_null_size(), caused by the fact that
efi_guid_t objects on the vmap'ed stack happen to be misaligned with
respect to their sizes. As a quick (i.e., backportable) fix, copy GUID
pointer arguments to the local stack into a buffer that is naturally
aligned to its size, so that it is guaranteed to cover only one
physical page.
Note that on x86, we cannot rely on the stack pointer being aligned
the way the compiler expects, so we need to allocate an 8-byte aligned
buffer of sufficient size, and copy the GUID into that buffer at an
offset that is aligned to 16 bytes.
Fixes: f6697df36bdf0bf7 ("x86/efi: Prevent mixed mode boot corruption with CONFIG_VMAP_STACK=y")
Reported-by: Hans de Goede <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Hans de Goede <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The systab member in struct efi has outlived its usefulness, now that
we have better ways to access the only piece of information we are
interested in after init, which is the EFI runtime services table
address. So instead of instantiating a doctored copy at early boot
with lots of mangled values, and switching the pointer when switching
into virtual mode, let's grab the values we need directly, and get
rid of the systab pointer entirely.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Instead of going through the EFI system table each time, just copy the
runtime services table pointer into struct efi directly. This is the
last use of the system table pointer in struct efi, allowing us to
drop it in a future patch, along with a fair amount of quirky handling
of the translated address.
Note that usually, the runtime services pointer changes value during
the call to SetVirtualAddressMap(), so grab the updated value as soon
as that call returns. (Mixed mode uses a 1:1 mapping, and kexec boot
enters with the updated address in the system table, so in those cases,
we don't need to do anything here)
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
efi.runtime_version is always set to the same value on both
existing code paths, so just set it earlier from a shared one.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
There is some code that exposes physical addresses of certain parts of
the EFI firmware implementation via sysfs nodes. These nodes are only
used on x86, and are of dubious value to begin with, so let's move
their handling into the x86 arch code.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Since commit 33b85447fa61946b ("efi/x86: Drop two near identical versions
of efi_runtime_init()"), we no longer map the EFI runtime services table
before calling SetVirtualAddressMap(), which means we don't need the 1:1
mapped physical address of this table, and so there is no point in passing
the address via EFI setup data on kexec boot.
Note that the kexec tools will still look for this address in sysfs, so
we still need to provide it.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
config_parse_tables() is a jumble of pointer arithmetic, due to the
fact that on x86, we may be dealing with firmware whose native word
size differs from the kernel's.
This is not a concern on other architectures, and doesn't quite
justify the state of the code, so let's clean it up by adding a
non-x86 code path, constifying statically allocated tables and
replacing preprocessor conditionals with IS_ENABLED() checks.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The efi_config_init() routine is no longer shared with ia64 so let's
move it into the x86 arch code before making further x86 specific
changes to it.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
We have three different versions of the code that checks the EFI system
table revision and copies the firmware vendor string, and they are
mostly equivalent, with the exception of the use of early_memremap_ro
vs. __va() and the lowest major revision to warn about. Let's move this
into common code and factor out the commonalities.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The memory attributes table is only used at init time by the core EFI
code, so there is no need to carry its address in struct efi that is
shared with the world. So move it out, and make it __ro_after_init as
well, considering that the value is set during early boot.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The UGA table is x86 specific (its handling was introduced when the
EFI support code was modified to accommodate IA32), so there is no
need to handle it in generic code.
The EFI properties table is not strictly x86 specific, but it was
deprecated almost immediately after having been introduced, due to
implementation difficulties. Only x86 takes it into account today,
and this is not going to change, so make this table x86 only as well.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
The HCDP and MPS tables are Itanium specific EFI config tables, so
move their handling to ia64 arch code.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Some plumbing exists to handle a UEFI configuration table of type
BOOT_INFO but since we never match it to a GUID anywhere, we never
actually register such a table, or access it, for that matter. So
simply drop all mentions of it.
Tested-by: Tony Luck <[email protected]> # arch/ia64
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
When possible, IS_ENABLED() conditionals are preferred over #ifdefs,
given that the latter hide the code from the compiler entirely, which
reduces build test coverage when the option is not enabled.
So replace an instance in the x86 efi startup code.
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Reindent the efi_memory_map_data initializer so that all the = signs
are aligned vertically, making the resulting code much easier to read.
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Thomas Gleixner:
"A single fix for a EFI boot regression on X86 which was caused by the
recent rework of the EFI memory map parsing. On systems with invalid
memmap entries the cleanup function uses an value which cannot be
relied on in this stage. Use the actual EFI memmap entry instead"
* tag 'efi-urgent-2020-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix boot regression on systems with invalid memmap entries
|
|
To enable x86 to use the generic walk_page_range() function, the callers
of ptdump_walk_pgd_level() need to pass an mm_struct rather than the raw
pgd_t pointer. Luckily since commit 7e904a91bf60 ("efi: Use efi_mm in x86
as well as ARM") we now have an mm_struct for EFI on x86.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Price <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Hogan <[email protected]>
Cc: James Morse <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: "Liang, Kan" <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Zong Li <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In efi_clean_memmap(), we do a pass over the EFI memory map to remove
bogus entries that may be returned on certain systems.
This recent commit:
1db91035d01aa8bf ("efi: Add tracking for dynamically allocated memmaps")
refactored this code to pass the input to efi_memmap_install() via a
temporary struct on the stack, which is populated using an initializer
which inadvertently defines the value of its size field in terms of its
desc_size field, which value cannot be relied upon yet in the initializer
itself.
Fix this by using efi.memmap.desc_size instead, which is where we get
the value for desc_size from in the first place.
Reported-by: Jörg Otte <[email protected]>
Tested-by: Jörg Otte <[email protected]>
Tested-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
"The main changes in this cycle were:
- Cleanup of the GOP [graphics output] handling code in the EFI stub
- Complete refactoring of the mixed mode handling in the x86 EFI stub
- Overhaul of the x86 EFI boot/runtime code
- Increase robustness for mixed mode code
- Add the ability to disable DMA at the root port level in the EFI
stub
- Get rid of RWX mappings in the EFI memory map and page tables,
where possible
- Move the support code for the old EFI memory mapping style into its
only user, the SGI UV1+ support code.
- plus misc fixes, updates, smaller cleanups.
... and due to interactions with the RWX changes, another round of PAT
cleanups make a guest appearance via the EFI tree - with no side
effects intended"
* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (75 commits)
efi/x86: Disable instrumentation in the EFI runtime handling code
efi/libstub/x86: Fix EFI server boot failure
efi/x86: Disallow efi=old_map in mixed mode
x86/boot/compressed: Relax sed symbol type regex for LLVM ld.lld
efi/x86: avoid KASAN false positives when accessing the 1: 1 mapping
efi: Fix handling of multiple efi_fake_mem= entries
efi: Fix efi_memmap_alloc() leaks
efi: Add tracking for dynamically allocated memmaps
efi: Add a flags parameter to efi_memory_map
efi: Fix comment for efi_mem_type() wrt absent physical addresses
efi/arm: Defer probe of PCIe backed efifb on DT systems
efi/x86: Limit EFI old memory map to SGI UV machines
efi/x86: Avoid RWX mappings for all of DRAM
efi/x86: Don't map the entire kernel text RW for mixed mode
x86/mm: Fix NX bit clearing issue in kernel_map_pages_in_pgd
efi/libstub/x86: Fix unused-variable warning
efi/libstub/x86: Use mandatory 16-byte stack alignment in mixed mode
efi/libstub/x86: Use const attribute for efi_is_64bit()
efi: Allow disabling PCI busmastering on bridges during boot
efi/x86: Allow translating 64-bit arguments for mixed mode calls
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull header cleanup from Ingo Molnar:
"This is a treewide cleanup, mostly (but not exclusively) with x86
impact, which breaks implicit dependencies on the asm/realtime.h
header and finally removes it from asm/acpi.h"
* 'core-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/ACPI/sleep: Move acpi_get_wakeup_address() into sleep.c, remove <asm/realmode.h> from <asm/acpi.h>
ACPI/sleep: Convert acpi_wakeup_address into a function
x86/ACPI/sleep: Remove an unnecessary include of asm/realmode.h
ASoC: Intel: Skylake: Explicitly include linux/io.h for virt_to_phys()
vmw_balloon: Explicitly include linux/io.h for virt_to_phys()
virt: vbox: Explicitly include linux/io.h to pick up various defs
efi/capsule-loader: Explicitly include linux/io.h for page_to_phys()
perf/x86/intel: Explicitly include asm/io.h to use virt_to_phys()
x86/kprobes: Explicitly include vmalloc.h for set_vm_flush_reset_perms()
x86/ftrace: Explicitly include vmalloc.h for set_vm_flush_reset_perms()
x86/boot: Explicitly include realmode.h to handle RM reservations
x86/efi: Explicitly include realmode.h to handle RM trampoline quirk
x86/platform/intel/quark: Explicitly include linux/io.h for virt_to_phys()
x86/setup: Enhance the comments
x86/setup: Clean up the header portion of setup.c
|
|
We already disable KASAN instrumentation specifically for the
EFI routines that are known to dereference memory addresses that
KASAN does not know about, avoiding false positive KASAN splats.
However, as it turns out, having GCOV or KASAN instrumentation enabled
interferes with the compiler's ability to optimize away function calls
that are guarded by IS_ENABLED() checks that should have resulted in
those references to have been const-propagated out of existence. But
with instrumenation enabled, we may get build errors like:
ld: arch/x86/platform/efi/efi_64.o: in function `efi_thunk_set_virtual_address_map':
ld: arch/x86/platform/efi/efi_64.o: in function `efi_set_virtual_address_map':
in builds where CONFIG_EFI=y but CONFIG_EFI_MIXED or CONFIG_X86_UV are not
defined, even though the invocations are conditional on IS_ENABLED() checks
against the respective Kconfig symbols.
So let's disable instrumentation entirely for this subdirectory, which
isn't that useful here to begin with.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: [email protected]
|
|
When installing the EFI virtual address map during early boot, we
access the EFI system table to retrieve the 1:1 mapped address of
the SetVirtualAddressMap() EFI runtime service. This memory is not
known to KASAN, so on KASAN enabled builds, this may result in a
splat like
==================================================================
BUG: KASAN: user-memory-access in efi_set_virtual_address_map+0x141/0x354
Read of size 4 at addr 000000003fbeef38 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.5.0-rc5+ #758
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
dump_stack+0x8b/0xbb
? efi_set_virtual_address_map+0x141/0x354
? efi_set_virtual_address_map+0x141/0x354
__kasan_report+0x176/0x192
? efi_set_virtual_address_map+0x141/0x354
kasan_report+0xe/0x20
efi_set_virtual_address_map+0x141/0x354
? efi_thunk_runtime_setup+0x148/0x148
? __inc_numa_state+0x19/0x90
? memcpy+0x34/0x50
efi_enter_virtual_mode+0x5fd/0x67d
start_kernel+0x5cd/0x682
? mem_encrypt_init+0x6/0x6
? x86_family+0x5/0x20
? load_ucode_bsp+0x46/0x154
secondary_startup_64+0xa4/0xb0
==================================================================
Since this code runs only a single time during early boot, let's annotate
it as __no_sanitize_address so KASAN disregards it entirely.
Fixes: 698294704573 ("efi/x86: Split SetVirtualAddresMap() wrappers into ...")
Reported-by: Qian Cai <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In preparation for fixing efi_memmap_alloc() leaks, add support for
recording whether the memmap was dynamically allocated from slab,
memblock, or is the original physical memmap provided by the platform.
Given this tracking is established in efi_memmap_alloc() and needs to be
carried to efi_memmap_install(), use 'struct efi_memory_map_data' to
convey the flags.
Some small cleanups result from this reorganization, specifically the
removal of local variables for 'phys' and 'size' that are already
tracked in @data.
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
We carry a quirk in the x86 EFI code to switch back to an older
method of mapping the EFI runtime services memory regions, because
it was deemed risky at the time to implement a new method without
providing a fallback to the old method in case problems arose.
Such problems did arise, but they appear to be limited to SGI UV1
machines, and so these are the only ones for which the fallback gets
enabled automatically (via a DMI quirk). The fallback can be enabled
manually as well, by passing efi=old_map, but there is very little
evidence that suggests that this is something that is being relied
upon in the field.
Given that UV1 support is not enabled by default by the distros
(Ubuntu, Fedora), there is no point in carrying this fallback code
all the time if there are no other users. So let's move it into the
UV support code, and document that efi=old_map now requires this
support code to be enabled.
Note that efi=old_map has been used in the past on other SGI UV
machines to work around kernel regressions in production, so we
keep the option to enable it by hand, but only if the kernel was
built with UV support.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The EFI code creates RWX mappings for all memory regions that are
occupied after the stub completes, and in the mixed mode case, it
even creates RWX mappings for all of the remaining DRAM as well.
Let's try to avoid this, by setting the NX bit for all memory
regions except the ones that are marked as EFI runtime services
code [which means text+rodata+data in practice, so we cannot mark
them read-only right away]. For cases of buggy firmware where boot
services code is called during SetVirtualAddressMap(), map those
regions with exec permissions as well - they will be unmapped in
efi_free_boot_services().
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The mixed mode thunking routine requires a part of it to be
mapped 1:1, and for this reason, we currently map the entire
kernel .text read/write in the EFI page tables, which is bad.
In fact, the kernel_map_pages_in_pgd() invocation that installs
this mapping is entirely redundant, since all of DRAM is already
1:1 mapped read/write in the EFI page tables when we reach this
point, which means that .rodata is mapped read-write as well.
So let's remap both .text and .rodata read-only in the EFI
page tables.
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
On x86 we need to thunk through assembler stubs to call the EFI services
for mixed mode, and for runtime services in 64-bit mode. The assembler
stubs have limits on how many arguments it handles. Introduce a few
macros to check that we do not try to pass too many arguments to the
stubs.
Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Remove some code that is guaranteed to be unreachable, given
that we have already bailed by this time if EFI_OLD_MEMMAP is
set.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The logic in __efi_enter_virtual_mode() does a number of steps in
sequence, all of which may fail in one way or the other. In most
cases, we simply print an error and disable EFI runtime services
support, but in some cases, we BUG() or panic() and bring down the
system when encountering conditions that we could easily handle in
the same way.
While at it, replace a pointless page-to-virt-phys conversion with
one that goes straight from struct page to physical.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Clean up the efi_systab_init() routine which maps the EFI system
table and copies the relevant pieces of data out of it.
The current routine is very difficult to read, so let's clean that
up. Also, switch to a R/O mapping of the system table since that is
all we need.
Finally, use a plain u64 variable to record the physical address of
the system table instead of pointlessly stashing it in a struct efi
that is never used for anything else.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The routines efi_runtime_init32() and efi_runtime_init64() are
almost indistinguishable, and the only relevant difference is
the offset in the runtime struct from where to obtain the physical
address of the SetVirtualAddressMap() routine.
However, this address is only used once, when installing the virtual
address map that the OS will use to invoke EFI runtime services, and
at the time of the call, we will necessarily be running with a 1:1
mapping, and so there is no need to do the map/unmap dance here to
retrieve the address. In fact, in the preceding changes to these users,
we stopped using the address recorded here entirely.
So let's just get rid of all this code since it no longer serves a
purpose. While at it, tweak the logic so that we handle unsupported
and disable EFI runtime services in the same way, and unmap the EFI
memory map in both cases.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Calling 32-bit EFI runtime services from a 64-bit OS involves
switching back to the flat mapping with a stack carved out of
memory that is 32-bit addressable.
There is no need to actually execute the 64-bit part of this
routine from the flat mapping as well, as long as the entry
and return address fit in 32 bits. There is also no need to
preserve part of the calling context in global variables: we
can simply push the old stack pointer value to the new stack,
and keep the return address from the code32 section in EBX.
While at it, move the conditional check whether to invoke
the mixed mode version of SetVirtualAddressMap() into the
64-bit implementation of the wrapper routine.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The efi_call() wrapper used to invoke EFI runtime services serves
a number of purposes:
- realign the stack to 16 bytes
- preserve FP and CR0 register state
- translate from SysV to MS calling convention.
Preserving CR0.TS is no longer necessary in Linux, and preserving the
FP register state is also redundant in most cases, since efi_call() is
almost always used from within the scope of a pair of kernel_fpu_begin()/
kernel_fpu_end() calls, with the exception of the early call to
SetVirtualAddressMap() and the SGI UV support code.
So let's add a pair of kernel_fpu_begin()/_end() calls there as well,
and remove the unnecessary code from the assembly implementation of
efi_call(), and only keep the pieces that deal with the stack
alignment and the ABI translation.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The variadic efi_call_phys() wrapper that exists on i386 was
originally created to call into any EFI firmware runtime service,
but in practice, we only use it once, to call SetVirtualAddressMap()
during early boot.
The flexibility provided by the variadic nature also makes it
type unsafe, and makes the assembler code more complicated than
needed, since it has to deal with an unknown number of arguments
living on the stack.
So clean this up, by renaming the helper to efi_call_svam(), and
dropping the unneeded complexity. Let's also drop the reference
to the efi_phys struct and grab the address from the EFI system
table directly.
Signed-off-by: Ard Biesheuvel <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Arvind Sankar <[email protected]>
Cc: Matthew Garrett <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|