Age | Commit message (Collapse) | Author | Files | Lines |
|
Track whether all the cores in the machine are vulnerable to Spectre-v2,
and whether all the vulnerable cores have been mitigated. We then expose
this information to userspace via sysfs.
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Ensure we are always able to detect whether or not the CPU is affected
by Spectre-v2, so that we can later advertise this to userspace.
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The SMCCC ARCH_WORKAROUND_1 service can indicate that although the
firmware knows about the Spectre-v2 mitigation, this particular
CPU is not vulnerable, and it is thus not necessary to call
the firmware on this CPU.
Let's use this information to our benefit.
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
We currently have a list of CPUs affected by Spectre-v2, for which
we check that the firmware implements ARCH_WORKAROUND_1. It turns
out that not all firmwares do implement the required mitigation,
and that we fail to let the user know about it.
Instead, let's slightly revamp our checks, and rely on a whitelist
of cores that are known to be non-vulnerable, and let the user know
the status of the mitigation in the kernel log.
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
We implement page table isolation as a mitigation for meltdown.
Report this to userspace via sysfs.
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
spectre-v1 has been mitigated and the mitigation is always active.
Report this to userspace via sysfs
Signed-off-by: Mian Yousaf Kaukab <[email protected]>
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Acked-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
There are various reasons, such as benchmarking, to disable spectrev2
mitigation on a machine. Provide a command-line option to do so.
Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Reviewed-by: Andre Przywara <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: [email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
With VHE different exception levels are used between the host (EL2) and
guest (EL1) with a shared exception level for userpace (EL0). We can take
advantage of this and use the PMU's exception level filtering to avoid
enabling/disabling counters in the world-switch code. Instead we just
modify the counter type to include or exclude EL0 at vcpu_{load,put} time.
We also ensure that trapped PMU system register writes do not re-enable
EL0 when reconfiguring the backing perf events.
This approach completely avoids blackout windows seen with !VHE.
Suggested-by: Christoffer Dall <[email protected]>
Signed-off-by: Andrew Murray <[email protected]>
Acked-by: Will Deacon <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Add support for the :G and :H attributes in perf by handling the
exclude_host/exclude_guest event attributes.
We notify KVM of counters that we wish to be enabled or disabled on
guest entry/exit and thus defer from starting or stopping events based
on their event attributes.
With !VHE we switch the counters between host/guest at EL2. We are able
to eliminate counters counting host events on the boundaries of guest
entry/exit when using :G by filtering out EL2 for exclude_host. When
using !exclude_hv there is a small blackout window at the guest
entry/exit where host events are not captured.
Signed-off-by: Andrew Murray <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
The virt/arm core allocates a kvm_cpu_context_t percpu, at present this is
a typedef to kvm_cpu_context and is used to store host cpu context. The
kvm_cpu_context structure is also used elsewhere to hold vcpu context.
In order to use the percpu to hold additional future host information we
encapsulate kvm_cpu_context in a new structure and rename the typedef and
percpu to match.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
The armv8pmu_enable_event_counter function issues an isb instruction
after enabling a pair of counters - this doesn't provide any value
and is inconsistent with the armv8pmu_disable_event_counter.
In any case armv8pmu_enable_event_counter is always called with the
PMU stopped. Starting the PMU with armv8pmu_start results in an isb
instruction being issued prior to writing to PMCR_EL0.
Let's remove the unnecessary isb instruction.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When pointer authentication is supported, a guest may wish to use it.
This patch adds the necessary KVM infrastructure for this to work, with
a semi-lazy context switch of the pointer auth state.
Pointer authentication feature is only enabled when VHE is built
in the kernel and present in the CPU implementation so only VHE code
paths are modified.
When we schedule a vcpu, we disable guest usage of pointer
authentication instructions and accesses to the keys. While these are
disabled, we avoid context-switching the keys. When we trap the guest
trying to use pointer authentication functionality, we change to eagerly
context-switching the keys, and enable the feature. The next time the
vcpu is scheduled out/in, we start again. However the host key save is
optimized and implemented inside ptrauth instruction/register access
trap.
Pointer authentication consists of address authentication and generic
authentication, and CPUs in a system might have varied support for
either. Where support for either feature is not uniform, it is hidden
from guests via ID register emulation, as a result of the cpufeature
framework in the host.
Unfortunately, address authentication and generic authentication cannot
be trapped separately, as the architecture provides a single EL2 trap
covering both. If we wish to expose one without the other, we cannot
prevent a (badly-written) guest from intermittently using a feature
which is not uniformly supported (when scheduled on a physical CPU which
supports the relevant feature). Hence, this patch expects both type of
authentication to be present in a cpu.
This switch of key is done from guest enter/exit assembly as preparation
for the upcoming in-kernel pointer authentication support. Hence, these
key switching routines are not implemented in C code as they may cause
pointer authentication key signing error in some situations.
Signed-off-by: Mark Rutland <[email protected]>
[Only VHE, key switch in full assembly, vcpu_has_ptrauth checks
, save host key in ptrauth exception trap]
Signed-off-by: Amit Daniel Kachhap <[email protected]>
Reviewed-by: Julien Thierry <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: [email protected]
[maz: various fixups]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
This patch provides support for reporting the presence of SVE2 and
its optional features to userspace.
This will also enable visibility of SVE2 for guests, when KVM
support for SVE-enabled guests is available.
Signed-off-by: Dave Martin <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
When kuser helpers are enabled the kernel maps the relative code at
a fixed address (0xffff0000). Making configurable the option to disable
them means that the kernel can remove this mapping and any access to
this memory area results in a sigfault.
Add a KUSER_HELPERS config option that can be used to disable the
mapping when it is turned off.
This option can be turned off if and only if the applications are
designed specifically for the platform and they do not make use of the
kuser helpers code.
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
[will: Use IS_ENABLED() instead of #ifdef]
Signed-off-by: Will Deacon <[email protected]>
|
|
aarch32_alloc_vdso_pages() needs to be refactored to make it
easier to disable kuser helpers.
Divide the function in aarch32_alloc_kuser_vdso_page() and
aarch32_alloc_sigreturn_vdso_page().
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
[will: Inlined sigpage allocation to simplify error paths]
Signed-off-by: Will Deacon <[email protected]>
|
|
To make it possible to disable kuser helpers in aarch32 we need to
divide the kuser and the sigreturn functionalities.
Split the current version of kuser32 in kuser32 (for kuser helpers)
and sigreturn32 (for sigreturn helpers).
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
For AArch32 tasks, we install a special "[vectors]" page that contains
the sigreturn trampolines and kuser helpers, which is mapped at a fixed
address specified by the kuser helpers ABI.
Having the sigreturn trampolines in the same page as the kuser helpers
makes it impossible to disable the kuser helpers independently.
Follow the Arm implementation, by moving the signal trampolines out of
the "[vectors]" page and into their own "[sigpage]".
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
[will: tweaked comments and fixed sparse warning]
Signed-off-by: Will Deacon <[email protected]>
|
|
Another bodge for the ftrace PLT code: plt_entries_equal() now takes
the place relative nature of the ADRP/ADD based PLT entries into
account, which means that a struct trampoline instance on the stack
is no longer equal to the same set of opcodes in the module struct,
given that they don't point to the same place in memory anymore.
Work around this by using memcmp() in the ftrace PLT handling code.
Acked-by: Will Deacon <[email protected]>
Tested-by: dann frazier <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
Currently the meanings of sve_vq_map and the ancillary helpers
__bit_to_vq() and __vq_to_bit() are not clearly explained.
This patch makes the explanatory comment clearer, and removes the
duplicate comment from fpsimd.h.
The WARN_ON() currently present in __bit_to_vq() confuses the
intended use of this helper. Since these are low-level helpers not
intended for general-purpose use anyway, it is better not to make
guesses about how these functions will be used: rather, this patch
removes the WARN_ON() and relies on callers to use the helpers
sensibly.
Suggested-by: Andrew Jones <[email protected]>
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
clock_getres() in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().
In particular, posix_get_hrtimer_res() does:
sec = 0;
ns = hrtimer_resolution;
where 'hrtimer_resolution' depends on whether or not high resolution
timers are enabled, which is a runtime decision.
The vDSO incorrectly returns the constant CLOCK_REALTIME_RES. Fix this
by exposing 'hrtimer_resolution' in the vDSO datapage and returning that
instead.
Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Vincenzo Frascino <[email protected]>
[will: Use WRITE_ONCE(), move adr off COARSE path, renumber labels, use 'w' reg]
Signed-off-by: Will Deacon <[email protected]>
|
|
Advertise ARM64_HAS_DCPODP when both DC CVAP and DC CVADP are supported.
Even though we don't use this feature now, we provide it for consistency
with DCPOP and anticipate it being used in the future.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Dave Martin <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
ARMv8.5 builds upon the ARMv8.2 DC CVAP instruction by introducing a DC
CVADP instruction which cleans the data cache to the point of deep
persistence. Let's expose this support via the arm64 ELF hwcaps.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Dave Martin <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The ARMv8.5 DC CVADP instruction may be trapped to EL1 via
SCTLR_EL1.UCI therefore let's provide a handler for it.
Just like the CVAP instruction we use a 'sys' instruction instead of
the 'dc' alias to avoid build issues with older toolchains.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Mark Rutland <[email protected]>
Reviewed-by: Dave Martin <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The introduction of AT_HWCAP2 introduced accessors which ensure that
hwcap features are set and tested appropriately.
Let's now mandate access to elf_hwcap via these accessors by making
elf_hwcap static within cpufeature.c.
Signed-off-by: Andrew Murray <[email protected]>
Reviewed-by: Dave Martin <[email protected]>
Reviewed-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
As we will exhaust the first 32 bits of AT_HWCAP let's start
exposing AT_HWCAP2 to userspace to give us up to 64 caps.
Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
prefer to expand into AT_HWCAP2 in order to provide a consistent
view to userspace between ILP32 and LP64. However internal to the
kernel we prefer to continue to use the full space of elf_hwcap.
To reduce complexity and allow for future expansion, we now
represent hwcaps in the kernel as ordinals and use a
KERNEL_HWCAP_ prefix. This allows us to support automatic feature
based module loading for all our hwcaps.
We introduce cpu_set_feature to set hwcaps which complements the
existing cpu_have_feature helper. These helpers allow us to clean
up existing direct uses of elf_hwcap and reduce any future effort
required to move beyond 64 caps.
For convenience we also introduce cpu_{have,set}_named_feature which
makes use of the cpu_feature macro to allow providing a hwcap name
without a {KERNEL_}HWCAP_ prefix.
Signed-off-by: Andrew Murray <[email protected]>
[will: use const_ilog2() and tweak documentation]
Signed-off-by: Will Deacon <[email protected]>
|
|
Terminating the last trace entry with ULONG_MAX is a completely pointless
exercise and none of the consumers can rely on it because it's
inconsistently implemented across architectures. In fact quite some of the
callers remove the entry and adjust stack_trace.nr_entries afterwards.
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
We use $(LD) to link vmlinux, modules, decompressors, etc.
VDSO is the only exceptional case where $(CC) is used as the linker
driver, but I do not know why we need to do so. VDSO uses a special
linker script, and does not link standard libraries at all.
I changed the Makefile to use $(LD) rather than $(CC). I tested this,
and VDSO worked for me.
Users will be able to use their favorite linker (e.g. lld instead of
of bfd) by passing LD= from the command line.
My plan is to rewrite all VDSO Makefiles to use $(LD), then delete
cc-ldoption.
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The functions armv8pmu_read_counter() and armv8pmu_write_counter()
are `static inline` while they are only referenced when assigned
to a function pointer field in a `struct arm_pmu` instance.
The inline keyword is thus counter intuitive and shouldn't be used.
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Raphael Gault <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Some firmwares may reboot CPUs with OS Double Lock set. Make sure that
it is unlocked, in order to use debug exceptions.
Cc: <[email protected]>
Signed-off-by: Jean-Philippe Brucker <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
brk_handler() now looks pretty strange and can be refactored to drop its
funny 'handler_found' local variable altogether.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
kprobes and uprobes reserve some BRK immediates for installing their
probes. Define these along with the other reservations in brk-imm.h
and rename the ESR definitions to be consistent with the others that we
already have.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Now that the debug hook dispatching code takes the triggering exception
level into account, there's no need for the hooks themselves to poke
around with user_mode(regs).
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Kprobes bypasses our debug hook registration code so that it doesn't
get tangled up with recursive debug exceptions from things like lockdep:
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/324385.html
However, since then, (a) the hook list has become RCU protected and (b)
the kprobes hooks were found not to filter out exceptions from userspace
correctly. On top of that, the step handler is invoked directly from
single_step_handler(), which *does* use the debug hook list, so it's
clearly not the end of the world.
For now, have kprobes use the debug hook registration API like everybody
else. We can revisit this in the future if this is found to limit
coverage significantly.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Mixing kernel and user debug hooks together is highly error-prone as it
relies on all of the hooks to figure out whether the exception came from
kernel or user, and then to act accordingly.
Make our debug hook code a little more robust by maintaining separate
hook lists for user and kernel, with separate registration functions
to force callers to be explicit about the exception levels that they
care about.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The comment next to the definition of our 'break_hook' list head is
at best wrong but mainly just meaningless. Rip it out.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Since the 'addr' parameter contains an UNKNOWN value for non-watchpoint
debug exceptions, rename it to 'unused' for those hooks so we don't get
tempted to use it in the future.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
In preparation for arm64 supporting ftrace built on other compiler
options, let's have the arm64 Makefiles remove the $(CC_FLAGS_FTRACE)
flags, whatever these may be, rather than assuming '-pg'.
There should be no functional change as a result of this patch.
Reviewed-by: Mark Rutland <[email protected]>
Signed-off-by: Torsten Duwe <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Calling dump_backtrace() with a pt_regs argument corresponding to
userspace doesn't make any sense and our unwinder will simply print
"Call trace:" before unwinding the stack looking for user frames.
Rather than go through this song and dance, just return early if we're
passed a user register state.
Cc: <[email protected]>
Fixes: 1149aad10b1e ("arm64: Add dump_backtrace() in show_regs")
Reported-by: Kefeng Wang <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The ftrace trampoline code (which deals with modules loaded out of
BL range of the core kernel) uses plt_entries_equal() to check whether
the per-module trampoline equals a zero buffer, to decide whether the
trampoline has already been initialized.
This triggers a BUG() in the opcode manipulation code, since we end
up checking the ADRP offset of a 0x0 opcode, which is not an ADRP
instruction.
So instead, add a helper to check whether a PLT is initialized, and
call that from the frace code.
Cc: <[email protected]> # v5.0
Fixes: bdb85cd1d206 ("arm64/module: switch to ADRP/ADD sequences for PLT entries")
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Following assembly code is not trivial; make it slightly easier to read by
replacing some of the magic numbers with the defines which are already
present in sysreg.h.
Reviewed-by: Dave Martin <[email protected]>
Signed-off-by: Alexandru Elisei <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
Parsing entries in an ACPI table had assumed a generic header
structure. There is no standard ACPI header, though, so less common
layouts with different field sizes required custom parsers to go through
their subtable entry list.
Create the infrastructure for adding different table types so parsing
the entries array may be more reused for all ACPI system tables and
the common code doesn't need to be duplicated.
Reviewed-by: Rafael J. Wysocki <[email protected]>
Acked-by: Jonathan Cameron <[email protected]>
Tested-by: Jonathan Cameron <[email protected]>
Signed-off-by: Keith Busch <[email protected]>
Tested-by: Brice Goglin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When doing unwind_frame() in the context of pseudo nmi (need enable
CONFIG_ARM64_PSEUDO_NMI), reaching the bottom of the stack (fp == 0,
pc != 0), function on_sdei_stack() will return true while the sdei acpi
table is not inited in fact. This will cause a "NULL pointer dereference"
oops when going on.
Reviewed-by: Julien Thierry <[email protected]>
Signed-off-by: Wei Li <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
- $(call if_changed,...) must have FORCE as a prerequisite
- vdso.lds is a generated file, so it should be prefixed with
$(obj)/ instead of $(src)/.
- cmd_vdsosym is a one-liner rule, so the assignment with '='
is simpler.
Signed-off-by: Masahiro Yamada <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
The call to of_get_next_child returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
./arch/arm64/kernel/cpu_ops.c:102:1-7: ERROR: missing of_node_put;
acquired a node pointer with refcount incremented on line 69, but
without a corresponding object release within this function.
Signed-off-by: Wen Yang <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Since commit ad67b74d2469d9b8 ("printk: hash addresses printed with %p"),
two obfuscated kernel pointer are printed at every boot:
vdso: 2 pages (1 code @ (____ptrval____), 1 data @ (____ptrval____))
Remove the the print completely, as it's useless without the addresses.
Fixes: ad67b74d2469d9b8 ("printk: hash addresses printed with %p")
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Matteo Croce <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
|
|
KVM will need to interrogate the set of SVE vector lengths
available on the system.
This patch exposes the relevant bits to the kernel, along with a
sve_vq_available() helper to check whether a particular vector
length is supported.
__vq_to_bit() and __bit_to_vq() are not intended for use outside
these functions: now that these are exposed outside fpsimd.c, they
are prefixed with __ in order to provide an extra hint that they
are not intended for general-purpose use.
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Tested-by: zhang.lei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
The current FPSIMD/SVE context handling support for non-task (i.e.,
KVM vcpu) contexts does not take SVE into account. This means that
only task contexts can safely use SVE at present.
In preparation for enabling KVM guests to use SVE, it is necessary
to keep track of SVE state for non-task contexts too.
This patch adds the necessary support, removing assumptions from
the context switch code about the location of the SVE context
storage.
When binding a vcpu context, its vector length is arbitrarily
specified as SVE_VL_MIN for now. In any case, because TIF_SVE is
presently cleared at vcpu context bind time, the specified vector
length will not be used for anything yet. In later patches TIF_SVE
will be set here as appropriate, and the appropriate maximum vector
length for the vcpu will be passed when binding.
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Julien Grall <[email protected]>
Tested-by: zhang.lei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Due to the way the effective SVE vector length is controlled and
trapped at different exception levels, certain mismatches in the
sets of vector lengths supported by different physical CPUs in the
system may prevent straightforward virtualisation of SVE at parity
with the host.
This patch analyses the extent to which SVE can be virtualised
safely without interfering with migration of vcpus between physical
CPUs, and rejects late secondary CPUs that would erode the
situation further.
It is left up to KVM to decide what to do with this information.
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Julien Thierry <[email protected]>
Tested-by: zhang.lei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
The roles of sve_init_vq_map(), sve_update_vq_map() and
sve_verify_vq_map() are highly non-obvious to anyone who has not dug
through cpufeatures.c in detail.
Since the way these functions interact with each other is more
important here than a full understanding of the cpufeatures code, this
patch adds comments to make the functions' roles clearer.
No functional change.
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Julien Thierry <[email protected]>
Reviewed-by: Julien Grall <[email protected]>
Tested-by: zhang.lei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
This patch updates fpsimd_flush_task_state() to mirror the new
semantics of fpsimd_flush_cpu_state() introduced by commit
d8ad71fa38a9 ("arm64: fpsimd: Fix TIF_FOREIGN_FPSTATE after
invalidating cpu regs"). Both functions now implicitly set
TIF_FOREIGN_FPSTATE to indicate that the task's FPSIMD state is not
loaded into the cpu.
As a side-effect, fpsimd_flush_task_state() now sets
TIF_FOREIGN_FPSTATE even for non-running tasks. In the case of
non-running tasks this is not useful but also harmless, because the
flag is live only while the corresponding task is running. This
function is not called from fast paths, so special-casing this for
the task == current case is not really worth it.
Compiler barriers previously present in restore_sve_fpsimd_context()
are pulled into fpsimd_flush_task_state() so that it can be safely
called with preemption enabled if necessary.
Explicit calls to set TIF_FOREIGN_FPSTATE that accompany
fpsimd_flush_task_state() calls and are now redundant are removed
as appropriate.
fpsimd_flush_task_state() is used to get exclusive access to the
representation of the task's state via task_struct, for the purpose
of replacing the state. Thus, the call to this function should
happen before manipulating fpsimd_state or sve_state etc. in
task_struct. Anomalous cases are reordered appropriately in order
to make the code more consistent, although there should be no
functional difference since these cases are protected by
local_bh_disable() anyway.
Signed-off-by: Dave Martin <[email protected]>
Reviewed-by: Alex Bennée <[email protected]>
Reviewed-by: Julien Grall <[email protected]>
Tested-by: zhang.lei <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|