aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/kernel/fpu
AgeCommit message (Collapse)AuthorFilesLines
2021-06-23x86/fpu: Use pkru_write_default() in copy_init_fpstate_to_fpregs()Thomas Gleixner1-2/+1
There is no point in using copy_init_pkru_to_fpregs() which in turn calls write_pkru(). write_pkru() tries to fiddle with the task's xstate buffer for nothing because the XRSTOR[S](init_fpstate) just cleared the xfeature flag in the xstate header which makes get_xsave_addr() fail. It's a useless exercise anyway because the reinitialization activates the FPU so before the task's xstate buffer can be used again a XRSTOR[S] must happen which in turn dumps the PKRU value. Get rid of the now unused copy_init_pkru_to_fpregs(). Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/cpu: Sanitize X86_FEATURE_OSPKEThomas Gleixner2-2/+2
X86_FEATURE_OSPKE is enabled first on the boot CPU and the feature flag is set. Secondary CPUs have to enable CR4.PKE as well and set their per CPU feature flag. That's ineffective because all call sites have checks for boot_cpu_data. Make it smarter and force the feature flag when PKU is enabled on the boot cpu which allows then to use cpu_feature_enabled(X86_FEATURE_OSPKE) all over the place. That either compiles the code out when PKEY support is disabled in Kconfig or uses a static_cpu_has() for the feature check which makes a significant difference in hotpaths, e.g. context switch. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename and sanitize fpu__save/copy()Thomas Gleixner2-10/+9
Both function names are a misnomer. fpu__save() is actually about synchronizing the hardware register state into the task's memory state so that either coredump or a math exception handler can inspect the state at the time where the problem happens. The function guarantees to preserve the register state, while "save" is a common terminology for saving the current state so it can be modified and restored later. This is clearly not the case here. Rename it to fpu_sync_fpstate(). fpu__copy() is used to clone the current task's FPU state when duplicating task_struct. While the register state is a copy the rest of the FPU state is not. Name it accordingly and remove the really pointless @src argument along with the warning which comes along with it. Nothing can ever copy the FPU state of a non-current task. It's clearly just a consequence of arch_dup_task_struct(), but it makes no sense to proliferate that further. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu/xstate: Sanitize handling of independent featuresThomas Gleixner1-50/+48
The copy functions for the independent features are horribly named and the supervisor and independent part is just overengineered. The point is that the supplied mask has either to be a subset of the independent features or a subset of the task->fpu.xstate managed features. Rewrite it so it checks for invalid overlaps of these areas in the caller supplied feature mask. Rename it so it follows the new naming convention for these operations. Mop up the function documentation. This allows to use that function for other purposes as well. Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Kan Liang <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename "dynamic" XSTATEs to "independent"Andy Lutomirski1-31/+31
The salient feature of "dynamic" XSTATEs is that they are not part of the main task XSTATE buffer. The fact that they are dynamically allocated is irrelevant and will become quite confusing when user math XSTATEs start being dynamically allocated. Rename them to "independent" because they are independent of the main XSTATE code. This is just a search-and-replace with some whitespace updates to keep things aligned. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lore.kernel.org/r/1eecb0e4f3e07828ebe5d737ec77dc3b708fad2d.1623388344.git.luto@kernel.org Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename initstate copy functionsThomas Gleixner1-3/+3
Again this not a copy. It's restoring register state from kernel memory. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Get rid of the FNSAVE optimizationThomas Gleixner1-31/+23
The FNSAVE support requires conditionals in quite some call paths because FNSAVE reinitializes the FPU hardware. If the save has to preserve the FPU register state then the caller has to conditionally restore it from memory when FNSAVE is in use. This also requires a conditional in context switch because the restore avoidance optimization cannot work with FNSAVE. As this only affects 20+ years old CPUs there is really no reason to keep this optimization effective for FNSAVE. It's about time to not optimize for antiques anymore. Just unconditionally FRSTOR the save content to the registers and clean up the conditionals all over the place. Suggested-by: Dave Hansen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename copy_fpregs_to_fpstate() to save_fpregs_to_fpstate()Thomas Gleixner1-5/+5
A copy is guaranteed to leave the source intact, which is not the case when FNSAVE is used as that reinitilizes the registers. Save does not make such guarantees and it matches what this is about, i.e. to save the state for a later restore. Rename it to save_fpregs_to_fpstate(). Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Deduplicate copy_uabi_from_user/kernel_to_xstate()Thomas Gleixner1-90/+47
copy_uabi_from_user_to_xstate() and copy_uabi_from_kernel_to_xstate() are almost identical except for the copy function. Unify them. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename xstate copy functions which are related to UABIThomas Gleixner3-4/+5
Rename them to reflect that these functions deal with user space format XSAVE buffers. copy_kernel_to_xstate() -> copy_uabi_from_kernel_to_xstate() copy_user_to_xstate() -> copy_sigframe_from_user_to_xstate() Again a clear statement that these functions deal with user space ABI. Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename fregs-related copy functionsThomas Gleixner2-4/+4
The function names for fnsave/fnrstor operations are horribly named and a permanent source of confusion. Rename: copy_kernel_to_fregs() to frstor() copy_fregs_to_user() to fnsave_to_user_sigframe() copy_user_to_fregs() to frstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename fxregs-related copy functionsThomas Gleixner2-8/+8
The function names for fxsave/fxrstor operations are horribly named and a permanent source of confusion. Rename: copy_fxregs_to_kernel() to fxsave() copy_kernel_to_fxregs() to fxrstor() copy_fxregs_to_user() to fxsave_to_user_sigframe() copy_user_to_fxregs() to fxrstor_from_user_sigframe() so it's clear what these are doing. All these functions are really low level wrappers around the equally named instructions, so mapping to the documentation is just natural. While at it, replace the static_cpu_has(X86_FEATURE_FXSR) with use_fxsr() to be consistent with the rest of the code. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename copy_user_to_xregs() and copy_xregs_to_user()Thomas Gleixner1-2/+2
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_user() to xsave_to_user_sigframe() copy_user_to_xregs() to xrstor_from_user_sigframe() so it's entirely clear what this is about. This is also a clear indicator of the potentially different storage format because this is user ABI and cannot use compacted format. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rename copy_xregs_to_kernel() and copy_kernel_to_xregs()Thomas Gleixner3-15/+15
The function names for xsave[s]/xrstor[s] operations are horribly named and a permanent source of confusion. Rename: copy_xregs_to_kernel() to os_xsave() copy_kernel_to_xregs() to os_xrstor() These are truly low level wrappers around the actual instructions XSAVE[OPT]/XRSTOR and XSAVES/XRSTORS with the twist that the selection based on the available CPU features happens with an alternative to avoid conditionals all over the place and to provide the best performance for hot paths. The os_ prefix tells that this is the OS selected mechanism. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Get rid of copy_supervisor_to_kernel()Thomas Gleixner2-60/+8
If the fast path of restoring the FPU state on sigreturn fails or is not taken and the current task's FPU is active then the FPU has to be deactivated for the slow path to allow a safe update of the tasks FPU memory state. With supervisor states enabled, this requires to save the supervisor state in the memory state first. Supervisor states require XSAVES so saving only the supervisor state requires to reshuffle the memory buffer because XSAVES uses the compacted format and therefore stores the supervisor states at the beginning of the memory state. That's just an overengineered optimization. Get rid of it and save the full state for this case. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Cleanup arch_set_user_pkey_access()Thomas Gleixner1-5/+6
The function does a sanity check with a WARN_ON_ONCE() but happily proceeds when the pkey argument is out of range. Clean it up. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Get rid of using_compacted_format()Thomas Gleixner1-18/+4
This function is pointlessly global and a complete misnomer because it's usage is related to both supervisor state checks and compacted format checks. Remove it and just make the conditions check the XSAVES feature. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Move fpu__write_begin() to regsetThomas Gleixner2-27/+22
The only usecase for fpu__write_begin is the set() callback of regset, so the function is pointlessly global. Move it to the regset code and rename it to fpu_force_restore() which is exactly decribing what the function does. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu/regset: Move fpu__read_begin() into regsetThomas Gleixner2-23/+19
The function can only be used from the regset get() callbacks safely. So there is no reason to have it globally exposed. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Remove fpstate_sanitize_xstate()Thomas Gleixner1-79/+0
No more users. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Use copy_xstate_to_uabi_buf() in fpregs_get()Thomas Gleixner1-10/+20
Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the FX state when XSAVE* is in use. This avoids to overwrite the FPU state buffer with fpstate_sanitize_xstate() which is error prone and duplicated code. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Use copy_xstate_to_uabi_buf() in xfpregs_get()Thomas Gleixner1-3/+8
Use the new functionality of copy_xstate_to_uabi_buf() to retrieve the FX state when XSAVE* is in use. This avoids overwriting the FPU state buffer with fpstate_sanitize_xstate() which is error prone and duplicated code. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Make copy_xstate_to_kernel() usable for [x]fpregs_get()Thomas Gleixner2-12/+32
When xsave with init state optimization is used then a component's state in the task's xsave buffer can be stale when the corresponding feature bit is not set. fpregs_get() and xfpregs_get() invoke fpstate_sanitize_xstate() to update the task's xsave buffer before retrieving the FX or FP state. That's just duplicated code as copy_xstate_to_kernel() already handles this correctly. Add a copy mode argument to the function which allows to restrict the state copy to the FP and SSE features. Also rename the function to copy_xstate_to_uabi_buf() so the name reflects what it is doing. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Clean up fpregs_set()Andy Lutomirski1-15/+16
fpregs_set() has unnecessary complexity to support short or nonzero-offset writes and to handle the case in which a copy from userspace overwrites some of the target buffer and then fails. Support for partial writes is useless -- just require that the write has offset 0 and the correct size, and copy into a temporary kernel buffer to avoid clobbering the state if the user access fails. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Fail ptrace() requests that try to set invalid MXCSR valuesAndy Lutomirski1-2/+3
There is no benefit from accepting and silently changing an invalid MXCSR value supplied via ptrace(). Instead, return -EINVAL on invalid input. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Rewrite xfpregs_set()Andy Lutomirski1-14/+23
xfpregs_set() was incomprehensible. Almost all of the complexity was due to trying to support nonsensically sized writes or -EFAULT errors that would have partially or completely overwritten the destination before failing. Nonsensically sized input would only have been possible using PTRACE_SETREGSET on REGSET_XFP. Fortunately, it appears (based on Debian code search results) that no one uses that API at all, let alone with the wrong sized buffer. Failed user access can be handled more cleanly by first copying to kernel memory. Just rewrite it to require sensible input. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Simplify PTRACE_GETREGS codeDave Hansen2-24/+6
ptrace() has interfaces that let a ptracer inspect a ptracee's register state. This includes XSAVE state. The ptrace() ABI includes a hardware-format XSAVE buffer for both the SETREGS and GETREGS interfaces. In the old days, the kernel buffer and the ptrace() ABI buffer were the same boring non-compacted format. But, since the advent of supervisor states and the compacted format, the kernel buffer has diverged from the format presented in the ABI. This leads to two paths in the kernel: 1. Effectively a verbatim copy_to_user() which just copies the kernel buffer out to userspace. This is used when the kernel buffer is kept in the non-compacted form which means that it shares a format with the ptrace ABI. 2. A one-state-at-a-time path: copy_xstate_to_kernel(). This is theoretically slower since it does a bunch of piecemeal copies. Remove the verbatim copy case. Speed probably does not matter in this path, and the vast majority of new hardware will use the one-state-at-a-time path anyway. This ensures greater testing for the "slow" path. This also makes enabling PKRU in this interface easier since a single path can be patched instead of two. Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Reject invalid MXCSR values in copy_kernel_to_xstate()Thomas Gleixner1-3/+16
Instead of masking out reserved bits, check them and reject the provided state as invalid if not zero. Suggested-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Sanitize xstateregs_set()Thomas Gleixner2-31/+25
xstateregs_set() operates on a stopped task and tries to copy the provided buffer into the task's fpu.state.xsave buffer. Any error while copying or invalid state detected after copying results in wiping the target task's FPU state completely including supervisor states. That's just wrong. The caller supplied invalid data or has a problem with unmapped memory, so there is absolutely no justification to corrupt the target state. Fix this with the following modifications: 1) If data has to be copied from userspace, allocate a buffer and copy from user first. 2) Use copy_kernel_to_xstate() unconditionally so that header checking works correctly. 3) Return on error without corrupting the target state. This prevents corrupting states and lets the caller deal with the problem it caused in the first place. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Limit xstate copy size in xstateregs_set()Thomas Gleixner1-1/+1
If the count argument is larger than the xstate size, this will happily copy beyond the end of xstate. Fixes: 91c3dba7dbc1 ("x86/fpu/xstate: Fix PTRACE frames for XSAVES") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Move inlines where they belongThomas Gleixner1-0/+15
They are only used in fpstate_init() and there is no point to have them in a header just to make reading the code harder. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Remove unused get_xsave_field_ptr()Thomas Gleixner1-30/+0
Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Get rid of fpu__get_supported_xfeatures_mask()Thomas Gleixner2-12/+3
This function is really not doing what the comment advertises: "Find supported xfeatures based on cpu features and command-line input. This must be called after fpu__init_parse_early_param() is called and xfeatures_mask is enumerated." fpu__init_parse_early_param() does not exist anymore and the function just returns a constant. Remove it and fix the caller and get rid of further references to fpu__init_parse_early_param(). Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Make xfeatures_mask_all __ro_after_initThomas Gleixner1-13/+15
Nothing has to modify this after init. But of course there is code which unconditionally masks xfeatures_mask_all on CPU hotplug. This goes unnoticed during boot hotplug because at that point the variable is still RW mapped. This is broken in several ways: 1) Masking this in post init CPU hotplug means that any modification of this state goes unnoticed until actual hotplug happens. 2) If that ever happens then these bogus feature bits are already populated all over the place and the system is in inconsistent state vs. the compacted XSTATE offsets. If at all then this has to panic the machine because the inconsistency cannot be undone anymore. Make this a one-time paranoia check in xstate init code and disable xsave when this happens. Reported-by: Kan Liang <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Mark various FPU state variables __ro_after_initThomas Gleixner2-7/+11
Nothing modifies these after booting. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Reviewed-by: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23x86/fpu: Fix copy_xstate_to_kernel() gap handlingThomas Gleixner1-44/+61
The gap handling in copy_xstate_to_kernel() is wrong when XSAVES is in use. Using init_fpstate for copying the init state of features which are not set in the xstate header is only correct for the legacy area, but not for the extended features area because when XSAVES is in use then init_fpstate is in compacted form which means the xstate offsets which are used to copy from init_fpstate are not valid. Fortunately, this is not a real problem today because all extended features in use have an all-zeros init state, but it is wrong nevertheless and with a potentially dynamically sized init_fpstate this would result in an access outside of the init_fpstate. Fix this by keeping track of the last copied state in the target buffer and explicitly zero it when there is a feature or alignment gap. Use the compacted offset when accessing the extended feature space in init_fpstate. As this is not a functional issue on older kernels this is intentionally not tagged for stable. Fixes: b8be15d58806 ("x86/fpu/xstate: Re-enable XSAVES") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-23Merge x86/urgent into x86/fpuBorislav Petkov2-97/+81
Pick up dependent changes which either went mainline (x86/urgent is based on -rc7 and that contains them) as urgent fixes and the current x86/urgent branch which contains two more urgent fixes, so that the bigger FPU rework can base off ontop. Signed-off-by: Borislav Petkov <[email protected]>
2021-06-22x86/fpu: Make init_fpstate correct with optimized XSAVEThomas Gleixner1-3/+38
The XSAVE init code initializes all enabled and supported components with XRSTOR(S) to init state. Then it XSAVEs the state of the components back into init_fpstate which is used in several places to fill in the init state of components. This works correctly with XSAVE, but not with XSAVEOPT and XSAVES because those use the init optimization and skip writing state of components which are in init state. So init_fpstate.xsave still contains all zeroes after this operation. There are two ways to solve that: 1) Use XSAVE unconditionally, but that requires to reshuffle the buffer when XSAVES is enabled because XSAVES uses compacted format. 2) Save the components which are known to have a non-zero init state by other means. Looking deeper, #2 is the right thing to do because all components the kernel supports have all-zeroes init state except the legacy features (FP, SSE). Those cannot be hard coded because the states are not identical on all CPUs, but they can be saved with FXSAVE which avoids all conditionals. Use FXSAVE to save the legacy FP/SSE components in init_fpstate along with a BUILD_BUG_ON() which reminds developers to validate that a newly added component has all zeroes init state. As a bonus remove the now unused copy_xregs_to_kernel_booting() crutch. The XSAVE and reshuffle method can still be implemented in the unlikely case that components are added which have a non-zero init state and no other means to save them. For now, FXSAVE is just simple and good enough. [ bp: Fix a typo or two in the text. ] Fixes: 6bad06b76892 ("x86, xsave: Use xsaveopt in context-switch path when supported") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-06-22x86/fpu: Preserve supervisor states in sanitize_restored_user_xstate()Thomas Gleixner1-18/+8
sanitize_restored_user_xstate() preserves the supervisor states only when the fx_only argument is zero, which allows unprivileged user space to put supervisor states back into init state. Preserve them unconditionally. [ bp: Fix a typo or two in the text. ] Fixes: 5d6b6a6f9b5c ("x86/fpu/xstate: Update sanitize_restored_xstate() for supervisor xstates") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-06-10x86/fpu: Reset state for all signal restore failuresThomas Gleixner1-11/+15
If access_ok() or fpregs_soft_set() fails in __fpu__restore_sig() then the function just returns but does not clear the FPU state as it does for all other fatal failures. Clear the FPU state for these failures as well. Fixes: 72a671ced66d ("x86, fpu: Unify signal handling code paths for x86 and x86_64 kernels") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-06-09x86/fpu: Add address range checks to copy_user_to_xstate()Andy Lutomirski1-3/+3
copy_user_to_xstate() uses __copy_from_user(), which provides a negligible speedup. Fortunately, both call sites are at least almost correct. __fpu__restore_sig() checks access_ok() with xstate_sigframe_size() length and ptrace regset access uses fpu_user_xstate_size. These should be valid upper bounds on the length, so, at worst, this would cause spurious failures and not accesses to kernel memory. Nonetheless, this is far more fragile than necessary and none of these callers are in a hotpath. Use copy_from_user() instead. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: Rik van Riel <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-06-09x86/fpu: Invalidate FPU state after a failed XRSTOR from a user bufferAndy Lutomirski1-0/+19
Both Intel and AMD consider it to be architecturally valid for XRSTOR to fail with #PF but nonetheless change the register state. The actual conditions under which this might occur are unclear [1], but it seems plausible that this might be triggered if one sibling thread unmaps a page and invalidates the shared TLB while another sibling thread is executing XRSTOR on the page in question. __fpu__restore_sig() can execute XRSTOR while the hardware registers are preserved on behalf of a different victim task (using the fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but modify the registers. If this happens, then there is a window in which __fpu__restore_sig() could schedule out and the victim task could schedule back in without reloading its own FPU registers. This would result in part of the FPU state that __fpu__restore_sig() was attempting to load leaking into the victim task's user-visible state. Invalidate preserved FPU registers on XRSTOR failure to prevent this situation from corrupting any state. [1] Frequent readers of the errata lists might imagine "complex microarchitectural conditions". Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()") Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-06-09x86/fpu: Prevent state corruption in __fpu__restore_sig()Thomas Gleixner1-8/+1
The non-compacted slowpath uses __copy_from_user() and copies the entire user buffer into the kernel buffer, verbatim. This means that the kernel buffer may now contain entirely invalid state on which XRSTOR will #GP. validate_user_xstate_header() can detect some of that corruption, but that leaves the onus on callers to clear the buffer. Prior to XSAVES support, it was possible just to reinitialize the buffer, completely, but with supervisor states that is not longer possible as the buffer clearing code split got it backwards. Fixing that is possible but not corrupting the state in the first place is more robust. Avoid corruption of the kernel XSAVE buffer by using copy_user_to_xstate() which validates the XSAVE header contents before copying the actual states to the kernel. copy_user_to_xstate() was previously only called for compacted-format kernel buffers, but it works for both compacted and non-compacted forms. Using it for the non-compacted form is slower because of multiple __copy_from_user() operations, but that cost is less important than robust code in an already slow path. [ Changelog polished by Dave Hansen ] Fixes: b860eb8dce59 ("x86/fpu/xstate: Define new functions for clearing fpregs and xstates") Reported-by: [email protected] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Borislav Petkov <[email protected]> Acked-by: Dave Hansen <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-06-03x86/cpufeatures: Force disable X86_FEATURE_ENQCMD and remove update_pasid()Thomas Gleixner1-57/+0
While digesting the XSAVE-related horrors which got introduced with the supervisor/user split, the recent addition of ENQCMD-related functionality got on the radar and turned out to be similarly broken. update_pasid(), which is only required when X86_FEATURE_ENQCMD is available, is invoked from two places: 1) From switch_to() for the incoming task 2) Via a SMP function call from the IOMMU/SMV code #1 is half-ways correct as it hacks around the brokenness of get_xsave_addr() by enforcing the state to be 'present', but all the conditionals in that code are completely pointless for that. Also the invocation is just useless overhead because at that point it's guaranteed that TIF_NEED_FPU_LOAD is set on the incoming task and all of this can be handled at return to user space. #2 is broken beyond repair. The comment in the code claims that it is safe to invoke this in an IPI, but that's just wishful thinking. FPU state of a running task is protected by fregs_lock() which is nothing else than a local_bh_disable(). As BH-disabled regions run usually with interrupts enabled the IPI can hit a code section which modifies FPU state and there is absolutely no guarantee that any of the assumptions which are made for the IPI case is true. Also the IPI is sent to all CPUs in mm_cpumask(mm), but the IPI is invoked with a NULL pointer argument, so it can hit a completely unrelated task and unconditionally force an update for nothing. Worse, it can hit a kernel thread which operates on a user space address space and set a random PASID for it. The offending commit does not cleanly revert, but it's sufficient to force disable X86_FEATURE_ENQCMD and to remove the broken update_pasid() code to make this dysfunctional all over the place. Anything more complex would require more surgery and none of the related functions outside of the x86 core code are blatantly wrong, so removing those would be overkill. As nothing enables the PASID bit in the IA32_XSS MSR yet, which is required to make this actually work, this cannot result in a regression except for related out of tree train-wrecks, but they are broken already today. Fixes: 20f0afd1fb3d ("x86/mmu: Allocate/free a PASID") Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2021-05-19x86/signal: Introduce helpers to get the maximum signal frame sizeChang S. Bae1-0/+19
Signal frames do not have a fixed format and can vary in size when a number of things change: supported XSAVE features, 32 vs. 64-bit apps, etc. Add support for a runtime method for userspace to dynamically discover how large a signal stack needs to be. Introduce a new variable, max_frame_size, and helper functions for the calculation to be used in a new user interface. Set max_frame_size to a system-wide worst-case value, instead of storing multiple app-specific values. Signed-off-by: Chang S. Bae <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Len Brown <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: H.J. Lu <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-03-18x86: Fix various typos in commentsIngo Molnar1-1/+1
Fix ~144 single-word typos in arch/x86/ code comments. Doing this in a single commit should reduce the churn. Signed-off-by: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: [email protected]
2021-01-29x86/fpu/xstate: Use sizeof() instead of a constantYejune Deng1-2/+2
Use sizeof() instead of a constant in fpstate_sanitize_xstate(). Remove use of the address of the 0th array element of ->st_space and ->xmm_space which is equivalent to the array address itself: No code changed: # arch/x86/kernel/fpu/xstate.o: text data bss dec hex filename 9694 899 4 10597 2965 xstate.o.before 9694 899 4 10597 2965 xstate.o.after md5: 5a43fc70bad8e2a1784f67f01b71aabb xstate.o.before.asm 5a43fc70bad8e2a1784f67f01b71aabb xstate.o.after.asm [ bp: Massage commit message. ] Signed-off-by: Yejune Deng <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-01-21x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize stateAndy Lutomirski1-4/+5
Currently, requesting kernel FPU access doesn't distinguish which parts of the extended ("FPU") state are needed. This is nice for simplicity, but there are a few cases in which it's suboptimal: - The vast majority of in-kernel FPU users want XMM/YMM/ZMM state but do not use legacy 387 state. These users want MXCSR initialized but don't care about the FPU control word. Skipping FNINIT would save time. (Empirically, FNINIT is several times slower than LDMXCSR.) - Code that wants MMX doesn't want or need MXCSR initialized. _mmx_memcpy(), for example, can run before CR4.OSFXSR gets set, and initializing MXCSR will fail because LDMXCSR generates an #UD when the aforementioned CR4 bit is not set. - Any future in-kernel users of XFD (eXtended Feature Disable)-capable dynamic states will need special handling. Add a more specific API that allows callers to specify exactly what they want. Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Tested-by: Krzysztof Piotr Olędzki <[email protected]> Link: https://lkml.kernel.org/r/aff1cac8b8fc7ee900cf73e8f2369966621b053f.1611205691.git.luto@kernel.org
2020-10-12Merge tag 'x86_fpu_for_v5.10' of ↵Linus Torvalds1-41/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fpu updates from Borislav Petkov: - Allow clearcpuid= to accept multiple bits (Arvind Sankar) - Move clearcpuid= parameter handling earlier in the boot, away from the FPU init code and to a generic location (Mike Hommey) * tag 'x86_fpu_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Handle FPU-related and clearcpuid command line arguments earlier x86/fpu: Allow multiple bits in clearcpuid= parameter
2020-09-22x86/fpu: Handle FPU-related and clearcpuid command line arguments earlierMike Hommey1-55/+0
FPU initialization handles them currently. However, in the case of clearcpuid=, some other early initialization code may check for features before the FPU initialization code is called. Handling the argument earlier allows the command line to influence those early initializations. Signed-off-by: Mike Hommey <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]