aboutsummaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)AuthorFilesLines
2022-05-09s390/pgtable: cleanup description of swp pte layoutDavid Hildenbrand1-9/+8
Bit 52 and bit 55 don't have to be zero: they only trigger a translation-specifiation exception if the PTE is marked as valid, which is not the case for swap ptes. Document which bits are used for what, and which ones are unused. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: David Hildenbrand <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Don Dutile <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jann Horn <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Liang Zhang <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Oded Gabbay <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pedro Demarchi Gomes <[email protected]> Cc: Peter Xu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-09s390/pai: add support for cryptography countersThomas Richter9-7/+784
PMU device driver perf_pai_crypto supports Processor Activity Instrumentation (PAI), available with IBM z16: - maps a full page to lowcore address 0x1500. - uses CR0 bit 13 to turn PAI crypto counting on and off. - creates a sample with raw data on each context switch out when at context switch some mapped counters have a value of nonzero. This device driver only supports CPU wide context, no task context is allowed. Support for counting: - one or more counters can be specified using perf stat -e pai_crypto/xxx/ where xxx stands for the counter event name. Multiple invocation of this command is possible. The counter names are listed in /sys/devices/pai_crypto/events directory. - one special counters can be specified using perf stat -e pai_crypto/CRYPTO_ALL/ which returns the sum of all incremented crypto counters. - one event pai_crypto/CRYPTO_ALL/ is reserved for sampling. No multiple invocations are possible. The event collects data at context switch out and saves them in the ring buffer. Add qpaci assembly instruction to query supported memory mapped crypto counters. It returns the number of counters (no holes allowed in that range). The PAI crypto counter events are system wide and can not be executed in parallel. Therefore some restrictions documented in function paicrypt_busy apply. In particular event CRYPTO_ALL for sampling must run exclusive. Only counting events can run in parallel. PAI crypto counter events can not be created when a CPU hot plug add is processed. This means a CPU hot plug add does not get the necessary PAI event to record PAI cryptography counter increments on the newly added CPU. CPU hot plug remove removes the event and terminates the counting of PAI counters immediately. Co-developed-by: Sven Schnelle <[email protected]> Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Juergen Christ <[email protected]> Signed-off-by: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-05-09entry: Rename arch_check_user_regs() to arch_enter_from_user_mode()Sven Schnelle1-2/+2
arch_check_user_regs() is used at the moment to verify that struct pt_regs contains valid values when entering the kernel from userspace. s390 needs a place in the generic entry code to modify a cpu data structure when switching from userspace to kernel mode. As arch_check_user_regs() is exactly this, rename it to arch_enter_from_user_mode(). When entering the kernel from userspace, arch_check_user_regs() is used to verify that struct pt_regs contains valid values. Note that the NMI codepath doesn't call this function. s390 needs a place in the generic entry code to modify a cpu data structure when switching from userspace to kernel mode. As arch_check_user_regs() is exactly this, rename it to arch_enter_from_user_mode(). Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-05-07fork: Generalize PF_IO_WORKER handlingEric W. Biederman1-4/+3
Add fn and fn_arg members into struct kernel_clone_args and test for them in copy_thread (instead of testing for PF_KTHREAD | PF_IO_WORKER). This allows any task that wants to be a user space task that only runs in kernel mode to use this functionality. The code on x86 is an exception and still retains a PF_KTHREAD test because x86 unlikely everything else handles kthreads slightly differently than user space tasks that start with a function. The functions that created tasks that start with a function have been updated to set ".fn" and ".fn_arg" instead of ".stack" and ".stack_size". These functions are fork_idle(), create_io_thread(), kernel_thread(), and user_mode_thread(). Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Eric W. Biederman" <[email protected]>
2022-05-07fork: Pass struct kernel_clone_args into copy_threadEric W. Biederman1-2/+5
With io_uring we have started supporting tasks that are for most purposes user space tasks that exclusively run code in kernel mode. The kernel task that exec's init and tasks that exec user mode helpers are also user mode tasks that just run kernel code until they call kernel execve. Pass kernel_clone_args into copy_thread so these oddball tasks can be supported more cleanly and easily. v2: Fix spelling of kenrel_clone_args on h8300 Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Eric W. Biederman" <[email protected]>
2022-05-06s390/compat: cleanup compat_linux.h header fileHeiko Carstens1-52/+28
Remove various declarations from former s390 specific compat system calls which have been removed with commit fef747bab3c0 ("s390: use generic UID16 implementation"). While at it clean up the whole small header file. Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/entry: remove broken and not needed codeHeiko Carstens1-4/+1
LLVM's integrated assembler reports the following error when compiling entry.S: <instantiation>:38:5: error: unknown token in expression tm %r8,0x0001 # coming from user space? The correct instruction would have been tmhh instead of tm. The current code is doing nothing, since (with gas) it get's translated to a tm instruction which reads from real address 8, which again contains always zero, and therefore the conditional code is never executed. Note that due to the missing displacement gas translates "%r8" into "8(%r0)". Also code inspection reveals that this conditional code is not needed. Therefore remove it. Reviewed-by: Sven Schnelle <[email protected]> Reviewed-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/boot: convert parmarea to CHeiko Carstens4-21/+12
Convert parmarea to C, which makes it much easier to initialize it. No need to keep offsets in assembler code in sync with struct parmarea anymore. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/boot: convert initial lowcore to CHeiko Carstens5-56/+103
Convert initial lowcore to C and use proper defines and structures to initialize it. This should make the z/VM ipl procedure a bit less magic. Acked-by: Peter Oberparleiter <[email protected]> Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/ptrace: move short psw definitions to ptrace header fileHeiko Carstens4-32/+32
The short psw definitions are contained in compat header files, however short psws are not compat specific. Therefore move the definitions to ptrace header file. This also gets rid of a compat header include in kvm code. Acked-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/head: initialize all new pswsHeiko Carstens1-7/+15
Initialize all new psws with disabled wait psws, except for the restart new psw. This way every unexpected exception, svc, machine check, or interrupt is handled properly. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/boot: change initial program check handler to disabled wait pswHeiko Carstens1-1/+1
The program check handler of the kernel image points to startup_pgm_check_handler. However an early program check which happens while loading the kernel image will jump to potentially random code, since the code of the program check handler is not yet loaded; leading to a program check loop. Therefore initialize it to a disabled wait psw and let the startup code set the proper psw when everything is in memory. Reviewed-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/head: adjust iplstart entry pointHeiko Carstens1-5/+8
Move iplstart entry point to 0x200 again, instead of the middle of the ipl code. This way even the comment describing the ccw program is correct again. Acked-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/boot: get rid of startup archiveHeiko Carstens11-101/+74
The final kernel image is created by linking decompressor object files with a startup archive. The startup archive file however does not contain only optional code and data which can be discarded if not referenced. It also contains mandatory object data like head.o which must never be discarded, even if not referenced. Move the decompresser code and linker script to the boot directory and get rid of the startup archive so everything is kept during link time. Acked-by: Vasily Gorbik <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/vx: remove comments from macros which break LLVM's IASHeiko Carstens1-3/+3
LLVM's integrated assembler does not like comments within macros: <instantiation>:3:19: error: too many positional arguments GR_NUM b2, 1 /* Base register */ ^ Remove them, since they are obvious anyway. Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/extable: prefer local labels in .set directivesHeiko Carstens1-6/+6
Use local labels in .set directives to avoid potential compile errors with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local labels in .set directives") for further details. Since s390 doesn't support LTO currently this doesn't fix a real bug for now, but helps to avoid problems as soon as required pieces have been added to llvm. Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/nospec: prefer local labels in .set directivesHeiko Carstens1-6/+6
Use local labels in .set directives to avoid potential compile errors with LTO + clang. See commit 334865b2915c ("x86/extable: Prefer local labels in .set directives") for further details. Since s390 doesn't support LTO currently this doesn't fix a real bug for now, but helps to avoid problems as soon as required pieces have been added to llvm. Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/hypfs: fix typos in commentsJulia Lawall1-1/+1
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-05-06s390/crypto: fix typos in commentsJulia Lawall2-2/+2
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-05-03KVM: s390: vsie/gmap: reduce gmap_rmap overheadChristian Borntraeger1-0/+7
there are cases that trigger a 2nd shadow event for the same vmaddr/raddr combination. (prefix changes, reboots, some known races) This will increase memory usages and it will result in long latencies when cleaning up, e.g. on shutdown. To avoid cases with a list that has hundreds of identical raddrs we check existing entries at insert time. As this measurably reduces the list length this will be faster than traversing the list at shutdown time. In the long run several places will be optimized to create less entries and a shrinker might be necessary. Fixes: 4be130a08420 ("s390/mm: add shadow gmap support") Signed-off-by: Christian Borntraeger <[email protected]> Acked-by: David Hildenbrand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-05-02KVM: s390: Fix lockdep issue in vm memopJanis Schoetterl-Glausch1-1/+10
Issuing a memop on a protected vm does not make sense, neither is the memory readable/writable, nor does it make sense to check storage keys. This is why the ioctl will return -EINVAL when it detects the vm to be protected. However, in order to ensure that the vm cannot become protected during the memop, the kvm->lock would need to be taken for the duration of the ioctl. This is also required because kvm_s390_pv_is_protected asserts that the lock must be held. Instead, don't try to prevent this. If user space enables secure execution concurrently with a memop it must accecpt the possibility of the memop failing. Still check if the vm is currently protected, but without locking and consider it a heuristic. Fixes: ef11c9463ae0 ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access") Signed-off-by: Janis Schoetterl-Glausch <[email protected]> Reviewed-by: Janosch Frank <[email protected]> Reviewed-by: Claudio Imbrenda <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-29vmcore: convert copy_oldmem_page() to take an iov_iterMatthew Wilcox (Oracle)1-5/+8
Patch series "Convert vmcore to use an iov_iter", v5. For some reason several people have been sending bad patches to fix compiler warnings in vmcore recently. Here's how it should be done. Compile-tested only on x86. As noted in the first patch, s390 should take this conversion a bit further, but I'm not inclined to do that work myself. This patch (of 3): Instead of passing in a 'buf' and 'userbuf' argument, pass in an iov_iter. s390 needs more work to pass the iov_iter down further, or refactor, but I'd be more comfortable if someone who can test on s390 did that work. It's more convenient to convert the whole of read_from_oldmem() to take an iov_iter at the same time, so rename it to read_from_oldmem_iter() and add a temporary read_from_oldmem() wrapper that creates an iov_iter. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Baoquan He <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-27s390/irq: utilize RCU instead of irq_lock_sparse() in show_msi_interrupt()Pingfan Liu1-2/+2
As demonstrated by commit 74bdf7815dfb ("genirq: Speedup show_interrupts()"), irq_desc can be accessed safely in RCU read section. Hence here resorting to rcu read lock to get rid of irq_lock_sparse(). Signed-off-by: Pingfan Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-04-27s390: disable -Warray-boundsSven Schnelle1-0/+10
gcc-12 shows a lot of array bound warnings on s390. This is caused by the S390_lowcore macro which uses a hardcoded address of 0. Wrapping that with absolute_pointer() works, but gcc no longer knows that a 12 bit displacement is sufficient to access lowcore. So it emits instructions like 'lghi %r1,0; l %rx,xxx(%r1)' instead of a single load/store instruction. As s390 stores variables often read/written in lowcore, this is considered problematic. Therefore disable -Warray-bounds on s390 for gcc-12 for the time being, until there is a better solution. Signed-off-by: Sven Schnelle <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Heiko Carstens <[email protected]>
2022-04-26asm-generic: compat: Cleanup duplicate definitionsGuo Ren1-67/+12
There are 7 64bit architectures that support Linux COMPAT mode to run 32bit applications. A lot of definitions are duplicate: - COMPAT_USER_HZ - COMPAT_RLIM_INFINITY - COMPAT_OFF_T_MAX - __compat_uid_t, __compat_uid_t - compat_dev_t - compat_ipc_pid_t - struct compat_flock - struct compat_flock64 - struct compat_statfs - struct compat_ipc64_perm, compat_semid64_ds, compat_msqid64_ds, compat_shmid64_ds Cleanup duplicate definitions and merge them into asm-generic. Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Acked-by: Helge Deller <[email protected]> # parisc Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-26fs: stat: compat: Add __ARCH_WANT_COMPAT_STATGuo Ren1-0/+1
RISC-V doesn't neeed compat_stat, so using __ARCH_WANT_COMPAT_STAT to exclude unnecessary SYSCALL functions. Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Acked-by: Helge Deller <[email protected]> # parisc Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-26arch: Add SYSVIPC_COMPAT for all architecturesGuo Ren1-3/+0
The existing per-arch definitions are pretty much historic cruft. Move SYSVIPC_COMPAT into init/Kconfig. Signed-off-by: Guo Ren <[email protected]> Signed-off-by: Guo Ren <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Acked-by: Helge Deller <[email protected]> # parisc Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-26compat: consolidate the compat_flock{,64} definitionChristoph Hellwig1-16/+0
Provide a single common definition for the compat_flock and compat_flock64 structures using the same tricks as for the native variants. Another extra define is added for the packing required on x86. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Acked-by: Helge Deller <[email protected]> # parisc Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-26uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.hChristoph Hellwig1-4/+0
The F_GETLK64/F_SETLK64/F_SETLKW64 fcntl opcodes are only implemented for the 32-bit syscall APIs, but are also needed for compat handling on 64-bit kernels. Consolidate them in unistd.h instead of definining the internal compat definitions in compat.h, which is rather error prone (e.g. parisc gets the values wrong currently). Note that before this change they were never visible to userspace due to the fact that CONFIG_64BIT is only set for kernel builds. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Guo Ren <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Tested-by: Heiko Stuebner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]>
2022-04-25s390: add KCSAN instrumentation to barriers and spinlocksIlya Leoshkevich2-8/+9
test_barrier fails on s390 because of the missing KCSAN instrumentation for several synchronization primitives. Add it to barriers by defining __mb(), __rmb(), __wmb(), __dma_rmb() and __dma_wmb(), and letting the common code in asm-generic/barrier.h do the rest. Spinlocks require instrumentation only on the unlock path; notify KCSAN that the CPU cannot move memory accesses outside of the spin lock. In reality it also cannot move stores inside of it, but this is not important and can be omitted. Reported-by: Tobias Huschle <[email protected]> Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/pci: add error record for CC 2 retriesNiklas Schnelle2-18/+63
Currently it is not detectable from within Linux when PCI instructions are retried because of a busy condition. Detecting such conditions and especially how long they lasted can however be quite useful in problem determination. This patch enables this by adding an s390dbf error log when a CC 2 is first encountered as well as after the retried instruction. Despite being unlikely it may be possible that these added debug messages drown out important other messages so allow setting the debug level in zpci_err_insn*() and set their level to 1 so they can be filtered out if need be. Reviewed-by: Matthew Rosato <[email protected]> Reviewed-by: Pierre Morel <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/pci: add PCI access type and length to error recordsNiklas Schnelle1-15/+39
Currently when a PCI instruction returns a non-zero condition code it can be very hard to tell from the s390dbf logs what kind of instruction was executed. In case of PCI memory I/O (MIO) instructions it is even impossible to tell if we attempted a load, store or block store or how large the access was because only the address is logged. Improve this by adding an indicator byte for the instruction type to the error record and also store the length of the access for MIO instructions where this can not be deduced from the request. We use the following indicator values: - 'l': PCI load - 's': PCI store - 'b': PCI store block - 'L': PCI load (MIO) - 'S': PCI store (MIO) - 'B': PCI store block (MIO) - 'M': MPCIFC - 'R': RPCIT Reviewed-by: Matthew Rosato <[email protected]> Reviewed-by: Pierre Morel <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/pci: don't log availability events as errorsNiklas Schnelle1-3/+0
Availability events are logged in s390dbf in s390dbf/pci_error/hex_ascii even though they don't indicate an error condition. They have also become redundant as commit 6526a597a2e85 ("s390/pci: add simpler s390dbf traces for events") added an s390dbf/pci_msg/sprintf log entry for availability events which contains all non reserved fields of struct zpci_ccdf_avail. On the other hand the availability entries in the error log make it easy to miss actual errors and may even overwrite error entries if the message buffer wraps. Thus simply remove the availability events from the error log thereby establishing the rule that any content in s390dbf/pci_error indicates some kind of error. Reviewed-by: Matthew Rosato <[email protected]> Reviewed-by: Pierre Morel <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/pci: make better use of zpci_dbg() levelsNiklas Schnelle3-3/+3
While the zpci_dbg() macro offers a level parameter this is currently largely unused. The only instance with higher importance than 3 is the UID checking change debug message which is not actually more important as the UID uniqueness guarantee is already exposed in sysfs so this should rather be 3 as well. On the other hand the "add ..." message which shows what devices are visible at the lowest level is essential during problem determination. By setting its level to 1, lowering the debug level can act as a filter to only show the available functions. On the error side the default level is set to 6 while all existing messages are printed at level 0. This is inconsistent and means there is no room for having messages be invisible on the default level so instead set the default level to 3 like for errors matching the default for debug messages. Reviewed-by: Matthew Rosato <[email protected]> Reviewed-by: Pierre Morel <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/vdso: add vdso randomizationSven Schnelle1-1/+32
Randomize the address of vdso if randomize_va_space is enabled. Note that this keeps the vdso address on the same PMD as the stack to avoid allocating an extra page table just for vdso. Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/vdso: map vdso above stackSven Schnelle2-4/+5
In the current code vdso is mapped below the stack. This is problematic when programs mapped to the top of the address space are allocating a lot of memory, because the heap will clash with the vdso. To avoid this map the vdso above the stack and move STACK_TOP so that it all fits into three level paging. Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/vdso: move vdso mapping to its own functionSven Schnelle2-5/+20
This is a preparation patch for adding vdso randomization to s390. It adds a function vdso_size(), which will be used later in calculating the STACK_TOP value. It also moves the vdso mapping into a new function vdso_map(), to keep the code similar to other architectures. Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/mmap: increase stack/mmap gap to 128MBSven Schnelle1-2/+2
This basically reverts commit 9e78a13bfb16 ("[S390] reduce miminum gap between stack and mmap_base"). 32MB is not enough space between stack and mmap for some programs. Given that compat task aren't common these days, lets revert back to 128MB. Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/zcrypt: code cleanupHarald Freudenberger2-9/+9
This patch tries to fix as much as possible of the checkpatch.pl --strict findings: CHECK: Logical continuations should be on the previous line CHECK: No space is necessary after a cast CHECK: Alignment should match open parenthesis CHECK: 'useable' may be misspelled - perhaps 'usable'? WARNING: Possible repeated word: 'is' CHECK: spaces preferred around that '*' (ctx:VxV) CHECK: Comparison to NULL could be written "!msg" CHECK: Prefer kzalloc(sizeof(*zc)...) over kzalloc(sizeof(struct...)...) CHECK: Unnecessary parentheses around resp_type->work CHECK: Avoid CamelCase: <xcRB> There is no functional change comming with this patch, only code cleanup, renaming, whitespaces, indenting, ... but no semantic change in any way. Also the API (zcrypt and pkey header file) is semantically unchanged. Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Jürgen Christ <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/zcrypt: cleanup CPRB struct definitionsHarald Freudenberger1-14/+12
This patch does a little cleanup on the CPRBX struct in zcrypt.h and the redundant CPRB struct definition in zcrypt_msgtype6.c. Especially some of the misleading fields from the CPRBX struct have been removed. There is no semantic change coming with this patch. The field names changed in the XCRB struct are only related to reserved fields which should never been used. Signed-off-by: Harald Freudenberger <[email protected]> Reviewed-by: Jürgen Christ <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/cio: simplify the calculation of variablesHaowen Bai1-15/+68
Fix the following coccicheck warnings: ./arch/s390/include/asm/scsw.h:695:47-49: WARNING !A || A && B is equivalent to !A || B I apply a readable version just to get rid of a warning. Signed-off-by: Haowen Bai <[email protected]> Reviewed-by: Peter Oberparleiter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Alexander Gordeev <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Sven Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/smp: sort out physical vs virtual CPU0 lowcore pointerAlexander Gordeev1-1/+1
SPX instruction called from set_prefix() expects physical address of the lowcore to be installed, but instead the virtual address is passed. Note: this does not fix a bug currently, since virtual and physical addresses are identical. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/kexec: set end-of-ipl flag in last diag308 callAlexander Egorenkov3-3/+16
If the facility IPL-complete-control is present then the last diag308 call made by kexec shall set the end-of-ipl flag in the subcode register to signal the hypervisor that this is the last diag308 call made by Linux. Only the diag308 calls made during a regular kexec need to set the end-of-ipl flag, in all other cases the hypervisor will ignore it. Signed-off-by: Alexander Egorenkov <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-25s390/sclp: add detection of IPL-complete-control facilityAlexander Egorenkov1-0/+1
The presence of the IPL-complete-control facility can be derived from the hypervisor's SCLP info response. Signed-off-by: Alexander Egorenkov <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-22Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-8/+8
Pull kvm fixes from Paolo Bonzini: "The main and larger change here is a workaround for AMD's lack of cache coherency for encrypted-memory guests. I have another patch pending, but it's waiting for review from the architecture maintainers. RISC-V: - Remove 's' & 'u' as valid ISA extension - Do not allow disabling the base extensions 'i'/'m'/'a'/'c' x86: - Fix NMI watchdog in guests on AMD - Fix for SEV cache incoherency issues - Don't re-acquire SRCU lock in complete_emulated_io() - Avoid NULL pointer deref if VM creation fails - Fix race conditions between APICv disabling and vCPU creation - Bugfixes for disabling of APICv - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume selftests: - Do not use bitfields larger than 32-bits, they differ between GCC and clang" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: kvm: selftests: introduce and use more page size-related constants kvm: selftests: do not use bitfields larger than 32-bits for PTEs KVM: SEV: add cache flush to solve SEV cache incoherency issues KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs KVM: SVM: Simplify and harden helper to flush SEV guest page(s) KVM: selftests: Silence compiler warning in the kvm_page_table_test KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume KVM: SPDX style and spelling fixes KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race KVM: nVMX: Defer APICv updates while L2 is active until L1 is active KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io() RISC-V: KVM: Restrict the extensions that can be disabled RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
2022-04-21KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abusedSean Christopherson3-8/+8
Add wrappers to acquire/release KVM's SRCU lock when stashing the index in vcpu->src_idx, along with rudimentary detection of illegal usage, e.g. re-acquiring SRCU and thus overwriting vcpu->src_idx. Because the SRCU index is (currently) either 0 or 1, illegal nesting bugs can go unnoticed for quite some time and only cause problems when the nested lock happens to get a different index. Wrap the WARNs in PROVE_RCU=y, and make them ONCE, otherwise KVM will likely yell so loudly that it will bring the kernel to its knees. Signed-off-by: Sean Christopherson <[email protected]> Tested-by: Fabiano Rosas <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2022-04-19vmalloc: replace VM_NO_HUGE_VMAP with VM_ALLOW_HUGE_VMAPSong Liu1-6/+1
Huge page backed vmalloc memory could benefit performance in many cases. However, some users of vmalloc may not be ready to handle huge pages for various reasons: hardware constraints, potential pages split, etc. VM_NO_HUGE_VMAP was introduced to allow vmalloc users to opt-out huge pages. However, it is not easy to track down all the users that require the opt-out, as the allocation are passed different stacks and may cause issues in different layers. To address this issue, replace VM_NO_HUGE_VMAP with an opt-in flag, VM_ALLOW_HUGE_VMAP, so that users that benefit from huge pages could ask specificially. Also, remove vmalloc_no_huge() and add opt-in helper vmalloc_huge(). Fixes: fac54e2bfb5b ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP") Link: https://lore.kernel.org/netdev/[email protected]/" Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Song Liu <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-04-18swiotlb: make the swiotlb_init interface more usefulChristoph Hellwig1-2/+1
Pass a boolean flag to indicate if swiotlb needs to be enabled based on the addressing needs, and replace the verbose argument with a set of flags, including one to force enable bounce buffering. Note that this patch removes the possibility to force xen-swiotlb use with the swiotlb=force parameter on the command line on x86 (arm and arm64 never supported that), but this interface will be restored shortly. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]> Tested-by: Boris Ostrovsky <[email protected]>
2022-04-12s390: enable CONFIG_HARDENED_USERCOPY in debug_defconfigSven Schnelle1-0/+1
Signed-off-by: Sven Schnelle <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2022-04-12s390: current_stack_pointer shouldn't be a functionSven Schnelle4-10/+4
s390 defines current_stack_pointer as function while all other architectures use 'register unsigned long asm("<stackptr reg>"). This make codes like the following from check_stack_object() fail: if (IS_ENABLED(CONFIG_STACK_GROWSUP)) { if ((void *)current_stack_pointer < obj + len) return BAD_STACK; } else { if (obj < (void *)current_stack_pointer) return BAD_STACK; } because this would compare the address of current_stack_pointer() and not the stackpointer value. Reported-by: Karsten Graul <[email protected]> Fixes: 2792d84e6da5 ("usercopy: Check valid lifetime via stack depth") Cc: Kees Cook <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Alexander Gordeev <[email protected]> Signed-off-by: Sven Schnelle <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>