aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-11-11Revert "ext4: fix superblock checksum calculation race"Theodore Ts'o1-11/+0
This reverts commit acaa532687cdc3a03757defafece9c27aa667546 which can result in a ext4_superblock_csum_set() trying to sleep while a spinlock is being held. For more discussion of this issue, please see: https://lore.kernel.org/r/[email protected] Reported-by: [email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2020-11-11ext4: handle dax mount option collisionHarshad Shirwadkar1-3/+3
Mount options dax=inode and dax=never collided with fast_commit and journal checksum. Redefine the mount flags to remove the collision. Reported-by: Murphy Zhou <[email protected]> Fixes: 9cb20f94afcd2 ("fs/ext4: Make DAX mount option a tri-state") Signed-off-by: Harshad Shirwadkar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2020-11-10Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds1-3/+1
Pull fscrypt fix from Eric Biggers: "Fix a regression where a new WARN_ON() was reachable when using FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 on ext4, causing xfstest generic/602 to sometimes fail on ext4" * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: remove reachable WARN in fscrypt_setup_iv_ino_lblk_32_key()
2020-11-10Merge branch 'turbostat' of ↵Linus Torvalds5-140/+509
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: "Update update to version 20.09.30, one kernel side fix" * 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: update version number powercap: restrict energy meter to root access tools/power turbostat: harden against cpu hotplug tools/power turbostat: adjust for temperature offset tools/power turbostat: Build with _FILE_OFFSET_BITS=64 tools/power turbostat: Support AMD Family 19h tools/power turbostat: Remove empty columns for Jacobsville tools/power turbostat: Add a new GFXAMHz column that exposes gt_act_freq_mhz. tools/power x86_energy_perf_policy: Input/output error in a VM tools/power turbostat: Skip pc8, pc9, pc10 columns, if they are disabled tools/power turbostat: Support additional CPU model numbers tools/power turbostat: Fix output formatting for ACPI CST enumeration tools/power turbostat: Replace HTTP links with HTTPS ones: TURBOSTAT UTILITY tools/power turbostat: Use sched_getcpu() instead of hardcoded cpu 0 tools/power turbostat: Enable accumulate RAPL display tools/power turbostat: Introduce functions to accumulate RAPL consumption tools/power turbostat: Make the energy variable to be 64 bit tools/power turbostat: Always print idle in the system configuration header tools/power turbostat: Print /dev/cpu_dma_latency
2020-11-10tools/power turbostat: update version numberLen Brown1-1/+1
goodbye summer... Signed-off-by: Len Brown <[email protected]>
2020-11-10powercap: restrict energy meter to root accessLen Brown1-2/+2
Remove non-privileged user access to power data contained in /sys/class/powercap/intel-rapl*/*/energy_uj Non-privileged users currently have read access to power data and can use this data to form a security attack. Some privileged drivers/applications need read access to this data, but don't expose it to non-privileged users. For example, thermald uses this data to ensure that power management works correctly. Thus removing non-privileged access is preferred over completely disabling this power reporting capability with CONFIG_INTEL_RAPL=n. Fixes: 95677a9a3847 ("PowerCap: Fix mode for energy counter") Signed-off-by: Len Brown <[email protected]> Cc: [email protected]
2020-11-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds32-393/+2383
Pull kvm fixes from Paolo Bonzini: "ARM: - fix compilation error when PMD and PUD are folded - fix regression in reads-as-zero behaviour of ID_AA64ZFR0_EL1 - add aarch64 get-reg-list test x86: - fix semantic conflict between two series merged for 5.10 - fix (and test) enforcement of paravirtual cpuid features selftests: - various cleanups to memory management selftests - new selftests testcase for performance of dirty logging" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (30 commits) KVM: selftests: allow two iterations of dirty_log_perf_test KVM: selftests: Introduce the dirty log perf test KVM: selftests: Make the number of vcpus global KVM: selftests: Make the per vcpu memory size global KVM: selftests: Drop pointless vm_create wrapper KVM: selftests: Add wrfract to common guest code KVM: selftests: Simplify demand_paging_test with timespec_diff_now KVM: selftests: Remove address rounding in guest code KVM: selftests: Factor code out of demand_paging_test KVM: selftests: Use a single binary for dirty/clear log test KVM: selftests: Always clear dirty bitmap after iteration KVM: selftests: Add blessed SVE registers to get-reg-list KVM: selftests: Add aarch64 get-reg-list test selftests: kvm: test enforcement of paravirtual cpuid features selftests: kvm: Add exception handling to selftests selftests: kvm: Clear uc so UCALL_NONE is being properly reported selftests: kvm: Fix the segment descriptor layout to match the actual layout KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrs kvm: x86: request masterclock update any time guest uses different msr kvm: x86: ensure pv_cpuid.features is initialized when enabling cap ...
2020-11-09Merge tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linuxLinus Torvalds5-14/+13
Pull nfsd fixes from Bruce Fields: "This is mainly server-to-server copy and fallout from Chuck's 5.10 rpc refactoring" * tag 'nfsd-5.10-1' of git://linux-nfs.org/~bfields/linux: net/sunrpc: fix useless comparison in proc_do_xprt() net/sunrpc: return 0 on attempt to write to "transports" NFSD: fix missing refcount in nfsd4_copy by nfsd4_do_async_copy NFSD: Fix use-after-free warning when doing inter-server copy NFSD: MKNOD should return NFSERR_BADTYPE instead of NFSERR_INVAL SUNRPC: Fix general protection fault in trace_rpc_xdr_overflow() NFSD: NFSv3 PATHCONF Reply is improperly formed
2020-11-09Merge tag 'ext4_for_linus_cleanups' of ↵Linus Torvalds24-271/+342
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes and cleanups from Ted Ts'o: "More fixes and cleanups for the new fast_commit features, but also a few other miscellaneous bug fixes and a cleanup for the MAINTAINERS file" * tag 'ext4_for_linus_cleanups' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (28 commits) jbd2: fix up sparse warnings in checkpoint code ext4: fix sparse warnings in fast_commit code ext4: cleanup fast commit mount options jbd2: don't start fast commit on aborted journal ext4: make s_mount_flags modifications atomic ext4: issue fsdev cache flush before starting fast commit ext4: disable fast commit with data journalling ext4: fix inode dirty check in case of fast commits ext4: remove unnecessary fast commit calls from ext4_file_mmap ext4: mark buf dirty before submitting fast commit buffer ext4: fix code documentatioon ext4: dedpulicate the code to wait on inode that's being committed jbd2: don't read journal->j_commit_sequence without taking a lock jbd2: don't touch buffer state until it is filled jbd2: add todo for a fast commit performance optimization jbd2: don't pass tid to jbd2_fc_end_commit_fallback() jbd2: don't use state lock during commit path jbd2: drop jbd2_fc_init documentation ext4: clean up the JBD2 API that initializes fast commits jbd2: rename j_maxlen to j_total_len and add jbd2_journal_max_txn_bufs ...
2020-11-09Merge tag 'erofs-for-5.10-rc4-fixes' of ↵Linus Torvalds2-12/+16
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "A week ago, Vladimir reported an issue that the kernel log would become polluted if the page allocation debug option is enabled. I also found this when I cleaned up magical page->mapping and originally planned to submit these all for 5.11 but it seems the impact can be noticed so submit the fix in advance. In addition, nl6720 also reported that atime is empty although EROFS has the only one on-disk timestamp as a practical consideration for now but it's better to derive it as what we did for the other timestamps. Summary: - fix setting up pcluster improperly for temporary pages - derive atime instead of leaving it empty" * tag 'erofs-for-5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix setting up pcluster for temporary pages erofs: derive atime instead of leaving it empty
2020-11-09KVM: selftests: allow two iterations of dirty_log_perf_testPaolo Bonzini1-1/+1
Even though one iteration is not enough for the dirty log performance test (due to the cost of building page tables, zeroing memory etc.) two is okay and it is the default. Without this patch, "./dirty_log_perf_test" without any further arguments fails. Cc: Ben Gardon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08Linux 5.10-rc3Linus Torvalds1-1/+1
2020-11-08net/sunrpc: fix useless comparison in proc_do_xprt()Dan Carpenter1-3/+4
In the original code, the "if (*lenp < 0)" check didn't work because "*lenp" is unsigned. Fortunately, the memory_read_from_buffer() call will never fail in this context so it doesn't affect runtime. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2020-11-08Merge tag 'driver-core-5.10-rc3' of ↵Linus Torvalds5-7/+30
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core documentation fixes from Greg KH: "Some small Documentation fixes that were fallout from the larger documentation update we did in 5.10-rc2. Nothing major here at all, but all of these have been in linux-next and resolve build warnings when building the documentation files" * tag 'driver-core-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: remove mic/index from misc-devices/index.rst scripts: get_api.pl: Add sub-titles to ABI output scripts: get_abi.pl: Don't let ABI files to create subtitles docs: leds: index.rst: add a missing file docs: ABI: sysfs-class-net: fix a typo docs: ABI: sysfs-driver-dma-ioatdma: what starts with /sys
2020-11-08Merge tag 'tty-5.10-rc3' of ↵Linus Torvalds5-25/+11
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial fixes from Greg KH: "Here are a small number of small tty and serial fixes for some reported problems for the tty core, vt code, and some serial drivers. They include fixes for: - a buggy and obsolete vt font ioctl removal - 8250_mtk serial baudrate runtime warnings - imx serial earlycon build configuration fix - txx9 serial driver error path cleanup issues - tty core fix in release_tty that can be triggered by trying to bind an invalid serial port name to a speakup console device Almost all of these have been in linux-next without any problems, the only one that hasn't, just deletes code :)" * tag 'tty-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: vt: Disable KD_FONT_OP_COPY tty: fix crash in release_tty if tty->port is not set serial: txx9: add missing platform_driver_unregister() on error in serial_txx9_init tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is enabled serial: 8250_mtk: Fix uart_get_baud_rate warning
2020-11-08Merge tag 'usb-5.10-rc3' of ↵Linus Torvalds11-6/+38
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes and new device ids: - USB gadget fixes for some reported issues - Fixes for the ever-troublesome apple fastcharge driver, hopefully we finally have it right. - More USB core quirks for odd devices - USB serial driver fixes for some long-standing issues that were recently found - some new USB serial driver device ids All have been in linux-next with no reported issues" * tag 'usb-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: apple-mfi-fastcharge: fix reference leak in apple_mfi_fc_set_property usb: mtu3: fix panic in mtu3_gadget_stop() USB: serial: option: add Telit FN980 composition 0x1055 USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231 USB: serial: cyberjack: fix write-URB completion race USB: Add NO_LPM quirk for Kingston flash drive USB: serial: option: add Quectel EC200T module support usb: raw-gadget: fix memory leak in gadget_setup usb: dwc2: Avoid leaving the error_debugfs label unused usb: dwc3: ep0: Fix delay status handling usb: gadget: fsl: fix null pointer checking usb: gadget: goku_udc: fix potential crashes in probe usb: dwc3: pci: add support for the Intel Alder Lake-S
2020-11-08fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parentEddy Wu1-5/+5
current->group_leader->exit_signal may change during copy_process() if current->real_parent exits. Move the assignment inside tasklist_lock to avoid the race. Signed-off-by: Eddy Wu <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-11-08vt: Disable KD_FONT_OP_COPYDaniel Vetter1-22/+2
It's buggy: On Fri, Nov 06, 2020 at 10:30:08PM +0800, Minh Yuan wrote: > We recently discovered a slab-out-of-bounds read in fbcon in the latest > kernel ( v5.10-rc2 for now ). The root cause of this vulnerability is that > "fbcon_do_set_font" did not handle "vc->vc_font.data" and > "vc->vc_font.height" correctly, and the patch > <https://lkml.org/lkml/2020/9/27/223> for VT_RESIZEX can't handle this > issue. > > Specifically, we use KD_FONT_OP_SET to set a small font.data for tty6, and > use KD_FONT_OP_SET again to set a large font.height for tty1. After that, > we use KD_FONT_OP_COPY to assign tty6's vc_font.data to tty1's vc_font.data > in "fbcon_do_set_font", while tty1 retains the original larger > height. Obviously, this will cause an out-of-bounds read, because we can > access a smaller vc_font.data with a larger vc_font.height. Further there was only one user ever. - Android's loadfont, busybox and console-tools only ever use OP_GET and OP_SET - fbset documentation only mentions the kernel cmdline font: option, not anything else. - systemd used OP_COPY before release 232 published in Nov 2016 Now unfortunately the crucial report seems to have gone down with gmane, and the commit message doesn't say much. But the pull request hints at OP_COPY being broken https://github.com/systemd/systemd/pull/3651 So in other words, this never worked, and the only project which foolishly every tried to use it, realized that rather quickly too. Instead of trying to fix security issues here on dead code by adding missing checks, fix the entire thing by removing the functionality. Note that systemd code using the OP_COPY function ignored the return value, so it doesn't matter what we're doing here really - just in case a lone server somewhere happens to be extremely unlucky and running an affected old version of systemd. The relevant code from font_copy_to_all_vcs() in systemd was: /* copy font from active VT, where the font was uploaded to */ cfo.op = KD_FONT_OP_COPY; cfo.height = vcs.v_active-1; /* tty1 == index 0 */ (void) ioctl(vcfd, KDFONTOP, &cfo); Note this just disables the ioctl, garbage collecting the now unused callbacks is left for -next. v2: Tetsuo found the old mail, which allowed me to find it on another archive. Add the link too. Acked-by: Peilin Ye <[email protected]> Reported-by: Minh Yuan <[email protected]> References: https://lists.freedesktop.org/archives/systemd-devel/2016-June/036935.html References: https://github.com/systemd/systemd/pull/3651 Cc: Greg KH <[email protected]> Cc: Peilin Ye <[email protected]> Cc: Tetsuo Handa <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-11-08Merge tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds8-33/+38
Pull xfs fixes from Darrick Wong: - Fix an uninitialized struct problem - Fix an iomap problem zeroing unwritten EOF blocks - Fix some clumsy error handling when writeback fails on filesystems with blocksize < pagesize - Fix a retry loop not resetting loop variables properly - Fix scrub flagging rtinherit inodes on a non-rt fs, since the kernel actually does permit that combination - Fix excessive page cache flushing when unsharing part of a file * tag 'xfs-5.10-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: only flush the unshared range in xfs_reflink_unshare xfs: fix scrub flagging rtinherit even if there is no rt device xfs: fix missing CoW blocks writeback conversion retry iomap: clean up writeback state logic on writepage error iomap: support partial page discard on writeback block mapping failure xfs: flush new eof page on truncate to avoid post-eof corruption xfs: set xefi_discard when creating a deferred agfl free log intent item
2020-11-08Merge branch 'hch' (patches from Christoph)Linus Torvalds6-17/+39
Merge procfs splice read fixes from Christoph Hellwig: "Greg reported a problem due to the fact that Android tests use procfs files to test splice, which stopped working with the changes for set_fs() removal. This series adds read_iter support for seq_file, and uses those for various proc files using seq_file to restore splice read support" [ Side note: Christoph initially had a scripted "move everything over" patch, which looks fine, but I personally would prefer us to actively discourage splice() on random files. So this does just the minimal basic core set of proc file op conversions. For completeness, and in case people care, that script was sed -i -e 's/\.proc_read\(\s*=\s*\)seq_read/\.proc_read_iter\1seq_read_iter/g' but I'll wait and see if somebody has a strong argument for using splice on random small /proc files before I'd run it on the whole kernel. - Linus ] * emailed patches from Christoph Hellwig <[email protected]>: proc "seq files": switch to ->read_iter proc "single files": switch to ->read_iter proc/stat: switch to ->read_iter proc/cpuinfo: switch to ->read_iter proc: wire up generic_file_splice_read for iter ops seq_file: add seq_read_iter
2020-11-08Merge tag 'x86-urgent-2020-11-08' of ↵Linus Torvalds5-32/+54
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of x86 fixes: - Use SYM_FUNC_START_WEAK in the mem* ASM functions instead of a combination of .weak and SYM_FUNC_START_LOCAL which makes LLVMs integrated assembler upset - Correct the mitigation selection logic which prevented the related prctl to work correctly - Make the UV5 hubless system work correctly by fixing up the malformed table entries and adding the missing ones" * tag 'x86-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/uv: Recognize UV5 hubless system identifier x86/platform/uv: Remove spaces from OEM IDs x86/platform/uv: Fix missing OEM_TABLE_ID x86/speculation: Allow IBPB to be conditionally enabled on CPUs with always-on STIBP x86/lib: Change .weak to SYM_FUNC_START_WEAK for arch/x86/lib/mem*_64.S
2020-11-08Merge tag 'perf-urgent-2020-11-08' of ↵Linus Torvalds1-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for the perf core plugging a memory leak in the address filter parser" * tag 'perf-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix a memory leak in perf_event_parse_addr_filter()
2020-11-08Merge tag 'locking-urgent-2020-11-08' of ↵Linus Torvalds1-2/+14
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex fix from Thomas Gleixner: "A single fix for the futex code where an intermediate state in the underlying RT mutex was not handled correctly and triggering a BUG() instead of treating it as another variant of retry condition" * tag 'locking-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Handle transient "ownerless" rtmutex state correctly
2020-11-08Merge tag 'irq-urgent-2020-11-08' of ↵Linus Torvalds9-18/+107
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Fix the fallout of the IPI as interrupt conversion in Kconfig and the BCM2836 interrupt chip driver - Fixes for interrupt affinity setting and the handling of hierarchical irq domains in the SiFive PLIC driver - Make the unmapped event handling in the TI SCI driver work correctly - A few minor fixes and cleanups in various chip drivers and Kconfig" * tag 'irq-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: dt-bindings: irqchip: ti, sci-inta: Fix diagram indentation for unmapped events irqchip/ti-sci-inta: Add support for unmapped event handling dt-bindings: irqchip: ti, sci-inta: Update for unmapped event handling irqchip/renesas-intc-irqpin: Merge irlm_bit and needs_irlm irqchip/sifive-plic: Fix chip_data access within a hierarchy irqchip/sifive-plic: Fix broken irq_set_affinity() callback irqchip/stm32-exti: Add all LP timer exti direct events support irqchip/bcm2836: Fix missing __init annotation irqchip/mips: Drop selection of IRQ_DOMAIN_HIERARCHY irqchip/mst: Make mst_intc_of_init static irqchip/mst: MST_IRQ should depend on ARCH_MEDIATEK or ARCH_MSTARV7 genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
2020-11-08Merge tag 'core-urgent-2020-11-08' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull entry code fix from Thomas Gleixner: "A single fix for the generic entry code to correct the wrong assumption that the lockdep interrupt state needs not to be established before calling the RCU check" * tag 'core-urgent-2020-11-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: entry: Fix the incorrect ordering of lockdep and RCU check
2020-11-08Merge tag 'powerpc-5.10-3' of ↵Linus Torvalds10-104/+44
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - fix miscompilation with GCC 4.9 by using asm_goto_volatile for put_user() - fix for an RCU splat at boot caused by a recent lockdep change - fix for a possible deadlock in our EEH debugfs code - several fixes for handling of _PAGE_ACCESSED on 32-bit platforms - build fix when CONFIG_NUMA=n Thanks to Andreas Schwab, Christophe Leroy, Oliver O'Halloran, Qian Cai, and Scott Cheloha. * tag 'powerpc-5.10-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/numa: Fix build when CONFIG_NUMA=n powerpc/8xx: Manage _PAGE_ACCESSED through APG bits in L1 entry powerpc/8xx: Always fault when _PAGE_ACCESSED is not set powerpc/40x: Always fault when _PAGE_ACCESSED is not set powerpc/603: Always fault when _PAGE_ACCESSED is not set powerpc: Use asm_goto_volatile for put_user() powerpc/smp: Call rcu_cpu_starting() earlier powerpc/eeh_cache: Fix a possible debugfs deadlock
2020-11-08KVM: selftests: Introduce the dirty log perf testBen Gardon6-8/+396
The dirty log perf test will time verious dirty logging operations (enabling dirty logging, dirtying memory, getting the dirty log, clearing the dirty log, and disabling dirty logging) in order to quantify dirty logging performance. This test can be used to inform future performance improvements to KVM's dirty logging infrastructure. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Make the number of vcpus globalAndrew Jones2-20/+20
We also check the input number of vcpus against the maximum supported. Signed-off-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Make the per vcpu memory size globalAndrew Jones2-12/+11
Rename vcpu_memory_bytes to something with "percpu" in it in order to be less ambiguous. Also make it global to simplify things. Signed-off-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Drop pointless vm_create wrapperAndrew Jones4-9/+3
Signed-off-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Add wrfract to common guest codeBen Gardon2-1/+7
Wrfract will be used by the dirty logging perf test introduced later in this series to dirty memory sparsely. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Simplify demand_paging_test with timespec_diff_nowBen Gardon3-15/+27
Add a helper function to get the current time and return the time since a given start time. Use that function to simplify the timekeeping in the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Remove address rounding in guest codeBen Gardon1-1/+0
Rounding the address the guest writes to a host page boundary will only have an effect if the host page size is larger than the guest page size, but in that case the guest write would still go to the same host page. There's no reason to round the address down, so remove the rounding to simplify the demand paging test. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Factor code out of demand_paging_testBen Gardon2-181/+210
Much of the code in demand_paging_test can be reused by other, similar multi-vCPU-memory-touching-perfromance-tests. Factor that common code out for reuse. No functional change expected. This series was tested by running the following invocations on an Intel Skylake machine: dirty_log_perf_test -b 20m -i 100 -v 64 dirty_log_perf_test -b 20g -i 5 -v 4 dirty_log_perf_test -b 4g -i 5 -v 32 demand_paging_test -b 20m -v 64 demand_paging_test -b 20g -v 4 demand_paging_test -b 4g -v 32 All behaved as expected. Signed-off-by: Ben Gardon <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Use a single binary for dirty/clear log testPeter Xu3-39/+156
Remove the clear_dirty_log test, instead merge it into the existing dirty_log_test. It should be cleaner to use this single binary to do both tests, also it's a preparation for the upcoming dirty ring test. The default behavior will run all the modes in sequence. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Always clear dirty bitmap after iterationPeter Xu1-1/+1
We used not to clear the dirty bitmap before because KVM_GET_DIRTY_LOG would overwrite it the next time it copies the dirty log onto it. In the upcoming dirty ring tests we'll start to fetch dirty pages from a ring buffer, so no one is going to clear the dirty bitmap for us. Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Add blessed SVE registers to get-reg-listAndrew Jones4-42/+217
Add support for the SVE registers to get-reg-list and create a new test, get-reg-list-sve, which tests them when running on a machine with SVE support. Signed-off-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: selftests: Add aarch64 get-reg-list testAndrew Jones5-0/+703
Check for KVM_GET_REG_LIST regressions. The blessed list was created by running on v4.15 with the --core-reg-fixup option. The following script was also used in order to annotate system registers with their names when possible. When new system registers are added the names can just be added manually using the same grep. while read reg; do if [[ ! $reg =~ ARM64_SYS_REG ]]; then printf "\t$reg\n" continue fi encoding=$(echo "$reg" | sed "s/ARM64_SYS_REG(//;s/),//") if ! name=$(grep "$encoding" ../../../../arch/arm64/include/asm/sysreg.h); then printf "\t$reg\n" continue fi name=$(echo "$name" | sed "s/.*SYS_//;s/[\t ]*sys_reg($encoding)$//") printf "\t$reg\t/* $name */\n" done < <(aarch64/get-reg-list --core-reg-fixup --list) Signed-off-by: Andrew Jones <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08selftests: kvm: test enforcement of paravirtual cpuid featuresOliver Upton7-0/+308
Add a set of tests that ensure the guest cannot access paravirtual msrs and hypercalls that have been disabled in the KVM_CPUID_FEATURES leaf. Expect a #GP in the case of msr accesses and -KVM_ENOSYS from hypercalls. Cc: Jim Mattson <[email protected]> Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: Peter Shier <[email protected]> Reviewed-by: Aaron Lewis <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08selftests: kvm: Add exception handling to selftestsAaron Lewis9-9/+244
Add the infrastructure needed to enable exception handling in selftests. This allows any of the exception and interrupt vectors to be overridden in the guest. Signed-off-by: Aaron Lewis <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08selftests: kvm: Clear uc so UCALL_NONE is being properly reportedAaron Lewis3-0/+9
Ensure the out value 'uc' in get_ucall() is properly reporting UCALL_NONE if the call fails. The return value will be correctly reported, however, the out parameter 'uc' will not be. Clear the struct to ensure the correct value is being reported in the out parameter. Signed-off-by: Aaron Lewis <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08selftests: kvm: Fix the segment descriptor layout to match the actual layoutAaron Lewis2-2/+3
Fix the layout of 'struct desc64' to match the layout described in the SDM Vol 3, Chapter 3 "Protected-Mode Memory Management", section 3.4.5 "Segment Descriptors", Figure 3-8 "Segment Descriptor". The test added later in this series relies on this and crashes if this layout is not correct. Signed-off-by: Aaron Lewis <[email protected]> Reviewed-by: Alexander Graf <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: x86: handle MSR_IA32_DEBUGCTLMSR with report_ignored_msrsPankaj Gupta1-3/+3
Windows2016 guest tries to enable LBR by setting the corresponding bits in MSR_IA32_DEBUGCTLMSR. KVM does not emulate MSR_IA32_DEBUGCTLMSR and spams the host kernel logs with error messages like: kvm [...]: vcpu1, guest rIP: 0xfffff800a8b687d3 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop" This patch fixes this by enabling error logging only with 'report_ignored_msrs=1'. Signed-off-by: Pankaj Gupta <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08kvm: x86: request masterclock update any time guest uses different msrOliver Upton1-1/+1
Commit 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) subtly changed the behavior of guest writes to MSR_KVM_SYSTEM_TIME(_NEW). Restore the previous behavior; update the masterclock any time the guest uses a different msr than before. Fixes: 5b9bb0ebbcdc ("kvm: x86: encapsulate wrmsr(MSR_KVM_SYSTEM_TIME) emulation in helper fn", 2020-10-21) Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: Peter Shier <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08kvm: x86: ensure pv_cpuid.features is initialized when enabling capOliver Upton3-7/+19
Make the paravirtual cpuid enforcement mechanism idempotent to ioctl() ordering by updating pv_cpuid.features whenever userspace requests the capability. Extract this update out of kvm_update_cpuid_runtime() into a new helper function and move its other call site into kvm_vcpu_after_set_cpuid() where it more likely belongs. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: Peter Shier <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08kvm: x86: reads of restricted pv msrs should also result in #GPOliver Upton1-0/+34
commit 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") only protects against disallowed guest writes to KVM paravirtual msrs, leaving msr reads unchecked. Fix this by enforcing KVM_CPUID_FEATURES for msr reads as well. Fixes: 66570e966dd9 ("kvm: x86: only provide PV features if enabled in guest's CPUID") Signed-off-by: Oliver Upton <[email protected]> Reviewed-by: Peter Shier <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: x86: use positive error values for msr emulation that causes #GPMaxim Levitsky2-14/+22
Recent introduction of the userspace msr filtering added code that uses negative error codes for cases that result in either #GP delivery to the guest, or handled by the userspace msr filtering. This breaks an assumption that a negative error code returned from the msr emulation code is a semi-fatal error which should be returned to userspace via KVM_RUN ioctl and usually kill the guest. Fix this by reusing the already existing KVM_MSR_RET_INVALID error code, and by adding a new KVM_MSR_RET_FILTERED error code for the userspace filtered msrs. Fixes: 291f35fb2c1d1 ("KVM: x86: report negative values from wrmsr emulation to userspace") Reported-by: Qian Cai <[email protected]> Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: Documentation: Update entry for KVM_CAP_ENFORCE_PV_CPUIDPeter Xu1-2/+1
Should be squashed into 66570e966dd9cb4f. Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: Documentation: Update entry for KVM_X86_SET_MSR_FILTERPeter Xu1-1/+1
It should be an accident when rebase, since we've already have section 8.25 (which is KVM_CAP_S390_DIAG318). Fix the number. Signed-off-by: Peter Xu <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-11-08KVM: x86/mmu: fix counting of rmap entries in pte_list_addLi RongQing1-5/+7
Fix an off-by-one style bug in pte_list_add() where it failed to account the last full set of SPTEs, i.e. when desc->sptes is full and desc->more is NULL. Merge the two "PTE_LIST_EXT-1" checks as part of the fix to avoid an extra comparison. Signed-off-by: Li RongQing <[email protected]> Reviewed-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>