aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-02-26libceph, ceph: avoid memory leak when specifying same option several timesChengguang Xu2-0/+9
When parsing string option, in order to avoid memory leak we need to carefully free it first in case of specifying same option several times. Signed-off-by: Chengguang Xu <[email protected]> Reviewed-by: Ilya Dryomov <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2018-02-26ceph: flush dirty caps of unlinked inode ASAPZhi Zhang3-24/+32
Client should release unlinked inode from its cache ASAP. But client can't release inode with dirty caps. Link: http://tracker.ceph.com/issues/22886 Signed-off-by: Zhi Zhang <[email protected]> Reviewed-by: "Yan, Zheng" <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2018-02-26ALSA: hda - Fix pincfg at resume on Lenovo T470 dockTakashi Iwai1-1/+2
We've added a quirk to enable the recent Lenovo dock support, where it overwrites the pin configs of NID 0x17 and 19, not only updating the pin config cache. It works right after the boot, but the problem is that the pin configs are occasionally cleared when the machine goes to PM. Meanwhile the quirk writes the pin configs only at the pre-probe, so this won't be applied any longer. For addressing that issue, this patch moves the code to overwrite the pin configs into HDA_FIXUP_ACT_INIT section so that it's always applied at both probe and resume time. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=195161 Fixes: 61fcf8ece9b6 ("ALSA: hda/realtek - Enable Thinkpad Dock device for ALC298 platform") Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-02-26clocksource/drivers/fsl_ftm_timer: Fix error return checkingColin Ian King1-1/+1
The error checks on freq for a negative error return always fails because freq is unsigned and can never be negative. Fix this by making freq a signed long. Detected with Coccinelle: drivers/clocksource/fsl_ftm_timer.c:287:5-9: WARNING: Unsigned expression compared with zero: freq <= 0 drivers/clocksource/fsl_ftm_timer.c:291:5-9: WARNING: Unsigned expression compared with zero: freq <= 0 Fixes: 2529c3a33079 ("clocksource: Add Freescale FlexTimer Module (FTM) timer support") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-02-26drm/sun4i: Protect the TCON pixel clocksMaxime Ripard1-2/+2
Both TCON clocks are very sensitive to clock changes, since any change might lead to improper timings. Make sure our rate is never changed. Tested-by: Giulio Benetti <[email protected]> Reviewed-by: Chen-Yu Tsai <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/d5224d2e81ecf73dc09f234e580ada52c00eaee3.1519204731.git-series.maxime.ripard@bootlin.com
2018-02-26drm/sun4i: Enable the output on the pins (tcon0)Ondrej Jirman1-0/+3
I noticed that with 4.16-rc1 LVDS output on A83T based TBS A711 tablet doesn't work (there's output but it's garbled). I compared some older patches for LVDS support with the mainlined ones and this change is missing from mainline Linux. I don't know what the register does exactly and the harcoded register value doesn't inspire much confidence that it will work in a general case, so I'm sending this RFC. This patch fixes the issue on A83T. Signed-off-by: Ondrej Jirman <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-02-26nvme-pci: Fix nvme queue cleanup if IRQ setup failsJianchao Wang1-1/+4
This patch fixes nvme queue cleanup if requesting an IRQ handler for the queue's vector fails. It does this by resetting the cq_vector to the uninitialized value of -1 so it is ignored for a controller reset. Signed-off-by: Jianchao Wang <[email protected]> [changelog updates, removed misc whitespace changes] Signed-off-by: Keith Busch <[email protected]>
2018-02-25Linux 4.16-rc3Linus Torvalds1-1/+1
2018-02-25Merge tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds2-17/+93
Pull Xtensa fixes from Max Filippov: "Two fixes for reserved memory/DMA buffers allocation in high memory on xtensa architecture - fix memory accounting when reserved memory is in high memory region - fix DMA allocation from high memory" * tag 'xtensa-20180225' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: support DMA buffers in high memory xtensa: fix high memory/reserved memory collision
2018-02-25Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds6-22/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of fixes: - UAPI data type correction for hyperv - correct the cpu cores field in /proc/cpuinfo on CPU hotplug - return proper error code in the resctrl file system failure path to avoid silent subsequent failures - correct a subtle accounting issue in the new vector allocation code which went unnoticed for a while and caused suspend/resume failures" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/topology: Update the 'cpu cores' field in /proc/cpuinfo correctly across CPU hotplug operations x86/topology: Fix function name in documentation x86/intel_rdt: Fix incorrect returned value when creating rdgroup sub-directory in resctrl file system x86/apic/vector: Handle vector release on CPU unplug correctly genirq/matrix: Handle CPU offlining proper x86/headers/UAPI: Use __u64 instead of u64 in <uapi/asm/hyperv.h>
2018-02-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single commit which shuts up a bogus GCC-8 warning" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/oprofile: Fix bogus GCC-8 warning in nmi_setup()
2018-02-25Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds3-18/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Three patches to fix memory ordering issues on ALPHA and a comment to clarify the usage scope of a mutex internal function" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/xchg/alpha: Fix xchg() and cmpxchg() memory ordering bugs locking/xchg/alpha: Clean up barrier usage by using smp_mb() in place of __ASM__MB locking/xchg/alpha: Add unconditional memory barrier to cmpxchg() locking/mutex: Add comment to __mutex_owner() to deter usage
2018-02-25Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds17-19/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull cleanup patchlet from Thomas Gleixner: "A single commit removing a bunch of bogus double semicolons all over the tree" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide/trivial: Remove ';;$' typo noise
2018-02-25Merge tag 'nfs-for-4.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds3-11/+11
Pull NFS client bugfixes from Trond Myklebust: - fix a broken cast in nfs4_callback_recallany() - fix an Oops during NFSv4 migration events - make struct nlmclnt_fl_close_lock_ops static * tag 'nfs-for-4.16-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFS: make struct nlmclnt_fl_close_lock_ops static nfs: system crashes after NFS4ERR_MOVED recovery NFSv4: Fix broken cast in nfs4_callback_recallany()
2018-02-25md/raid1: fix NULL pointer dereferenceYufen Yu1-0/+11
In handle_write_finished(), if r1_bio->bios[m] != NULL, it thinks the corresponding conf->mirrors[m].rdev is also not NULL. But, it is not always true. Even if some io hold replacement rdev(i.e. rdev->nr_pending.count > 0), raid1_remove_disk() can also set the rdev as NULL. That means, bios[m] != NULL, but mirrors[m].rdev is NULL, resulting in NULL pointer dereference in handle_write_finished and sync_request_write. This patch can fix BUGs as follows: BUG: unable to handle kernel NULL pointer dereference at 0000000000000140 IP: [<ffffffff815bbbbd>] raid1d+0x2bd/0xfc0 PGD 12ab52067 PUD 12f587067 PMD 0 Oops: 0000 [#1] SMP CPU: 1 PID: 2008 Comm: md3_raid1 Not tainted 4.1.44+ #130 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ? schedule+0x37/0x90 ? prepare_to_wait_event+0x83/0xf0 md_thread+0x144/0x150 ? wake_atomic_t_function+0x70/0x70 ? md_start_sync+0xf0/0xf0 kthread+0xd8/0xf0 ? kthread_worker_fn+0x160/0x160 ret_from_fork+0x42/0x70 ? kthread_worker_fn+0x160/0x160 BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8 IP: sync_request_write+0x9e/0x980 PGD 800000007c518067 P4D 800000007c518067 PUD 8002b067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 24 PID: 2549 Comm: md3_raid1 Not tainted 4.15.0+ #118 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 04/01/2014 Call Trace: ? sched_clock+0x5/0x10 ? sched_clock_cpu+0xc/0xb0 ? flush_pending_writes+0x3a/0xd0 ? pick_next_task_fair+0x4d5/0x5f0 ? __switch_to+0xa2/0x430 raid1d+0x65a/0x870 ? find_pers+0x70/0x70 ? find_pers+0x70/0x70 ? md_thread+0x11c/0x160 md_thread+0x11c/0x160 ? finish_wait+0x80/0x80 kthread+0x111/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ? do_syscall_64+0x6f/0x190 ? SyS_exit_group+0x10/0x10 ret_from_fork+0x35/0x40 Reviewed-by: NeilBrown <[email protected]> Signed-off-by: Yufen Yu <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2018-02-25md: fix a potential deadlock of raid5/raid10 reshapeBingJing Chang3-14/+15
There is a potential deadlock if mount/umount happens when raid5_finish_reshape() tries to grow the size of emulated disk. How the deadlock happens? 1) The raid5 resync thread finished reshape (expanding array). 2) The mount or umount thread holds VFS sb->s_umount lock and tries to write through critical data into raid5 emulated block device. So it waits for raid5 kernel thread handling stripes in order to finish it I/Os. 3) In the routine of raid5 kernel thread, md_check_recovery() will be called first in order to reap the raid5 resync thread. That is, raid5_finish_reshape() will be called. In this function, it will try to update conf and call VFS revalidate_disk() to grow the raid5 emulated block device. It will try to acquire VFS sb->s_umount lock. The raid5 kernel thread cannot continue, so no one can handle mount/ umount I/Os (stripes). Once the write-through I/Os cannot be finished, mount/umount will not release sb->s_umount lock. The deadlock happens. The raid5 kernel thread is an emulated block device. It is responible to handle I/Os (stripes) from upper layers. The emulated block device should not request any I/Os on itself. That is, it should not call VFS layer functions. (If it did, it will try to acquire VFS locks to guarantee the I/Os sequence.) So we have the resync thread to send resync I/O requests and to wait for the results. For solving this potential deadlock, we can put the size growth of the emulated block device as the final step of reshape thread. 2017/12/29: Thanks to Guoqing Jiang <[email protected]>, we confirmed that there is the same deadlock issue in raid10. It's reproducible and can be fixed by this patch. For raid10.c, we can remove the similar code to prevent deadlock as well since they has been called before. Reported-by: Alex Wu <[email protected]> Reviewed-by: Alex Wu <[email protected]> Reviewed-by: Chung-Chiang Cheng <[email protected]> Signed-off-by: BingJing Chang <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2018-02-25md-cluster: choose correct label when clustered layout is not supportedLidong Zhong1-1/+1
r10conf is already successfully allocated before checking the layout Signed-off-by: Lidong Zhong <[email protected]> Reviewed-by: Guoqing Jiang <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2018-02-25platform/x86: intel-vbtn: Only activate tablet mode switch on 2-in-1'sMario Limonciello1-17/+29
Some laptops such as the XPS 9360 support the intel-vbtn INT33D6 interface but don't initialize the bit that intel-vbtn uses to represent switching tablet mode. By running this only on real 2-in-1's it shouldn't cause false positives. Fixes: 30323fb6d5 ("Support tablet mode switch") Reported-by: Jeremy Cline <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Tested-by: Jeremy Cline <[email protected]> Tested-by: Darren Hart (VMware) <[email protected]> Signed-off-by: Darren Hart (VMware) <[email protected]>
2018-02-25radix tree test suite: Fix buildMatthew Wilcox4-2/+16
- Add an empty linux/compiler_types.h (now being included by kconfig.h) - Add __GFP_ZERO - Add kzalloc - Test __GFP_DIRECT_RECLAIM instead of __GFP_NOWARN Signed-off-by: Matthew Wilcox <[email protected]>
2018-02-24Merge tag 'powerpc-4.16-4' of ↵Linus Torvalds8-9/+20
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Add handling for a missing instruction in our 32-bit BPF JIT so that it can be used for seccomp filtering. - Add a missing NULL pointer check before a function call in new EEH code. - Fix an error path in the new ocxl driver to correctly return EFAULT. - The support for the new ibm,drc-info device tree property turns out to need several fixes, so for now we just stop advertising to firmware that we support it until the bugs can be ironed out. - One fix for the new drmem code which was incorrectly modifying the device tree in place. - Finally two fixes for the RFI flush support, so that firmware can advertise to us that it should be disabled entirely so as not to affect performance. Thanks to: Bharata B Rao, Frederic Barrat, Juan J. Alvarez, Mark Lord, Michael Bringmann. * tag 'powerpc-4.16-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/powernv: Support firmware disable of RFI flush powerpc/pseries: Support firmware disable of RFI flush powerpc/mm/drmem: Fix unexpected flag value in ibm,dynamic-memory-v2 powerpc/bpf/jit: Fix 32-bit JIT for seccomp_data access powerpc/pseries: Revert support for ibm,drc-info devtree property powerpc/pseries: Fix duplicate firmware feature for DRC_INFO ocxl: Fix potential bad errno on irq allocation powerpc/eeh: Fix crashes in eeh_report_resume()
2018-02-24block: kyber: fix domain token leak during requeueMing Lei1-0/+1
When requeuing request, the domain token should have been freed before re-inserting the request to io scheduler. Otherwise, the assigned domain token will be leaked, and IO hang can be caused. Cc: Paolo Valente <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: [email protected] Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-02-24blk-mq: don't call io sched's .requeue_request when requeueing rq to ->dispatchMing Lei1-1/+3
__blk_mq_requeue_request() covers two cases: - one is that the requeued request is added to hctx->dispatch, such as blk_mq_dispatch_rq_list() - another case is that the request is requeued to io scheduler, such as blk_mq_requeue_request(). We should call io sched's .requeue_request callback only for the 2nd case. Cc: Paolo Valente <[email protected]> Cc: Omar Sandoval <[email protected]> Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers") Cc: [email protected] Reviewed-by: Bart Van Assche <[email protected]> Acked-by: Paolo Valente <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-02-24ALSA: usb-audio: Add a quirck for B&W PX headphonesErik Veijola1-0/+47
The capture interface doesn't work and the playback interface only supports 48 kHz sampling rate even though it advertises more rates. Signed-off-by: Erik Veijola <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-02-24ALSA: hda: Add a power_save blacklistHans de Goede1-2/+36
On some boards setting power_save to a non 0 value leads to clicking / popping sounds when ever we enter/leave powersaving mode. Ideally we would figure out how to avoid these sounds, but that is not always feasible. This commit adds a blacklist for devices where powersaving is known to cause problems and disables it on these devices. Note I tried to put this blacklist in userspace first: https://github.com/systemd/systemd/pull/8128 But the systemd maintainers rightfully pointed out that it would be impossible to then later remove entries once we actually find a way to make power-saving work on listed boards without issues. Having this list in the kernel will allow removal of the blacklist entry in the same commit which fixes the clicks / plops. The blacklist only applies to the default power_save module-option value, if a user explicitly sets the module-option then the blacklist is not used. [ added an ifdef CONFIG_PM for the build error -- tiwai] BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1525104 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198611 Cc: [email protected] Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-02-24ARM: dts: imx6dl: Include correct dtsi file for Engicam i.CoreM6 ↵Shyam Saini1-1/+1
DualLite/Solo RQS This patch fixes the wrongly included dtsi file which was breaking mainline support for Engicam i.CoreM6 DualLite/Solo RQS. As per the board name, the correct file should be imx6dl.dtsi instead of imx6q.dtsi Reported-by: Michael Trimarchi <[email protected]> Suggested-by: Jagan Teki <[email protected]> Signed-off-by: Shyam Saini <[email protected]> Reviewed-by: Fabio Estevam <[email protected]> Fixes: 7a9caba55a61 ("ARM: dts: imx6dl: Add Engicam i.CoreM6 DualLite/Solo RQS initial support") Signed-off-by: Shawn Guo <[email protected]>
2018-02-24KVM: SVM: Fix SEV LAUNCH_SECRET commandBrijesh Singh1-3/+7
The SEV LAUNCH_SECRET command fails with error code 'invalid param' because we missed filling the guest and header system physical address while issuing the command. Fixes: 9f5b5b950aa9 (KVM: SVM: Add support for SEV LAUNCH_SECRET command) Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: [email protected] Cc: Joerg Roedel <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: SVM: install RSM interceptBrijesh Singh1-1/+10
RSM instruction is used by the SMM handler to return from SMM mode. Currently, rsm causes a #UD - which results in instruction fetch, decode, and emulate. By installing the RSM intercept we can avoid the instruction fetch since we know that #VMEXIT was due to rsm. The patch is required for the SEV guest, because in case of SEV guest memory is encrypted with guest-specific key and hypervisor will not able to fetch the instruction bytes from the guest memory. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tom Lendacky <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: SVM: no need to call access_ok() in LAUNCH_MEASURE commandBrijesh Singh1-9/+7
Using the access_ok() to validate the input before issuing the SEV command does not buy us anything in this case. If userland is giving us a garbage pointer then copy_to_user() will catch it when we try to return the measurement. Suggested-by: Al Viro <[email protected]> Fixes: 0d0736f76347 (KVM: SVM: Add support for KVM_SEV_LAUNCH_MEASURE ...) Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: [email protected] Cc: Joerg Roedel <[email protected]> Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24include: psp-sev: Capitalize invalid length enumBrijesh Singh1-1/+1
Commit 1d57b17c60ff ("crypto: ccp: Define SEV userspace ioctl and command id") added the invalid length enum but we missed capitalizing it. Fixes: 1d57b17c60ff (crypto: ccp: Define SEV userspace ioctl ...) Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Tom Lendacky <[email protected]> CC: Gary R Hook <[email protected]> Cc: [email protected] Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24crypto: ccp: Fix sparse, use plain integer as NULL pointerBrijesh Singh1-4/+4
Fix sparse warning: Using plain integer as NULL pointer. Replaces assignment of 0 to pointer with NULL assignment. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Gary Hook <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Brijesh Singh <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: X86: Avoid traversing all the cpus for pv tlb flush when steal time is ↵Wanpeng Li1-2/+4
disabled Avoid traversing all the cpus for pv tlb flush when steal time is disabled since pv tlb flush depends on the field in steal time for shared data. Cc: Paolo Bonzini <[email protected]> Cc: Radim KrÄmář <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24x86/kvm: Make parse_no_xxx __init for kvmDou Liyang1-3/+3
The early_param() is only called during kernel initialization, So Linux marks the functions of it with __init macro to save memory. But it forgot to mark the parse_no_kvmapf/stealacc/kvmclock_vsyscall, So, Make them __init as well. Cc: Paolo Bonzini <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: [email protected] Signed-off-by: Dou Liyang <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: x86: fix backward migration with async_PFRadim Krčmář5-6/+13
Guests on new hypersiors might set KVM_ASYNC_PF_DELIVERY_AS_PF_VMEXIT bit when enabling async_PF, but this bit is reserved on old hypervisors, which results in a failure upon migration. To avoid breaking different cases, we are checking for CPUID feature bit before enabling the feature and nothing else. Fixes: 52a5c155cf79 ("KVM: async_pf: Let guest support delivery of async_pf from guest mode") Cc: <[email protected]> Reviewed-by: Wanpeng Li <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24kvm: fix warning for non-x86 buildsSebastian Ott2-3/+3
Fix the following sparse warning by moving the prototype of kvm_arch_mmu_notifier_invalidate_range() to linux/kvm_host.h . CHECK arch/s390/kvm/../../../virt/kvm/kvm_main.c arch/s390/kvm/../../../virt/kvm/kvm_main.c:138:13: warning: symbol 'kvm_arch_mmu_notifier_invalidate_range' was not declared. Should it be static? Signed-off-by: Sebastian Ott <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24kvm: fix warning for CONFIG_HAVE_KVM_EVENTFD buildsSebastian Ott1-1/+2
Move the kvm_arch_irq_routing_update() prototype outside of ifdef CONFIG_HAVE_KVM_EVENTFD guards to fix the following sparse warning: arch/s390/kvm/../../../virt/kvm/irqchip.c:171:28: warning: symbol 'kvm_arch_irq_routing_update' was not declared. Should it be static? Signed-off-by: Sebastian Ott <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: print 'Total' line for multiple events onlyStefan Raspl1-1/+1
The 'Total' line looks a bit weird when we have a single event only. This can happen e.g. due to filters. Therefore suppress when there's only a single event in the output. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: group child events indented after parentStefan Raspl1-30/+59
We keep the current logic that sorts all events (parent and child), but re-shuffle the events afterwards, grouping the children after the respective parent. Note that the percentage column for child events gives the percentage of the parent's total. Since we rework the logic anyway, we modify the total average calculation to use the raw numbers instead of the (rounded) averages. Note that this can result in differing numbers (between total average and the sum of the individual averages) due to rounding errors. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: separate drilldown and fields filteringStefan Raspl1-43/+100
Drilldown (i.e. toggle display of child trace events) was implemented by overriding the fields filter. This resulted in inconsistencies: E.g. when drilldown was not active, adding a filter that also matches child trace events would not only filter fields according to the filter, but also add in the child trace events matching the filter. E.g. on x86, setting 'kvm_userspace_exit' as the fields filter after startup would result in display of kvm_userspace_exit(DCR), although that wasn't previously present - not exactly what one would expect from a filter. This patch addresses the issue by keeping drilldown and fields filter separate. While at it, we also fix a PEP8 issue by adding a blank line at one place (since we're in the area...). We implement this by adding a framework that also allows to define a taxonomy among the debugfs events to identify child trace events. I.e. drilldown using 'x' can now also work with debugfs. A respective parent- child relationship is only known for S390 at the moment, but could be added adjusting other platforms' ARCH.dbg_is_child() methods accordingly. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: eliminate extra guest/pid selection dialogStefan Raspl2-75/+39
We can do with a single dialog that takes both, pids and guest names. Note that we keep both interactive commands, 'p' and 'g' for now, to avoid confusion among users used to a specific key. While at it, we improve on some minor glitches regarding curses usage, e.g. cursor still visible when not supposed to be. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: mark private methods as suchStefan Raspl1-66/+66
Helps quite a bit reading the code when it's obvious when a method is intended for internal use only. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: fix debugfs handlingStefan Raspl1-14/+26
Te checks for debugfs assumed that debugfs is always mounted at /sys/kernel/debug - which is likely, but not guaranteed. This is addressed by checking /proc/mounts for the actual location. Furthermore, when debugfs was mounted, but the kvm module not loaded, a misleading error pointing towards debugfs not present was given. To reproduce, (a) run kvm_stat with debugfs mounted at a place different from /sys/kernel/debug (b) run kvm_stat with debugfs mounted but kvm module not loaded Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: print error on invalid regexStefan Raspl1-0/+3
Entering an invalid regular expression did not produce any indication of an error so far. To reproduce, press 'f' and enter 'foo(' (with an unescaped bracket). Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: fix crash when filtering out all non-child trace eventsStefan Raspl1-0/+6
When we apply a filter that will only leave child trace events, we receive a ZeroDivisionError when calculating the percentages. In that case, provide percentages based on child events only. To reproduce, run 'kvm_stat -f .*[\(].*'. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: avoid 'is' for equality checksMarc Hartmayer1-2/+2
Use '==' for equality checks and 'is' when comparing identities. An example where '==' and 'is' behave differently: >>> a = 4242 >>> a == 4242 True >>> a is 4242 False Signed-off-by: Marc Hartmayer <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: use a more pythonic way to iterate over dictionariesMarc Hartmayer1-8/+8
If it's clear that the values of a dictionary will be used then use the '.items()' method. Signed-off-by: Marc Hartmayer <[email protected]> Tested-by: Stefan Raspl <[email protected]> [Include fix for logging mode by Stefan Raspl] Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: use a namedtuple for storing the valuesMarc Hartmayer1-12/+15
Use a namedtuple for storing the values as it allows to access the fields of a tuple via names. This makes the overall code much easier to read and to understand. Access by index is still possible as before. Signed-off-by: Marc Hartmayer <[email protected]> Tested-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24tools/kvm_stat: simplify the sortkey functionMarc Hartmayer1-15/+8
The 'sortkey' function references a value in its enclosing scope (closure). This is not common practice for a sort key function so let's replace it. Additionally, the function 'sorted' has already a parameter for reversing the result therefore the inversion of the values is unneeded. The check for stats[x][1] is also superfluous as it's ensured that this value is initialized with 0. Signed-off-by: Marc Hartmayer <[email protected]> Tested-by: Stefan Raspl <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: X86: Fix SMRAM accessing even if VM is shutdownWanpeng Li1-1/+1
Reported by syzkaller: WARNING: CPU: 6 PID: 2434 at arch/x86/kvm/vmx.c:6660 handle_ept_misconfig+0x54/0x1e0 [kvm_intel] CPU: 6 PID: 2434 Comm: repro_test Not tainted 4.15.0+ #4 RIP: 0010:handle_ept_misconfig+0x54/0x1e0 [kvm_intel] Call Trace: vmx_handle_exit+0xbd/0xe20 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdaf/0x1d50 [kvm] kvm_vcpu_ioctl+0x3e9/0x720 [kvm] do_vfs_ioctl+0xa4/0x6a0 SyS_ioctl+0x79/0x90 entry_SYSCALL_64_fastpath+0x25/0x9c The testcase creates a first thread to issue KVM_SMI ioctl, and then creates a second thread to mmap and operate on the same vCPU. This triggers a race condition when running the testcase with multiple threads. Sometimes one thread exits with a triple fault while another thread mmaps and operates on the same vCPU. Because CS=0x3000/IP=0x8000 is not mapped, accessing the SMI handler results in an EPT misconfig. This patch fixes it by returning RET_PF_EMULATE in kvm_handle_bad_page(), which will go on to cause an emulation failure and an exit with KVM_EXIT_INTERNAL_ERROR. Reported-by: syzbot+c1d9517cab094dae65e446c0c5b4de6c40f4dc58@syzkaller.appspotmail.com Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: [email protected] Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: nVMX: Don't halt vcpu when L1 is injecting events to L2Chao Gao1-1/+6
Although L2 is in halt state, it will be in the active state after VM entry if the VM entry is vectoring according to SDM 26.6.2 Activity State. Halting the vcpu here means the event won't be injected to L2 and this decision isn't reported to L1. Thus L0 drops an event that should be injected to L2. Cc: Liran Alon <[email protected]> Reviewed-by: Liran Alon <[email protected]> Signed-off-by: Chao Gao <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2018-02-24KVM: mmu: Fix overlap between public and private memslotsWanpeng Li1-2/+1
Reported by syzkaller: pte_list_remove: ffff9714eb1f8078 0->BUG ------------[ cut here ]------------ kernel BUG at arch/x86/kvm/mmu.c:1157! invalid opcode: 0000 [#1] SMP RIP: 0010:pte_list_remove+0x11b/0x120 [kvm] Call Trace: drop_spte+0x83/0xb0 [kvm] mmu_page_zap_pte+0xcc/0xe0 [kvm] kvm_mmu_prepare_zap_page+0x81/0x4a0 [kvm] kvm_mmu_invalidate_zap_all_pages+0x159/0x220 [kvm] kvm_arch_flush_shadow_all+0xe/0x10 [kvm] kvm_mmu_notifier_release+0x6c/0xa0 [kvm] ? kvm_mmu_notifier_release+0x5/0xa0 [kvm] __mmu_notifier_release+0x79/0x110 ? __mmu_notifier_release+0x5/0x110 exit_mmap+0x15a/0x170 ? do_exit+0x281/0xcb0 mmput+0x66/0x160 do_exit+0x2c9/0xcb0 ? __context_tracking_exit.part.5+0x4a/0x150 do_group_exit+0x50/0xd0 SyS_exit_group+0x14/0x20 do_syscall_64+0x73/0x1f0 entry_SYSCALL64_slow_path+0x25/0x25 The reason is that when creates new memslot, there is no guarantee for new memslot not overlap with private memslots. This can be triggered by the following program: #include <fcntl.h> #include <pthread.h> #include <setjmp.h> #include <signal.h> #include <stddef.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <unistd.h> #include <linux/kvm.h> long r[16]; int main() { void *p = valloc(0x4000); r[2] = open("/dev/kvm", 0); r[3] = ioctl(r[2], KVM_CREATE_VM, 0x0ul); uint64_t addr = 0xf000; ioctl(r[3], KVM_SET_IDENTITY_MAP_ADDR, &addr); r[6] = ioctl(r[3], KVM_CREATE_VCPU, 0x0ul); ioctl(r[3], KVM_SET_TSS_ADDR, 0x0ul); ioctl(r[6], KVM_RUN, 0); ioctl(r[6], KVM_RUN, 0); struct kvm_userspace_memory_region mr = { .slot = 0, .flags = KVM_MEM_LOG_DIRTY_PAGES, .guest_phys_addr = 0xf000, .memory_size = 0x4000, .userspace_addr = (uintptr_t) p }; ioctl(r[3], KVM_SET_USER_MEMORY_REGION, &mr); return 0; } This patch fixes the bug by not adding a new memslot even if it overlaps with private memslots. Reported-by: Dmitry Vyukov <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Biggers <[email protected]> Cc: [email protected] Signed-off-by: Wanpeng Li <[email protected]> --- virt/kvm/kvm_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)