aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-08-30tools/kvm_stat: don't reset stats when setting PID filter for debugfsStefan Raspl1-1/+0
When setting a PID filter in debugfs, we unnecessarily reset the statistics, although there is no reason to do so. This behavior was merely introduced with commit 9f114a03c6854f "tools/kvm_stat: add interactive command 'r'", most likely to mimic the behavior of the tracepoints provider in this respect. However, there are plenty of differences between the two providers, so there is no reason not to take advantage of the possibility to filter by PID without resetting the statistics. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30tools/kvm_stat: fix updates for dead guestsStefan Raspl1-1/+10
With pid filtering active, when a guest is removed e.g. via virsh shutdown, successive updates produce garbage. Therefore, we add code to detect this case and prevent further body updates. Note that when displaying the help dialog via 'h' in this case, once we exit we're stuck with the 'Collecting data...' message till we remove the filter. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30tools/kvm_stat: fix handling of invalid paths in debugfs providerStefan Raspl1-0/+8
When filtering by guest, kvm_stat displays garbage when the guest is destroyed - see sample output below. We add code to remove the invalid paths from the providers, so at least no more garbage is displayed. Here's a sample output to illustrate: kvm statistics - pid 13986 (foo) Event Total %Total CurAvg/s diagnose_258 -2 0.0 0 deliver_program_interruption -3 0.0 0 diagnose_308 -4 0.0 0 halt_poll_invalid -91 0.0 -6 deliver_service_signal -244 0.0 -16 halt_successful_poll -250 0.1 -17 exit_pei -285 0.1 -19 exit_external_request -312 0.1 -21 diagnose_9c -328 0.1 -22 userspace_handled -713 0.1 -47 halt_attempted_poll -939 0.2 -62 deliver_emergency_signal -3126 0.6 -208 halt_wakeup -7199 1.5 -481 exit_wait_state -7379 1.5 -493 diagnose_500 -56499 11.5 -3757 exit_null -85491 17.4 -5685 diagnose_44 -133300 27.1 -8874 exit_instruction -195898 39.8 -13037 Total -492063 Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30tools/kvm_stat: fix python3 issuesStefan Raspl1-3/+3
Python3 returns a float for a regular division - switch to a division operator that returns an integer. Furthermore, filters return a generator object instead of the actual list - wrap result in yet another list, which makes it still work in both, Python2 and 3. Signed-off-by: Stefan Raspl <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30vfs: add the fadvise() file operationAmir Goldstein3-33/+53
This is going to be used by overlayfs and possibly useful for other filesystems. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-08-30Documentation/filesystems: update documentation of file_operationsAmir Goldstein1-2/+16
...to kernel 4.18. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-08-30ovl: fix GPF in swapfile_activate of file from overlayfs over xfsAmir Goldstein2-3/+6
Since overlayfs implements stacked file operations, the underlying filesystems are not supposed to be exposed to the overlayfs file, whose f_inode is an overlayfs inode. Assigning an overlayfs file to swap_file results in an attempt of xfs code to dereference an xfs_inode struct from an ovl_inode pointer: CPU: 0 PID: 2462 Comm: swapon Not tainted 4.18.0-xfstests-12721-g33e17876ea4e #3402 RIP: 0010:xfs_find_bdev_for_inode+0x23/0x2f Call Trace: xfs_iomap_swapfile_activate+0x1f/0x43 __se_sys_swapon+0xb1a/0xee9 Fix this by not assigning the real inode mapping to f_mapping, which will cause swapon() to return an error (-EINVAL). Although it makes sense not to allow setting swpafile on an overlayfs file, some users may depend on it, so we may need to fix this up in the future. Keeping f_mapping pointing to overlay inode mapping will cause O_DIRECT open to fail. Fix this by installing ovl_aops with noop_direct_IO in overlay inode mapping. Keeping f_mapping pointing to overlay inode mapping will cause other a_ops related operations to fail (e.g. readahead()). Those will be fixed by follow up patches. Suggested-by: Miklos Szeredi <[email protected]> Fixes: f7c72396d0de ("ovl: add O_DIRECT support") Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-08-30ovl: respect FIEMAP_FLAG_SYNC flagAmir Goldstein1-0/+4
Stacked overlayfs fiemap operation broke xfstests that test delayed allocation (with "_test_generic_punch -d"), because ovl_fiemap() failed to write dirty pages when requested. Fixes: 9e142c4102db ("ovl: add ovl_fiemap()") Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-08-30KVM: x86: Unexport x86_emulate_instruction()Sean Christopherson3-15/+18
Allowing x86_emulate_instruction() to be called directly has led to subtle bugs being introduced, e.g. not setting EMULTYPE_NO_REEXECUTE in the emulation type. While most of the blame lies on re-execute being opt-out, exporting x86_emulate_instruction() also exposes its cr2 parameter, which may have contributed to commit d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") using x86_emulate_instruction() instead of emulate_instruction() because "hey, I have a cr2!", which in turn introduced its EMULTYPE_NO_REEXECUTE bug. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Rename emulate_instruction() to kvm_emulate_instruction()Sean Christopherson4-17/+17
Lack of the kvm_ prefix gives the impression that it's a VMX or SVM specific function, and there's no conflict that prevents adding the kvm_ prefix. Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Do not re-{try,execute} after failed emulation in L2Sean Christopherson2-2/+11
Commit a6f177efaa58 ("KVM: Reenter guest after emulation failure if due to access to non-mmio address") added reexecute_instruction() to handle the scenario where two (or more) vCPUS race to write a shadowed page, i.e. reexecute_instruction() is intended to return true if and only if the instruction being emulated was accessing a shadowed page. As L0 is only explicitly shadowing L1 tables, an emulation failure of a nested VM instruction cannot be due to a race to write a shadowed page and so should never be re-executed. This fixes an issue where an "MMIO" emulation failure[1] in L2 is all but guaranteed to result in an infinite loop when TDP is enabled. Because "cr2" is actually an L2 GPA when TDP is enabled, calling kvm_mmu_gva_to_gpa_write() to translate cr2 in the non-direct mapped case (L2 is never direct mapped) will almost always yield UNMAPPED_GVA and cause reexecute_instruction() to immediately return true. The !mmio_info_in_cache() check in kvm_mmu_page_fault() doesn't catch this case because mmio_info_in_cache() returns false for a nested MMU (the MMIO caching currently handles L1 only, e.g. to cache nested guests' GPAs we'd have to manually flush the cache when switching between VMs and when L1 updated its page tables controlling the nested guest). Way back when, commit 68be0803456b ("KVM: x86: never re-execute instruction with enabled tdp") changed reexecute_instruction() to always return false when using TDP under the assumption that KVM would only get into the emulator for MMIO. Commit 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") effectively reverted that behavior in order to handle the scenario where emulation failed due to an access from L1 to the shadow page tables for L2, but it didn't account for the case where emulation failed in L2 with TDP enabled. All of the above logic also applies to retry_instruction(), added by commit 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions"). An indefinite loop in retry_instruction() should be impossible as it protects against retrying the same instruction over and over, but it's still correct to not retry an L2 instruction in the first place. Fix the immediate issue by adding a check for a nested guest when determining whether or not to allow retry in kvm_mmu_page_fault(). In addition to fixing the immediate bug, add WARN_ON_ONCE in the retry functions since they are not designed to handle nested cases, i.e. they need to be modified even if there is some scenario in the future where we want to allow retrying a nested guest. [1] This issue was encountered after commit 3a2936dedd20 ("kvm: mmu: Don't expose private memslots to L2") changed the page fault path to return KVM_PFN_NOSLOT when translating an L2 access to a prive memslot. Returning KVM_PFN_NOSLOT is semantically correct when we want to hide a memslot from L2, i.e. there effectively is no defined memory region for L2, but it has the unfortunate side effect of making KVM think the GFN is a MMIO page, thus triggering emulation. The failure occurred with in-development code that deliberately exposed a private memslot to L2, which L2 accessed with an instruction that is not emulated by KVM. Fixes: 95b3cf69bdf8 ("KVM: x86: let reexecute_instruction work for tdp") Fixes: 1cb3f3ae5a38 ("KVM: x86: retry non-page-table writing instructions") Signed-off-by: Sean Christopherson <[email protected]> Cc: Jim Mattson <[email protected]> Cc: Krish Sadhukhan <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Default to not allowing emulation retry in kvm_mmu_page_faultSean Christopherson1-6/+12
Effectively force kvm_mmu_page_fault() to opt-in to allowing retry to make it more obvious when and why it allows emulation to be retried. Previously this approach was less convenient due to retry and re-execute behavior being controlled by separate flags that were also inverted in their implementations (opt-in versus opt-out). Suggested-by: Paolo Bonzini <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Merge EMULTYPE_RETRY and EMULTYPE_ALLOW_REEXECUTESean Christopherson3-7/+6
retry_instruction() and reexecute_instruction() are a package deal, i.e. there is no scenario where one is allowed and the other is not. Merge their controlling emulation type flags to enforce this in code. Name the combined flag EMULTYPE_ALLOW_RETRY to make it abundantly clear that we are allowing re{try,execute} to occur, as opposed to explicitly requesting retry of a previously failed instruction. Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: Invert emulation re-execute behavior to make it opt-inSean Christopherson3-7/+5
Re-execution of an instruction after emulation decode failure is intended to be used only when emulating shadow page accesses. Invert the flag to make allowing re-execution opt-in since that behavior is by far in the minority. Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: x86: SVM: Set EMULTYPE_NO_REEXECUTE for RSM emulationSean Christopherson2-2/+9
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of RSM emulation. Add a new helper, kvm_emulate_instruction_from_buffer(), to support emulating from a pre-defined buffer. This eliminates the last direct call to x86_emulate_instruction() outside of kvm_mmu_page_fault(), which means x86_emulate_instruction() can be unexported in a future patch. Fixes: 7607b7174405 ("KVM: SVM: install RSM intercept") Cc: Brijesh Singh <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: VMX: Do not allow reexecute_instruction() when skipping MMIO instrSean Christopherson1-2/+2
Re-execution after an emulation decode failure is only intended to handle a case where two or vCPUs race to write a shadowed page, i.e. we should never re-execute an instruction as part of MMIO emulation. As handle_ept_misconfig() is only used for MMIO emulation, it should pass EMULTYPE_NO_REEXECUTE when using the emulator to skip an instr in the fast-MMIO case where VM_EXIT_INSTRUCTION_LEN is invalid. And because the cr2 value passed to x86_emulate_instruction() is only destined for use when retrying or reexecuting, we can simply call emulate_instruction(). Fixes: d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") Cc: Vitaly Kuznetsov <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Cc: [email protected] Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: SVM: remove unused variable dst_vaddr_endColin Ian King1-2/+1
Variable dst_vaddr_end is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: variable 'dst_vaddr_end' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30KVM: nVMX: avoid redundant double assignment of nested_run_pendingVitaly Kuznetsov1-3/+0
nested_run_pending is set 20 lines above and check_vmentry_prereqs()/ check_vmentry_postreqs() don't seem to be resetting it (the later, however, checks it). Signed-off-by: Vitaly Kuznetsov <[email protected]> Reviewed-by: Paolo Bonzini <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Reviewed-by: Eduardo Valentin <[email protected]> Reviewed-by: Krish Sadhukhan <[email protected]> Signed-off-by: Radim Krčmář <[email protected]>
2018-08-30ALSA: hda - Fix cancel_work_sync() stall from jackpoll workTakashi Iwai1-1/+2
On AMD/ATI controllers, the HD-audio controller driver allows a bus reset upon the error recovery, and its procedure includes the cancellation of pending jack polling work as found in snd_hda_bus_codec_reset(). This works usually fine, but it becomes a problem when the reset happens from the jack poll work itself; then calling cancel_work_sync() from the work being processed tries to wait the finish endlessly. As a workaround, this patch adds the check of current_work() and applies the cancel_work_sync() only when it's not from the jackpoll_work. This doesn't fix the root cause of the reported error below, but at least, it eases the unexpected stall of the whole system. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200937 Cc: <[email protected]> Cc: Lukas Wunner <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-08-30arm/arm64: smccc-1.1: Handle function result as parametersMarc Zyngier1-10/+20
If someone has the silly idea to write something along those lines: extern u64 foo(void); void bar(struct arm_smccc_res *res) { arm_smccc_1_1_smc(0xbad, foo(), res); } they are in for a surprise, as this gets compiled as: 0000000000000588 <bar>: 588: a9be7bfd stp x29, x30, [sp, #-32]! 58c: 910003fd mov x29, sp 590: f9000bf3 str x19, [sp, #16] 594: aa0003f3 mov x19, x0 598: aa1e03e0 mov x0, x30 59c: 94000000 bl 0 <_mcount> 5a0: 94000000 bl 0 <foo> 5a4: aa0003e1 mov x1, x0 5a8: d4000003 smc #0x0 5ac: b4000073 cbz x19, 5b8 <bar+0x30> 5b0: a9000660 stp x0, x1, [x19] 5b4: a9010e62 stp x2, x3, [x19, #16] 5b8: f9400bf3 ldr x19, [sp, #16] 5bc: a8c27bfd ldp x29, x30, [sp], #32 5c0: d65f03c0 ret 5c4: d503201f nop The call to foo "overwrites" the x0 register for the return value, and we end up calling the wrong secure service. A solution is to evaluate all the parameters before assigning anything to specific registers, leading to the expected result: 0000000000000588 <bar>: 588: a9be7bfd stp x29, x30, [sp, #-32]! 58c: 910003fd mov x29, sp 590: f9000bf3 str x19, [sp, #16] 594: aa0003f3 mov x19, x0 598: aa1e03e0 mov x0, x30 59c: 94000000 bl 0 <_mcount> 5a0: 94000000 bl 0 <foo> 5a4: aa0003e1 mov x1, x0 5a8: d28175a0 mov x0, #0xbad 5ac: d4000003 smc #0x0 5b0: b4000073 cbz x19, 5bc <bar+0x34> 5b4: a9000660 stp x0, x1, [x19] 5b8: a9010e62 stp x2, x3, [x19, #16] 5bc: f9400bf3 ldr x19, [sp, #16] 5c0: a8c27bfd ldp x29, x30, [sp], #32 5c4: d65f03c0 ret Reported-by: Julien Grall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-08-30scsi: qedi: Add the CRC size within iSCSI NVM imageNilesh Javali2-14/+21
The QED driver commit, 1ac4329a1cff ("qed: Add configuration information to register dump and debug data"), removes the CRC length validation causing nvm_get_image failure while loading qedi driver: [qed_mcp_get_nvm_image:2700(host_10-0)]Image [0] is too big - 00006008 bytes where only 00006004 are available [qedi_get_boot_info:2253]:10: Could not get NVM image. ret = -12 Hence add and adjust the CRC size to iSCSI NVM image to read boot info at qedi load time. Signed-off-by: Nilesh Javali <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-08-30scsi: iscsi: target: Fix conn_ops double freeMike Christie3-75/+77
If iscsi_login_init_conn fails it can free conn_ops. __iscsi_target_login_thread will then call iscsi_target_login_sess_out which will also free it. This fixes the problem by organizing conn allocation/setup into parts that are needed through the life of the conn and parts that are only needed for the login. The free functions then release what was allocated in the alloc functions. With this patch we have: iscsit_alloc_conn/iscsit_free_conn - allocs/frees the conn we need for the entire life of the conn. iscsi_login_init_conn/iscsi_target_nego_release - allocs/frees the parts of the conn that are only needed during login. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-08-30scsi: iscsi: target: Set conn->sess to NULL when iscsi_login_set_conn_values ↵Vincent Pelletier1-5/+3
fails Fixes a use-after-free reported by KASAN when later iscsi_target_login_sess_out gets called and it tries to access conn->sess->se_sess: Disabling lock debugging due to kernel taint iSCSI Login timeout on Network Portal [::]:3260 iSCSI Login negotiation failed. ================================================================== BUG: KASAN: use-after-free in iscsi_target_login_sess_out.cold.12+0x58/0xff [iscsi_target_mod] Read of size 8 at addr ffff880109d070c8 by task iscsi_np/980 CPU: 1 PID: 980 Comm: iscsi_np Tainted: G O 4.17.8kasan.sess.connops+ #4 Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 05/19/2014 Call Trace: dump_stack+0x71/0xac print_address_description+0x65/0x22e ? iscsi_target_login_sess_out.cold.12+0x58/0xff [iscsi_target_mod] kasan_report.cold.6+0x241/0x2fd iscsi_target_login_sess_out.cold.12+0x58/0xff [iscsi_target_mod] iscsi_target_login_thread+0x1086/0x1710 [iscsi_target_mod] ? __sched_text_start+0x8/0x8 ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] ? __kthread_parkme+0xcc/0x100 ? parse_args.cold.14+0xd3/0xd3 ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] kthread+0x1a0/0x1c0 ? kthread_bind+0x30/0x30 ret_from_fork+0x35/0x40 Allocated by task 980: kasan_kmalloc+0xbf/0xe0 kmem_cache_alloc_trace+0x112/0x210 iscsi_target_login_thread+0x816/0x1710 [iscsi_target_mod] kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 Freed by task 980: __kasan_slab_free+0x125/0x170 kfree+0x90/0x1d0 iscsi_target_login_thread+0x1577/0x1710 [iscsi_target_mod] kthread+0x1a0/0x1c0 ret_from_fork+0x35/0x40 The buggy address belongs to the object at ffff880109d06f00 which belongs to the cache kmalloc-512 of size 512 The buggy address is located 456 bytes inside of 512-byte region [ffff880109d06f00, ffff880109d07100) The buggy address belongs to the page: page:ffffea0004274180 count:1 mapcount:0 mapping:0000000000000000 index:0x0 compound_mapcount: 0 flags: 0x17fffc000008100(slab|head) raw: 017fffc000008100 0000000000000000 0000000000000000 00000001000c000c raw: dead000000000100 dead000000000200 ffff88011b002e00 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880109d06f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880109d07000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff880109d07080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff880109d07100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880109d07180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ================================================================== Signed-off-by: Vincent Pelletier <[email protected]> [rebased against idr/ida changes and to handle ret review comments from Matthew] Signed-off-by: Mike Christie <[email protected]> Cc: Matthew Wilcox <[email protected]> Reviewed-by: Matthew Wilcox <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-08-30x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember()Uros Bizjak1-3/+4
Replace open-coded set instructions with CC_SET()/CC_OUT(). Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/alternatives: Lockdep-enforce text_mutex in text_poke*()Jiri Kosina1-4/+5
text_poke() and text_poke_bp() must be called with text_mutex held. Put proper lockdep anotation in place instead of just mentioning the requirement in a comment. Reported-by: Peter Zijlstra <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Andy Lutomirski <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-30objtool: Remove workaround for unreachable warnings from old GCCMasahiro Yamada1-2/+0
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. This effectively reverts commit da541b20021c ("objtool: Skip unreachable warnings for GCC 4.4 and older"), which was a workaround for GCC 4.4 or older. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Michal Marek <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30notifier: Remove notifier header file wherever not usedMukesh Ojha7-7/+0
The conversion of the hotplug notifiers to a state machine left the notifier.h includes around in some places. Remove them. Signed-off-by: Mukesh Ojha <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-30watchdog: Mark watchdog touch functions as notraceVincent Whitchurch3-4/+4
Some architectures need to use stop_machine() to patch functions for ftrace, and the assumption is that the stopped CPUs do not make function calls to traceable functions when they are in the stopped state. Commit ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") added calls to the watchdog touch functions from the stopped CPUs and those functions lack notrace annotations. This leads to crashes when enabling/disabling ftrace on ARM kernels built with the Thumb-2 instruction set. Fix it by adding the necessary notrace annotations. Fixes: ce4f06dcbb5d ("stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE") Signed-off-by: Vincent Whitchurch <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit()Jann Horn1-0/+4
Reset the KASAN shadow state of the task stack before rewinding RSP. Without this, a kernel oops will leave parts of the stack poisoned, and code running under do_exit() can trip over such poisoned regions and cause nonsensical false-positive KASAN reports about stack-out-of-bounds bugs. This does not wipe the exception stacks; if an oops happens on an exception stack, it might result in random KASAN false-positives from other tasks afterwards. This is probably relatively uninteresting, since if the kernel oopses on an exception stack, there are most likely bigger things to worry about. It'd be more interesting if vmapped stacks and KASAN were compatible, since then handle_stack_overflow() would oops from exception stack context. Fixes: 2deb4be28077 ("x86/dumpstack: When OOPSing, rewind the stack before do_exit()") Signed-off-by: Jann Horn <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Andrey Ryabinin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Kees Cook <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/irqflags: Mark native_restore_fl extern inlineNick Desaulniers1-1/+2
This should have been marked extern inline in order to pick up the out of line definition in arch/x86/kernel/irqflags.S. Fixes: 208cbb325589 ("x86/irqflags: Provide a declaration for native_save_fl") Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30x86/build: Remove jump label quirk for GCC older than 4.5.2Masahiro Yamada2-16/+0
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6") bumped the minimum GCC version to 4.6 for all architectures. Remove the workaround code. It was the only user of cc-if-fullversion. Remove the macro as well. Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Michal Marek <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-08-30mac80211: always account for A-MSDU header changesJohannes Berg1-5/+7
In the error path of changing the SKB headroom of the second A-MSDU subframe, we would not account for the already-changed length of the first frame that just got converted to be in A-MSDU format and thus is a bit longer now. Fix this by doing the necessary accounting. It would be possible to reorder the operations, but that would make the code more complex (to calculate the necessary pad), and the headroom expansion should not fail frequently enough to make that worthwhile. Fixes: 6e0456b54545 ("mac80211: add A-MSDU tx support") Signed-off-by: Johannes Berg <[email protected]> Acked-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2018-08-30HID: hid-saitek: Add device ID for RAT 7 ContagionHarry Mallon2-0/+3
Signed-off-by: Harry Mallon <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2018-08-30mac80211: do not convert to A-MSDU if frag/subframe limitedLorenzo Bianconi1-3/+3
Do not start to aggregate packets in a A-MSDU frame (converting the first subframe to A-MSDU, adding the header) if max_tx_fragments or max_amsdu_subframes limits are already exceeded by it. In particular, this happens when drivers set the limit to 1 to avoid A-MSDUs at all. Signed-off-by: Lorenzo Bianconi <[email protected]> [reword commit message to be more precise] Signed-off-by: Johannes Berg <[email protected]>
2018-08-30cfg80211: nl80211_update_ft_ies() to validate NL80211_ATTR_IEArunk Khandavalli1-0/+1
nl80211_update_ft_ies() tried to validate NL80211_ATTR_IE with is_valid_ie_attr() before dereferencing it, but that helper function returns true in case of NULL pointer (i.e., attribute not included). This can result to dereferencing a NULL pointer. Fix that by explicitly checking that NL80211_ATTR_IE is included. Fixes: 355199e02b83 ("cfg80211: Extend support for IEEE 802.11r Fast BSS Transition") Signed-off-by: Arunk Khandavalli <[email protected]> Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2018-08-29Merge branch 'net_sched-reject-unknown-tcfa_action-values'David S. Miller2-5/+59
Paolo Abeni says: ==================== net_sched: reject unknown tcfa_action values As agreed some time ago, this changeset reject unknown tcfa_action values, instead of changing such values under the hood. A tdc test is included to verify the new behavior. v1 -> v2: - helper is now static and renamed according to act_* convention - updated extack message, according to the new behavior ==================== Reviewed-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29tc-testing: add test-cases for numeric and invalid control actionPaolo Abeni1-0/+48
Only the police action allows us to specify an arbitrary numeric value for the control action. This change introduces an explicit test case for the above feature and then leverage it for testing the kernel behavior for invalid control actions (reject). Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29net_sched: reject unknown tcfa_action valuesPaolo Abeni1-5/+11
After the commit 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values"), unknown tcfa_action values are converted to TC_ACT_UNSPEC, but the common agreement is instead rejecting such configurations. This change also introduces a helper to simplify the destruction of a single action, avoiding code duplication. v1 -> v2: - helper is now static and renamed according to act_* convention - updated extack message, according to the new behavior Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values") Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29net: mvpp2: initialize port of_node pointerBaruch Siach1-0/+1
Without a valid of_node in struct device we can't find the mvpp2 port device by its DT node. Specifically, this breaks of_find_net_device_by_node(). For example, the Armada 8040 based Clearfog GT-8K uses Marvell 88E6141 switch connected to the &cp1_eth2 port: &cp1_mdio { ... switch0: switch0@4 { compatible = "marvell,mv88e6085"; ... ports { ... port@5 { reg = <5>; label = "cpu"; ethernet = <&cp1_eth2>; }; }; }; }; Without this patch, dsa_register_switch() returns -EPROBE_DEFER because of_find_net_device_by_node() can't find the device_node of the &cp1_eth2 device. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Baruch Siach <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-30drm/i915/gvt: Fix drm_format_mod value for vGPU planeZhenyu Wang3-11/+29
Physical plane's tiling mode value is given directly as drm_format_mod for plane query, which is not correct fourcc code. Fix it by using correct intel tiling fourcc mod definition. Current qemu seems also doesn't correctly utilize drm_format_mod for plane object setting. Anyway this is required to fix the usage. v3: use DRM_FORMAT_MOD_LINEAR, fix comment v2: Fix missed old 'tiled' use for stride calculation Fixes: e546e281d33d ("drm/i915/gvt: Dmabuf support for GVT-g") Cc: Tina Zhang <[email protected]> Cc: Gerd Hoffmann <[email protected]> Cc: Colin Xu <[email protected]> Reviewed-by: Colin Xu <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-08-30drm/i915/gvt: move intel_runtime_pm_get out of spin_lock in stop_scheduleHang Yuan2-2/+3
pm_runtime_get_sync in intel_runtime_pm_get might sleep if i915 device is not active. When stop vgpu schedule, the device may be inactive. So need to move runtime_pm_get out of spin_lock/unlock. Fixes: b24881e0b0b6("drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio Cc: <[email protected]> Signed-off-by: Hang Yuan <[email protected]> Signed-off-by: Xiong Zhang <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-08-30drm/i915/gvt: Handle GEN9_WM_CHICKEN3 with F_CMD_ACCESS.Colin Xu1-1/+3
Recent patch introduce strict check on scanning cmd: Commit 8d458ea0ec33 ("drm/i915/gvt: return error on cmd access") Before 8d458ea0ec33, if cmd_reg_handler() checks that a cmd access a mmio that not marked as F_CMD_ACCESS, it simply returns 0 and log an error. Now it will return -EBADRQC which will cause the workload fail to submit. On BXT, i915 applies WaClearHIZ_WM_CHICKEN3 which will program GEN9_WM_CHICKEN3 by LRI when init wa ctx. If it has no F_CMD_ACCESS flag, vgpu will fail to start. Also add F_MODE_MASK since it's mode mask reg. v2: Refresh commit message to elaborate issue symptom in detail. v3: Make SKL_PLUS share same handling since GEN9_WM_CHICKEN3 should be F_CMD_ACCESS from HW aspect. (yan, zhenyu) Signed-off-by: Colin Xu <[email protected]> Acked-by: Zhao Yan <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-08-30drm/i915/gvt: Make correct handling to vreg BXT_PHY_CTL_FAMILYColin Xu1-3/+9
Guest kernel will write to BXT_PHY_CTL_FAMILY to reset DDI PHY and pull BXT_PHY_CTL to check PHY status. Previous handling will set/reset BXT_PHY_CTL of all PHYs at same time on receiving vreg write to some BXT_PHY_CTL_FAMILY. If some BXT_PHY_CTL is already enabled, following reset to another BXT_PHY_CTL_FAMILY will clear the enabled BXT_PHY_CTL, which result in guest kernel print: ----------------------------------- [drm:intel_ddi_get_hw_state [i915]] *ERROR* Port B enabled but PHY powered down? (PHY_CTL 00000000) ----------------------------------- The correct handling should operate BXT_PHY_CTL_FAMILY and BXT_PHY_CTL on the same DDI. v2: Use correct reg define. The naming looks confusing, however current i915_reg.h bind DPIO_PHY0 to _PHY_CTL_FAMILY_DDI and bind DPIO_PHY1 to _PHY_CTL_FAMILY_EDP, pairing to _BXT_PHY_CTL_DDI_A and _BXT_PHY_CTL_DDI_B respectively. v3: v2 incorrectly map _PHY_CTL_FAMILY_EDP to _BXT_PHY_CTL_DDI_A. BXT_PHY_CTL() looks up DDI using PORTx but not PHYx. Based on DPIO_PHY to DDI mapping, make correct vreg handle to BXT_PHY_CTL on receiving vreg write to BXT_PHY_CTL_FAMILY. (He, Min) Current mapping according to bxt_power_wells: dpio-common-a: >>> DPIO_PHY1 >>> BXT_DPIO_CMN_A_POWER_DOMAINS >>> POWER_DOMAIN_PORT_DDI_A_LANES >>> PORT_A dpio-common-bc: >>> DPIO_PHY0 >>> BXT_DPIO_CMN_BC_POWER_DOMAINS >>> POWER_DOMAIN_PORT_DDI_B_LANES | POWER_DOMAIN_PORT_DDI_C_LANES >>> PORT_B or PORT_C Signed-off-by: Colin Xu <[email protected]> Reviewed-by: He, Min <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-08-30drm/i915/gvt: emulate gen9 dbuf ctl register accessXiaolin Zhang1-2/+15
there is below call track at boot time when booting guest with kabylake vgpu with specifal configuration and this try to fix it. [drm:gen9_dbuf_enable [i915]] *ERROR* DBuf power enable timeout ------------[ cut here ]------------ WARNING: gen9_dc_off_power_well_enable+0x224/0x230 [i915] Unexpected DBuf power power state (0x8000000a) Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014 Call Trace: [<ffffffff99d24408>] dump_stack+0x19/0x1b [<ffffffff996926d8>] __warn+0xd8/0x100 [<ffffffff9969275f>] warn_slowpath_fmt+0x5f/0x80 [<ffffffffc07bbae4>] gen9_dc_off_power_well_enable+0x224/0x230 [i915] [<ffffffffc07ba9d2>] intel_power_well_enable+0x42/0x50 [i915] [<ffffffffc07baa6a>] __intel_display_power_get_domain+0x8a/0xb0 [i915] [<ffffffffc07bdb93>] intel_display_power_get+0x33/0x50 [i915] [<ffffffffc07bdf95>] intel_display_set_init_power+0x45/0x50 [i915] [<ffffffffc07be003>] intel_power_domains_init_hw+0x63/0x8a0 [i915] [<ffffffffc07995c3>] i915_driver_load+0xae3/0x1760 [i915] [<ffffffff99bd6580>] ? nvmem_register+0x500/0x500 [<ffffffffc07a476c>] i915_pci_probe+0x2c/0x50 [i915] [<ffffffff9999cfea>] local_pci_probe+0x4a/0xb0 [<ffffffff9999e729>] pci_device_probe+0x109/0x160 [<ffffffff99a79aa5>] driver_probe_device+0xc5/0x3e0 [<ffffffff99a79ea3>] __driver_attach+0x93/0xa0 [<ffffffff99a79e10>] ? __device_attach+0x50/0x50 [<ffffffff99a77645>] bus_for_each_dev+0x75/0xc0 [<ffffffff99a7941e>] driver_attach+0x1e/0x20 [<ffffffff99a78ec0>] bus_add_driver+0x200/0x2d0 [<ffffffff99a7a534>] driver_register+0x64/0xf0 [<ffffffff9999df65>] __pci_register_driver+0xa5/0xc0 [<ffffffffc0929000>] ? 0xffffffffc0928fff [<ffffffffc0929059>] i915_init+0x59/0x5c [i915] [<ffffffff9960210a>] do_one_initcall+0xba/0x240 [<ffffffff9971108c>] load_module+0x272c/0x2bc0 [<ffffffff9997b990>] ? ddebug_proc_write+0xf0/0xf0 [<ffffffff997115e5>] SyS_init_module+0xc5/0x110 [<ffffffff99d36795>] system_call_fastpath+0x1c/0x21 Signed-off-by: Xiaolin Zhang <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2018-08-29net: bcmgenet: use MAC link status for fixed phyDoug Berger2-2/+11
When using the fixed PHY with GENET (e.g. MOCA) the PHY link status can be determined from the internal link status captured by the MAC. This allows the PHY state machine to use the correct link state with the fixed PHY even if MAC link event interrupts are missed when the net device is opened. Fixes: 8d88c6ebb34c ("net: bcmgenet: enable MoCA link state change detection") Signed-off-by: Doug Berger <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29net: stmmac: build the dwmac-socfpga platform driver for Stratix10Dinh Nguyen1-1/+1
The Stratix10 SoC is an AARCH64 based platform that shares the same ethernet controller that is on other SoCFPGA platforms. Build the platform driver. Signed-off-by: Dinh Nguyen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29Merge branch 'ipv6-fix-error-path-of-inet6_init'David S. Miller2-5/+9
Sabrina Dubroca says: ==================== ipv6: fix error path of inet6_init() The error path of inet6_init() can trigger multiple kernel panics, mostly due to wrong ordering of cleanups. This series fixes those issues. ==================== Reviewed-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29net: rtnl: return early from rtnl_unregister_all when protocol isn't registeredSabrina Dubroca1-0/+4
rtnl_unregister_all(PF_INET6) gets called from inet6_init in cases when no handler has been registered for PF_INET6 yet, for example if ip6_mr_init() fails. Abort and avoid a NULL pointer deref in that case. Example of panic (triggered by faking a failure of register_pernet_subsys): general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [...] RIP: 0010:rtnl_unregister_all+0x17e/0x2a0 [...] Call Trace: ? rtnetlink_net_init+0x250/0x250 ? sock_unregister+0x103/0x160 ? kernel_getsockopt+0x200/0x200 inet6_init+0x197/0x20d Fixes: e2fddf5e96df ("[IPV6]: Make af_inet6 to check ip6_route_init return value.") Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29ipv6: fix cleanup ordering for pingv6 registrationSabrina Dubroca1-2/+2
Commit 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.") contains an error in the cleanup path of inet6_init(): when proto_register(&pingv6_prot, 1) fails, we try to unregister &pingv6_prot. When rawv6_init() fails, we skip unregistering &pingv6_prot. Example of panic (triggered by faking a failure of proto_register(&pingv6_prot, 1)): general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [...] RIP: 0010:__list_del_entry_valid+0x79/0x160 [...] Call Trace: proto_unregister+0xbb/0x550 ? trace_preempt_on+0x6f0/0x6f0 ? sock_no_shutdown+0x10/0x10 inet6_init+0x153/0x1b8 Fixes: 6d0bfe226116 ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-29ipv6: fix cleanup ordering for ip6_mr failureSabrina Dubroca1-3/+3
Commit 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") moved the cleanup label for ipmr_fail, but should have changed the contents of the cleanup labels as well. Now we can end up cleaning up icmpv6 even though it hasn't been initialized (jump to icmp_fail or ipmr_fail). Simply undo things in the reverse order of their initialization. Example of panic (triggered by faking a failure of icmpv6_init): kasan: GPF could be caused by NULL-ptr deref or user memory access general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI [...] RIP: 0010:__list_del_entry_valid+0x79/0x160 [...] Call Trace: ? lock_release+0x8a0/0x8a0 unregister_pernet_operations+0xd4/0x560 ? ops_free_list+0x480/0x480 ? down_write+0x91/0x130 ? unregister_pernet_subsys+0x15/0x30 ? down_read+0x1b0/0x1b0 ? up_read+0x110/0x110 ? kmem_cache_create_usercopy+0x1b4/0x240 unregister_pernet_subsys+0x1d/0x30 icmpv6_cleanup+0x1d/0x30 inet6_init+0x1b5/0x23f Fixes: 15e668070a64 ("ipv6: reorder icmpv6_init() and ip6_mr_init()") Signed-off-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>