aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-07-27brcmfmac: fix memleak due to calling brcmf_sdiod_sgtable_alloc() twiceArend Van Spriel1-5/+0
Due to a bugfix in wireless tree and the commit mentioned below a merge was needed which went haywire. So the submitted change resulted in the function brcmf_sdiod_sgtable_alloc() being called twice during the probe thus leaking the memory of the first call. Cc: [email protected] # 4.6.x Fixes: 4d7928959832 ("brcmfmac: switch to new platform data") Reported-by: Stefan Wahren <[email protected]> Tested-by: Stefan Wahren <[email protected]> Reviewed-by: Hante Meuleman <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-07-27brcmfmac: Don't grow SKB by negative sizeDaniel Stone1-1/+1
The commit to rework the headroom check in start_xmit() now calls pxskb_expand_head() unconditionally if the header is CoW. Unfortunately, it does so with the delta between the extant headroom and the header length, which may be negative if there is already sufficient headroom. pskb_expand_head() does allow for size being 0, in which case it just copies, so clamp the header delta to zero. Opening Chrome (and all my tabs) on a PCIE device was enough to reliably hit this. Fixes: 270a6c1f65fe ("brcmfmac: rework headroom check in .start_xmit()") Signed-off-by: Daniel Stone <[email protected]> Cc: Arend Van Spriel <[email protected]> Cc: James Hughes <[email protected]> Cc: Hante Meuleman <[email protected]> Cc: Pieter-Paul Giesberts <[email protected]> Cc: Franky Lin <[email protected]> Tested-by: Hans de Goede <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2017-07-27drm/i915: Fix cursor updates on some platformsVille Syrjälä1-1/+11
Turns out that just writing CURPOS isn't sufficient to move the cursor on some platforms. My 830 works just fine, but eg. 945 and PNV don't. On those platforms we need to arm even the CURPOS update with a CURBASE write. Even worse, a write to any of the cursor register apart from CURBASE will cancel an already pending cursor update. So if we have armed a CURCNTR/CURBASE update, a subsequent CURPOS write prior to vblank would cancel that armed update. Thus we're left with a cursor that doesn't appear to move, or even change shape. Fix the problem by always performing the CURBASE write after a CURPOS write. Bspec is somewhat unclear which platforms actually require this CURBASE write and which don't. So to keep it simple and to make sure we really fix the problem across all supported devices, let's just perform the CURBASE write unconditionally. Cc: Paul Menzel <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101790 Fixes: 75343a44c901 ("drm/i915: Drop useless posting reads from cursor commit") Signed-off-by: Ville Syrjälä <[email protected]> Tested-by: Paul Menzel <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 8753d2bc5e49daad301ce65f5dada57ed924fad6) Signed-off-by: Daniel Vetter <[email protected]>
2017-07-27drm/i915: Fix user ptr check size in eb_relocate_vma()Imre Deak1-1/+1
Fix the sizeof(ptr) vs. sizeof(*ptr) typo. Fixes: 2889caa92321 ("drm/i915: Eliminate lots of iterations over the execobjects array") Cc: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Signed-off-by: Imre Deak <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit edd9003f7f9dddd28fdd768e6e7569d996c769cb) Signed-off-by: Daniel Vetter <[email protected]>
2017-07-27sctp: fix the check for _sctp_walk_params and _sctp_walk_errorsXin Long1-2/+2
Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()") tried to fix the issue that it may overstep the chunk end for _sctp_walk_{params, errors} with 'chunk_end > offset(length) + sizeof(length)'. But it introduced a side effect: When processing INIT, it verifies the chunks with 'param.v == chunk_end' after iterating all params by sctp_walk_params(). With the check 'chunk_end > offset(length) + sizeof(length)', it would return when the last param is not yet accessed. Because the last param usually is fwdtsn supported param whose size is 4 and 'chunk_end == offset(length) + sizeof(length)' This is a badly issue even causing sctp couldn't process 4-shakes. Client would always get abort when connecting to server, due to the failure of INIT chunk verification on server. The patch is to use 'chunk_end <= offset(length) + sizeof(length)' instead of 'chunk_end < offset(length) + sizeof(length)' for both _sctp_walk_params and _sctp_walk_errors. Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()") Signed-off-by: Xin Long <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-27dccp: fix a memleak for dccp_feat_init err processXin Long1-2/+5
In dccp_feat_init, when ccid_get_builtin_ccids failsto alloc memory for rx.val, it should free tx.val before returning an error. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-27dccp: fix a memleak that dccp_ipv4 doesn't put reqsk properlyXin Long1-0/+1
The patch "dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properly" fixed reqsk refcnt leak for dccp_ipv6. The same issue exists on dccp_ipv4. This patch is to fix it for dccp_ipv4. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-27dccp: fix a memleak that dccp_ipv6 doesn't put reqsk properlyXin Long1-0/+1
In dccp_v6_conn_request, after reqsk gets alloced and hashed into ehash table, reqsk's refcnt is set 3. one is for req->rsk_timer, one is for hlist, and the other one is for current using. The problem is when dccp_v6_conn_request returns and finishes using reqsk, it doesn't put reqsk. This will cause reqsk refcnt leaks and reqsk obj never gets freed. Jianlin found this issue when running dccp_memleak.c in a loop, the system memory would run out. dccp_memleak.c: int s1 = socket(PF_INET6, 6, IPPROTO_IP); bind(s1, &sa1, 0x20); listen(s1, 0x9); int s2 = socket(PF_INET6, 6, IPPROTO_IP); connect(s2, &sa1, 0x20); close(s1); close(s2); This patch is to put the reqsk before dccp_v6_conn_request returns, just as what tcp_conn_request does. Reported-by: Jianlin Shi <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-27powerpc/mm/hash: Free the subpage_prot_table correctlyAneesh Kumar K.V1-1/+1
Fixes: dad6f37c2602e ("powerpc: subpage_protect: Increase the array size to take care of 64TB") Signed-off-by: Aneesh Kumar K.V <[email protected]> Tested-by: Ram Pai <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-27drm: exynos: mark pm functions as __maybe_unusedArnd Bergmann1-4/+2
The rework of the exynos DRM clock handling introduced warnings for configurations that have CONFIG_PM disabled: drivers/gpu/drm/exynos/exynos_hdmi.c:736:13: error: 'hdmi_clk_disable_gates' defined but not used [-Werror=unused-function] static void hdmi_clk_disable_gates(struct hdmi_context *hdata) ^~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/exynos/exynos_hdmi.c:717:12: error: 'hdmi_clk_enable_gates' defined but not used [-Werror=unused-function] static int hdmi_clk_enable_gates(struct hdmi_context *hdata) The problem is that the PM functions themselves are inside of an #ifdef, but some functions they call are not. This patch removes the #ifdef and instead marks the PM functions as __maybe_unused, which is a more reliable way to get it right. Link: https://patchwork.kernel.org/patch/8436281/ Fixes: 9be7e9898444 ("drm/exynos/hdmi: clock code re-factoring") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm/exynos: select CEC_CORE if CEC_NOTIFIERHans Verkuil1-0/+1
If the s5p-cec driver is a module and the drm exynos driver is built-in, then the CEC core will be a module also, causing the CEC notifier to fail (will be compiled as empty functions). To prevent this select CEC_CORE if CEC_NOTIFIER is set to ensure the CEC core is also built into the kernel. Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm/exynos/hdmi: fix disable sequenceAndrzej Hajda1-2/+0
The "Fixes" patch was incorrectly merged, as a result PHY is prematurely powered off and for example Odroid-U3 cannot disable TV power domain when HDMI cable is unplugged. Signed-off-by: Andrzej Hajda <[email protected]> Reported-by: Marek Szyprowski <[email protected]> Fixes: 625e63e2 ("drm/exynos/hdmi: fix pipeline disable order") Tested-by: Marek Szyprowski <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm/exynos: mic: add a bridge at probeInki Dae1-9/+15
This patch moves drm_bridge_add call into probe. It doesn't need to call drm_bridge_add call every time bind callback is called. Changelog v2 - moved drm_bridge_remove call into remove callback. - corrected description. Suggested-by: Andrzej Hajda <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Reviewed-by: Hoegeun Kwon <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm/exynos/dsi: Remove error handling for bridge_node DT parsingHoegeun Kwon1-2/+0
Remove the error handling of bridge_node because the bridge_node is optional. For example, In case of Exynos SoC, a bridge device such as mDNIe and MIC could be placed between Display Controller and MIPI DSI device but the bridge device is optional. Signed-off-by: Hoegeun Kwon <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm/exynos: dsi: do not try to find bridgeInki Dae1-3/+5
It doesn't need to try to find a bridge if bridge node doesn't exist. Reviewed-by: Shuah Khan <[email protected]> Tested-by: Shuah Khan <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm: exynos: hdmi: make of_device_ids const.Arvind Yadav1-1/+1
of_device_ids are not supposed to change at runtime. All functions working with of_device_ids provided by <linux/of.h> work with const of_device_ids. So mark the non-const structs as const. File size before: text data bss dec hex filename 12294 1192 0 13486 34ae drivers/gpu/drm/exynos/exynos_hdmi.o File size after constify hdmi_match_types. text data bss dec hex filename 13318 176 0 13494 34b6 drivers/gpu/drm/exynos/exynos_hdmi.o Signed-off-by: Arvind Yadav <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27drm: exynos: constify mixer_match_types and *_mxr_drv_data.Arvind Yadav1-5/+5
File size before: text data bss dec hex filename 9983 1424 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o File size after constify: text data bss dec hex filename 11231 176 0 11407 2c8f drivers/gpu/drm/exynos/exynos_mixer.o Signed-off-by: Arvind Yadav <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-27exynos_drm: Clean up duplicated assignment in exynos_drm_driverGabriel Krisman Bertazi1-1/+0
num_ioctls is already assigned when declaring the exynos_drm_driver structure. No need to duplicate it here. Signed-off-by: Gabriel Krisman Bertazi <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2017-07-26bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()Jakub Kicinski2-3/+6
The buffer passed to bpf_obj_get_info_by_fd() should be initialized to zeros. Kernel will enforce that to guarantee we can safely extend info structures in the future. Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing is problematic, however, since some members of the info structures may need to be initialized by the callers (for instance pointers to buffers to which kernel is to dump translated and jited images). Remove the zeroing and fix up the in-tree callers before any kernel has been released with this code. As Daniel points out this seems to be the intended operation anyway, since commit 95b9afd3987f ("bpf: Test for bpf ID") is itself setting the buffer pointers before calling bpf_obj_get_info_by_fd(). Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-26netpoll: Fix device name check in netpoll_setup()Matthias Kaehlcke1-1/+1
Apparently netpoll_setup() assumes that netpoll.dev_name is a pointer when checking if the device name is set: if (np->dev_name) { ... However the field is a character array, therefore the condition always yields true. Check instead whether the first byte of the array has a non-zero value. Signed-off-by: Matthias Kaehlcke <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-27Merge branch 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie3-23/+23
into drm-fixes Three misc amd fixes. * 'drm-fixes-4.13' of git://people.freedesktop.org/~agd5f/linux: drm/amd/powerplay: fix AVFS voltage offset for Vega10 drm/amdgpu/gfx9: simplify and fix GRBM index selection drm/amdgpu: Fix blocking in RCU critical section(v2)
2017-07-26NFS: Use raw NFS access mask in nfs4_opendata_access()Anna Schumaker1-4/+8
Commit bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") changed how the access results are stored after an access() call. An NFS v4 OPEN might have access bits returned with the opendata, so we should use the NFS4_ACCESS values when determining the return value in nfs4_opendata_access(). Fixes: bd8b2441742b ("NFS: Store the raw NFS access mask in the inode's access cache") Reported-by: Eryu Guan <[email protected]> Signed-off-by: Anna Schumaker <[email protected]> Tested-by: Takashi Iwai <[email protected]>
2017-07-26bonding: commit link status change after proposeWANG Cong1-0/+2
Commit de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") moves link status commitment into bond_mii_monitor(), but it still relies on the return value of bond_miimon_inspect() as the hint. We need to return non-zero as long as we propose a link status change. Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring") Reported-by: Benjamin Gilbert <[email protected]> Tested-by: Benjamin Gilbert <[email protected]> Cc: Mahesh Bandewar <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-26dm, dax: Make sure dm_dax_flush() is called if device supports itVivek Goyal3-0/+42
Currently dm_dax_flush() is not being called, even if underlying dax device supports write cache, because DAXDEV_WRITE_CACHE is not being propagated up to the DM dax device. If the underlying dax device supports write cache, set DAXDEV_WRITE_CACHE on the DM dax device. This will cause dm_dax_flush() to be called. Fixes: abebfbe2f7 ("dm: add ->flush() dax operation support") Signed-off-by: Vivek Goyal <[email protected]> Acked-by: Dan Williams <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-26dm verity fec: fix GFP flags used with mempool_alloc()NeilBrown1-16/+5
mempool_alloc() cannot fail for GFP_NOIO allocation, so there is no point testing for failure. One place the code tested for failure was passing "0" as the GFP flags. This is most unusual and is probably meant to be GFP_NOIO, so that is changed. Also, allocation from ->extra_pool and ->prealloc_pool are repeated before releasing the previous allocation. This can deadlock if the code is servicing a write under high memory pressure. To avoid deadlocks, change these to use GFP_NOWAIT and leave the error handling in place. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-26dm zoned: use GFP_NOIO in I/O pathDamien Le Moal3-9/+9
Use GFP_NOIO for memory allocations in the I/O path. Other memory allocations in the initialization path can use GFP_KERNEL. Reported-by: Mikulas Patocka <[email protected]> Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2017-07-26Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds3-22/+15
Pull virtio fixes and cleanups from Michael Tsirkin: "Fixes some minor issues all over the codebase" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-net: fix module unloading virtio-balloon: coding format cleanup virtio-balloon: deflate via a page list virtio_blk: Use sysfs_match_string() helper
2017-07-26Merge branch 'kvm-ppc-fixes' of ↵Paolo Bonzini2-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master Two commits which fix host crashes. Signed-off-by: Paolo BOnzini <[email protected]>
2017-07-26KVM: LAPIC: Fix reentrancy issues with preempt notifiersWanpeng Li1-3/+14
Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> [Add more WARNs. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26tools/kvm_stat: add '-f help' to get the available event listLin Ma1-2/+14
Signed-off-by: Lin Ma <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26tools/kvm_stat: use variables instead of hard paths in help outputLin Ma1-3/+3
Using variables instead of hard paths makes the requirements information more accurate. Signed-off-by: Lin Ma <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26Merge tag 'kvm-s390-master-4.13-1' of ↵Paolo Bonzini1-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: fixup missing srcu lock We need to hold the srcu lock when accessing memory slots during migration Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26KVM: nVMX: Fix loss of L2's NMI blocking stateWanpeng Li1-0/+2
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1: Before NMI IRET test Sending NMI to self NMI isr running stack 0x461000 Sending nested NMI to self After nested NMI to self Nested NMI isr running rip=40038e After iret After NMI to self FAIL: NMI Commit 4c4a6f790ee862 (KVM: nVMX: track NMI blocking state separately for each VMCS) tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough: - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the L2 can generate #PF which can be intercepted by L0. - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 guest and injects the #PF into L1. At this point the vmcs02 has nmi_known_unmasked=true. - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field of vmcs12 (and fixes the shadow page table) before resuming L2 guest. - L1 executes VMRESUME to resume L2, causing a vmexit to L0 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the interruptibility-state field of vmcs02, but nmi_known_unmasked is still true. - L2 immediately exits to L0 with another page fault, because L0 still has not updated the NGVA->HPA page tables. However, nmi_known_unmasked is true so vmx_recover_nmi_blocking does not do anything. The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12. Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26KVM: nVMX: Fix posted intr delivery when vcpu is in guest modeWincy Van1-11/+11
The PI vector for L0 and L1 must be different. If dest vcpu0 is in guest mode while vcpu1 is delivering a non-nested PI to vcpu0, there wont't be any vmexit so that the non-nested interrupt will be delayed. Signed-off-by: Wincy Van <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26x86: irq: Define a global vector for nested posted interruptsWincy Van7-1/+29
We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: Wincy Van <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26KVM: x86: do mask out upper bits of PAE CR3Paolo Bonzini1-2/+2
This reverts the change of commit f85c758dbee54cc3612a6e873ef7cecdb66ebee5, as the behavior it modified was intended. The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual says: Table 4-7. Use of CR3 with PAE Paging Bit Position(s) Contents 4:0 Ignored 31:5 Physical address of the 32-Byte aligned page-directory-pointer table used for linear-address translation 63:32 Ignored (these bits exist only on processors supporting the Intel-64 architecture) To placate the static checker, write the mask explicitly as an unsigned long constant instead of using a 32-bit unsigned constant. Cc: Dan Carpenter <[email protected]> Fixes: f85c758dbee54cc3612a6e873ef7cecdb66ebee5 Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26KVM: make pid available for uevents without debugfsClaudio Imbrenda2-23/+13
Simplify and improve the code so that the PID is always available in the uevent even when debugfs is not available. This adds a userspace_pid field to struct kvm, as per Radim's suggestion, so that the PID can be retrieved on destruction too. Acked-by: Janosch Frank <[email protected]> Fixes: 286de8f6ac9202 ("KVM: trigger uevents when creating or destroying a VM") Signed-off-by: Claudio Imbrenda <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-07-26udp: unbreak build lacking CONFIG_XFRMPaolo Abeni1-1/+1
We must use pre-processor conditional block or suitable accessors to manipulate skb->sp elsewhere builds lacking the CONFIG_XFRM will break. Fixes: dce4551cb2ad ("udp: preserve head state for IP_CMSG_PASSSEC") Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-26nvme: validate admin queue before unquiesceScott Bauer1-1/+2
With a misbehaving controller it's possible we'll never enter the live state and create an admin queue. When we fail out of reset work it's possible we failed out early enough without setting up the admin queue. We tear down queues after a failed reset, but needed to do some more sanitization. Fixes 443bd90f2cca: "nvme: host: unquiesce queue in nvme_kill_queues()" [ 189.650995] nvme nvme1: pci function 0000:0b:00.0 [ 317.680055] nvme nvme0: Device not ready; aborting reset [ 317.680183] nvme nvme0: Removing after probe failure status: -19 [ 317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 317.681397] general protection fault: 0000 [#1] SMP KASAN [ 317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5 [ 317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016 [ 317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000 [ 317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0 [ 317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282 [ 317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3 [ 317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000 [ 317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000 [ 317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0 [ 317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0 [ 317.684371] FS: 0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000 [ 317.684519] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0 [ 317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 317.685018] Call Trace: [ 317.685084] nvme_kill_queues+0x4d/0x170 [nvme_core] [ 317.685185] nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme] [ 317.685289] process_one_work+0x771/0x1170 [ 317.685372] worker_thread+0xde/0x11e0 [ 317.685452] ? pci_mmcfg_check_reserved+0x110/0x110 [ 317.685550] kthread+0x2d3/0x3d0 [ 317.685617] ? process_one_work+0x1170/0x1170 [ 317.685704] ? kthread_create_on_node+0xc0/0xc0 [ 317.685785] ret_from_fork+0x25/0x30 [ 317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3 [ 317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40 [ 317.685908] ---[ end trace a3f8704150b1e8b4 ]--- Signed-off-by: Scott Bauer <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2017-07-26xfs: fix multi-AG deadlock in xfs_bunmapiChristoph Hellwig1-0/+12
Just like in the allocator we must avoid touching multiple AGs out of order when freeing blocks, as freeing still locks the AGF and can cause the same AB-BA deadlocks as in the allocation path. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Nikolay Borisov <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-07-26arm64: sysreg: Fix unprotected macro argmuent in write_sysregDave Martin1-2/+2
write_sysreg() may misparse the value argument because it is used without parentheses to protect it. This patch adds the ( ) in order to avoid any surprises. Acked-by: Mark Rutland <[email protected]> Signed-off-by: Dave Martin <[email protected]> [will: same change to write_sysreg_s] Signed-off-by: Will Deacon <[email protected]>
2017-07-26perf: qcom_l2: fix column exclusion checkNeil Leeder1-0/+2
The check for column exclusion did not verify that the event being checked was an L2 event, and not a software event. Software events should not be checked for column exclusion. This resulted in a group with both software and L2 events sometimes incorrectly rejecting the L2 event for column exclusion and not counting it. Add a check for PMU type before applying column exclusion logic. Fixes: 21bdbb7102edeaeb ("perf: add qcom l2 cache perf events driver") Acked-by: Mark Rutland <[email protected]> Signed-off-by: Neil Leeder <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2017-07-26powerpc/Makefile: Fix ld version check with 64-bit LE-only toolchainMichael Ellerman1-12/+13
In commit efe0160cfd40 ("powerpc/64: Linker on-demand sfpr functions for modules"), we added an ld version check early in the powerpc top-level Makefile. Because the Makefile runs before the kernel config is setup, the checks for CONFIG_CPU_LITTLE_ENDIAN etc. all take the default case. So we end up configuring ld for 32-bit big endian. That would be OK, except that for historical (or perhaps no) reason, we use 'override LD' to add the endian flags to the LD variable itself, rather than the normal approach of adding them to LDFLAGS. The end result is that when we check the ld version we run it as: $(CROSS_COMPILE)ld -EB -m elf32ppc --version This often works, unless you are using a 64-bit only and/or little endian only, toolchain. In which case you see something like: $ make defconfig powerpc64le-linux-ld: unrecognised emulation mode: elf32ppc Supported emulations: elf64lppc elf32lppc elf32lppclinux elf32lppcsim /bin/sh: 1: [: -ge: unexpected operator The proper fix is to stop using 'override LD', but that will require a fair bit of testing. Instead we can fix it for now just by reordering the Makefile to do the version check earlier. Fixes: efe0160cfd40 ("powerpc/64: Linker on-demand sfpr functions for modules") Signed-off-by: Michael Ellerman <[email protected]>
2017-07-26powerpc/pseries: Fix of_node_put() underflow during reconfig removeLaurent Vivier1-1/+0
As for commit 68baf692c435 ("powerpc/pseries: Fix of_node_put() underflow during DLPAR remove"), the call to of_node_put() must be removed from pSeries_reconfig_remove_node(). dlpar_detach_node() and pSeries_reconfig_remove_node() both call of_detach_node(), and thus the node should not be released in both cases. Fixes: 0829f6d1f69e ("of: device_node kobject lifecycle fixes") Cc: [email protected] # v3.15+ Signed-off-by: Laurent Vivier <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-26powerpc/mm/radix: Workaround prefetch issue with KVMBenjamin Herrenschmidt6-22/+154
There's a somewhat architectural issue with Radix MMU and KVM. When coming out of a guest with AIL (Alternate Interrupt Location, ie, MMU enabled), we start executing hypervisor code with the PID register still containing whatever the guest has been using. The problem is that the CPU can (and will) then start prefetching or speculatively load from whatever host context has that same PID (if any), thus bringing translations for that context into the TLB, which Linux doesn't know about. This can cause stale translations and subsequent crashes. Fixing this in a way that is neither racy nor a huge performance impact is difficult. We could just make the host invalidations always use broadcast forms but that would hurt single threaded programs for example. We chose to fix it instead by partitioning the PID space between guest and host. This is possible because today Linux only use 19 out of the 20 bits of PID space, so existing guests will work if we make the host use the top half of the 20 bits space. We additionally add support for a property to indicate to Linux the size of the PID register which will be useful if we eventually have processors with a larger PID space available. There is still an issue with malicious guests purposefully setting the PID register to a value in the hosts PID range. Hopefully future HW can prevent that, but in the meantime, we handle it with a pair of kludges: - On the way out of a guest, before we clear the current VCPU in the PACA, we check the PID and if it's outside of the permitted range we flush the TLB for that PID. - When context switching, if the mm is "new" on that CPU (the corresponding bit was set for the first time in the mm cpumask), we check if any sibling thread is in KVM (has a non-NULL VCPU pointer in the PACA). If that is the case, we also flush the PID for that CPU (core). This second part is needed to handle the case where a process is migrated (or starts a new pthread) on a sibling thread of the CPU coming out of KVM, as there's a window where stale translations can exist before we detect it and flush them out. A future optimization could be added by keeping track of whether the PID has ever been used and avoid doing that for completely fresh PIDs. We could similarily mark PIDs that have been the subject of a global invalidation as "fresh". But for now this will do. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [mpe: Rework the asm to build with CONFIG_PPC_RADIX_MMU=n, drop unneeded include of kvm_book3s_asm.h] Signed-off-by: Michael Ellerman <[email protected]>
2017-07-25net: ethernet: nb8800: Handle all 4 RGMII modes identicallyMarc Gonzalez1-5/+4
Before commit bf8f6952a233 ("Add blurb about RGMII") it was unclear whose responsibility it was to insert the required clock skew, and in hindsight, some PHY drivers got it wrong. The solution forward is to introduce a new property, explicitly requiring skew from the node to which it is attached. In the interim, this driver will handle all 4 RGMII modes identically (no skew). Fixes: 52dfc8301248 ("net: ethernet: add driver for Aurora VLSI NB8800 Ethernet controller") Signed-off-by: Marc Gonzalez <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25Revert "netvsc: optimize calculation of number of slots"stephen hemminger1-10/+33
The logic for computing page buffer scatter does not take into account the impact of compound pages. Therefore the optimization to compute number of slots was incorrect and could cause stack corruption a skb was sent with lots of fragments from huge pages. This reverts commit 60b86665af0dfbeebda8aae43f0ba451cd2dcfe5. Fixes: 60b86665af0d ("netvsc: optimize calculation of number of slots") Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25ftgmac100: return error in ftgmac100_alloc_rx_bufJoel Stanley1-2/+2
The error paths set err, but it's not returned. I wondered if we should fix all of the callers to check the returned value, but Ben explains why the code is this way: > Most call sites ignore it on purpose. There's nothing we can do if > we fail to get a buffer at interrupt time, so we point the buffer to > the scratch page so the HW doesn't DMA into lalaland and lose the > packet. > > The one call site that tests and can fail is the one used when brining > the interface up. If we fail to allocate at that point, we fail the > ifup. But as you noticed, I do have a bug not returning the error. Acked-by: Benjamin Herrenschmidt <[email protected]> Signed-off-by: Joel Stanley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25ipv6: Don't increase IPSTATS_MIB_FRAGFAILS twice in ip6_fragment()Stefano Brivio1-4/+0
RFC 2465 defines ipv6IfStatsOutFragFails as: "The number of IPv6 datagrams that have been discarded because they needed to be fragmented at this output interface but could not be." The existing implementation, instead, would increase the counter twice in case we fail to allocate room for single fragments: once for the fragment, once for the datagram. This didn't look intentional though. In one of the two affected affected failure paths, the double increase was simply a result of a new 'goto fail' statement, introduced to avoid a skb leak. The other path appears to be affected since at least 2.6.12-rc2. Reported-by: Sabrina Dubroca <[email protected]> Fixes: 1d325d217c7f ("ipv6: ip6_fragment: fix headroom tests and skb leak") Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-07-25Merge tag 'scsi-fixes' of ↵Linus Torvalds3-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small fixes. The transfer size fixes are actually correcting some performance drops on the hpsa and smartpqi cards. The cards actually have an internal cache for request speed up but bypass it for transfers > 1MB. Since 4.3 the efficiency of our merges has rendered the cache mostly unused, so limit transfers to under 1MB to recover the cache boost" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sg: fix static checker warning in sg_is_valid_dxfer scsi: smartpqi: limit transfer length to 1MB scsi: hpsa: limit transfer length to 1MB