aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-20KVM: x86: Unexport kvm_vcpu_reload_apic_access_page()Liran Alon1-1/+0
The function is only used in kvm.ko module. Reviewed-by: Mark Kanda <[email protected]> Signed-off-by: Liran Alon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-11-20KVM: nVMX: add CR4_LA57 bit to nested CR4_FIXED1Chenyi Qiang1-0/+1
When L1 guest uses 5-level paging, it fails vm-entry to L2 due to invalid host-state. It needs to add CR4_LA57 bit to nested CR4_FIXED1 MSR. Signed-off-by: Chenyi Qiang <[email protected]> Reviewed-by: Xiaoyao Li <[email protected]> Reviewed-by: Liran Alon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-11-20KVM: nVMX: Use semi-colon instead of comma for exit-handlers initializationLiran Alon1-13/+13
Reviewed-by: Mark Kanda <[email protected]> Signed-off-by: Liran Alon <[email protected]> Reviewed-by: Jim Mattson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-11-20KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmapNitesh Narayan Lal1-0/+1
Not zeroing the bitmap used for identifying the destination vCPUs for an IOAPIC scan request in fixed delivery mode could lead to waking up unwanted vCPUs. This patch zeroes the vCPU bitmap before passing it to kvm_bitmap_or_dest_vcpus(), which is responsible for setting the bitmap with the bits corresponding to the destination vCPUs. Fixes: 7ee30bc132c6("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs") Signed-off-by: Nitesh Narayan Lal <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-11-20s390/smp: fix physical to logical CPU map for SMTHeiko Carstens1-26/+54
If an SMT capable system is not IPL'ed from the first CPU the setup of the physical to logical CPU mapping is broken: the IPL core gets CPU number 0, but then the next core gets CPU number 1. Correct would be that all SMT threads of CPU 0 get the subsequent logical CPU numbers. This is important since a lot of code (like e.g. the CPU topology code) assumes that CPU maps are setup like this. If the mapping is broken the system will not IPL due to broken topology masks: [ 1.716341] BUG: arch topology broken [ 1.716342] the SMT domain not a subset of the MC domain [ 1.716343] BUG: arch topology broken [ 1.716344] the MC domain not a subset of the BOOK domain This scenario can usually not happen since LPARs are always IPL'ed from CPU 0 and also re-IPL is intiated from CPU 0. However older kernels did initiate re-IPL on an arbitrary CPU. If therefore a re-IPL from an old kernel into a new kernel is initiated this may lead to crash. Fix this by setting up the physical to logical CPU mapping correctly. Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20s390/early: move access registers setup in C codeVasily Gorbik2-8/+11
Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20s390/head64: remove unnecessary vdso_per_cpu_data setupVasily Gorbik1-2/+0
vdso_per_cpu_data lowcore value is only needed for fully functional exception handlers, which are activated in setup_lowcore_dat_off. The same function does init vdso_per_cpu_data via vdso_alloc_boot_cpu. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20s390/early: move control registers setup in C codeVasily Gorbik3-6/+13
Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20s390/kasan: support memcpy_real with TRACE_IRQFLAGSVasily Gorbik1-4/+8
Currently if the kernel is built with CONFIG_TRACE_IRQFLAGS and KASAN and used as crash kernel it crashes itself due to trace_hardirqs_off/trace_hardirqs_on being called with DAT off. This happens because trace_hardirqs_off/trace_hardirqs_on are instrumented and kasan code tries to perform access to shadow memory to validate memory accesses. Kasan shadow memory is populated with vmemmap, so all accesses require DAT on. memcpy_real could be called with DAT on or off (with kasan enabled DAT is set even before early code is executed). Make sure that trace_hardirqs_off/trace_hardirqs_on are called with DAT on and only actual __memcpy_real is called with DAT off. Also annotate __memcpy_real and _memcpy_real with __no_sanitize_address to avoid further problems due to switching DAT off. Reviewed-by: Philipp Rudo <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20s390/crypto: Fix unsigned variable compared with zeroYueHaibing1-2/+5
s390_crypto_shash_parmsize() return type is int, it should not be stored in a unsigned variable, which compared with zero. Reported-by: Hulk Robot <[email protected]> Fixes: 3c2eb6b76cab ("s390/crypto: Support for SHA3 via CPACF (MSA6)") Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Joerg Schmidbauer <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-20fork: fix pidfd_poll()'s return typeLuc Van Oostenryck1-3/+3
pidfd_poll() is defined as returning 'unsigned int' but the .poll method is declared as returning '__poll_t', a bitwise type. Fix this by using the proper return type and using the EPOLL constants instead of the POLL ones, as required for __poll_t. Fixes: b53b0b9d9a61 ("pidfd: add polling support") Cc: Joel Fernandes (Google) <[email protected]> Cc: [email protected] # 5.3 Signed-off-by: Luc Van Oostenryck <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2019-11-20PM: QoS: Invalidate frequency QoS requests after removalRafael J. Wysocki1-1/+7
Switching cpufreq drivers (or switching operation modes of the intel_pstate driver from "active" to "passive" and vice versa) does not work on some x86 systems with ACPI after commit 3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS"), because the ACPI _PPC and thermal code uses the same frequency QoS request object for a given CPU every time a cpufreq driver is registered and freq_qos_remove_request() does not invalidate the request after removing it from its QoS list, so freq_qos_add_request() complains and fails when that request is passed to it again. Fix the issue by modifying freq_qos_remove_request() to clear the qos and type fields of the frequency request pointed to by its argument after removing it from its QoS list so as to invalidate it. Fixes: 3000ce3c52f8 ("cpufreq: Use per-policy frequency QoS") Reported-and-tested-by: Doug Smythies <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Viresh Kumar <[email protected]>
2019-11-20virtio_balloon: fix shrinker countWei Wang1-1/+1
Instead of multiplying by page order, virtio balloon divided by page order. The result is that it can return 0 if there are a bit less than MAX_ORDER - 1 pages in use, and then shrinker scan won't be called. Cc: [email protected] Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") Signed-off-by: Wei Wang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]>
2019-11-20virtio_balloon: fix shrinker scan number of pagesMichael S. Tsirkin1-6/+12
virtio_balloon_shrinker_scan should return number of system pages freed, but because it's calling functions that deal with balloon pages, it gets confused and sometimes returns the number of balloon pages. It does not matter practically as the exact number isn't used, but it seems better to be consistent in case someone starts using this API. Further, if we ever tried to iteratively leak pages as virtio_balloon_shrinker_scan tries to do, we'd run into issues - this is because freed_pages was accumulating total freed pages, but was also subtracted on each iteration from pages_to_free, which can result in either leaking less memory than we were supposed to free, or more if pages_to_free underruns. On a system with 4K pages we are lucky that we are never asked to leak more than 128 pages while we can leak up to 256 at a time, but it looks like a real issue for systems with page size != 4K. Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker") Reported-by: Khazhismel Kumykov <[email protected]> Reviewed-by: Wei Wang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2019-11-19mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=nGeert Uytterhoeven1-1/+1
Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS, causing failures if reset controller support is not enabled. E.g. on r7s72100/rskrza1: sh-eth e8203000.ethernet: MDIO init failed: -524 sh-eth: probe of e8203000.ethernet failed with error -524 Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb. Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: YueHaibing <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19Revert "mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabled"David S. Miller1-2/+1
This reverts commit 075e238d12c21c8bde700d21fb48be7a3aa80194. Going to go with Geert's fix instead, which also has a correct Fixes tag. Signed-off-by: David S. Miller <[email protected]>
2019-11-19net: hns3: fix a wrong reset interrupt status maskHuazhong Tan1-1/+1
According to hardware user manual, bits5~7 in register HCLGE_MISC_VECTOR_INT_STS means reset interrupts status, but HCLGE_RESET_INT_M is defined as bits0~2 now. So it will make hclge_reset_err_handle() read the wrong reset interrupt status. This patch fixes this wrong bit mask. Fixes: 2336f19d7892 ("net: hns3: check reset interrupt status when reset fails") Signed-off-by: Huazhong Tan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19net: fec: fix clock count mis-matchChuhong Yuan1-4/+11
pm_runtime_put_autosuspend in probe will call runtime suspend to disable clks automatically if CONFIG_PM is defined. (If CONFIG_PM is not defined, its implementation will be empty, then runtime suspend will not be called.) Therefore, we can call pm_runtime_get_sync to runtime resume it first to enable clks, which matches the runtime suspend. (Only when CONFIG_PM is defined, otherwise pm_runtime_get_sync will also be empty, then runtime resume will not be called.) Then it is fine to disable clks without causing clock count mis-match. Fixes: c43eab3eddb4 ("net: fec: add missed clk_disable_unprepare in remove") Signed-off-by: Chuhong Yuan <[email protected]> Acked-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19net/sched: act_pedit: fix WARN() in the traffic pathDavide Caratti1-7/+5
when configuring act_pedit rules, the number of keys is validated only on addition of a new entry. This is not sufficient to avoid hitting a WARN() in the traffic path: for example, it is possible to replace a valid entry with a new one having 0 extended keys, thus causing splats in dmesg like: pedit BUG: index 42 WARNING: CPU: 2 PID: 4054 at net/sched/act_pedit.c:410 tcf_pedit_act+0xc84/0x1200 [act_pedit] [...] RIP: 0010:tcf_pedit_act+0xc84/0x1200 [act_pedit] Code: 89 fa 48 c1 ea 03 0f b6 04 02 84 c0 74 08 3c 03 0f 8e ac 00 00 00 48 8b 44 24 10 48 c7 c7 a0 c4 e4 c0 8b 70 18 e8 1c 30 95 ea <0f> 0b e9 a0 fa ff ff e8 00 03 f5 ea e9 14 f4 ff ff 48 89 58 40 e9 RSP: 0018:ffff888077c9f320 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffac2983a2 RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888053927bec RBP: dffffc0000000000 R08: ffffed100a726209 R09: ffffed100a726209 R10: 0000000000000001 R11: ffffed100a726208 R12: ffff88804beea780 R13: ffff888079a77400 R14: ffff88804beea780 R15: ffff888027ab2000 FS: 00007fdeec9bd740(0000) GS:ffff888053900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffdb3dfd000 CR3: 000000004adb4006 CR4: 00000000001606e0 Call Trace: tcf_action_exec+0x105/0x3f0 tcf_classify+0xf2/0x410 __dev_queue_xmit+0xcbf/0x2ae0 ip_finish_output2+0x711/0x1fb0 ip_output+0x1bf/0x4b0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x180c/0x2430 sock_sendmsg+0xdb/0x110 __sys_sendto+0x257/0x2b0 __x64_sys_sendto+0xdd/0x1b0 do_syscall_64+0xa5/0x4e0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7fdeeb72e993 Code: 48 8b 0d e0 74 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 0d d6 2c 00 00 75 13 49 89 ca b8 2c 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 34 c3 48 83 ec 08 e8 4b cc 00 00 48 89 04 24 RSP: 002b:00007ffdb3de8a18 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 000055c81972b700 RCX: 00007fdeeb72e993 RDX: 0000000000000040 RSI: 000055c81972b700 RDI: 0000000000000003 RBP: 00007ffdb3dea130 R08: 000055c819728510 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040 R13: 000055c81972b6c0 R14: 000055c81972969c R15: 0000000000000080 Fix this moving the check on 'nkeys' earlier in tcf_pedit_init(), so that attempts to install rules having 0 keys are always rejected with -EINVAL. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19net: phylink: fix link mode modification in PHY modeRussell King1-9/+16
Modifying the link settings via phylink_ethtool_ksettings_set() and phylink_ethtool_set_pauseparam() didn't always work as intended for PHY based setups, as calling phylink_mac_config() would result in the unresolved configuration being committed to the MAC, rather than the configuration with the speed and duplex setting. This would work fine if the update caused the link to renegotiate, but if no settings have changed, phylib won't trigger a renegotiation cycle, and the MAC will be left incorrectly configured. Avoid calling phylink_mac_config() unless we are using an inband mode in phylink_ethtool_ksettings_set(), and use phy_set_asym_pause() as introduced in 4.20 to set the PHY settings in phylink_ethtool_set_pauseparam(). Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19net: phylink: update documentation on create and destroyRussell King1-0/+4
Update the documentation on phylink's create and destroy functions to explicitly state that the rtnl lock must not be held while calling these. Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19r8169: disable TSO on a single version of RTL8168c to fix performanceCorinna Vinschen1-2/+5
During performance testing, I found that one of my r8169 NICs suffered a major performance loss, a 8168c model. Running netperf's TCP_STREAM test didn't return the expected throughput of > 900 Mb/s, but rather only about 22 Mb/s. Strange enough, running the TCP_MAERTS and UDP_STREAM tests all returned with throughput > 900 Mb/s, as did TCP_STREAM with the other r8169 NICs I can test (either one of 8169s, 8168e, 8168f). Bisecting turned up commit 93681cd7d94f83903cb3f0f95433d10c28a7e9a5, "r8169: enable HW csum and TSO" as the culprit. I added my 8168c version, RTL_GIGA_MAC_VER_22, to the code special-casing the 8168evl as per the patch below. This fixed the performance problem for me. Fixes: 93681cd7d94f ("r8169: enable HW csum and TSO") Signed-off-by: Corinna Vinschen <[email protected]> Reviewed-by: Heiner Kallweit <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19MAINTAINERS: forcedeth: Change Zhu Yanjun's email addressZhu Yanjun1-1/+1
I prefer to use my personal email address for kernel related work. Signed-off-by: Zhu Yanjun <[email protected]> Acked-by: Rain River <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19taprio: don't reject same mqprio settingsIvan Khoronzhuk1-2/+26
The taprio qdisc allows to set mqprio setting but only once. In case if mqprio settings are provided next time the error is returned as it's not allowed to change traffic class mapping in-flignt and that is normal. But if configuration is absolutely the same - no need to return error. It allows to provide same command couple times, changing only base time for instance, or changing only scheds maps, but leaving mqprio setting w/o modification. It more corresponds the message: "Changing the traffic mapping of a running schedule is not supported", so reject mqprio if it's really changed. Also corrected TC_BITMASK + 1 for consistency, as proposed. Fixes: a3d43c0d56f1 ("taprio: Add support adding an admin schedule") Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Acked-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19net/tls: enable sk_msg redirect to tls socket egressWillem de Bruijn3-0/+14
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket with TLS_TX takes the following path: tcp_bpf_sendmsg_redir tcp_bpf_push_locked tcp_bpf_push kernel_sendpage_locked sock->ops->sendpage_locked Also update the flags test in tls_sw_sendpage_locked to allow flag MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this. Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t Link: https://github.com/wdebruij/kerneltools/commits/icept.2 Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP") Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"") Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19afs: Fix missing timeout resetDavid Howells1-0/+1
In afs_wait_for_call_to_complete(), rather than immediately aborting an operation if a signal occurs, the code attempts to wait for it to complete, using a schedule timeout of 2*RTT (or min 2 jiffies) and a check that we're still receiving relevant packets from the server before we consider aborting the call. We may even ping the server to check on the status of the call. However, there's a missing timeout reset in the event that we do actually get a packet to process, such that if we then get a couple of short stalls, we then time out when progress is actually being made. Fix this by resetting the timeout any time we get something to process. If it's the failure of the call then the call state will get changed and we'll exit the loop shortly thereafter. A symptom of this is data fetches and stores failing with EINTR when they really shouldn't. Fixes: bc5e3a546d55 ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Signed-off-by: David Howells <[email protected]> Reviewed-by: Marc Dionne <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-11-19gve: fix dma sync bug where not all pages syncedAdi Suresh1-4/+5
The previous commit had a bug where the last page in the memory range could not be synced. This change fixes the behavior so that all the required pages are synced. Fixes: 9cfeeb576d49 ("gve: Fixes DMA synchronization") Signed-off-by: Adi Suresh <[email protected]> Reviewed-by: Catherine Sullivan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-19drm/i915: make pool objects read-onlyMatthew Auld1-0/+2
For our current users we don't expect pool objects to be writable from the gpu. Signed-off-by: Matthew Auld <[email protected]> Cc: Chris Wilson <[email protected]> Fixes: 4f7af1948abc ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers") Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit d18580b08b92ec4105eb0ede2d676e8b1f5a66c3) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-19mdio_bus: Fix init if CONFIG_RESET_CONTROLLER=nGeert Uytterhoeven1-1/+1
Commit 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") accidentally changed a check from -ENOTSUPP to -ENOSYS, causing failures if reset controller support is not enabled. E.g. on r7s72100/rskrza1: sh-eth e8203000.ethernet: MDIO init failed: -524 sh-eth: probe of e8203000.ethernet failed with error -524 Seen on r8a7740/armadillo, r7s72100/rskrza1, and r7s9210/rza2mevb. Fixes: 1d4639567d97 ("mdio_bus: Fix PTR_ERR applied after initialization to constant") Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: YueHaibing <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-11-19nbd:fix memory leak in nbd_get_socket()Sun Ke1-0/+1
Before returning NULL, put the sock first. Cc: [email protected] Fixes: cf1b2326b734 ("nbd: verify socket is supported during setup") Reviewed-by: Josef Bacik <[email protected]> Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Sun Ke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-11-19virtio_console: allocate inbufs in add_port() only if it is neededLaurent Vivier1-15/+13
When we hot unplug a virtserialport and then try to hot plug again, it fails: (qemu) chardev-add socket,id=serial0,path=/tmp/serial0,server,nowait (qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\ chardev=serial0,id=serial0,name=serial0 (qemu) device_del serial0 (qemu) device_add virtserialport,bus=virtio-serial0.0,nr=2,\ chardev=serial0,id=serial0,name=serial0 kernel error: virtio-ports vport2p2: Error allocating inbufs qemu error: virtio-serial-bus: Guest failure in adding port 2 for device \ virtio-serial0.0 This happens because buffers for the in_vq are allocated when the port is added but are not released when the port is unplugged. They are only released when virtconsole is removed (see a7a69ec0d8e4) To avoid the problem and to be symmetric, we could allocate all the buffers in init_vqs() as they are released in remove_vqs(), but it sounds like a waste of memory. Rather than that, this patch changes add_port() logic to ignore ENOSPC error in fill_queue(), which means queue has already been filled. Fixes: a7a69ec0d8e4 ("virtio_console: free buffers after reset") Cc: [email protected] Cc: [email protected] Signed-off-by: Laurent Vivier <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2019-11-19virtio_ring: fix return code on DMA mapping failsHalil Pasic1-2/+2
Commit 780bc7903a32 ("virtio_ring: Support DMA APIs") makes virtqueue_add() return -EIO when we fail to map our I/O buffers. This is a very realistic scenario for guests with encrypted memory, as swiotlb may run out of space, depending on it's size and the I/O load. The virtio-blk driver interprets -EIO form virtqueue_add() as an IO error, despite the fact that swiotlb full is in absence of bugs a recoverable condition. Let us change the return code to -ENOMEM, and make the block layer recover form these failures when virtio-blk encounters the condition described above. Cc: [email protected] Fixes: 780bc7903a32 ("virtio_ring: Support DMA APIs") Signed-off-by: Halil Pasic <[email protected]> Tested-by: Michael Mueller <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2019-11-18mdio_bus: fix mdio_register_device when RESET_CONTROLLER is disabledMarek Behún1-1/+2
When CONFIG_RESET_CONTROLLER is disabled, the devm_reset_control_get_exclusive function returns -ENOTSUPP. This is not handled in subsequent check and then the mdio device fails to probe. When CONFIG_RESET_CONTROLLER is enabled, its code checks in OF for reset device, and since it is not present, returns -ENOENT. -ENOENT is handled. Add -ENOTSUPP also. This happened to me when upgrading kernel on Turris Omnia. You either have to enable CONFIG_RESET_CONTROLLER or use this patch. Signed-off-by: Marek Behún <[email protected]> Fixes: 71dd6c0dff51b ("net: phy: add support for reset-controller") Cc: Dmitry Torokhov <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Andy Shevchenko <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18net/ipv4: fix sysctl max for fib_multipath_hash_policyMarcelo Ricardo Leitner1-1/+1
Commit eec4844fae7c ("proc/sysctl: add shared variables for range check") did: - .extra2 = &two, + .extra2 = SYSCTL_ONE, here, which doesn't seem to be intentional, given the changelog. This patch restores it to the previous, as the value of 2 still makes sense (used in fib_multipath_hash()). Fixes: eec4844fae7c ("proc/sysctl: add shared variables for range check") Cc: Matteo Croce <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Matteo Croce <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18phy: mdio-sun4i: add missed regulator_disable in removeChuhong Yuan1-0/+3
The driver forgets to disable the regulator in remove like what is done in probe failure. Add the missed call to fix it. Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18net/mlx4_en: Fix wrong limitation for number of TX ringsTariq Toukan2-4/+13
XDP_TX rings should not be limited by max_num_tx_rings_p_up. To make sure total number of TX rings never exceed MAX_TX_RINGS, add similar check in mlx4_en_alloc_tx_queue_per_tc(), where a new value is assigned for num_up. Fixes: 7e1dc5e926d5 ("net/mlx4_en: Limit the number of TX rings") Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18net: sched: ensure opts_len <= IP_TUNNEL_OPTS_MAX in act_tunnel_keyXin Long1-0/+4
info->options_len is 'u8' type, and when opts_len with a value > IP_TUNNEL_OPTS_MAX, 'info->options_len = opts_len' will cast int to u8 and set a wrong value to info->options_len. Kernel crashed in my test when doing: # opts="0102:80:00800022" # for i in {1..99}; do opts="$opts,0102:80:00800022"; done # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev eth0 ingress # tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 ip_proto udp action tunnel_key \ set src_ip 10.0.99.192 dst_ip 10.0.99.193 \ dst_port 6081 id 11 geneve_opts $opts \ action mirred egress redirect dev geneve0 So we should do the similar check as cls_flower does, return error when opts_len > IP_TUNNEL_OPTS_MAX in tunnel_key_copy_opts(). Fixes: 0ed5269f9e41 ("net/sched: add tunnel option support to act_tunnel_key") Signed-off-by: Xin Long <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18mlxsw: spectrum_router: Fix determining underlay for a GRE tunnelPetr Machata1-18/+1
The helper mlxsw_sp_ipip_dev_ul_tb_id() determines the underlay VRF of a GRE tunnel. For a tunnel without a bound device, it uses the same VRF that the tunnel is in. However in Linux, a GRE tunnel without a bound device uses the main VRF as the underlay. Fix the function accordingly. mlxsw further assumed that moving a tunnel to a different VRF could cause conflict in local tunnel endpoint address, which cannot be offloaded. However, the only way that an underlay could be changed by moving the tunnel device itself is if the tunnel device does not have a bound device. But in that case the underlay is always the main VRF, so there is no opportunity to introduce a conflict by moving such device. Thus this check constitutes a dead code, and can be removed, which do. Fixes: 6ddb7426a7d4 ("mlxsw: spectrum_router: Introduce loopback RIFs") Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18net: atm: Reduce the severity of logging in unlink_clip_vccAditya Pakki1-3/+3
In case of errors in unlink_clip_vcc, the logging level is set to pr_crit but failures in clip_setentry are handled by pr_err(). The patch changes the severity consistent across invocations. Signed-off-by: Aditya Pakki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-18btrfs: drop bdev argument from submit_extent_pageDavid Sterba1-8/+3
After previous patches removing bdev being passed around to set it to bio, it has become unused in submit_extent_page. So it now has "only" 13 parameters. Signed-off-by: David Sterba <[email protected]>
2019-11-18btrfs: remove extent_map::bdevDavid Sterba8-31/+2
We can now remove the bdev from extent_map. Previous patches made sure that bio_set_dev is correctly in all places and that we don't need to grab it from latest_bdev or pass it around inside the extent map. Signed-off-by: David Sterba <[email protected]>
2019-11-18btrfs: drop bio_set_dev where not neededDavid Sterba2-11/+0
bio_set_dev sets a bdev to a bio and is not only setting a pointer bug also changing some state bits if there was a different bdev set before. This is one thing that's not needed. Another thing is that setting a bdev at bio allocation time is too early and actually does not work with plain redundancy profiles, where each time we submit a bio to a device, the bdev is set correctly. In many places the bio bdev is set to latest_bdev that seems to serve as a stub pointer "just to put something to bio". But we don't have to do that. Where do we know which bdev to set: * for regular IO: submit_stripe_bio that's called by btrfs_map_bio * repair IO: repair_io_failure, read or write from specific device * super block write (using buffer_heads but uses raw bdev) and barriers * scrub: this does not use all regular IO paths as it needs to reach all copies, verify and fixup eventually, and for that all bdev management is independent * raid56: rbio_add_io_page, for the RMW write * integrity-checker: does it's own low-level block tracking Signed-off-by: David Sterba <[email protected]>
2019-11-18btrfs: get bdev directly from fs_devices in submit_extent_pageDavid Sterba1-1/+4
This is preparatory patch to remove @bdev parameter from submit_extent_page. It can't be removed completely, because the cgroups need it for wbc when initializing the bio wbc_init_bio bio_associate_blkg_from_css dereference bdev->bi_disk->queue The bdev pointer is the same as latest_bdev, thus no functional change. We can retrieve it from fs_devices that's reachable through several dereferences. The local variable shadows the parameter, but that's only temporary. Signed-off-by: David Sterba <[email protected]>
2019-11-18drm/i915: Protect request peeking with RCUChris Wilson1-11/+39
Since the execlists_active() is no longer protected by the engine->active.lock, we need to protect the request pointer with RCU to prevent it being freed as we evaluate whether or not we need to preempt. Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock") Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 7d148635253328dda7cfe55d57e3c828e9564427) Signed-off-by: Joonas Lahtinen <[email protected]> (cherry picked from commit 8eb4704b124cbd44f189709959137d77063ecfa1) (cherry picked from commit 7e27238e149ce4f00d9cd801fe3aa0ea55e986a2) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-18btrfs: record all roots for rename exchange on a subvolJosef Bacik1-0/+3
Testing with the new fsstress support for subvolumes uncovered a pretty bad problem with rename exchange on subvolumes. We're modifying two different subvolumes, but we only start the transaction on one of them, so the other one is not added to the dirty root list. This is caught by btrfs_cow_block() with a warning because the root has not been updated, however if we do not modify this root again we'll end up pointing at an invalid root because the root item is never updated. Fix this by making sure we add the destination root to the trans list, the same as we do with normal renames. This fixes the corruption. Fixes: cdd1fedf8261 ("btrfs: add support for RENAME_EXCHANGE and RENAME_WHITEOUT") CC: [email protected] # 4.9+ Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: David Sterba <[email protected]>
2019-11-18drm/i915/userptr: Try to acquire the page lock around set_page_dirty()Chris Wilson1-1/+21
set_page_dirty says: For pages with a mapping this should be done under the page lock for the benefit of asynchronous memory errors who prefer a consistent dirty state. This rule can be broken in some special cases, but should be better not to. Under those rules, it is only safe for us to use the plain set_page_dirty calls for shmemfs/anonymous memory. Userptr may be used with real mappings and so needs to use the locked version (set_page_dirty_lock). However, following a try_to_unmap() we may want to remove the userptr and so call put_pages(). However, try_to_unmap() acquires the page lock and so we must avoid recursively locking the pages ourselves -- which means that we cannot safely acquire the lock around set_page_dirty(). Since we can't be sure of the lock, we have to risk skip dirtying the page, or else risk calling set_page_dirty() without a lock and so risk fs corruption. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012 Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") References: cb6d7c7dc7ff ("drm/i915/userptr: Acquire the page lock around set_page_dirty()") References: 505a8ec7e11a ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"") References: 6dcc693bc57f ("ext4: warn when page is dirtied without buffers") Signed-off-by: Chris Wilson <[email protected]> Cc: Lionel Landwerlin <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: [email protected] Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 0d4bbe3d407f79438dc4f87943db21f7134cfc65) Signed-off-by: Joonas Lahtinen <[email protected]> (cherry picked from commit cee7fb437edcdb2f9f8affa959e274997f5dca4d) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-18drm/i915/pmu: "Frequency" is reported as accumulated cyclesChris Wilson1-2/+2
We report "frequencies" (actual-frequency, requested-frequency) as the number of accumulated cycles so that the average frequency over that period may be determined by the user. This means the units we report to the user are Mcycles (or just M), not MHz. Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: [email protected] Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit e88866ef02851c88fe95a4bb97820b94b4d46f36) Signed-off-by: Joonas Lahtinen <[email protected]> (cherry picked from commit a7d87b70d6da96c6772e50728c8b4e78e4cbfd55) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-18drm/i915: Preload LUTs if the hw isn't currently using themVille Syrjälä4-0/+69
The LUTs are single buffered so in order to program them without tearing we'd have to do it during vblank (actually to be 100% effective it has to happen between start of vblank and frame start). We have no proper mechanism for that at the moment so we just defer loading them after the vblank waits have happened. That is not quite sufficient (especially when committing multiple pipes whose vblanks don't line up) so the LUT load will often leak into the following frame causing tearing. However in case the hardware wasn't previously using the LUT we can preload it before setting the enable bit (which is double buffered so won't tear). Let's determine if we can do such preloading and make it happen. Slight variation between the hardware requires some platforms specifics in the checks. Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win and Asus T100HA) when the gamma LUT gets loaded for the first time as the BIOS has left some junk in the LUT memory. v2: Deal with uapi vs. hw crtc state split s/GCM/CGM/ typo fix Cc: Hans de Goede <[email protected]> Fixes: 051a6d8d3ca0 ("drm/i915: Move LUT programming to happen after vblank waits") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Tested-by: Hans de Goede <[email protected]> Reviewed-by: Hans de Goede <[email protected]> (cherry picked from commit 0ccc42a2fd5107a7f58e62c8b35b61de9a70ce82) Signed-off-by: Joonas Lahtinen <[email protected]> (cherry picked from commit f77021372e2880237278e0ee57faadc077a8256a) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-18drm/i915: Don't oops in dumb_create ioctl if we have no crtcsVille Syrjälä1-0/+3
Make sure we have a crtc before probing its primary plane's max stride. Initially I thought we can't get this far without crtcs, but looks like we can via the dumb_create ioctl. Not sure if we shouldn't disable dumb buffer support entirely when we have no crtcs, but that would require some amount of work as the only thing currently being checked is dev->driver->dumb_create which we'd have to convert to some device specific dynamic thing. Cc: [email protected] Reported-by: Mika Kuoppala <[email protected]> Fixes: aa5ca8b7421c ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping") Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Chris Wilson <[email protected]> (cherry picked from commit baea9ffe64200033499a4955f431e315bb807899) Signed-off-by: Joonas Lahtinen <[email protected]> (cherry picked from commit aeec766133f99d45aad60d650de50fb382104d95) Signed-off-by: Rodrigo Vivi <[email protected]>
2019-11-18Btrfs: fix block group remaining RO forever after error during device replaceFilipe Manana3-45/+2
When doing a device replace, while at scrub.c:scrub_enumerate_chunks(), we set the block group to RO mode and then wait for any ongoing writes into extents of the block group to complete. While doing that wait we overwrite the value of the variable 'ret' and can break out of the loop if an error happens without turning the block group back into RW mode. So what happens is the following: 1) btrfs_inc_block_group_ro() returns 0, meaning it set the block group to RO mode (its ->ro field set to 1 or incremented to some value > 1); 2) Then btrfs_wait_ordered_roots() returns a value > 0; 3) Then if either joining or committing the transaction fails, we break out of the loop wihtout calling btrfs_dec_block_group_ro(), leaving the block group in RO mode forever. To fix this, just remove the code that waits for ongoing writes to extents of the block group, since it's not needed because in the initial setup phase of a device replace operation, before starting to find all chunks and their extents, we set the target device for replace while holding fs_info->dev_replace->rwsem, which ensures that after releasing that semaphore, any writes into the source device are made to the target device as well (__btrfs_map_block() guarantees that). So while at scrub_enumerate_chunks() we only need to worry about finding and copying extents (from the source device to the target device) that were written before we started the device replace operation. Fixes: f0e9b7d6401959 ("Btrfs: fix race setting block group readonly during device replace") Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]>