aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-11-24drm/amd/amdgpu: fix null pointer in runtime pmKenneth Feng1-2/+2
fix the null pointer issue when runtime pm is triggered. Signed-off-by: Kenneth Feng <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2020-11-24drm/i915/gt: Defer enabling the breadcrumb interrupt to after submissionChris Wilson1-39/+70
Move the register slow register write and readback from out of the critical path for execlists submission and delay it until the following worker, shaving off around 200us. Note that the same signal_irq_work() is allowed to run concurrently on each CPU (but it will only be queued once, once running though it can be requeued and reexecuted) so we have to remember to lock the global interactions as we cannot rely on the signal_irq_work() itself providing the serialisation (in constrast to a tasklet). By pushing the arm/disarm into the central signaling worker we can close the race for disarming the interrupt (and dropping its associated GT wakeref) on parking the engine. If we loose the race, that GT wakeref may be held indefinitely, preventing the machine from sleeping while the GPU is ostensibly idle. v2: Move the self-arming parking of the signal_irq_work to a flush of the irq-work from intel_breadcrumbs_park(). Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2271 Fixes: e23005604b2f ("drm/i915/gt: Hold context/request reference while breadcrumbs are active") Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 9d5612ca165a58aacc160465532e7998b9aab270) Signed-off-by: Rodrigo Vivi <[email protected]>
2020-11-24drm/i915/gvt: correct a false comment of flag F_UNALIGNYan Zhao1-1/+1
Correct falsely removed comment of flag F_UNALIGN. Fixes: a6c5817a38cf ("drm/i915/gvt: remove flag F_CMD_ACCESSED") Reviewed-by: Zhenyu Wang <[email protected]> Signed-off-by: Yan Zhao <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 6594094f819e0020e926e137e47e2edb97ba500b) Signed-off-by: Rodrigo Vivi <[email protected]>
2020-11-24drm/i915/perf: workaround register corruption in OATAILPTRLionel Landwerlin2-2/+9
After having written the entire OA buffer with reports, the HW will write again at the beginning of the OA buffer. It'll indicate it by setting the WRAP bits in the OASTATUS register. When a wrap happens and that at the end of the read vfunc we write the OASTATUS register back to clear the REPORT_LOST bit, we sometimes see that the OATAILPTR register is reset to a previous position on Gen8/9 (apparently not the case on Gen11+). This leads the next call to the read vfunc to process reports we've already read. Because we've marked those as read by clearing the reason & timestamp dwords, they're discarded and a "Skipping spurious, invalid OA report" message is emitted. The workaround to avoid this OATAILPTR value reset seems to be to set the wrap bits when writing back OASTATUS. This change has no impact on userspace, it only avoids a bunch of DRM_NOTE("Skipping spurious, invalid OA report\n") messages. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 19f81df2859eb1 ("drm/i915/perf: Add OA unit support for Gen 8+") Reviewed-by: Umesh Nerlige Ramappa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 059a0beb486344a577ff476acce75e69eab704be) Signed-off-by: Rodrigo Vivi <[email protected]>
2020-11-24intel_idle: Fix intel_idle() vs tracingPeter Zijlstra1-17/+20
cpuidle->enter() callbacks should not call into tracing because RCU has already been disabled. Instead of doing the broadcast thing itself, simply advertise to the cpuidle core that those states stop the timer. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-11-24sched/idle: Fix arch_cpu_idle() vs tracingPeter Zijlstra23-38/+64
We call arch_cpu_idle() with RCU disabled, but then use local_irq_{en,dis}able(), which invokes tracing, which relies on RCU. Switch all arch_cpu_idle() implementations to use raw_local_irq_{en,dis}able() and carefully manage the lockdep,rcu,tracing state like we do in entry. (XXX: we really should change arch_cpu_idle() to not return with interrupts enabled) Reported-by: Sven Schnelle <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Mark Rutland <[email protected]> Tested-by: Mark Rutland <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-11-24io_uring: fix ITER_BVEC checkPavel Begunkov1-1/+1
iov_iter::type is a bitmask that also keeps direction etc., so it shouldn't be directly compared against ITER_*. Use proper helper. Fixes: ff6165b2d7f6 ("io_uring: retain iov_iter state over io_read/io_write calls") Reported-by: David Howells <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Cc: <[email protected]> # 5.9 Signed-off-by: Jens Axboe <[email protected]>
2020-11-24io_uring: fix shift-out-of-bounds when round up cq sizeJoseph Qi1-2/+4
Abaci Fuzz reported a shift-out-of-bounds BUG in io_uring_create(): [ 59.598207] UBSAN: shift-out-of-bounds in ./include/linux/log2.h:57:13 [ 59.599665] shift exponent 64 is too large for 64-bit type 'long unsigned int' [ 59.601230] CPU: 0 PID: 963 Comm: a.out Not tainted 5.10.0-rc4+ #3 [ 59.602502] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 [ 59.603673] Call Trace: [ 59.604286] dump_stack+0x107/0x163 [ 59.605237] ubsan_epilogue+0xb/0x5a [ 59.606094] __ubsan_handle_shift_out_of_bounds.cold+0xb2/0x20e [ 59.607335] ? lock_downgrade+0x6c0/0x6c0 [ 59.608182] ? rcu_read_lock_sched_held+0xaf/0xe0 [ 59.609166] io_uring_create.cold+0x99/0x149 [ 59.610114] io_uring_setup+0xd6/0x140 [ 59.610975] ? io_uring_create+0x2510/0x2510 [ 59.611945] ? lockdep_hardirqs_on_prepare+0x286/0x400 [ 59.613007] ? syscall_enter_from_user_mode+0x27/0x80 [ 59.614038] ? trace_hardirqs_on+0x5b/0x180 [ 59.615056] do_syscall_64+0x2d/0x40 [ 59.615940] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 59.617007] RIP: 0033:0x7f2bb8a0b239 This is caused by roundup_pow_of_two() if the input entries larger enough, e.g. 2^32-1. For sq_entries, it will check first and we allow at most IORING_MAX_ENTRIES, so it is okay. But for cq_entries, we do round up first, that may overflow and truncate it to 0, which is not the expected behavior. So check the cq size first and then do round up. Fixes: 88ec3211e463 ("io_uring: round-up cq size before comparing with rounded sq size") Reported-by: Abaci Fuzz <[email protected]> Signed-off-by: Joseph Qi <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-11-24spi: imx: fix the unbalanced spi runtime pm managementClark Wang1-0/+1
If set active without increase the usage count of pm, the dont use autosuspend function will call the suspend callback to close the two clocks of spi because the usage count is reduced to -1. This will cause the warning dump below when the defer-probe occurs. [ 129.379701] ecspi2_root_clk already disabled [ 129.384005] WARNING: CPU: 1 PID: 33 at drivers/clk/clk.c:952 clk_core_disable+0xa4/0xb0 So add the get noresume function before set active. Fixes: 43b6bf406cd0 spi: imx: fix runtime pm support for !CONFIG_PM Signed-off-by: Clark Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2020-11-24firmware: xilinx: Use hash-table for api feature checkAmit Sunil Dhamne2-18/+49
Currently array of fix length PM_API_MAX is used to cache the pm_api version (valid or invalid). However ATF based PM APIs values are much higher then PM_API_MAX. So to include ATF based PM APIs also, use hash-table to store the pm_api version status. Signed-off-by: Amit Sunil Dhamne <[email protected]> Reported-by: Arnd Bergmann <[email protected]> Signed-off-by: Ravi Patel <[email protected]> Signed-off-by: Rajan Vaja <[email protected]> Reviewed-by: Arnd Bergmann <[email protected]> Tested-by: Michal Simek <[email protected]> Fixes: f3217d6f2f7a ("firmware: xilinx: fix out-of-bounds access") Cc: stable <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michal Simek <[email protected]>
2020-11-24firmware: xilinx: Fix SD DLL node reset issueManish Narani1-1/+1
Fix the SD DLL node reset issue where incorrect node is being referenced instead of SD DLL node. Fixes: 426c8d85df7a ("firmware: xilinx: Use APIs instead of IOCTLs") Signed-off-by: Manish Narani <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michal Simek <[email protected]>
2020-11-24x86/resctrl: Add necessary kernfs_put() calls to prevent refcount leakXiaochen Shen1-7/+25
On resource group creation via a mkdir an extra kernfs_node reference is obtained by kernfs_get() to ensure that the rdtgroup structure remains accessible for the rdtgroup_kn_unlock() calls where it is removed on deletion. Currently the extra kernfs_node reference count is only dropped by kernfs_put() in rdtgroup_kn_unlock() while the rdtgroup structure is removed in a few other locations that lack the matching reference drop. In call paths of rmdir and umount, when a control group is removed, kernfs_remove() is called to remove the whole kernfs nodes tree of the control group (including the kernfs nodes trees of all child monitoring groups), and then rdtgroup structure is freed by kfree(). The rdtgroup structures of all child monitoring groups under the control group are freed by kfree() in free_all_child_rdtgrp(). Before calling kfree() to free the rdtgroup structures, the kernfs node of the control group itself as well as the kernfs nodes of all child monitoring groups still take the extra references which will never be dropped to 0 and the kernfs nodes will never be freed. It leads to reference count leak and kernfs_node_cache memory leak. For example, reference count leak is observed in these two cases: (1) mount -t resctrl resctrl /sys/fs/resctrl mkdir /sys/fs/resctrl/c1 mkdir /sys/fs/resctrl/c1/mon_groups/m1 umount /sys/fs/resctrl (2) mkdir /sys/fs/resctrl/c1 mkdir /sys/fs/resctrl/c1/mon_groups/m1 rmdir /sys/fs/resctrl/c1 The same reference count leak issue also exists in the error exit paths of mkdir in mkdir_rdt_prepare() and rdtgroup_mkdir_ctrl_mon(). Fix this issue by following changes to make sure the extra kernfs_node reference on rdtgroup is dropped before freeing the rdtgroup structure. (1) Introduce rdtgroup removal helper rdtgroup_remove() to wrap up kernfs_put() and kfree(). (2) Call rdtgroup_remove() in rdtgroup removal path where the rdtgroup structure is about to be freed by kfree(). (3) Call rdtgroup_remove() or kernfs_put() as appropriate in the error exit paths of mkdir where an extra reference is taken by kernfs_get(). Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support") Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files") Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system") Reported-by: Willem de Bruijn <[email protected]> Signed-off-by: Xiaochen Shen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2020-11-24x86/resctrl: Remove superfluous kernfs_get() calls to prevent refcount leakXiaochen Shen1-33/+2
Willem reported growing of kernfs_node_cache entries in slabtop when repeatedly creating and removing resctrl subdirectories as well as when repeatedly mounting and unmounting the resctrl filesystem. On resource group (control as well as monitoring) creation via a mkdir an extra kernfs_node reference is obtained to ensure that the rdtgroup structure remains accessible for the rdtgroup_kn_unlock() calls where it is removed on deletion. The kernfs_node reference count is dropped by kernfs_put() in rdtgroup_kn_unlock(). With the above explaining the need for one kernfs_get()/kernfs_put() pair in resctrl there are more places where a kernfs_node reference is obtained without a corresponding release. The excessive amount of reference count on kernfs nodes will never be dropped to 0 and the kernfs nodes will never be freed in the call paths of rmdir and umount. It leads to reference count leak and kernfs_node_cache memory leak. Remove the superfluous kernfs_get() calls and expand the existing comments surrounding the remaining kernfs_get()/kernfs_put() pair that remains in use. Superfluous kernfs_get() calls are removed from two areas: (1) In call paths of mount and mkdir, when kernfs nodes for "info", "mon_groups" and "mon_data" directories and sub-directories are created, the reference count of newly created kernfs node is set to 1. But after kernfs_create_dir() returns, superfluous kernfs_get() are called to take an additional reference. (2) kernfs_get() calls in rmdir call paths. Fixes: 17eafd076291 ("x86/intel_rdt: Split resource group removal in two") Fixes: 4af4a88e0c92 ("x86/intel_rdt/cqm: Add mount,umount support") Fixes: f3cbeacaa06e ("x86/intel_rdt/cqm: Add rmdir support") Fixes: d89b7379015f ("x86/intel_rdt/cqm: Add mon_data") Fixes: c7d9aac61311 ("x86/intel_rdt/cqm: Add mkdir support for RDT monitoring") Fixes: 5dc1d5c6bac2 ("x86/intel_rdt: Simplify info and base file lists") Fixes: 60cf5e101fd4 ("x86/intel_rdt: Add mkdir to resctrl file system") Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system") Reported-by: Willem de Bruijn <[email protected]> Signed-off-by: Xiaochen Shen <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Reinette Chatre <[email protected]> Tested-by: Willem de Bruijn <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]
2020-11-23net/packet: fix packet receive on L3 devices without visible hard headerEyal Birger2-9/+14
In the patchset merged by commit b9fcf0a0d826 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'") L3 devices which did not have header_ops were given one for the purpose of protocol parsing on af_packet transmit path. That change made af_packet receive path regard these devices as having a visible L3 header and therefore aligned incoming skb->data to point to the skb's mac_header. Some devices, such as ipip, xfrmi, and others, do not reset their mac_header prior to ingress and therefore their incoming packets became malformed. Ideally these devices would reset their mac headers, or af_packet would be able to rely on dev->hard_header_len being 0 for such cases, but it seems this is not the case. Fix by changing af_packet RX ll visibility criteria to include the existence of a '.create()' header operation, which is used when creating a device hard header - via dev_hard_header() - by upper layers, and does not exist in these L3 devices. As this predicate may be useful in other situations, add it as a common dev_has_header() helper in netdevice.h. Fixes: b9fcf0a0d826 ("Merge branch 'support-AF_PACKET-for-layer-3-devices'") Signed-off-by: Eyal Birger <[email protected]> Acked-by: Jason A. Donenfeld <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-23soc: fsl: dpio: Get the cpumask through cpumask_of(cpu)Hao Si1-4/+1
The local variable 'cpumask_t mask' is in the stack memory, and its address is assigned to 'desc->affinity' in 'irq_set_affinity_hint()'. But the memory area where this variable is located is at risk of being modified. During LTP testing, the following error was generated: Unable to handle kernel paging request at virtual address ffff000012e9b790 Mem abort info: ESR = 0x96000007 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000007 CM = 0, WnR = 0 swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000075ac5e07 [ffff000012e9b790] pgd=00000027dbffe003, pud=00000027dbffd003, pmd=00000027b6d61003, pte=0000000000000000 Internal error: Oops: 96000007 [#1] PREEMPT SMP Modules linked in: xt_conntrack Process read_all (pid: 20171, stack limit = 0x0000000044ea4095) CPU: 14 PID: 20171 Comm: read_all Tainted: G B W Hardware name: NXP Layerscape LX2160ARDB (DT) pstate: 80000085 (Nzcv daIf -PAN -UAO) pc : irq_affinity_hint_proc_show+0x54/0xb0 lr : irq_affinity_hint_proc_show+0x4c/0xb0 sp : ffff00001138bc10 x29: ffff00001138bc10 x28: 0000ffffd131d1e0 x27: 00000000007000c0 x26: ffff8025b9480dc0 x25: ffff8025b9480da8 x24: 00000000000003ff x23: ffff8027334f8300 x22: ffff80272e97d000 x21: ffff80272e97d0b0 x20: ffff8025b9480d80 x19: ffff000009a49000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000040 x11: 0000000000000000 x10: ffff802735b79b88 x9 : 0000000000000000 x8 : 0000000000000000 x7 : ffff000009a49848 x6 : 0000000000000003 x5 : 0000000000000000 x4 : ffff000008157d6c x3 : ffff00001138bc10 x2 : ffff000012e9b790 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: irq_affinity_hint_proc_show+0x54/0xb0 seq_read+0x1b0/0x440 proc_reg_read+0x80/0xd8 __vfs_read+0x60/0x178 vfs_read+0x94/0x150 ksys_read+0x74/0xf0 __arm64_sys_read+0x24/0x30 el0_svc_common.constprop.0+0xd8/0x1a0 el0_svc_handler+0x34/0x88 el0_svc+0x10/0x14 Code: f9001bbf 943e0732 f94066c2 b4000062 (f9400041) ---[ end trace b495bdcb0b3b732b ]--- Kernel panic - not syncing: Fatal exception SMP: stopping secondary CPUs SMP: failed to stop secondary CPUs 0,2-4,6,8,11,13-15 Kernel Offset: disabled CPU features: 0x0,21006008 Memory Limit: none ---[ end Kernel panic - not syncing: Fatal exception ]--- Fix it by using 'cpumask_of(cpu)' to get the cpumask. Signed-off-by: Hao Si <[email protected]> Signed-off-by: Lin Chen <[email protected]> Signed-off-by: Yi Wang <[email protected]> Signed-off-by: Li Yang <[email protected]>
2020-11-23i40e: Fix removing driver while bare-metal VFs pass trafficSylwester Dziedziuch3-18/+31
Prevent VFs from resetting when PF driver is being unloaded: - introduce new pf state: __I40E_VF_RESETS_DISABLED; - check if pf state has __I40E_VF_RESETS_DISABLED state set, if so, disable any further VFLR event notifications; - when i40e_remove (rmmod i40e) is called, disable any resets on the VFs; Previously if there were bare-metal VFs passing traffic and PF driver was removed, there was a possibility of VFs triggering a Tx timeout right before iavf_remove. This was causing iavf_close to not be called because there is a check in the beginning of iavf_remove that bails out early if adapter->state < IAVF_DOWN_PENDING. This makes it so some resources do not get cleaned up. Fixes: 6a9ddb36eeb8 ("i40e: disable IOV before freeing resources") Signed-off-by: Slawomir Laba <[email protected]> Signed-off-by: Brett Creeley <[email protected]> Signed-off-by: Sylwester Dziedziuch <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-23vsock/virtio: discard packets only when socket is really closedStefano Garzarella1-3/+5
Starting from commit 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt"), we discard packets in virtio_transport_recv_pkt() if the socket has been released. When the socket is connected, we schedule a delayed work to wait the RST packet from the other peer, also if SHUTDOWN_MASK is set in sk->sk_shutdown. This is done to complete the virtio-vsock shutdown algorithm, releasing the port assigned to the socket definitively only when the other peer has consumed all the packets. If we discard the RST packet received, the socket will be closed only when the VSOCK_CLOSE_TIMEOUT is reached. Sergio discovered the issue while running ab(1) HTTP benchmark using libkrun [1] and observing a latency increase with that commit. To avoid this issue, we discard packet only if the socket is really closed (SOCK_DONE flag is set). We also set SOCK_DONE in virtio_transport_release() when we don't need to wait any packets from the other peer (we didn't schedule the delayed work). In this case we remove the socket from the vsock lists, releasing the port assigned. [1] https://github.com/containers/libkrun Fixes: 8692cefc433f ("virtio_vsock: Fix race condition in virtio_transport_recv_pkt") Cc: [email protected] Reported-by: Sergio Lopez <[email protected]> Tested-by: Sergio Lopez <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Acked-by: Jia He <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-23tcp: fix race condition when creating child sockets from syncookiesRicardo Dias7-16/+91
When the TCP stack is in SYN flood mode, the server child socket is created from the SYN cookie received in a TCP packet with the ACK flag set. The child socket is created when the server receives the first TCP packet with a valid SYN cookie from the client. Usually, this packet corresponds to the final step of the TCP 3-way handshake, the ACK packet. But is also possible to receive a valid SYN cookie from the first TCP data packet sent by the client, and thus create a child socket from that SYN cookie. Since a client socket is ready to send data as soon as it receives the SYN+ACK packet from the server, the client can send the ACK packet (sent by the TCP stack code), and the first data packet (sent by the userspace program) almost at the same time, and thus the server will equally receive the two TCP packets with valid SYN cookies almost at the same instant. When such event happens, the TCP stack code has a race condition that occurs between the momement a lookup is done to the established connections hashtable to check for the existence of a connection for the same client, and the moment that the child socket is added to the established connections hashtable. As a consequence, this race condition can lead to a situation where we add two child sockets to the established connections hashtable and deliver two sockets to the userspace program to the same client. This patch fixes the race condition by checking if an existing child socket exists for the same client when we are adding the second child socket to the established connections socket. If an existing child socket exists, we drop the packet and discard the second child socket to the same client. Signed-off-by: Ricardo Dias <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-23Merge tag 'hyperv-fixes-signed' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fix from Wei Liu: "One patch from Dexuan to fix VRAM cache type in Hyper-V framebuffer driver" * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: video: hyperv_fb: Fix the cache type when mapping the VRAM
2020-11-23Merge tag 'wireless-drivers-2020-11-23' of ↵Jakub Kicinski10-57/+164
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for v5.10 First set of fixes for v5.10. One fix for iwlwifi kernel panic, others less notable. rtw88 * fix a bogus test found by clang iwlwifi * fix long memory reads causing soft lockup warnings * fix kernel panic during Channel Switch Announcement (CSA) * other smaller fixes MAINTAINERS * email address updates * tag 'wireless-drivers-2020-11-23' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers: iwlwifi: mvm: fix kernel panic in case of assert during CSA iwlwifi: pcie: set LTR to avoid completion timeout iwlwifi: mvm: write queue_sync_state only for sync iwlwifi: mvm: properly cancel a session protection for P2P iwlwifi: mvm: use the HOT_SPOT_CMD to cancel an AUX ROC iwlwifi: sta: set max HE max A-MPDU according to HE capa MAINTAINERS: update maintainers list for Cypress MAINTAINERS: update Yan-Hsuan's email address iwlwifi: pcie: limit memory read spin time rtw88: fix fw_fifo_addr check ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-23IB/mthca: fix return value of error branch in mthca_init_cq()Xiongfeng Wang1-4/+6
We return 'err' in the error branch, but this variable may be set as zero by the above code. Fix it by setting 'err' as a negative value before we goto the error label. Fixes: 74c2174e7be5 ("IB uverbs: add mthca user CQ support") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/[email protected] Reported-by: Hulk Robot <[email protected]> Signed-off-by: Xiongfeng Wang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-11-23btrfs: fix lockdep splat when enabling and disabling qgroupsFilipe Manana2-9/+53
When running test case btrfs/017 from fstests, lockdep reported the following splat: [ 1297.067385] ====================================================== [ 1297.067708] WARNING: possible circular locking dependency detected [ 1297.068022] 5.10.0-rc4-btrfs-next-73 #1 Not tainted [ 1297.068322] ------------------------------------------------------ [ 1297.068629] btrfs/189080 is trying to acquire lock: [ 1297.068929] ffff9f2725731690 (sb_internal#2){.+.+}-{0:0}, at: btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.069274] but task is already holding lock: [ 1297.069868] ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] [ 1297.070219] which lock already depends on the new lock. [ 1297.071131] the existing dependency chain (in reverse order) is: [ 1297.071721] -> #1 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}: [ 1297.072375] lock_acquire+0xd8/0x490 [ 1297.072710] __mutex_lock+0xa3/0xb30 [ 1297.073061] btrfs_qgroup_inherit+0x59/0x6a0 [btrfs] [ 1297.073421] create_subvol+0x194/0x990 [btrfs] [ 1297.073780] btrfs_mksubvol+0x3fb/0x4a0 [btrfs] [ 1297.074133] __btrfs_ioctl_snap_create+0x119/0x1a0 [btrfs] [ 1297.074498] btrfs_ioctl_snap_create+0x58/0x80 [btrfs] [ 1297.074872] btrfs_ioctl+0x1a90/0x36f0 [btrfs] [ 1297.075245] __x64_sys_ioctl+0x83/0xb0 [ 1297.075617] do_syscall_64+0x33/0x80 [ 1297.075993] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.076380] -> #0 (sb_internal#2){.+.+}-{0:0}: [ 1297.077166] check_prev_add+0x91/0xc60 [ 1297.077572] __lock_acquire+0x1740/0x3110 [ 1297.077984] lock_acquire+0xd8/0x490 [ 1297.078411] start_transaction+0x3c5/0x760 [btrfs] [ 1297.078853] btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.079323] btrfs_ioctl+0x2c60/0x36f0 [btrfs] [ 1297.079789] __x64_sys_ioctl+0x83/0xb0 [ 1297.080232] do_syscall_64+0x33/0x80 [ 1297.080680] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.081139] other info that might help us debug this: [ 1297.082536] Possible unsafe locking scenario: [ 1297.083510] CPU0 CPU1 [ 1297.084005] ---- ---- [ 1297.084500] lock(&fs_info->qgroup_ioctl_lock); [ 1297.084994] lock(sb_internal#2); [ 1297.085485] lock(&fs_info->qgroup_ioctl_lock); [ 1297.085974] lock(sb_internal#2); [ 1297.086454] *** DEADLOCK *** [ 1297.087880] 3 locks held by btrfs/189080: [ 1297.088324] #0: ffff9f2725731470 (sb_writers#14){.+.+}-{0:0}, at: btrfs_ioctl+0xa73/0x36f0 [btrfs] [ 1297.088799] #1: ffff9f2702b60cc0 (&fs_info->subvol_sem){++++}-{3:3}, at: btrfs_ioctl+0x1f4d/0x36f0 [btrfs] [ 1297.089284] #2: ffff9f2702b61a08 (&fs_info->qgroup_ioctl_lock){+.+.}-{3:3}, at: btrfs_quota_enable+0x3b/0xa70 [btrfs] [ 1297.089771] stack backtrace: [ 1297.090662] CPU: 5 PID: 189080 Comm: btrfs Not tainted 5.10.0-rc4-btrfs-next-73 #1 [ 1297.091132] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 1297.092123] Call Trace: [ 1297.092629] dump_stack+0x8d/0xb5 [ 1297.093115] check_noncircular+0xff/0x110 [ 1297.093596] check_prev_add+0x91/0xc60 [ 1297.094076] ? kvm_clock_read+0x14/0x30 [ 1297.094553] ? kvm_sched_clock_read+0x5/0x10 [ 1297.095029] __lock_acquire+0x1740/0x3110 [ 1297.095510] lock_acquire+0xd8/0x490 [ 1297.095993] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.096476] start_transaction+0x3c5/0x760 [btrfs] [ 1297.096962] ? btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.097451] btrfs_quota_enable+0xaf/0xa70 [btrfs] [ 1297.097941] ? btrfs_ioctl+0x1f4d/0x36f0 [btrfs] [ 1297.098429] btrfs_ioctl+0x2c60/0x36f0 [btrfs] [ 1297.098904] ? do_user_addr_fault+0x20c/0x430 [ 1297.099382] ? kvm_clock_read+0x14/0x30 [ 1297.099854] ? kvm_sched_clock_read+0x5/0x10 [ 1297.100328] ? sched_clock+0x5/0x10 [ 1297.100801] ? sched_clock_cpu+0x12/0x180 [ 1297.101272] ? __x64_sys_ioctl+0x83/0xb0 [ 1297.101739] __x64_sys_ioctl+0x83/0xb0 [ 1297.102207] do_syscall_64+0x33/0x80 [ 1297.102673] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 1297.103148] RIP: 0033:0x7f773ff65d87 This is because during the quota enable ioctl we lock first the mutex qgroup_ioctl_lock and then start a transaction, and starting a transaction acquires a fs freeze semaphore (at the VFS level). However, every other code path, except for the quota disable ioctl path, we do the opposite: we start a transaction and then lock the mutex. So fix this by making the quota enable and disable paths to start the transaction without having the mutex locked, and then, after starting the transaction, lock the mutex and check if some other task already enabled or disabled the quotas, bailing with success if that was the case. Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-11-23btrfs: do nofs allocations when adding and removing qgroup relationsFilipe Manana1-0/+9
When adding or removing a qgroup relation we are doing a GFP_KERNEL allocation which is not safe because we are holding a transaction handle open and that can make us deadlock if the allocator needs to recurse into the filesystem. So just surround those calls with a nofs context. Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-11-23btrfs: fix lockdep splat when reading qgroup config on mountFilipe Manana1-1/+1
Lockdep reported the following splat when running test btrfs/190 from fstests: [ 9482.126098] ====================================================== [ 9482.126184] WARNING: possible circular locking dependency detected [ 9482.126281] 5.10.0-rc4-btrfs-next-73 #1 Not tainted [ 9482.126365] ------------------------------------------------------ [ 9482.126456] mount/24187 is trying to acquire lock: [ 9482.126534] ffffa0c869a7dac0 (&fs_info->qgroup_rescan_lock){+.+.}-{3:3}, at: qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.126647] but task is already holding lock: [ 9482.126777] ffffa0c892ebd3a0 (btrfs-quota-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.126886] which lock already depends on the new lock. [ 9482.127078] the existing dependency chain (in reverse order) is: [ 9482.127213] -> #1 (btrfs-quota-00){++++}-{3:3}: [ 9482.127366] lock_acquire+0xd8/0x490 [ 9482.127436] down_read_nested+0x45/0x220 [ 9482.127528] __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.127613] btrfs_read_lock_root_node+0x41/0x130 [btrfs] [ 9482.127702] btrfs_search_slot+0x514/0xc30 [btrfs] [ 9482.127788] update_qgroup_status_item+0x72/0x140 [btrfs] [ 9482.127877] btrfs_qgroup_rescan_worker+0xde/0x680 [btrfs] [ 9482.127964] btrfs_work_helper+0xf1/0x600 [btrfs] [ 9482.128039] process_one_work+0x24e/0x5e0 [ 9482.128110] worker_thread+0x50/0x3b0 [ 9482.128181] kthread+0x153/0x170 [ 9482.128256] ret_from_fork+0x22/0x30 [ 9482.128327] -> #0 (&fs_info->qgroup_rescan_lock){+.+.}-{3:3}: [ 9482.128464] check_prev_add+0x91/0xc60 [ 9482.128551] __lock_acquire+0x1740/0x3110 [ 9482.128623] lock_acquire+0xd8/0x490 [ 9482.130029] __mutex_lock+0xa3/0xb30 [ 9482.130590] qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.131577] btrfs_read_qgroup_config+0x43a/0x550 [btrfs] [ 9482.132175] open_ctree+0x1228/0x18a0 [btrfs] [ 9482.132756] btrfs_mount_root.cold+0x13/0xed [btrfs] [ 9482.133325] legacy_get_tree+0x30/0x60 [ 9482.133866] vfs_get_tree+0x28/0xe0 [ 9482.134392] fc_mount+0xe/0x40 [ 9482.134908] vfs_kern_mount.part.0+0x71/0x90 [ 9482.135428] btrfs_mount+0x13b/0x3e0 [btrfs] [ 9482.135942] legacy_get_tree+0x30/0x60 [ 9482.136444] vfs_get_tree+0x28/0xe0 [ 9482.136949] path_mount+0x2d7/0xa70 [ 9482.137438] do_mount+0x75/0x90 [ 9482.137923] __x64_sys_mount+0x8e/0xd0 [ 9482.138400] do_syscall_64+0x33/0x80 [ 9482.138873] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 9482.139346] other info that might help us debug this: [ 9482.140735] Possible unsafe locking scenario: [ 9482.141594] CPU0 CPU1 [ 9482.142011] ---- ---- [ 9482.142411] lock(btrfs-quota-00); [ 9482.142806] lock(&fs_info->qgroup_rescan_lock); [ 9482.143216] lock(btrfs-quota-00); [ 9482.143629] lock(&fs_info->qgroup_rescan_lock); [ 9482.144056] *** DEADLOCK *** [ 9482.145242] 2 locks held by mount/24187: [ 9482.145637] #0: ffffa0c8411c40e8 (&type->s_umount_key#44/1){+.+.}-{3:3}, at: alloc_super+0xb9/0x400 [ 9482.146061] #1: ffffa0c892ebd3a0 (btrfs-quota-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.146509] stack backtrace: [ 9482.147350] CPU: 1 PID: 24187 Comm: mount Not tainted 5.10.0-rc4-btrfs-next-73 #1 [ 9482.147788] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 9482.148709] Call Trace: [ 9482.149169] dump_stack+0x8d/0xb5 [ 9482.149628] check_noncircular+0xff/0x110 [ 9482.150090] check_prev_add+0x91/0xc60 [ 9482.150561] ? kvm_clock_read+0x14/0x30 [ 9482.151017] ? kvm_sched_clock_read+0x5/0x10 [ 9482.151470] __lock_acquire+0x1740/0x3110 [ 9482.151941] ? __btrfs_tree_read_lock+0x27/0x120 [btrfs] [ 9482.152402] lock_acquire+0xd8/0x490 [ 9482.152887] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.153354] __mutex_lock+0xa3/0xb30 [ 9482.153826] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.154301] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.154768] ? qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.155226] qgroup_rescan_init+0x43/0xf0 [btrfs] [ 9482.155690] btrfs_read_qgroup_config+0x43a/0x550 [btrfs] [ 9482.156160] open_ctree+0x1228/0x18a0 [btrfs] [ 9482.156643] btrfs_mount_root.cold+0x13/0xed [btrfs] [ 9482.157108] ? rcu_read_lock_sched_held+0x5d/0x90 [ 9482.157567] ? kfree+0x31f/0x3e0 [ 9482.158030] legacy_get_tree+0x30/0x60 [ 9482.158489] vfs_get_tree+0x28/0xe0 [ 9482.158947] fc_mount+0xe/0x40 [ 9482.159403] vfs_kern_mount.part.0+0x71/0x90 [ 9482.159875] btrfs_mount+0x13b/0x3e0 [btrfs] [ 9482.160335] ? rcu_read_lock_sched_held+0x5d/0x90 [ 9482.160805] ? kfree+0x31f/0x3e0 [ 9482.161260] ? legacy_get_tree+0x30/0x60 [ 9482.161714] legacy_get_tree+0x30/0x60 [ 9482.162166] vfs_get_tree+0x28/0xe0 [ 9482.162616] path_mount+0x2d7/0xa70 [ 9482.163070] do_mount+0x75/0x90 [ 9482.163525] __x64_sys_mount+0x8e/0xd0 [ 9482.163986] do_syscall_64+0x33/0x80 [ 9482.164437] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 9482.164902] RIP: 0033:0x7f51e907caaa This happens because at btrfs_read_qgroup_config() we can call qgroup_rescan_init() while holding a read lock on a quota btree leaf, acquired by the previous call to btrfs_search_slot_for_read(), and qgroup_rescan_init() acquires the mutex qgroup_rescan_lock. A qgroup rescan worker does the opposite: it acquires the mutex qgroup_rescan_lock, at btrfs_qgroup_rescan_worker(), and then tries to update the qgroup status item in the quota btree through the call to update_qgroup_status_item(). This inversion of locking order between the qgroup_rescan_lock mutex and quota btree locks causes the splat. Fix this simply by releasing and freeing the path before calling qgroup_rescan_init() at btrfs_read_qgroup_config(). CC: [email protected] # 4.4+ Signed-off-by: Filipe Manana <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-11-23btrfs: tree-checker: add missing returns after data_ref alignment checksDavid Sterba1-0/+2
There are sectorsize alignment checks that are reported but then check_extent_data_ref continues. This was not intended, wrong alignment is not a minor problem and we should return with error. CC: [email protected] # 5.4+ Fixes: 0785a9aacf9d ("btrfs: tree-checker: Add EXTENT_DATA_REF check") Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-11-23btrfs: don't access possibly stale fs_info data for printing duplicate deviceJohannes Thumshirn1-1/+7
Syzbot reported a possible use-after-free when printing a duplicate device warning device_list_add(). At this point it can happen that a btrfs_device::fs_info is not correctly setup yet, so we're accessing stale data, when printing the warning message using the btrfs_printk() wrappers. ================================================================== BUG: KASAN: use-after-free in btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245 Read of size 8 at addr ffff8880878e06a8 by task syz-executor225/7068 CPU: 1 PID: 7068 Comm: syz-executor225 Not tainted 5.9.0-rc5-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d6/0x29e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 btrfs_printk+0x3eb/0x435 fs/btrfs/super.c:245 device_list_add+0x1a88/0x1d60 fs/btrfs/volumes.c:943 btrfs_scan_one_device+0x196/0x490 fs/btrfs/volumes.c:1359 btrfs_mount_root+0x48f/0xb60 fs/btrfs/super.c:1634 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x44840a RSP: 002b:00007ffedfffd608 EFLAGS: 00000293 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007ffedfffd670 RCX: 000000000044840a RDX: 0000000020000000 RSI: 0000000020000100 RDI: 00007ffedfffd630 RBP: 00007ffedfffd630 R08: 00007ffedfffd670 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000001a R13: 0000000000000004 R14: 0000000000000003 R15: 0000000000000003 Allocated by task 6945: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461 kmalloc_node include/linux/slab.h:577 [inline] kvmalloc_node+0x81/0x110 mm/util.c:574 kvmalloc include/linux/mm.h:757 [inline] kvzalloc include/linux/mm.h:765 [inline] btrfs_mount_root+0xd0/0xb60 fs/btrfs/super.c:1613 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 6945: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x3d/0x70 mm/kasan/common.c:56 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kfree+0x113/0x200 mm/slab.c:3756 deactivate_locked_super+0xa7/0xf0 fs/super.c:335 btrfs_mount_root+0x72b/0xb60 fs/btrfs/super.c:1678 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 fc_mount fs/namespace.c:978 [inline] vfs_kern_mount+0xc9/0x160 fs/namespace.c:1008 btrfs_mount+0x33c/0xae0 fs/btrfs/super.c:1732 legacy_get_tree+0xea/0x180 fs/fs_context.c:592 vfs_get_tree+0x88/0x270 fs/super.c:1547 do_new_mount fs/namespace.c:2875 [inline] path_mount+0x179d/0x29e0 fs/namespace.c:3192 do_mount fs/namespace.c:3205 [inline] __do_sys_mount fs/namespace.c:3413 [inline] __se_sys_mount+0x126/0x180 fs/namespace.c:3390 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff8880878e0000 which belongs to the cache kmalloc-16k of size 16384 The buggy address is located 1704 bytes inside of 16384-byte region [ffff8880878e0000, ffff8880878e4000) The buggy address belongs to the page: page:0000000060704f30 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x878e0 head:0000000060704f30 order:3 compound_mapcount:0 compound_pincount:0 flags: 0xfffe0000010200(slab|head) raw: 00fffe0000010200 ffffea00028e9a08 ffffea00021e3608 ffff8880aa440b00 raw: 0000000000000000 ffff8880878e0000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8880878e0580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880878e0600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb >ffff8880878e0680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8880878e0700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880878e0780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The syzkaller reproducer for this use-after-free crafts a filesystem image and loop mounts it twice in a loop. The mount will fail as the crafted image has an invalid chunk tree. When this happens btrfs_mount_root() will call deactivate_locked_super(), which then cleans up fs_info and fs_info::sb. If a second thread now adds the same block-device to the filesystem, it will get detected as a duplicate device and device_list_add() will reject the duplicate and print a warning. But as the fs_info pointer passed in is non-NULL this will result in a use-after-free. Instead of printing possibly uninitialized or already freed memory in btrfs_printk(), explicitly pass in a NULL fs_info so the printing of the device name will be skipped altogether. There was a slightly different approach discussed in https://lore.kernel.org/linux-btrfs/[email protected]/t/#u Link: https://lore.kernel.org/linux-btrfs/[email protected] Reported-by: [email protected] CC: [email protected] # 4.19+ Reviewed-by: Nikolay Borisov <[email protected]> Reviewed-by: Anand Jain <[email protected]> Signed-off-by: Johannes Thumshirn <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2020-11-23Merge tag 'misc-habanalabs-fixes-2020-11-23' of ↵Greg Kroah-Hartman1-0/+2
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into char-misc-linus Oded writes: This tag contains the following habanalabs driver fix for 5.10-rc6: - Add missing statements and break; in case switch of ECC handling. Without this fix, the handling of that interrupt will be erroneous. * tag 'misc-habanalabs-fixes-2020-11-23' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux: habanalabs/gaudi: fix missing code in ECC handling
2020-11-23habanalabs/gaudi: fix missing code in ECC handlingOded Gabbay1-0/+2
There is missing statement and missing "break;" in the ECC handling code in gaudi.c This will cause a wrong behavior upon certain ECC interrupts. Signed-off-by: Oded Gabbay <[email protected]>
2020-11-23drm/vc4: kms: Don't disable the muxing of an active CRTCMaxime Ripard2-36/+48
The current HVS muxing code will consider the CRTCs in a given state to setup their muxing in the HVS, and disable the other CRTCs muxes. However, it's valid to only update a single CRTC with a state, and in this situation we would mux out a CRTC that was enabled but left untouched by the new state. Fix this by setting a flag on the CRTC state when the muxing has been changed, and only change the muxing configuration when that flag is there. Fixes: 87ebcd42fb7b ("drm/vc4: crtc: Assign output to channel automatically") Signed-off-by: Maxime Ripard <[email protected]> Tested-by: Hoegeun Kwon <[email protected]> Reviewed-by: Hoegeun Kwon <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-11-23drm/vc4: kms: Store the unassigned channel list in the stateMaxime Ripard2-25/+102
If a CRTC is enabled but not active, and that we're then doing a page flip on another CRTC, drm_atomic_get_crtc_state will bring the first CRTC state into the global state, and will make us wait for its vblank as well, even though that might never occur. Instead of creating the list of the free channels each time atomic_check is called, and calling drm_atomic_get_crtc_state to retrieve the allocated channels, let's create a private state object in the main atomic state, and use it to store the available channels. Since vc4 has a semaphore (with a value of 1, so a lock) in its commit implementation to serialize all the commits, even the nonblocking ones, we are free from the use-after-free race if two subsequent commits are not ran in their submission order. Fixes: 87ebcd42fb7b ("drm/vc4: crtc: Assign output to channel automatically") Signed-off-by: Maxime Ripard <[email protected]> Tested-by: Hoegeun Kwon <[email protected]> Reviewed-by: Hoegeun Kwon <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-11-23Merge tag 'icc-5.10-rc6' of ↵Greg Kroah-Hartman4-9/+20
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-linus Georgi writes: interconnect fixes for v5.10 This contains a few driver fixes and one core fix: - Fix an excessive of_node_put() in the core. - Fix boot regression and integer overflow on msm8974 platforms. - Fix a minor issue on qcs404 and msm8916 platforms. Signed-off-by: Georgi Djakov <[email protected]> * tag 'icc-5.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: interconnect: fix memory trashing in of_count_icc_providers() interconnect: qcom: qcs404: Remove GPU and display RPM IDs interconnect: qcom: msm8916: Remove rpm-ids from non-RPM nodes interconnect: qcom: msm8974: Don't boost the NoC rate during boot interconnect: qcom: msm8974: Prevent integer overflow in rate
2020-11-23Merge tag 'v5.10-rockchip-dtsfixes1' of ↵Arnd Bergmann4-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into arm/fixes Fixed ordering for MMC devices on rk3399, due to a mmc change jumbling all ordering, a fix to make the Odroig Go Advance actually power down and using the correct clock name on the NanoPi R2S. * tag 'v5.10-rockchip-dtsfixes1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: arm64: dts: rockchip: Reorder LED triggers from mmc devices on rk3399-roc-pc. arm64: dts: rockchip: Assign a fixed index to mmc devices on rk3399 boards. arm64: dts: rockchip: Remove system-power-controller from pmic on Odroid Go Advance arm64: dts: rockchip: fix NanoPi R2S GMAC clock name Link: https://lore.kernel.org/r/11641389.O9o76ZdvQC@phil Signed-off-by: Arnd Bergmann <[email protected]>
2020-11-23arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect()Will Deacon1-13/+14
With hardware dirty bit management, calling pte_wrprotect() on a writable, dirty PTE will lose the dirty state and return a read-only, clean entry. Move the logic from ptep_set_wrprotect() into pte_wrprotect() to ensure that the dirty bit is preserved for writable entries, as this is required for soft-dirty bit management if we enable it in the future. Cc: <[email protected]> Fixes: 2f4b829c625e ("arm64: Add support for hardware updates of the access and dirty pte bits") Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23arm64: pgtable: Fix pte_accessible()Will Deacon1-3/+4
pte_accessible() is used by ptep_clear_flush() to figure out whether TLB invalidation is necessary when unmapping pages for reclaim. Although our implementation is correct according to the architecture, returning true only for valid, young ptes in the absence of racing page-table modifications, this is in fact flawed due to lazy invalidation of old ptes in ptep_clear_flush_young() where we elide the expensive DSB instruction for completing the TLB invalidation. Rather than penalise the aging path, adjust pte_accessible() to return true for any valid pte, even if the access flag is cleared. Cc: <[email protected]> Fixes: 76c714be0e5e ("arm64: pgtable: implement pte_accessible()") Reported-by: Yu Zhao <[email protected]> Acked-by: Yu Zhao <[email protected]> Reviewed-by: Minchan Kim <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23iommu: Check return of __iommu_attach_device()Shameer Kolothum1-4/+6
Currently iommu_create_device_direct_mappings() is called without checking the return of __iommu_attach_device(). This may result in failures in iommu driver if dev attach returns error. Fixes: ce574c27ae27 ("iommu: Move iommu_group_create_direct_mappings() out of iommu_group_add_device()") Signed-off-by: Shameer Kolothum <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23arm-smmu-qcom: Ensure the qcom_scm driver has finished probingJohn Stultz1-0/+4
Robin Murphy pointed out that if the arm-smmu driver probes before the qcom_scm driver, we may call qcom_scm_qsmmu500_wait_safe_toggle() before the __scm is initialized. Now, getting this to happen is a bit contrived, as in my efforts it required enabling asynchronous probing for both drivers, moving the firmware dts node to the end of the dtsi file, as well as forcing a long delay in the qcom_scm_probe function. With those tweaks we ran into the following crash: [ 2.631040] arm-smmu 15000000.iommu: Stage-1: 48-bit VA -> 48-bit IPA [ 2.633372] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 ... [ 2.633402] [0000000000000000] user address but active_mm is swapper [ 2.633409] Internal error: Oops: 96000005 [#1] PREEMPT SMP [ 2.633415] Modules linked in: [ 2.633427] CPU: 5 PID: 117 Comm: kworker/u16:2 Tainted: G W 5.10.0-rc1-mainline-00025-g272a618fc36-dirty #3971 [ 2.633430] Hardware name: Thundercomm Dragonboard 845c (DT) [ 2.633448] Workqueue: events_unbound async_run_entry_fn [ 2.633456] pstate: 80c00005 (Nzcv daif +PAN +UAO -TCO BTYPE=--) [ 2.633465] pc : qcom_scm_qsmmu500_wait_safe_toggle+0x78/0xb0 [ 2.633473] lr : qcom_smmu500_reset+0x58/0x78 [ 2.633476] sp : ffffffc0105a3b60 ... [ 2.633567] Call trace: [ 2.633572] qcom_scm_qsmmu500_wait_safe_toggle+0x78/0xb0 [ 2.633576] qcom_smmu500_reset+0x58/0x78 [ 2.633581] arm_smmu_device_reset+0x194/0x270 [ 2.633585] arm_smmu_device_probe+0xc94/0xeb8 [ 2.633592] platform_drv_probe+0x58/0xa8 [ 2.633597] really_probe+0xec/0x398 [ 2.633601] driver_probe_device+0x5c/0xb8 [ 2.633606] __driver_attach_async_helper+0x64/0x88 [ 2.633610] async_run_entry_fn+0x4c/0x118 [ 2.633617] process_one_work+0x20c/0x4b0 [ 2.633621] worker_thread+0x48/0x460 [ 2.633628] kthread+0x14c/0x158 [ 2.633634] ret_from_fork+0x10/0x18 [ 2.633642] Code: a9034fa0 d0007f73 29107fa0 91342273 (f9400020) To avoid this, this patch adds a check on qcom_scm_is_available() in the qcom_smmu_impl_init() function, returning -EPROBE_DEFER if its not ready. This allows the driver to try to probe again later after qcom_scm has finished probing. Reported-by: Robin Murphy <[email protected]> Signed-off-by: John Stultz <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Cc: Robin Murphy <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andy Gross <[email protected]> Cc: Maulik Shah <[email protected]> Cc: Bjorn Andersson <[email protected]> Cc: Saravana Kannan <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Lina Iyer <[email protected]> Cc: [email protected] Cc: linux-arm-msm <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23spi: spi-nxp-fspi: fix fspi panic by unexpected interruptsRan Wang1-0/+7
Given the case that bootloader(such as UEFI)'s FSPI driver might not handle all interrupts before loading kernel, those legacy interrupts would assert immidiately once kernel's FSPI driver enable them. Further, if it was FSPI_INTR_IPCMDDONE, the irq handler nxp_fspi_irq_handler() would call complete(&f->c) to notify others. However, f->c might not be initialized yet at that time, then cause kernel panic. Of cause, we should fix this issue within bootloader. But it would be better to have this pacth to make dirver more robust (by clearing all interrupt status bits before enabling interrupts). Suggested-by: Han Xu <[email protected]> Signed-off-by: Ran Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2020-11-23iommu/amd: Enforce 4k mapping for certain IOMMU data structuresSuravee Suthikulpanit1-5/+22
AMD IOMMU requires 4k-aligned pages for the event log, the PPR log, and the completion wait write-back regions. However, when allocating the pages, they could be part of large mapping (e.g. 2M) page. This causes #PF due to the SNP RMP hardware enforces the check based on the page level for these data structures. So, fix by calling set_memory_4k() on the allocated pages. Fixes: c69d89aff393 ("iommu/amd: Use 4K page for completion wait write-back semaphore") Signed-off-by: Suravee Suthikulpanit <[email protected]> Cc: Brijesh Singh <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23Merge branch 'cpufreq/arm/fixes' of ↵Rafael J. Wysocki1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Pull SCMI cpufreq driver fix for 5.10-rc6 from Viresh Kumar: "This fixes a build issues with SCMI cpufreq driver in the !CONFIG_COMMON_CLK case." * 'cpufreq/arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: scmi: Fix build for !CONFIG_COMMON_CLK
2020-11-23ACPI/IORT: Fix doc warnings in iort.cShiju Jose1-3/+5
Fix following warnings caused by mismatch between function parameters and function comments. drivers/acpi/arm64/iort.c:55: warning: Function parameter or member 'iort_node' not described in 'iort_set_fwnode' drivers/acpi/arm64/iort.c:55: warning: Excess function parameter 'node' description in 'iort_set_fwnode' drivers/acpi/arm64/iort.c:682: warning: Function parameter or member 'id' not described in 'iort_get_device_domain' drivers/acpi/arm64/iort.c:682: warning: Function parameter or member 'bus_token' not described in 'iort_get_device_domain' drivers/acpi/arm64/iort.c:682: warning: Excess function parameter 'req_id' description in 'iort_get_device_domain' drivers/acpi/arm64/iort.c:1142: warning: Function parameter or member 'dma_size' not described in 'iort_dma_setup' drivers/acpi/arm64/iort.c:1142: warning: Excess function parameter 'size' description in 'iort_dma_setup' drivers/acpi/arm64/iort.c:1534: warning: Function parameter or member 'ops' not described in 'iort_add_platform_device' Signed-off-by: Shiju Jose <[email protected]> Acked-by: Hanjun Guo <[email protected]> Acked-by: Lorenzo Pieralisi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd buildRandy Dunlap1-0/+2
Adding <asm/exception.h> brought in <asm/kprobes.h> which uses <asm/probes.h>, which uses 'pstate_check_t' so the latter needs to #include <asm/insn.h> for this typedef. Fixes this build error: In file included from arch/arm64/include/asm/kprobes.h:24, from arch/arm64/include/asm/exception.h:11, from arch/arm64/kernel/fpsimd.c:35: arch/arm64/include/asm/probes.h:16:2: error: unknown type name 'pstate_check_t' 16 | pstate_check_t *pstate_cc; Fixes: c6b90d5cf637 ("arm64/fpsimd: Fix missing-prototypes in fpsimd.c") Reported-by: kernel test robot <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Cc: Will Deacon <[email protected]> Cc: Tian Tao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2020-11-23s390: fix fpu restore in entry.SSven Schnelle2-5/+7
We need to disable interrupts in load_fpu_regs(). Otherwise an interrupt might come in after the registers are loaded, but before CIF_FPU is cleared in load_fpu_regs(). When the interrupt returns, CIF_FPU will be cleared and the registers will never be restored. The entry.S code usually saves the interrupt state in __SF_EMPTY on the stack when disabling/restoring interrupts. sie64a however saves the pointer to the sie control block in __SF_SIE_CONTROL, which references the same location. This is non-obvious to the reader. To avoid thrashing the sie control block pointer in load_fpu_regs(), move the __SIE_* offsets eight bytes after __SF_EMPTY on the stack. Cc: <[email protected]> # 5.8 Fixes: 0b0ed657fe00 ("s390: remove critical section cleanup from entry.S") Reported-by: Pierre Morel <[email protected]> Signed-off-by: Sven Schnelle <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Heiko Carstens <[email protected]>
2020-11-23powerpc/64s: Fix allnoconfig build since uaccess flushStephen Rothwell1-0/+2
Using DECLARE_STATIC_KEY_FALSE needs linux/jump_table.h. Otherwise the build fails with eg: arch/powerpc/include/asm/book3s/64/kup-radix.h:66:1: warning: data definition has no type or storage class 66 | DECLARE_STATIC_KEY_FALSE(uaccess_flush_key); Fixes: 9a32a7e78bd0 ("powerpc/64s: flush L1D after user accesses") Signed-off-by: Stephen Rothwell <[email protected]> [mpe: Massage change log] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-23Merge tag 'powerpc-cve-2020-4788' into fixesMichael Ellerman23-147/+693
From Daniel's cover letter: IBM Power9 processors can speculatively operate on data in the L1 cache before it has been completely validated, via a way-prediction mechanism. It is not possible for an attacker to determine the contents of impermissible memory using this method, since these systems implement a combination of hardware and software security measures to prevent scenarios where protected data could be leaked. However these measures don't address the scenario where an attacker induces the operating system to speculatively execute instructions using data that the attacker controls. This can be used for example to speculatively bypass "kernel user access prevention" techniques, as discovered by Anthony Steinhauser of Google's Safeside Project. This is not an attack by itself, but there is a possibility it could be used in conjunction with side-channels or other weaknesses in the privileged code to construct an attack. This issue can be mitigated by flushing the L1 cache between privilege boundaries of concern. This patch series flushes the L1 cache on kernel entry (patch 2) and after the kernel performs any user accesses (patch 3). It also adds a self-test and performs some related cleanups.
2020-11-23cpufreq: scmi: Fix build for !CONFIG_COMMON_CLKSudeep Holla1-1/+3
Commit 8410e7f3b31e ("cpufreq: scmi: Fix OPP addition failure with a dummy clock provider") registers a dummy clock provider using devm_of_clk_add_hw_provider. These *_hw_provider functions are defined only when CONFIG_COMMON_CLK=y. One possible fix is to add the Kconfig dependency, but since we plan to move away from the clock dependency for scmi cpufreq, it is preferrable to avoid that. Let us just conditionally compile out the offending call to devm_of_clk_add_hw_provider. It also uses the variable 'dev' outside of the #ifdef block to avoid build warning. Fixes: 8410e7f3b31e ("cpufreq: scmi: Fix OPP addition failure with a dummy clock provider") Cc: Rafael J. Wysocki <[email protected]> Cc: Viresh Kumar <[email protected]> Signed-off-by: Sudeep Holla <[email protected]> Signed-off-by: Viresh Kumar <[email protected]>
2020-11-23drm/exynos: depend on COMMON_CLK to fix compile testsKrzysztof Kozlowski1-1/+2
The Exynos DRM uses Common Clock Framework thus it cannot be built on platforms without it (e.g. compile test on MIPS with RALINK and SOC_RT305X): /usr/bin/mips-linux-gnu-ld: drivers/gpu/drm/exynos/exynos_mixer.o: in function `mixer_bind': exynos_mixer.c:(.text+0x958): undefined reference to `clk_set_parent' Reported-by: kernel test robot <[email protected]> Signed-off-by: Krzysztof Kozlowski <[email protected]> Signed-off-by: Inki Dae <[email protected]>
2020-11-22Linux 5.10-rc5Linus Torvalds1-1/+1
2020-11-22Merge branch 'for-linus' of ↵Linus Torvalds12-17/+223
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - Various functionality / regression fixes for Logitech devices from Hans de Goede - Fix for (recently added) GPIO support in mcp2221 driver from Lars Povlsen - Power management handling fix/quirk in i2c-hid driver for certain BIOSes that have strange aproach to power-cycle from Hans de Goede - a few device ID additions and device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: logitech-dj: Fix Dinovo Mini when paired with a MX5x00 receiver HID: logitech-dj: Fix an error in mse_bluetooth_descriptor HID: Add Logitech Dinovo Edge battery quirk HID: logitech-hidpp: Add HIDPP_CONSUMER_VENDOR_KEYS quirk for the Dinovo Edge HID: logitech-dj: Handle quad/bluetooth keyboards with a builtin trackpad HID: add HID_QUIRK_INCREMENT_USAGE_ON_DUPLICATE for Gamevice devices HID: mcp2221: Fix GPIO output handling HID: hid-sensor-hub: Fix issue with devices with no report ID HID: i2c-hid: Put ACPI enumerated devices in D3 on shutdown HID: add support for Sega Saturn HID: cypress: Support Varmilo Keyboards' media hotkeys HID: ite: Replace ABS_MISC 120/121 events with touchpad on/off keypresses HID: logitech-hidpp: Add PID for MX Anywhere 2 HID: uclogic: Add ID for Trust Flex Design Tablet
2020-11-22Merge tag 'sched-urgent-2020-11-22' of ↵Linus Torvalds4-57/+95
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Thomas Gleixner: "A couple of scheduler fixes: - Make the conditional update of the overutilized state work correctly by caching the relevant flags state before overwriting them and checking them afterwards. - Fix a data race in the wakeup path which caused loadavg on ARM64 platforms to become a random number generator. - Fix the ordering of the iowaiter accounting operations so it can't be decremented before it is incremented. - Fix a bug in the deadline scheduler vs. priority inheritance when a non-deadline task A has inherited the parameters of a deadline task B and then blocks on a non-deadline task C. The second inheritance step used the static deadline parameters of task A, which are usually 0, instead of further propagating task B's parameters. The zero initialized parameters trigger a bug in the deadline scheduler" * tag 'sched-urgent-2020-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/deadline: Fix priority inheritance with multiple scheduling classes sched: Fix rq->nr_iowait ordering sched: Fix data-race in wakeup sched/fair: Fix overutilized update in enqueue_task_fair()
2020-11-22Merge tag 'perf-urgent-2020-11-22' of ↵Linus Torvalds4-24/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for the x86 perf sysfs interfaces which used kobject attributes instead of device attributes and therefore making clang's control flow integrity checker upset" * tag 'perf-urgent-2020-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86: fix sysfs type mismatches