Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.linux-nfs.org/projects/anna/linux-nfs
NFSoRDMA Client Fixes for Linux 5.7
Bugfixes:
- Restore wake-up-all to rpcrdma_cm_event_handler()
- Otherwise the client won't respond to server disconnect requests
- Fix tracepoint use-after-free race
- Fix usage of xdr_stream_encode_item_{present, absent}
- These functions return a size on success, and not 0
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Currently, if the client sends BIND_CONN_TO_SESSION with
NFS4_CDFC4_FORE_OR_BOTH but only gets NFS4_CDFS4_FORE back it ignores
that it wasn't able to enable a backchannel.
To make sure, the client sends BIND_CONN_TO_SESSION as the first
operation on the connections (ie., no other session compounds haven't
been sent before), and if the client's request to bind the backchannel
is not satisfied, then reset the connection and retry.
Cc: [email protected]
Signed-off-by: Olga Kornievskaia <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
The rpciod workqueue is on the write-out path for freeing dirty memory,
so it is important that it never block waiting for memory to be
allocated - this can lead to a deadlock.
rpc_execute() - which is often called by an rpciod work item - calls
rcp_task_release_client() which can lead to rpc_free_client().
rpc_free_client() makes two calls which could potentially block wating
for memory allocation.
rpc_clnt_debugfs_unregister() calls into debugfs and will block while
any of the debugfs files are being accessed. In particular it can block
while any of the 'open' methods are being called and all of these use
malloc for one thing or another. So this can deadlock if the memory
allocation waits for NFS to complete some writes via rpciod.
rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't
obvious that memory allocations can happen while the lock it held, it is
safer to assume they might and to not let rpciod call
rpc_clnt_remove_pipedir().
So this patch moves these two calls (together with the final kfree() and
rpciod_down()) into a work-item to be run from the system work-queue.
rpciod can continue its important work, and the final stages of the free
can happen whenever they happen.
I have seen this deadlock on a 4.12 based kernel where debugfs used
synchronize_srcu() when removing objects. synchronize_srcu() requires a
workqueue and there were no free workther threads and none could be
allocated. While debugsfs no longer uses SRCU, I believe the deadlock
is still possible.
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
This fixes GPU hangs due to cache coherency issues.
Bump the driver version. Split out from the original patch.
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Christian König <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This fixes GPU hangs due to cache coherency issues.
v2: Split the version bump to a separate patch
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Christian König <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
DCC_INDEPENDENT_128B is needed for displayble DCC on gfx10.
SCANOUT is not needed by the kernel, but Mesa uses it.
Signed-off-by: Marek Olšák <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
hwmgr->pm_en is initialized at hwmgr_hw_init.
during amdgpu_device_init, there is amdgpu_asic_reset that calls to
soc15_asic_reset (for V320 usecase, Vega10 asic), in which:
1) soc15_asic_reset_method calls to pp_get_asic_baco_capability (pm_en)
2) soc15_asic_baco_reset calls to pp_set_asic_baco_state (pm_en)
pm_en is used in the above two cases while it has not yet been initialized
So avoid using pm_en in the above two functions for V320 passthrough.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Tiecheng Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit c5207876232649ca5e5ddd6f966d2da75ffded8f.
The commit being reverted changed the wrong place, it should have
changed in func get_asic_baco_capability.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Tiecheng Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If we do
% echo 1 > /sys/module/dmatest/parameters/run
[ 115.851124] dmatest: Could not start test, no channels configured
% echo dma8chan7 > /sys/module/dmatest/parameters/channel
[ 127.563872] dmatest: Added 1 threads using dma8chan7
% cat /sys/module/dmatest/parameters/wait
... !!! HANG !!! ...
The culprit is the commit 6138f967bccc
("dmaengine: dmatest: Use fixed point div to calculate iops")
which makes threads not to run, but pending and being kicked off by writing
to the 'run' node. However, it forgot to consider 'wait' routine to avoid
above mentioned case.
In order to fix this, check for really running threads, i.e. with pending
and done flags unset.
It's pity the culprit commit hadn't updated documentation and tested all
possible scenarios.
Fixes: 6138f967bccc ("dmaengine: dmatest: Use fixed point div to calculate iops")
Cc: Seraj Alijan <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
Pull s390 fix from Christian Borntraeger:
"Fix a race between page table upgrade and uaccess on s390.
This fixes CVE-2020-11884 which allows for a local kernel crash or
code execution"
* tag 'cve-2020-11884' from emailed bundle:
s390/mm: fix page table upgrade vs 2ndary address mode accesses
|
|
A race exists between build_pcms() and build_controls() phases of codec
setup. Build_pcms() sets up notifier for jack events. If a monitor event
is received before build_controls() is run, the initial jack state is
lost and never reported via mixer controls.
The problem can be hit at least with SOF as the controller driver. SOF
calls snd_hda_codec_build_controls() in its workqueue-based probe and
this can be delayed enough to hit the race condition.
Fix the issue by invalidating the per-pin ELD information when
build_controls() is called. The existing call to hdmi_present_sense()
will update the ELD contents. This ensures initial monitor state is
correctly reflected via mixer controls.
BugLink: https://github.com/thesofproject/linux/issues/1687
Signed-off-by: Kai Vehmanen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
This reverts commit a900aeac253729411cf33c6cb598c152e9e4137f because
regressions were showing up.
Suggested-by: Thierry Reding <[email protected]>
Link: https://lore.kernel.org/dmaengine/[email protected]/
Signed-off-by: Wolfram Sang <[email protected]>
|
|
This reverts commit 8814044fe0fa182abc9ff818d3da562de98bc9a7 because
regressions were showing up.
Suggested-by: Thierry Reding <[email protected]>
Link: https://lore.kernel.org/dmaengine/[email protected]/
Signed-off-by: Wolfram Sang <[email protected]>
|
|
When slave status is I2C_SLAVE_RX_END, generate I2C_SLAVE_STOP
event to i2c_client.
Fixes: c245d94ed106 ("i2c: iproc: Add multi byte read-write support for slave mode")
Signed-off-by: Rayagonda Kokatanur <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
Cc: [email protected]
Fixes: 8002db6336dd ("qxl: convert qxl driver to proper use for reservations")
Signed-off-by: Vasily Averin <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>
|
|
ret should be changed to release allocated struct qxl_release
Cc: [email protected]
Fixes: 8002db6336dd ("qxl: convert qxl driver to proper use for reservations")
Signed-off-by: Vasily Averin <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>
|
|
This can happen if userspace doesn't issue any 3D ioctls before
closing the DRM fd.
Fixes: 72b48ae800da ("drm/virtio: enqueue virtio_gpu_create_context after the first 3D ioctl")
Signed-off-by: Gurchetan Singh <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Gerd Hoffmann <[email protected]>
|
|
Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 60abd3181db2 ("selinux: convert cond_list to array")
Signed-off-by: Wei Yongjun <[email protected]>
Reviewed-by: Ondrej Mosnacek <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
tcp_bpf_recvmsg() invokes sk_psock_get(), which returns a reference of
the specified sk_psock object to "psock" with increased refcnt.
When tcp_bpf_recvmsg() returns, local variable "psock" becomes invalid,
so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in several exception handling paths
of tcp_bpf_recvmsg(). When those error scenarios occur such as "flags"
includes MSG_ERRQUEUE, the function forgets to decrease the refcnt
increased by sk_psock_get(), causing a refcnt leak.
Fix this issue by calling sk_psock_put() or pulling up the error queue
read handling when those error scenarios occur.
Fixes: e7a5f1f1cd000 ("bpf/sockmap: Read psock ingress_msg before sk_receive_queue")
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- regression fixes:
- transaction leak when deleting unused block group
- log cleanup after transaction abort
- fix block group leak when removing fails
- transaction leak if relocation recovery fails
- fix SPDX header
* tag 'for-5.7-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix transaction leak in btrfs_recover_relocation
btrfs: fix block group leak when removing fails
btrfs: drop logs when we've aborted a transaction
btrfs: fix memory leak of transaction when deleting unused block group
btrfs: discard: Use the correct style for SPDX License Identifier
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull Hyper-V fixes from Wei Liu:
- Two patches from Dexuan fixing suspension bugs
- Three cleanup patches from Andy and Michael
* tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hyper-v: Remove internal types from UAPI header
hyper-v: Use UUID API for exporting the GUID
x86/hyperv: Suspend/resume the VP assist page for hibernation
Drivers: hv: Move AEOI determination to architecture dependent code
Drivers: hv: vmbus: Fix Suspend-to-Idle for Generation-2 VM
|
|
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- fix random number generation in network coding, by George Spelvin
- fix reference counter leaks, by Xiyu Yang (3 patches)
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
A call to 'dma_alloc_coherent()' is hidden in 'sonic_alloc_descriptors()',
called from 'sonic_probe1()'.
This is correctly freed in the remove function, but not in the error
handling path of the probe function.
Fix it and add the missing 'dma_free_coherent()' call.
While at it, rename a label in order to be slightly more informative.
Fixes: efcce839360f ("[PATCH] macsonic/jazzsonic network drivers update")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 3c1bcc8614db ("net: ethernet: Convert phydev advertize and
supported from u32 to link mode") updated ethernet drivers to use a
linkmode bitmap. It mistakenly dropped a bitwise negation in the
tc35815 ethernet driver on a bitmask to set the supported/advertising
flags.
Found by Anthony via code inspection, not tested as I do not have the
required hardware.
Fixes: 3c1bcc8614db ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
Signed-off-by: Anthony Felice <[email protected]>
Reviewed-by: Akshay Bhat <[email protected]>
Reviewed-by: Heiner Kallweit <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
syzbot managed to set up sfq so that q->scaled_quantum was zero,
triggering an infinite loop in sfq_dequeue()
More generally, we must only accept quantum between 1 and 2^18 - 7,
meaning scaled_quantum must be in [1, 0x7FFF] range.
Otherwise, we also could have a loop in sfq_dequeue()
if scaled_quantum happens to be 0x8000, since slot->allot
could indefinitely switch between 0 and 0x8000.
Fixes: eeaeb068f139 ("sch_sfq: allow big packets and be fair")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: [email protected]
Cc: Jason A. Donenfeld <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Michael Chan says:
====================
bnxt_en: Bug fixes.
A collection of 5 miscellaneous bug fixes covering VF anti-spoof setup
issues, devlink MSIX max value, AER, context memory allocation error
path, and VLAN acceleration logic.
Please queue for -stable. Thanks.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The current logic in bnxt_fix_features() will inadvertently turn on both
CTAG and STAG VLAN offload if the user tries to disable both. Fix it
by checking that the user is trying to enable CTAG or STAG before
enabling both. The logic is supposed to enable or disable both CTAG and
STAG together.
Fixes: 5a9f6b238e59 ("bnxt_en: Enable and disable RX CTAG and RX STAG VLAN acceleration together.")
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
bnxt_alloc_ctx_pg_tbls() should return error when the memory size of the
context memory to set up is zero. By returning success (0), the caller
may proceed normally and may crash later when it tries to set up the
memory.
Fixes: 08fe9d181606 ("bnxt_en: Add Level 2 context memory paging support.")
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Improve the slot reset sequence by disabling the device to prevent bad
DMAs if slot reset fails. Return the proper result instead of always
PCI_ERS_RESULT_RECOVERED to the caller.
Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.")
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Broadcom adapters support only maximum of 512 CQs per PF. If user sets
MSIx vectors more than supported CQs, firmware is setting incorrect value
for msix_vec_per_pf_max parameter. Fix it by reducing the BNXT_MSIX_VEC_MAX
value to 512, even though the maximum # of MSIx vectors supported by adapter
are 1280.
Fixes: f399e8497826 ("bnxt_en: Use msix_vec_per_pf_max and msix_vec_per_pf_min devlink params.")
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix the logic that sets the enable/disable flag for the source MAC
filter according to firmware spec 1.7.1.
In the original firmware spec. before 1.7.1, the VF spoof check flags
were not latched after making the HWRM_FUNC_CFG call, so there was a
need to keep the func_flags so that subsequent calls would perserve
the VF spoof check setting. A change was made in the 1.7.1 spec
so that the flags became latched. So we now set or clear the anti-
spoof setting directly without retrieving the old settings in the
stored vf->func_flags which are no longer valid. We also remove the
unneeded vf->func_flags.
Fixes: 8eb992e876a8 ("bnxt_en: Update firmware interface spec to 1.7.6.2.")
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Read the temperature sensor register from the correct location for the
88E2110 PHY. There is no enable/disable bit on 2110, so make
mv3310_hwmon_config() run on 88X3310 only.
Fixes: 62d01535474b61 ("net: phy: marvell10g: add support for the 88x2110 PHY")
Cc: Maxime Chevallier <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Baruch Siach <[email protected]>
Reviewed-by: Russell King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If choke_init() could not allocate q->tab, we would crash later
in choke_reset().
BUG: KASAN: null-ptr-deref in memset include/linux/string.h:366 [inline]
BUG: KASAN: null-ptr-deref in choke_reset+0x208/0x340 net/sched/sch_choke.c:326
Write of size 8 at addr 0000000000000000 by task syz-executor822/7022
CPU: 1 PID: 7022 Comm: syz-executor822 Not tainted 5.7.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
__kasan_report.cold+0x5/0x4d mm/kasan/report.c:515
kasan_report+0x33/0x50 mm/kasan/common.c:625
check_memory_region_inline mm/kasan/generic.c:187 [inline]
check_memory_region+0x141/0x190 mm/kasan/generic.c:193
memset+0x20/0x40 mm/kasan/common.c:85
memset include/linux/string.h:366 [inline]
choke_reset+0x208/0x340 net/sched/sch_choke.c:326
qdisc_reset+0x6b/0x520 net/sched/sch_generic.c:910
dev_deactivate_queue.constprop.0+0x13c/0x240 net/sched/sch_generic.c:1138
netdev_for_each_tx_queue include/linux/netdevice.h:2197 [inline]
dev_deactivate_many+0xe2/0xba0 net/sched/sch_generic.c:1195
dev_deactivate+0xf8/0x1c0 net/sched/sch_generic.c:1233
qdisc_graft+0xd25/0x1120 net/sched/sch_api.c:1051
tc_modify_qdisc+0xbab/0x1a00 net/sched/sch_api.c:1670
rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5454
netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
___sys_sendmsg+0x100/0x170 net/socket.c:2416
__sys_sendmsg+0xec/0x1b0 net/socket.c:2449
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
Fixes: 77e62da6e60c ("sch_choke: drop all packets in queue during reset")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Cc: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
My intent was to not let users set a zero drop_batch_size,
it seems I once again messed with min()/max().
Fixes: 9d18562a2278 ("fq_codel: add batch ability to fq_codel_drop()")
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tls_data_ready() invokes sk_psock_get(), which returns a reference of
the specified sk_psock object to "psock" with increased refcnt.
When tls_data_ready() returns, local variable "psock" becomes invalid,
so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
tls_data_ready(). When "psock->ingress_msg" is empty but "psock" is not
NULL, the function forgets to decrease the refcnt increased by
sk_psock_get(), causing a refcnt leak.
Fix this issue by calling sk_psock_put() on all paths when "psock" is
not NULL.
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
x25_connect() invokes x25_get_neigh(), which returns a reference of the
specified x25_neigh object to "x25->neighbour" with increased refcnt.
When x25 connect success and returns, the reference still be hold by
"x25->neighbour", so the refcount should be decreased in
x25_disconnect() to keep refcount balanced.
The reference counting issue happens in x25_disconnect(), which forgets
to decrease the refcnt increased by x25_get_neigh() in x25_connect(),
causing a refcnt leak.
Fix this issue by calling x25_neigh_put() before x25_disconnect()
returns.
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
bpf_exec_tx_verdict() invokes sk_psock_get(), which returns a reference
of the specified sk_psock object to "psock" with increased refcnt.
When bpf_exec_tx_verdict() returns, local variable "psock" becomes
invalid, so the refcount should be decreased to keep refcount balanced.
The reference counting issue happens in one exception handling path of
bpf_exec_tx_verdict(). When "policy" equals to NULL but "psock" is not
NULL, the function forgets to decrease the refcnt increased by
sk_psock_get(), causing a refcnt leak.
Fix this issue by calling sk_psock_put() on this error path before
bpf_exec_tx_verdict() returns.
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The Aquantia AQC100 controller enables a SFP+ port, so the driver should
configure the media type as '_TYPE_FIBRE' instead of '_TYPE_TP'.
Signed-off-by: Richard Clark <[email protected]>
Cc: Igor Russkikh <[email protected]>
Cc: "David S. Miller" <[email protected]>
Acked-by: Igor Russkikh <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Stefano Garzarella says:
====================
vsock/virtio: fixes about packet delivery to monitoring devices
During the review of v1, Stefan pointed out an issue introduced by
that patch, where replies can appear in the packet capture before
the transmitted packet.
While fixing my patch, reverting it and adding a new flag in
'struct virtio_vsock_pkt' (patch 2/2), I found that we already had
that issue in vhost-vsock, so I fixed it (patch 1/2).
v1 -> v2:
- reverted the v1 patch, to avoid that replies can appear in the
packet capture before the transmitted packet [Stefan]
- added patch to fix packet delivering to monitoring devices in
vhost-vsock
- added patch to check if the packet is already delivered to
monitoring devices
v1: https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
In virtio_transport.c, if the virtqueue is full, the transmitting
packet is queued up and it will be sent in the next iteration.
This causes the same packet to be delivered multiple times to
monitoring devices.
We want to continue to deliver packets to monitoring devices before
it is put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.
This patch fixes the issue, adding a new flag (tap_delivered) in
struct virtio_vsock_pkt, to check if the packet is already delivered
to monitoring devices.
In vhost/vsock.c, we are splitting packets, so we must set
'tap_delivered' to false when we queue up the same virtio_vsock_pkt
to handle the remaining bytes.
Signed-off-by: Stefano Garzarella <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We want to deliver packets to monitoring devices before it is
put in the virtqueue, to avoid that replies can appear in the
packet capture before the transmitted packet.
Signed-off-by: Stefano Garzarella <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
drm_dp_mst_wait_tx_reply() returns > 1 if time elapsed in
wait_event_timeout() before check_txmsg_state(mgr, txmsg) evaluated to
true. However, we make the mistake of returning this time from
drm_dp_send_dpcd_write() on success instead of returning the number of
bytes written - causing spontaneous failures during link probing:
[drm:drm_dp_send_link_address [drm_kms_helper]] *ERROR* GUID check on
10:01 failed: 3975
Yikes! So, fix this by returning the number of bytes written on success
instead.
Signed-off-by: Lyude Paul <[email protected]>
Fixes: cb897542c6d2 ("drm/dp_mst: Fix W=1 warnings")
Cc: Benjamin Gaignard <[email protected]>
Cc: Sean Paul <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Sean Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The hwsp_cacheline pointer from i915_request is very, very flimsy. The
i915_request.timeline (and the hwsp_cacheline) are lost upon retiring
(after an RCU grace). Therefore we need to confirm that once we have the
right pointer for the cacheline, it is not in the process of being
retired and disposed of before we attempt to acquire a reference to the
cacheline.
<3>[ 547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915]
<3>[ 547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536
<4>[ 547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G U 5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1
<4>[ 547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016
<4>[ 547.208564] Call Trace:
<4>[ 547.208579] dump_stack+0x96/0xdb
<4>[ 547.208707] ? active_debug_hint+0x6a/0x70 [i915]
<4>[ 547.208719] print_address_description.constprop.6+0x16/0x310
<4>[ 547.208841] ? active_debug_hint+0x6a/0x70 [i915]
<4>[ 547.208963] ? active_debug_hint+0x6a/0x70 [i915]
<4>[ 547.208975] __kasan_report+0x137/0x190
<4>[ 547.209106] ? active_debug_hint+0x6a/0x70 [i915]
<4>[ 547.209127] kasan_report+0x32/0x50
<4>[ 547.209257] ? i915_gemfs_fini+0x40/0x40 [i915]
<4>[ 547.209376] active_debug_hint+0x6a/0x70 [i915]
<4>[ 547.209389] debug_print_object+0xa7/0x220
<4>[ 547.209405] ? lockdep_hardirqs_on+0x348/0x5f0
<4>[ 547.209426] debug_object_assert_init+0x297/0x430
<4>[ 547.209449] ? debug_object_free+0x360/0x360
<4>[ 547.209472] ? lock_acquire+0x1ac/0x8a0
<4>[ 547.209592] ? intel_timeline_read_hwsp+0x4f/0x840 [i915]
<4>[ 547.209737] ? i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[ 547.209861] i915_active_acquire_if_busy+0x66/0x120 [i915]
<4>[ 547.209990] ? __live_alloc.isra.15+0xc0/0xc0 [i915]
<4>[ 547.210005] ? rcu_read_lock_sched_held+0xd0/0xd0
<4>[ 547.210017] ? print_usage_bug+0x580/0x580
<4>[ 547.210153] intel_timeline_read_hwsp+0xbc/0x840 [i915]
<4>[ 547.210284] __emit_semaphore_wait+0xd5/0x480 [i915]
<4>[ 547.210415] ? i915_fence_get_timeline_name+0x110/0x110 [i915]
<4>[ 547.210428] ? lockdep_hardirqs_on+0x348/0x5f0
<4>[ 547.210442] ? _raw_spin_unlock_irq+0x2a/0x40
<4>[ 547.210567] ? __await_execution.constprop.51+0x2e0/0x570 [i915]
<4>[ 547.210706] i915_request_await_dma_fence+0x8f7/0xc70 [i915]
Fixes: 85bedbf191e8 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: <[email protected]> # v5.6+
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 2759e395358b2b909577928894f856ab75bea41a)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
While the ggtt vma are protected by their object lifetime, the list
continues until it hits a non-ggtt vma, and that vma is not protected
and may be freed as we inspect it. Hence, we require the obj->vma.lock
to protect the list as we iterate.
An example of forgetting to hold the obj->vma.lock is
[1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
[1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
[1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
[1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
[1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
[1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
[1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
[1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
[1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
[1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
[1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
[1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
[1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
[1642834.465038] Call Trace:
[1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
[1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm]
[1642834.465156] ? avc_has_perm+0x3b/0x160
[1642834.465178] drm_ioctl+0x206/0x390 [drm]
[1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
[1642834.465221] ? selinux_file_ioctl+0x122/0x1c0
[1642834.465226] ? __do_munmap+0x24b/0x4d0
[1642834.465231] ksys_ioctl+0x82/0xc0
[1642834.465235] __x64_sys_ioctl+0x16/0x20
[1642834.465238] do_syscall_64+0x5b/0xf0
[1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[1642834.465245] RIP: 0033:0x7f6a3d7b047b
[1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
[1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
[1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
[1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
[1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
[1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30
Now to take the spinlock during the list iteration, we need to break it
down into two phases. In the first phase under the lock, we cannot sleep
and so must defer the actual work to a second list, protected by the
ggtt->mutex.
We also need to hold the spinlock during creation of a new vma to
serialise with updates of the tiling on the object.
Reported-by: Dave Airlie <[email protected]>
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: <[email protected]> # v5.5+
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit cb593e5d2b6d3ad489669914d9fd1c64c7a4a6af)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
igt_ppgtt_pin_update() invokes i915_gem_context_get_vm_rcu(), which
returns a reference of the i915_address_space object to "vm" with
increased refcount.
When igt_ppgtt_pin_update() returns, "vm" becomes invalid, so the
refcount should be decreased to keep refcount balanced.
The reference counting issue happens in two exception handling paths of
igt_ppgtt_pin_update(). When i915_gem_object_create_internal() returns
IS_ERR, the refcnt increased by i915_gem_context_get_vm_rcu() is not
decreased, causing a refcnt leak.
Fix this issue by jumping to "out_vm" label when
i915_gem_object_create_internal() returns IS_ERR.
Fixes: a4e7ccdac38e ("drm/i915: Move context management under GEM")
Signed-off-by: Xiyu Yang <[email protected]>
Signed-off-by: Xin Tan <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit e07c7606a00c4361bad72ff4e72ed0dfbefa23b0)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Clay reports that OP_STATX fails for a test case with a valid fd
and empty path:
-- Test 0: statx:fd 3: SUCCEED, file mode 100755
-- Test 1: statx:path ./uring_statx: SUCCEED, file mode 100755
-- Test 2: io_uring_statx:fd 3: FAIL, errno 9: Bad file descriptor
-- Test 3: io_uring_statx:path ./uring_statx: SUCCEED, file mode 100755
This is due to statx not grabbing the process file table, hence we can't
lookup the fd in async context. If the fd is valid, ensure that we grab
the file table so we can grab the file from async context.
Cc: [email protected] # v5.6
Reported-by: Clay Harris <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Under some circumstances, i.e. when test is still running and about to
time out and user runs, for example,
grep -H . /sys/module/dmatest/parameters/*
the iterations parameter is not respected and test is going on and on until
user gives
echo 0 > /sys/module/dmatest/parameters/run
This is not what expected.
The history of this bug is interesting. I though that the commit
2d88ce76eb98 ("dmatest: add a 'wait' parameter")
is a culprit, but looking closer to the code I think it simple revealed the
broken logic from the day one, i.e. in the commit
0a2ff57d6fba ("dmaengine: dmatest: add a maximum number of test iterations")
which adds iterations parameter.
So, to the point, the conditional of checking the thread to be stopped being
first part of conjunction logic prevents to check iterations. Thus, we have to
always check both conditions to be able to stop after given iterations.
Since it wasn't visible before second commit appeared, I add a respective
Fixes tag.
Fixes: 2d88ce76eb98 ("dmatest: add a 'wait' parameter")
Cc: Dan Williams <[email protected]>
Cc: Nicolas Ferre <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
DMA synchronization hook checks whether interrupt is raised by testing
corresponding bit in a hardware status register, and thus, clock should
be enabled in this case, otherwise CPU may hang if synchronization is
invoked while Runtime PM is in suspended state. This patch resumes the RPM
during of the DMA synchronization process in order to avoid potential
problems. It is a minor clean up of a previous commit, no real problem is
fixed by this patch because currently RPM is always in a resumed state
while DMA is synchronized, although this may change in the future.
Fixes: 6697255f239f ("dmaengine: tegra-apb: Improve DMA synchronization")
Signed-off-by: Dmitry Osipenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Vinod Koul <[email protected]>
|
|
When the channel register code was changed to allow hotplug operations,
dynamic indexing wasn't taken into account. When channels are randomly
plugged and unplugged out of order, the serial indexing breaks. Convert
channel indexing to using IDA tracking in order to allow dynamic
assignment. The previous code does not cause any regression bug for
existing channel allocation besides idxd driver since the hotplug usage
case is only used by idxd at this point.
With this change, the chan->idr_ref is also not needed any longer. We can
have a device with no channels registered due to hot plug. The channel
device release code no longer should attempt to free the dma device id on
the last channel release.
Fixes: e81274cd6b52 ("dmaengine: add support to dynamic register/unregister of channels")
Reported-by: Yixin Zhang <[email protected]>
Signed-off-by: Dave Jiang <[email protected]>
Tested-by: Yixin Zhang <[email protected]>
Link: https://lore.kernel.org/r/158679961260.7674.8485924270472851852.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <[email protected]>
|
|
fs_info::journal_info
[BUG]
One run of btrfs/063 triggered the following lockdep warning:
============================================
WARNING: possible recursive locking detected
5.6.0-rc7-custom+ #48 Not tainted
--------------------------------------------
kworker/u24:0/7 is trying to acquire lock:
ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
but task is already holding lock:
ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(sb_internal#2);
lock(sb_internal#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by kworker/u24:0/7:
#0: ffff88817b495948 ((wq_completion)btrfs-endio-write){+.+.}, at: process_one_work+0x557/0xb80
#1: ffff888189ea7db8 ((work_completion)(&work->normal_work)){+.+.}, at: process_one_work+0x557/0xb80
#2: ffff88817d3a46e0 (sb_internal#2){.+.+}, at: start_transaction+0x66c/0x890 [btrfs]
#3: ffff888174ca4da8 (&fs_info->reloc_mutex){+.+.}, at: btrfs_record_root_in_trans+0x83/0xd0 [btrfs]
stack backtrace:
CPU: 0 PID: 7 Comm: kworker/u24:0 Not tainted 5.6.0-rc7-custom+ #48
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
Call Trace:
dump_stack+0xc2/0x11a
__lock_acquire.cold+0xce/0x214
lock_acquire+0xe6/0x210
__sb_start_write+0x14e/0x290
start_transaction+0x66c/0x890 [btrfs]
btrfs_join_transaction+0x1d/0x20 [btrfs]
find_free_extent+0x1504/0x1a50 [btrfs]
btrfs_reserve_extent+0xd5/0x1f0 [btrfs]
btrfs_alloc_tree_block+0x1ac/0x570 [btrfs]
btrfs_copy_root+0x213/0x580 [btrfs]
create_reloc_root+0x3bd/0x470 [btrfs]
btrfs_init_reloc_root+0x2d2/0x310 [btrfs]
record_root_in_trans+0x191/0x1d0 [btrfs]
btrfs_record_root_in_trans+0x90/0xd0 [btrfs]
start_transaction+0x16e/0x890 [btrfs]
btrfs_join_transaction+0x1d/0x20 [btrfs]
btrfs_finish_ordered_io+0x55d/0xcd0 [btrfs]
finish_ordered_fn+0x15/0x20 [btrfs]
btrfs_work_helper+0x116/0x9a0 [btrfs]
process_one_work+0x632/0xb80
worker_thread+0x80/0x690
kthread+0x1a3/0x1f0
ret_from_fork+0x27/0x50
It's pretty hard to reproduce, only one hit so far.
[CAUSE]
This is because we're calling btrfs_join_transaction() without re-using
the current running one:
btrfs_finish_ordered_io()
|- btrfs_join_transaction() <<< Call #1
|- btrfs_record_root_in_trans()
|- btrfs_reserve_extent()
|- btrfs_join_transaction() <<< Call #2
Normally such btrfs_join_transaction() call should re-use the existing
one, without trying to re-start a transaction.
But the problem is, in btrfs_join_transaction() call #1, we call
btrfs_record_root_in_trans() before initializing current::journal_info.
And in btrfs_join_transaction() call #2, we're relying on
current::journal_info to avoid such deadlock.
[FIX]
Call btrfs_record_root_in_trans() after we have initialized
current::journal_info.
CC: [email protected] # 4.4+
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|