Age | Commit message (Collapse) | Author | Files | Lines |
|
The series that introduced dual port RoCE mode assumed that we don't have
a dual port HCA that use the mlx5 driver, this is not the case for
Connect-IB HCAs. This reasoning led to assigning 1 as the native port
index which causes issue when the second port is used.
For example query_pkey() when called on the second port will return values
of the first port. Make sure that we assign the right port index as the
native port index.
Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Reviewed-by: Daniel Jurgens <[email protected]>
Signed-off-by: Mark Bloch <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The commit cited below added a gid_type field (RoCEv1 or RoCEv2)
to GID properties.
When adding GIDs, this gid_type field was copied over to the
hardware gid table. However, when deleting GIDs, the gid_type field
was not copied over to the hardware gid table.
As a result, when running RoCEv2, all RoCEv2 gids in the
hardware gid table were set to type RoCEv1 when any gid was deleted.
This problem would persist until the next gid was added (which would again
restore the gid_type field for all the gids in the hardware gid table).
Fix this by copying over the gid_type field to the hardware gid table
when deleting gids, so that the gid_type of all remaining gids is
preserved when a gid is deleted.
Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties")
Reviewed-by: Moni Shoua <[email protected]>
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
When using IPv4 addresses in RoCEv2, the GID format for the mapped
IPv4 address should be: ::ffff:<4-byte IPv4 address>.
In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords
zeroed out by memset, which resulted in deleting the ffff field.
However, since procedure ipv6_addr_v4mapped() already verifies that the
gid has format ::ffff:<ipv4 address>, no change is needed for the gid,
and the memset can simply be removed.
Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware")
Reviewed-by: Moni Shoua <[email protected]>
Signed-off-by: Jack Morgenstein <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
If the read-only flag is true on a SCSI disk, re-reading the partition
table sets the flag back to false.
To observe this bug, you can run:
1. blockdev --setro /dev/sda
2. blockdev --rereadpt /dev/sda
3. blockdev --getro /dev/sda
This commit reads the disk's old state and combines it with the device
disk-reported state rather than unconditionally marking it as RW.
Reported-by: Li Ning <[email protected]>
Signed-off-by: Jeremy Cline <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Because of the shifting around of code in qla2x00_probe_one recently,
failures during adapter initialization can lead to problems, i.e. NULL
pointer crashes and doubly freed data structures which cause eventual
panics.
This V2 version makes the relevant memory free routines idempotent, so
repeat calls won't cause any harm. I also removed the problematic
probe_init_failed exit point as it is not needed.
Fixes: d64d6c5671db ("scsi: qla2xxx: Fix NULL pointer crash due to probe failure")
Signed-off-by: Bill Kuzeja <[email protected]>
Acked-by: Himanshu Madhani <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Rework sd_zbc_check_zone_size() to avoid a memory leak due to an early
return if sd_zbc_report_zones() fails.
Reported-by: David.butterfield <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Cc: [email protected]
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The firmware event workqueue should not be marked as WQ_MEM_RECLAIM
as it's doesn't need to make forward progress under memory pressure.
In the current state it will result in a deadlock if the device had been
forcefully removed.
Cc: Sreekanth Reddy <[email protected]>
Cc: Suganath Prabu Subramani <[email protected]>
Acked-by: Sreekanth Reddy <[email protected]>
Signed-off-by: Hannes Reinecke <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
iWARP does not support RDMA WRITE or SEND with immediate data.
Driver should check this before submitting to FW and return an
immediate error
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Race in qedr_poll_cq, lastest_cqe wasn't protected by lock,
leading to a case where two context's accessing poll_cq at
the same time lead to one of them having a pointer to an old
latest_cqe and reading an invalid cqe element
Signed-off-by: Amit Radzi <[email protected]>
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Fix iWARP connect and listen to use the mapped port for
ipv4 and ipv6. Without this fixed, running on a server
that has iwpmd enabled will not use the correct port
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The wrong parameter was passed to dst_neigh_lookup
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In practice this is really only meaningful in the context of the DM
multipath target (which uses dm_table_set_type() to set the type of
device DM should create via its "queue_mode" option).
So this change allows a DM multipath device with "queue_mode bio" to be
upgraded from DM_TYPE_BIO_BASED to DM_TYPE_NVME_BIO_BASED -- iff the
underlying device(s) are NVMe.
DM_TYPE_NVME_BIO_BASED is just a DM core implementation detail that
allows for NVMe-specific optimizations (e.g. use direct_make_request
instead of generic_make_request). If in the future there is no benefit
or need to distinguish NVMe vs not: then it will be removed.
Signed-off-by: Mike Snitzer <[email protected]>
|
|
This eliminates the "queue_mode" configuration's "nvme" mode. There
wasn't anything NVMe-specific about that mode. It was named "nvme"
because it was a short name for the mode. But the entire point of the
mode was to optimize the multipath target for underlying devices that
are _not_ SCSI-based. Devices that aren't SCSI have no need for the
various SCSI device handler (scsi_dh) specific code in DM multipath.
But rather than narrowly define this scsi_dh vs not branching in terms
of "nvme": invert the logic so that we're just checking whether a
multipath device is layered on SCSI devices with scsi_dh attached.
This allows any future storage technology to avoid scsi_dh specific code
in the multipath target too.
Fixes: 848b8aefd4 ("dm mpath: optimize NVMe bio-based support")
Suggested-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
The strncmp function should compare 4 bytes.
Fixes: 22c11858e8002 ("dm: introduce DM_TYPE_NVME_BIO_BASED")
Signed-off-by: Mikulas Patocka <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
Upstream commit 4102d9de6d375 ("dm raid: fix rs_get_progress()
synchronization state/ratio") in combination with commit 7c29744ecce
("dm raid: simplify rs_get_progress()") introduced a regression by
incorrectly reporting a sync_ratio of 0 for degraded raid sets. This
caused lvm2 to fail to repair raid legs automatically.
Fix by identifying the degraded state by checking the MD_RECOVERY_INTR
flag and returning mddev->recovery_cp in case it is set.
MD sets recovery = [ MD_RECOVERY_RECOVER MD_RECOVERY_INTR
MD_RECOVERY_NEEDED ] when a RAID member fails. It then shuts down any
sync thread that is running and leaves us with all MD_RECOVERY_* flags
cleared. The bug occurs if a status is requested in the short time it
takes to shut down any sync thread and clear the flags, because we were
keying in on the MD_RECOVERY_NEEDED - understanding it to be the initial
phase of a “recover” sync thread. However, this is an incorrect
interpretation if MD_RECOVERY_INTR is also set.
This also explains why the bug only happened when automatic repair was
enabled and not a normal ‘manual’ method. It is impossible to react
quick enough to hit the problematic window without it being automated.
Fix passes automatic repair tests.
Fixes: 7c29744ecce ("dm raid: simplify rs_get_progress()")
Signed-off-by: Jonathan Brassow <[email protected]>
Signed-off-by: Heinz Mauelshagen <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
Otherwise an underlying device's teardown (e.g. SCSI) may race with the
DM ioctl or persistent reservation and result in dereferencing driver
memory that gets freed when the underlying device's final blkdev_put()
occurs.
bdgrab() only increases the refcount for the block_device's inode to
ensure the block_device struct itself will not be freed, but does not
guarantee the block_device will remain associated with the gendisk or
its storage.
Cc: [email protected] # 4.8+
Reported-by: David Jeffery <[email protected]>
Suggested-by: David Jeffery <[email protected]>
Reviewed-by: Ben Marzinski <[email protected]>
Signed-off-by: Mike Snitzer <[email protected]>
|
|
gcc-6.3 and earlier show a new warning after a seemingly unrelated
change to the arm64 PAGE_KERNEL definition:
In file included from drivers/md/dm-bufio.c:14:0:
drivers/md/dm-bufio.c: In function 'alloc_buffer':
include/linux/sched/mm.h:182:56: warning: 'noio_flag' may be used uninitialized in this function [-Wmaybe-uninitialized]
current->flags = (current->flags & ~PF_MEMALLOC_NOIO) | flags;
^
The same warning happened earlier on linux-3.18 for MIPS and I did a
workaround for that, but now it's come back.
gcc-7 and newer are apparently smart enough to figure this out, and
other architectures don't show it, so the best I could come up with is
to rework the caller slightly in a way that makes it obvious enough to
all arm64 compilers what is happening here.
Fixes: 41acec624087 ("arm64: kpti: Make use of nG dependent on arm64_kernel_unmapped_at_el0()")
Link: https://patchwork.kernel.org/patch/9692829/
Cc: [email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
[snitzer: moved declarations inside conditional, altered vmalloc return]
Signed-off-by: Mike Snitzer <[email protected]>
|
|
Fix bugs in signaling the Hyper-V host when freeing space in the
host->guest ring buffer:
1. The interrupt_mask must not be used to determine whether to signal
on the host->guest ring buffer
2. The ring buffer write_index must be read (via hv_get_bytes_to_write)
*after* pending_send_sz is read in order to avoid a race condition
3. Comparisons with pending_send_sz must treat the "equals" case as
not-enough-space
4. Don't signal if the pending_send_sz feature is not present. Older
versions of Hyper-V that don't implement this feature will poll.
Fixes: 03bad714a161 ("vmbus: more host signalling avoidance")
Cc: Stable <[email protected]> # 4.14 and above
Signed-off-by: Michael Kelley <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Commit 57e6f0d7b804 ("typec: tcpm: Only request matching pdos") is causing
a regression, before this commit e.g. the GPD win and GPD pocket devices
were charging at 9V 3A with a PD charger, now they are instead slowly
discharging at 5V 0.4A, as this commit causes the ports max_snk_mv/ma/mw
settings to be completely ignored.
Arguably the way to fix this would be to add a PDO_VAR() describing the
voltage range to the snk_caps of boards which can handle any voltage in
their range, but the "typec: tcpm: Only request matching pdos" commit
looks at the type of PDO advertised by the source/charger and if that
is fixed (as it typically is) only compairs against PDO_FIXED entries
in the snk_caps so supporting a range of voltage would require adding a
PDO_FIXED entry for *every possible* voltage to snk_caps.
AFAICT there is no reason why a fixed source_cap cannot be matched against
a variable snk_cap, so at a minimum the commit should be rewritten to
support that.
For now lets revert the "typec: tcpm: Only request matching pdos" commit,
fixing the regression.
Fixes: 57e6f0d7b804 ("typec: tcpm: Only request matching pdos")
Cc: Badhri Jagan Sridharan <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Acked-by: Heikki Krogerus <[email protected]>
Acked-by: Adam Thomson <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Without pm_runtime_{get,put}_sync calls in place, reading
vbus status via /sys causes the following error:
Unhandled fault: external abort on non-linefetch (0x1028) at 0xfa0ab060
pgd = b333e822
[fa0ab060] *pgd=48011452(bad)
[<c05261b0>] (musb_default_readb) from [<c0525bd0>] (musb_vbus_show+0x58/0xe4)
[<c0525bd0>] (musb_vbus_show) from [<c04c0148>] (dev_attr_show+0x20/0x44)
[<c04c0148>] (dev_attr_show) from [<c0259f74>] (sysfs_kf_seq_show+0x80/0xdc)
[<c0259f74>] (sysfs_kf_seq_show) from [<c0210bac>] (seq_read+0x250/0x448)
[<c0210bac>] (seq_read) from [<c01edb40>] (__vfs_read+0x1c/0x118)
[<c01edb40>] (__vfs_read) from [<c01edccc>] (vfs_read+0x90/0x144)
[<c01edccc>] (vfs_read) from [<c01ee1d0>] (SyS_read+0x3c/0x74)
[<c01ee1d0>] (SyS_read) from [<c0106fe0>] (ret_fast_syscall+0x0/0x54)
Solution was suggested by Tony Lindgren <[email protected]>.
Signed-off-by: Merlijn Wajer <[email protected]>
Acked-by: Tony Lindgren <[email protected]>
Signed-off-by: Bin Liu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Corsair Strafe RGB keyboard does not respond to usb control messages
sometimes and hence generates timeouts.
Commit de3af5bf259d ("usb: quirks: add delay init quirk for Corsair
Strafe RGB keyboard") tried to fix those timeouts by adding
USB_QUIRK_DELAY_INIT.
Unfortunately, even with this quirk timeouts of usb_control_msg()
can still be seen, but with a lower frequency (approx. 1 out of 15):
[ 29.103520] usb 1-8: string descriptor 0 read error: -110
[ 34.363097] usb 1-8: can't set config #1, error -110
Adding further delays to different locations where usb control
messages are issued just moves the timeouts to other locations,
e.g.:
[ 35.400533] usbhid 1-8:1.0: can't add hid device: -110
[ 35.401014] usbhid: probe of 1-8:1.0 failed with error -110
The only way to reliably avoid those issues is having a pause after
each usb control message. In approx. 200 boot cycles no more timeouts
were seen.
Addionaly, keep USB_QUIRK_DELAY_INIT as it turned out to be necessary
to have the delay in hub_port_connect() after hub_port_init().
The overall boot time seems not to be influenced by these additional
delays, even on fast machines and lightweight distributions.
Fixes: de3af5bf259d ("usb: quirks: add delay init quirk for Corsair Strafe RGB keyboard")
Cc: [email protected]
Signed-off-by: Danilo Krummrich <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A typo broke the comparison.
Fixes: cbeef22fd611 ("usb: uas: unconditionally bring back host after reset")
Signed-off-by: Oliver Neukum <[email protected]>
CC: [email protected]
Acked-by: Hans de Goede <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Correct spelling of National Semi-conductor (no hyphen)
in drivers/net/ethernet/.
Signed-off-by: Randy Dunlap <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 820ee17b8d3b ("net: phy: broadcom: Add support code for reading PHY counters")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 2b2427d06426 ("phy: micrel: Add ethtool statistics counters")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: d2fa47d9dd5c ("phy: marvell: Add ethtool statistics counters")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Our statistics strings are allocated at initialization without being
bound to a specific size, yet, we would copy ETH_GSTRING_LEN bytes using
memcpy() which would create out of bounds accesses, this was flagged by
KASAN. Replace this with strlcpy() to make sure we are bound the source
buffer size and we also always NUL-terminate strings.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Make sure that the CRTC code will call the enable/disable_vblank hooks.
Otherwise, since the refcounting will be off, we might end up in a
situation where the vblank management functions are called while the CRTC
is off.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
In the case where mode_valid callback of our RGB connector was called
before mode_set was being called, the range of dividers would not be set,
resulting in a division by zero later on in the clk_round_rate logic.
Set the range of dividers before calling clk_round_rate to fix this.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Tested-by: Giulio Benetti <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The current logic to deal with old DT missing the LVDS properties doesn't
take into account whether the LVDS output is supported in the first place,
resulting in spurious error messages on SoCs where it doesn't even matter.
Introduce a new TCON flag to list if LVDS is supported at all to prevent
this from happening.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
ashmem_mutex may create a chain of dependencies like:
CPU0 CPU1
mmap syscall ioctl syscall
-> mmap_sem (acquired) -> ashmem_ioctl
-> ashmem_mmap -> ashmem_mutex (acquired)
-> ashmem_mutex (try to acquire) -> copy_from_user
-> mmap_sem (try to acquire)
There is a lock odering problem between mmap_sem and ashmem_mutex causing
a lockdep splat[1] during a syzcaller test. This patch fixes the problem
by move copy_from_user out of ashmem_mutex.
[1] https://www.spinics.net/lists/kernel/msg2733200.html
Fixes: ce8a3a9e76d0 (staging: android: ashmem: Fix a race condition in pin ioctls)
Reported-by: [email protected]
Signed-off-by: Yisheng Xie <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A rounding error was causing comedi_nsamples_left to
return the wrong value when nsamples was not a multiple
of the scan length.
Cc: <[email protected]> # v4.4+
Signed-off-by: Frank Mori Hess <[email protected]>
Reviewed-by: Ian Abbott <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
After staring hard at sequences like
[ 28.199013] systemd-1 2..s. 26062228us : execlists_submission_tasklet: rcs0 cs-irq head=0 [0?], tail=1 [1?]
[ 28.199095] systemd-1 2..s. 26062229us : execlists_submission_tasklet: rcs0 csb[1]: status=0x00000018:0x00000000, active=0x1
[ 28.199177] systemd-1 2..s. 26062230us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.1, seqno=3, prio=-1024
[ 28.199258] systemd-1 2..s. 26062231us : execlists_submission_tasklet: rcs0 completed ctx=0
[ 28.199340] gem_eio-829 1..s1 26066853us : execlists_submission_tasklet: rcs0 in[0]: ctx=1.1, seqno=1, prio=0
[ 28.199421] <idle>-0 2..s. 26066863us : execlists_submission_tasklet: rcs0 cs-irq head=1 [1?], tail=2 [2?]
[ 28.199503] <idle>-0 2..s. 26066865us : execlists_submission_tasklet: rcs0 csb[2]: status=0x00000001:0x00000000, active=0x1
[ 28.199585] gem_eio-829 1..s1 26067077us : execlists_submission_tasklet: rcs0 in[1]: ctx=3.1, seqno=2, prio=0
[ 28.199667] gem_eio-829 1..s1 26067078us : execlists_submission_tasklet: rcs0 in[0]: ctx=1.2, seqno=1, prio=0
[ 28.199749] <idle>-0 2..s. 26067084us : execlists_submission_tasklet: rcs0 cs-irq head=2 [2?], tail=3 [3?]
[ 28.199830] <idle>-0 2..s. 26067085us : execlists_submission_tasklet: rcs0 csb[3]: status=0x00008002:0x00000001, active=0x1
[ 28.199912] <idle>-0 2..s. 26067086us : execlists_submission_tasklet: rcs0 out[0]: ctx=1.2, seqno=1, prio=0
[ 28.199994] gem_eio-829 2..s. 28246084us : execlists_submission_tasklet: rcs0 cs-irq head=3 [3?], tail=4 [4?]
[ 28.200096] gem_eio-829 2..s. 28246088us : execlists_submission_tasklet: rcs0 csb[4]: status=0x00000014:0x00000001, active=0x5
[ 28.200178] gem_eio-829 2..s. 28246089us : execlists_submission_tasklet: rcs0 out[0]: ctx=0.0, seqno=0, prio=0
[ 28.200260] gem_eio-829 2..s. 28246127us : execlists_submission_tasklet: execlists_submission_tasklet:886 GEM_BUG_ON(buf[2 * head + 1] != port->context_id)
the conclusion is that the only place where the ports are reset to zero,
is from engine->cancel_requests called during i915_gem_set_wedged().
The race is horrible as it results from calling set-wedged on active HW
(the GPU reset failed) and as such we need to be careful as the HW state
changes beneath us. Fortunately, it's the same scary conditions as
affect normal reset, so we can reuse the same machinery to disable state
tracking as we clobber it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104945
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Michel Thierry <[email protected]>
Fixes: af7a8ffad9c5 ("drm/i915: Use rcu instead of stop_machine in set_wedged")
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 963ddd63c314e9b5d9cd999873d473a93aed5380)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We're seeing on CI that some contexts don't have the programmed OA
period timer that directs the OA unit on how often to write reports.
The issue is that we're not holding the drm lock from when we edit the
context images down to when we set the exclusive_stream variable. This
leaves a window for the deferred context allocation to call
i915_oa_init_reg_state() that will not program the expected OA timer
value, because we haven't set the exclusive_stream yet.
v2: Drop need_lock from gen8_configure_all_contexts() (Matt)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Fixes: 701f8231a2f ("drm/i915/perf: prune OA configs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102254
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103715
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103755
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: [email protected]
Cc: <[email protected]> # v4.14+
(cherry picked from commit 41d3fdcd15d5ecf29cc73e8b79c2327ebb54b960)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
This can happen e.g. during disk cloning.
This is an incomplete fix: it does not catch duplicate UUIDs earlier
when things are still unattached. It does not unregister the device.
Further changes to cope better with this are planned but conflict with
Coly's ongoing improvements to handling device errors. In the meantime,
one can manually stop the device after this has happened.
Attempts to attach a duplicate device result in:
[ 136.372404] loop: module loaded
[ 136.424461] bcache: register_bdev() registered backing device loop0
[ 136.424464] bcache: bch_cached_dev_attach() Tried to attach loop0 but duplicate UUID already attached
My test procedure is:
dd if=/dev/sdb1 of=imgfile bs=1024 count=262144
losetup -f imgfile
Signed-off-by: Michael Lyle <[email protected]>
Reviewed-by: Tang Junhui <[email protected]>
Cc: <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Kernel crashed when register a duplicate cache device, the call trace is
bellow:
[ 417.643790] CPU: 1 PID: 16886 Comm: bcache-register Tainted: G
W OE 4.15.5-amd64-preempt-sysrq-20171018 #2
[ 417.643861] Hardware name: LENOVO 20ERCTO1WW/20ERCTO1WW, BIOS
N1DET41W (1.15 ) 12/31/2015
[ 417.643870] RIP: 0010:bdevname+0x13/0x1e
[ 417.643876] RSP: 0018:ffffa3aa9138fd38 EFLAGS: 00010282
[ 417.643884] RAX: 0000000000000000 RBX: ffff8c8f2f2f8000 RCX: ffffd6701f8
c7edf
[ 417.643890] RDX: ffffa3aa9138fd88 RSI: ffffa3aa9138fd88 RDI: 00000000000
00000
[ 417.643895] RBP: ffffa3aa9138fde0 R08: ffffa3aa9138fae8 R09: 00000000000
1850e
[ 417.643901] R10: ffff8c8eed34b271 R11: ffff8c8eed34b250 R12: 00000000000
00000
[ 417.643906] R13: ffffd6701f78f940 R14: ffff8c8f38f80000 R15: ffff8c8ea7d
90000
[ 417.643913] FS: 00007fde7e66f500(0000) GS:ffff8c8f61440000(0000) knlGS:
0000000000000000
[ 417.643919] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 417.643925] CR2: 0000000000000314 CR3: 00000007e6fa0001 CR4: 00000000003
606e0
[ 417.643931] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 00000000000
00000
[ 417.643938] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 00000000000
00400
[ 417.643946] Call Trace:
[ 417.643978] register_bcache+0x1117/0x1270 [bcache]
[ 417.643994] ? slab_pre_alloc_hook+0x15/0x3c
[ 417.644001] ? slab_post_alloc_hook.isra.44+0xa/0x1a
[ 417.644013] ? kernfs_fop_write+0xf6/0x138
[ 417.644020] kernfs_fop_write+0xf6/0x138
[ 417.644031] __vfs_write+0x31/0xcc
[ 417.644043] ? current_kernel_time64+0x10/0x36
[ 417.644115] ? __audit_syscall_entry+0xbf/0xe3
[ 417.644124] vfs_write+0xa5/0xe2
[ 417.644133] SyS_write+0x5c/0x9f
[ 417.644144] do_syscall_64+0x72/0x81
[ 417.644161] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[ 417.644169] RIP: 0033:0x7fde7e1c1974
[ 417.644175] RSP: 002b:00007fff13009a38 EFLAGS: 00000246 ORIG_RAX: 0000000
000000001
[ 417.644183] RAX: ffffffffffffffda RBX: 0000000001658280 RCX: 00007fde7e1c
1974
[ 417.644188] RDX: 000000000000000a RSI: 0000000001658280 RDI: 000000000000
0001
[ 417.644193] RBP: 000000000000000a R08: 0000000000000003 R09: 000000000000
0077
[ 417.644198] R10: 000000000000089e R11: 0000000000000246 R12: 000000000000
0001
[ 417.644203] R13: 000000000000000a R14: 7fffffffffffffff R15: 000000000000
0000
[ 417.644213] Code: c7 c2 83 6f ee 98 be 20 00 00 00 48 89 df e8 6c 27 3b 0
0 48 89 d8 5b c3 0f 1f 44 00 00 48 8b 47 70 48 89 f2 48 8b bf 80 00 00 00 <8
b> b0 14 03 00 00 e9 73 ff ff ff 0f 1f 44 00 00 48 8b 47 40 39
[ 417.644302] RIP: bdevname+0x13/0x1e RSP: ffffa3aa9138fd38
[ 417.644306] CR2: 0000000000000314
When registering duplicate cache device in register_cache(), after failure
on calling register_cache_set(), bch_cache_release() will be called, then
bdev will be freed, so bdevname(bdev, name) caused kernel crash.
Since bch_cache_release() will free bdev, so in this patch we make sure
bdev being freed if register_cache() fail, and do not free bdev again in
register_bcache() when register_cache() fail.
Signed-off-by: Tang Junhui <[email protected]>
Reported-by: Marc MERLIN <[email protected]>
Tested-by: Michael Lyle <[email protected]>
Reviewed-by: Michael Lyle <[email protected]>
Cc: <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Descriptor rings were not initialized at zero when allocated
When area contained garbage data, it caused skb_over_panic in
e1000_clean_rx_irq (if data had E1000_RXD_STAT_DD bit set)
This patch makes use of dma_zalloc_coherent to make sure the
ring is memset at 0 to prevent the area from containing garbage.
Following is the signature of the panic:
[email protected]: skbuff: skb_over_panic: text:80407b20 len:64010 put:64010 head:ab46d800 data:ab46d842 tail:0xab47d24c end:0xab46df40 dev:eth0
[email protected]: BUG: failure at net/core/skbuff.c:105/skb_panic()!
[email protected]: Kernel panic - not syncing: BUG!
[email protected]:
[email protected]: Process swapper/0 (pid: 0, threadinfo=81728000, task=8173cc00 ,cpu: 0)
[email protected]: SP = <815a1c0c>
[email protected]: Stack: 00000001
[email protected]: b2d89800 815e33ac
[email protected]: ea73c040 00000001
[email protected]: 60040003 0000fa0a
[email protected]: 00000002
[email protected]:
[email protected]: 804540c0 815a1c70
[email protected]: b2744000 602ac070
[email protected]: 815a1c44 b2d89800
[email protected]: 8173cc00 815a1c08
[email protected]:
[email protected]: 00000006
[email protected]: 815a1b50 00000000
[email protected]: 80079434 00000001
[email protected]: ab46df40 b2744000
[email protected]: b2d89800
[email protected]:
[email protected]: 0000fa0a 8045745c
[email protected]: 815a1c88 0000fa0a
[email protected]: 80407b20 b2789f80
[email protected]: 00000005 80407b20
[email protected]:
[email protected]:
[email protected]: Call Trace:
[email protected]: [<804540bc>] skb_panic+0xa4/0xa8
[email protected]: [<80079430>] console_unlock+0x2f8/0x6d0
[email protected]: [<80457458>] skb_put+0xa0/0xc0
[email protected]: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
[email protected]: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
[email protected]: [<804079c8>] e1000_clean_rx_irq+0x188/0x3e8
[email protected]: [<80407b1c>] e1000_clean_rx_irq+0x2dc/0x3e8
[email protected]: [<80468b48>] __dev_kfree_skb_any+0x88/0xa8
[email protected]: [<804101ac>] e1000e_poll+0x94/0x288
[email protected]: [<8046e9d4>] net_rx_action+0x19c/0x4e8
[email protected]: ...
[email protected]: Maximum depth to print reached. Use kstack=<maximum_depth_to_print> To specify a custom value (where 0 means to display the full backtrace)
[email protected]: ---[ end Kernel panic - not syncing: BUG!
Signed-off-by: Pierre-Yves Kerbrat <[email protected]>
Signed-off-by: Marius Gligor <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Clean fake sink flag after detecting link on downstream port.
Fixing display light-up after "hot-unplug&plug again" downstream
of an active dongle.
Signed-off-by: Roman Li <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rather than querying it every time we need it.
Also fixes a crash in VM pass through if there is no
root bridge because the cached value fetch already checks
this properly.
v2: fix includes
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=105244
Acked-by: Christian König <[email protected]>
Reviewed-by: Rex Zhu<[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
The read/write pointers on sdma4 devices increment
beyond the ring size and should be masked. Tested
on my Ryzen 2400G.
Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Pull networking fixes from David Miller:
1) Use an appropriate TSQ pacing shift in mac80211, from Toke
Høiland-Jørgensen.
2) Just like ipv4's ip_route_me_harder(), we have to use skb_to_full_sk
in ip6_route_me_harder, from Eric Dumazet.
3) Fix several shutdown races and similar other problems in l2tp, from
James Chapman.
4) Handle missing XDP flush properly in tuntap, for real this time.
From Jason Wang.
5) Out-of-bounds access in powerpc ebpf tailcalls, from Daniel
Borkmann.
6) Fix phy_resume() locking, from Andrew Lunn.
7) IFLA_MTU values are ignored on newlink for some tunnel types, fix
from Xin Long.
8) Revert F-RTO middle box workarounds, they only handle one dimension
of the problem. From Yuchung Cheng.
9) Fix socket refcounting in RDS, from Ka-Cheong Poon.
10) Don't allow ppp unit registration to an unregistered channel, from
Guillaume Nault.
11) Various hv_netvsc fixes from Stephen Hemminger.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (98 commits)
hv_netvsc: propagate rx filters to VF
hv_netvsc: filter multicast/broadcast
hv_netvsc: defer queue selection to VF
hv_netvsc: use napi_schedule_irqoff
hv_netvsc: fix race in napi poll when rescheduling
hv_netvsc: cancel subchannel setup before halting device
hv_netvsc: fix error unwind handling if vmbus_open fails
hv_netvsc: only wake transmit queue if link is up
hv_netvsc: avoid retry on send during shutdown
virtio-net: re enable XDP_REDIRECT for mergeable buffer
ppp: prevent unregistered channels from connecting to PPP units
tc-testing: skbmod: fix match value of ethertype
mlxsw: spectrum_switchdev: Check success of FDB add operation
net: make skb_gso_*_seglen functions private
net: xfrm: use skb_gso_validate_network_len() to check gso sizes
net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
net: rename skb_gso_validate_mtu -> skb_gso_validate_network_len
rds: Incorrect reference counting in TCP socket creation
net: ethtool: don't ignore return from driver get_fecparam method
vrf: check forwarding on the original netdevice when generating ICMP dest unreachable
...
|
|
When autoneg is off, the .check_for_link callback functions clear the
get_link_status flag and systematically return a "pseudo-error". This means
that the link is not detected as up until the next execution of the
e1000_watchdog_task() 2 seconds later.
Fixes: 19110cfbb34d ("e1000e: Separate signaling for link check/link up")
Signed-off-by: Benjamin Poirier <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
The 82574 specification update errata 12 states that interrupts may be
missed if ICR is read while INT_ASSERTED is not set. Avoid that problem by
setting all bits related to events that can trigger the Other interrupt in
IMS.
The Other interrupt is raised for such events regardless of whether or not
they are set in IMS. However, only when they are set is the INT_ASSERTED
bit also set in ICR.
By doing this, we ensure that INT_ASSERTED is always set when we read ICR
in e1000_msix_other() and steer clear of the errata. This also ensures that
ICR will automatically be cleared on read, therefore we no longer need to
clear bits explicitly.
Signed-off-by: Benjamin Poirier <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
Restores the ICS write for Rx/Tx queue interrupts which was present before
commit 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1)
but was not restored in commit 4aea7a5c5e94
("e1000e: Avoid receiver overrun interrupt bursts", v4.15-rc1).
This re-raises the queue interrupts in case the txq or rxq bits were set in
ICR and the Other interrupt handler read and cleared ICR before the queue
interrupt was raised.
Fixes: 4aea7a5c5e94 ("e1000e: Avoid receiver overrun interrupt bursts")
Signed-off-by: Benjamin Poirier <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
This partially reverts commit 4aea7a5c5e940c1723add439f4088844cd26196d.
We keep the fix for the first part of the problem (1) described in the log
of that commit, that is to read ICR in the other interrupt handler. We
remove the fix for the second part of the problem (2), Other interrupt
throttling.
Bursts of "Other" interrupts may once again occur during rxo (receive
overflow) traffic conditions. This is deemed acceptable in the interest of
avoiding unforeseen fallout from changes that are not strictly necessary.
As discussed, the e1000e driver should be in "maintenance mode".
Link: https://www.spinics.net/lists/netdev/msg480675.html
Signed-off-by: Benjamin Poirier <[email protected]>
Acked-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
It was reported that emulated e1000e devices in vmware esxi 6.5 Build
7526125 do not link up after commit 4aea7a5c5e94 ("e1000e: Avoid receiver
overrun interrupt bursts", v4.15-rc1). Some tracing shows that after
e1000e_trigger_lsc() is called, ICR reads out as 0x0 in e1000_msix_other()
on emulated e1000e devices. In comparison, on real e1000e 82574 hardware,
icr=0x80000004 (_INT_ASSERTED | _LSC) in the same situation.
Some experimentation showed that this flaw in vmware e1000e emulation can
be worked around by not setting Other in EIAC. This is how it was before
16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt", v4.5-rc1).
Fixes: 4aea7a5c5e94 ("e1000e: Avoid receiver overrun interrupt bursts")
Signed-off-by: Benjamin Poirier <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
|
|
drivers/media/Kconfig: typo: replace `with` with `which`
Signed-off-by: Michael Ira Krufky <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
If a start bit is detected, then reset the receive buffer counter to 0.
This ensures that no stale data is in the buffer if a message is
broken off midstream due to e.g. a Low Drive condition and then
retransmitted.
The only Rx interrupts we need to listen to are RX_REGISTER_FULL (i.e.
a valid byte was received) and RX_START_BIT_DETECTED (i.e. a new
message starts and we need to reset the counter).
Signed-off-by: Hans Verkuil <[email protected]>
Cc: <[email protected]> # for v4.15 and up
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
This patch fixes compatible for STM32F7 USB OTG HS and consistently rename
dw2_set_params function.
The v2 former patch [1] had been acked by Paul Young, but v1 was merged.
[1] https://patchwork.kernel.org/patch/9925573/
Fixes: d8fae8b93682 ("usb: dwc2: add support for STM32F7xx USB OTG HS")
Signed-off-by: Amelie Delaunay <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|
|
When I debug a kernel crash issue in funcitonfs, found ffs_data.ref
overflowed, While functionfs is unmounting, ffs_data is put twice.
Commit 43938613c6fd ("drivers, usb: convert ffs_data.ref from atomic_t to
refcount_t") can avoid refcount overflow, but that is risk some situations.
So no need put ffs data in ffs_fs_kill_sb, already put in ffs_data_closed.
The issue can be reproduced in Mediatek mt6763 SoC, ffs for ADB device.
KASAN enabled configuration reports use-after-free errro.
BUG: KASAN: use-after-free in refcount_dec_and_test+0x14/0xe0 at addr ffffffc0579386a0
Read of size 4 by task umount/4650
====================================================
BUG kmalloc-512 (Tainted: P W O ): kasan: bad access detected
-----------------------------------------------------------------------------
INFO: Allocated in ffs_fs_mount+0x194/0x844 age=22856 cpu=2 pid=566
alloc_debug_processing+0x1ac/0x1e8
___slab_alloc.constprop.63+0x640/0x648
__slab_alloc.isra.57.constprop.62+0x24/0x34
kmem_cache_alloc_trace+0x1a8/0x2bc
ffs_fs_mount+0x194/0x844
mount_fs+0x6c/0x1d0
vfs_kern_mount+0x50/0x1b4
do_mount+0x258/0x1034
INFO: Freed in ffs_data_put+0x25c/0x320 age=0 cpu=3 pid=4650
free_debug_processing+0x22c/0x434
__slab_free+0x2d8/0x3a0
kfree+0x254/0x264
ffs_data_put+0x25c/0x320
ffs_data_closed+0x124/0x15c
ffs_fs_kill_sb+0xb8/0x110
deactivate_locked_super+0x6c/0x98
deactivate_super+0xb0/0xbc
INFO: Object 0xffffffc057938600 @offset=1536 fp=0x (null)
......
Call trace:
[<ffffff900808cf5c>] dump_backtrace+0x0/0x250
[<ffffff900808d3a0>] show_stack+0x14/0x1c
[<ffffff90084a8c04>] dump_stack+0xa0/0xc8
[<ffffff900826c2b4>] print_trailer+0x158/0x260
[<ffffff900826d9d8>] object_err+0x3c/0x40
[<ffffff90082745f0>] kasan_report_error+0x2a8/0x754
[<ffffff9008274f84>] kasan_report+0x5c/0x60
[<ffffff9008273208>] __asan_load4+0x70/0x88
[<ffffff90084cd81c>] refcount_dec_and_test+0x14/0xe0
[<ffffff9008d98f9c>] ffs_data_put+0x80/0x320
[<ffffff9008d9d904>] ffs_fs_kill_sb+0xc8/0x110
[<ffffff90082852a0>] deactivate_locked_super+0x6c/0x98
[<ffffff900828537c>] deactivate_super+0xb0/0xbc
[<ffffff90082af0c0>] cleanup_mnt+0x64/0xec
[<ffffff90082af1b0>] __cleanup_mnt+0x10/0x18
[<ffffff90080d9e68>] task_work_run+0xcc/0x124
[<ffffff900808c8c0>] do_notify_resume+0x60/0x70
[<ffffff90080866e4>] work_pending+0x10/0x14
Cc: [email protected]
Signed-off-by: Xinyong <[email protected]>
Signed-off-by: Felipe Balbi <[email protected]>
|