Age | Commit message (Collapse) | Author | Files | Lines |
|
Antoine Tenart says:
====================
gro: various fixes related to UDP tunnels
We found issues when a UDP tunnel endpoint is in a different netns than
where UDP GRO happens. This kind of setup is actually quite diverse,
from having one leg of the tunnel on a remove host, to having a tunnel
between netns (eg. being bridged in another one or on the host). In our
case that UDP tunnel was geneve.
UDP tunnel packets should not be GROed at the UDP level. The fundamental
issue here is such packet can't be detected in a foolproof way: we can't
know by looking at a packet alone and the current logic of looking up
UDP sockets is fragile (socket could be in another netns, packet could
be modified in between, etc). Because there is no way to make the GRO
code to correctly handle those packets in all cases, this series aims at
two things: making the net stack to correctly behave (as in, no crash
and no invalid packet) when such thing happens, and in some cases to
prevent this "early GRO" from happening.
First three patches fix issues when an "UDP tunneled" packet is being
GROed too early by rx-udp-gro-forwarding or rx-gro-list.
Last patch is preventing locally generated UDP tunnel packets from being
GROed. This turns out to be more complex than this patch alone as it
relies on skb->encapsulation which is currently untrusty in some cases
(see iptunnel_handle_offloads); but that should fix things in practice
and is acceptable for a fix. Future work is required to improve things
(prevent all locally generated UDP tunnel packets from being GROed),
such as fixing the misuse of skb->encapsulation in drivers; but that
would be net-next material.
Thanks!
Antoine
Since v3:
- Fixed the udpgro_fwd selftest in patch 5 (Jakub Kicinski feedback).
- Improved commit message on patch 3 (Willem de Bruijn feeback).
Since v2:
- Fixed a build issue with IPv6=m in patch 1 (Jakub Kicinski
feedback).
- Fixed typo in patch 1 (Nikolay Aleksandrov feedback).
- Added Reviewed-by tag on patch 2 (Willem de Bruijn feeback).
- Added back conversion to CHECKSUM_UNNECESSARY but only from non
CHECKSUM_PARTIAL in patch 3 (Paolo Abeni & Willem de Bruijn
feeback).
- Reworded patch 3 commit msg.
Since v1:
- Fixed a build issue with IPv6 disabled in patch 1.
- Reworked commit log in patch 2 (Willem de Bruijn feedback).
- Added Reviewed-by tags on patches 1 & 4 (Willem de Bruijn feeback).
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
UDP tunnel packets can't be GRO in-between their endpoints as this
causes different issues. The UDP GRO fwd vxlan tests were relying on
this and their expectations have to be fixed.
We keep both vxlan tests and expected no GRO from happening. The vxlan
UDP GRO bench test was removed as it's not providing any valuable
information now.
Fixes: a062260a9d5f ("selftests: net: add UDP GRO forwarding self-tests")
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
GRO has a fundamental issue with UDP tunnel packets as it can't detect
those in a foolproof way and GRO could happen before they reach the
tunnel endpoint. Previous commits have fixed issues when UDP tunnel
packets come from a remote host, but if those packets are issued locally
they could run into checksum issues.
If the inner packet has a partial checksum the information will be lost
in the GRO logic, either in udp4/6_gro_complete or in
udp_gro_complete_segment and packets will have an invalid checksum when
leaving the host.
Prevent local UDP tunnel packets from ever being GROed at the outer UDP
level.
Due to skb->encapsulation being wrongly used in some drivers this is
actually only preventing UDP tunnel packets with a partial checksum to
be GROed (see iptunnel_handle_offloads) but those were also the packets
triggering issues so in practice this should be sufficient.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Suggested-by: Paolo Abeni <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
UDP GRO validates checksums and in udp4/6_gro_complete fraglist packets
are converted to CHECKSUM_UNNECESSARY to avoid later checks. However
this is an issue for CHECKSUM_PARTIAL packets as they can be looped in
an egress path and then their partial checksums are not fixed.
Different issues can be observed, from invalid checksum on packets to
traces like:
gen01: hw csum failure
skb len=3008 headroom=160 headlen=1376 tailroom=0
mac=(106,14) net=(120,40) trans=160
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0xffff232e ip_summed=2 complete_sw=0 valid=0 level=0)
hash(0x77e3d716 sw=1 l4=1) proto=0x86dd pkttype=0 iif=12
...
Fix this by only converting CHECKSUM_NONE packets to
CHECKSUM_UNNECESSARY by reusing __skb_incr_checksum_unnecessary. All
other checksum types are kept as-is, including CHECKSUM_COMPLETE as
fraglist packets being segmented back would have their skb->csum valid.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Antoine Tenart <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If packets are GROed with fraglist they might be segmented later on and
continue their journey in the stack. In skb_segment_list those skbs can
be reused as-is. This is an issue as their destructor was removed in
skb_gro_receive_list but not the reference to their socket, and then
they can't be orphaned. Fix this by also removing the reference to the
socket.
For example this could be observed,
kernel BUG at include/linux/skbuff.h:3131! (skb_orphan)
RIP: 0010:ip6_rcv_core+0x11bc/0x19a0
Call Trace:
ipv6_list_rcv+0x250/0x3f0
__netif_receive_skb_list_core+0x49d/0x8f0
netif_receive_skb_list_internal+0x634/0xd40
napi_complete_done+0x1d2/0x7d0
gro_cell_poll+0x118/0x1f0
A similar construction is found in skb_gro_receive, apply the same
change there.
Fixes: 5e10da5385d2 ("skbuff: allow 'slow_gro' for skb carring sock reference")
Signed-off-by: Antoine Tenart <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When rx-udp-gro-forwarding is enabled UDP packets might be GROed when
being forwarded. If such packets might land in a tunnel this can cause
various issues and udp_gro_receive makes sure this isn't the case by
looking for a matching socket. This is performed in
udp4/6_gro_lookup_skb but only in the current netns. This is an issue
with tunneled packets when the endpoint is in another netns. In such
cases the packets will be GROed at the UDP level, which leads to various
issues later on. The same thing can happen with rx-gro-list.
We saw this with geneve packets being GROed at the UDP level. In such
case gso_size is set; later the packet goes through the geneve rx path,
the geneve header is pulled, the offset are adjusted and frag_list skbs
are not adjusted with regard to geneve. When those skbs hit
skb_fragment, it will misbehave. Different outcomes are possible
depending on what the GROed skbs look like; from corrupted packets to
kernel crashes.
One example is a BUG_ON[1] triggered in skb_segment while processing the
frag_list. Because gso_size is wrong (geneve header was pulled)
skb_segment thinks there is "geneve header size" of data in frag_list,
although it's in fact the next packet. The BUG_ON itself has nothing to
do with the issue. This is only one of the potential issues.
Looking up for a matching socket in udp_gro_receive is fragile: the
lookup could be extended to all netns (not speaking about performances)
but nothing prevents those packets from being modified in between and we
could still not find a matching socket. It's OK to keep the current
logic there as it should cover most cases but we also need to make sure
we handle tunnel packets being GROed too early.
This is done by extending the checks in udp_unexpected_gso: GSO packets
lacking the SKB_GSO_UDP_TUNNEL/_CSUM bits and landing in a tunnel must
be segmented.
[1] kernel BUG at net/core/skbuff.c:4408!
RIP: 0010:skb_segment+0xd2a/0xf70
__udp_gso_segment+0xaa/0x560
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Signed-off-by: Antoine Tenart <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Up till now only single character ('A' or 'B') was used to provide
information of HSR slave network device status.
As it is also possible and valid, that Interlink network device may
be supported as well, the description must be more verbose. As a result
the full string description is now used.
Signed-off-by: Lukasz Majewski <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Include the header that defines u32.
This fixes build of 6.6.23 and 6.1.83 kernels for Alpine Linux, which
uses musl libc. I assume that GNU libc indirecly pulls in linux/types.h.
Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218647
Cc: [email protected]
Signed-off-by: Natanael Copa <[email protected]>
Tested-by: Greg Thelen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-03-27 (e1000e)
This series contains updates to e1000e driver only.
Vitaly adds retry mechanism for some PHY operations to workaround MDI
error and moves SMBus configuration to avoid possible PHY loss.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue
e1000e: Workaround for sporadic MDI error on Meteor Lake systems
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
BPF link for some program types is passed as a "context" which can be
used by those BPF programs to look up additional information. E.g., for
multi-kprobes and multi-uprobes, link is used to fetch BPF cookie values.
Because of this runtime dependency, when bpf_link refcnt drops to zero
there could still be active BPF programs running accessing link data.
This patch adds generic support to defer bpf_link dealloc callback to
after RCU GP, if requested. This is done by exposing two different
deallocation callbacks, one synchronous and one deferred. If deferred
one is provided, bpf_link_free() will schedule dealloc_deferred()
callback to happen after RCU GP.
BPF is using two flavors of RCU: "classic" non-sleepable one and RCU
tasks trace one. The latter is used when sleepable BPF programs are
used. bpf_link_free() accommodates that by checking underlying BPF
program's sleepable flag, and goes either through normal RCU GP only for
non-sleepable, or through RCU tasks trace GP *and* then normal RCU GP
(taking into account rcu_trace_implies_rcu_gp() optimization), if BPF
program is sleepable.
We use this for multi-kprobe and multi-uprobe links, which dereference
link during program run. We also preventively switch raw_tp link to use
deferred dealloc callback, as upcoming changes in bpf-next tree expose
raw_tp link data (specifically, cookie value) to BPF program at runtime
as well.
Fixes: 0dcac2725406 ("bpf: Add multi kprobe link")
Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link")
Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
There is no need to delay putting either path or task to deallocation
step. It can be done right after bpf_uprobe_unregister. Between release
and dealloc, there could be still some running BPF programs, but they
don't access either task or path, only data in link->uprobes, so it is
safe to do.
On the other hand, doing path_put() in dealloc callback makes this
dealloc sleepable because path_put() itself might sleep. Which is
problematic due to the need to call uprobe's dealloc through call_rcu(),
which is what is done in the next bug fix patch. So solve the problem by
releasing these resources early.
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Notice that skb_mark_for_recycle() is introduced later than fixes tag in
commit 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling").
It is believed that fixes tag were missing a call to page_pool_release_page()
between v5.9 to v5.14, after which is should have used skb_mark_for_recycle().
Since v6.6 the call page_pool_release_page() were removed (in
commit 535b9c61bdef ("net: page_pool: hide page_pool_release_page()")
and remaining callers converted (in commit 6bfef2ec0172 ("Merge branch
'net-page_pool-remove-page_pool_release_page'")).
This leak became visible in v6.8 via commit dba1b8a7ab68 ("mm/page_pool: catch
page_pool memory leaks").
Cc: [email protected]
Fixes: 6c5aa6fc4def ("xen networking: add basic XDP support for xen-netfront")
Reported-by: Leonidas Spyropoulos <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218654
Reported-by: Arthur Borsboom <[email protected]>
Signed-off-by: Jesper Dangaard Brouer <[email protected]>
Link: https://lore.kernel.org/r/171154167446.2671062.9127105384591237363.stgit@firesoul
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Emails to Jeff Sipek bounce:
Your message to [email protected] couldn't be delivered.
Recipient is not authorized to accept external mail
Status code: 550 5.7.1_ETR
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Provide devlink documentation for three eswitch attributes:
mode, inline-mode, and encap-mode.
Signed-off-by: William Tu <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Fix regression for the mmc ioctl
MMC host:
- sdhci-of-dwcmshc: Fixup PM support in ->remove_new()
- sdhci-omap: Re-tune when device became runtime suspended"
* tag 'mmc-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
sdhci-of-dwcmshc: disable PM runtime in dwcmshc_remove()
mmc: sdhci-omap: re-tuning is needed after a pm transition to support emmc HS200 mode
mmc: core: Avoid negative index with array access
mmc: core: Initialize mmc_blk_ioc_data
|
|
This reverts commit 748dc0b65ec2b4b7b3dbd7befcc4a54fdcac7988.
Partial zone append completions cannot be supported as there is no
guarantees that the fragmented data will be written sequentially in the
same manner as with a full command. Commit 748dc0b65ec2 ("block: fix
partial zone append completion handling in req_bio_endio()") changed
req_bio_endio() to always advance a partially failed BIO by its full
length, but this can lead to incorrect accounting. So revert this
change and let low level device drivers handle this case by always
failing completely zone append operations. With this revert, users will
still see an IO error for a partially completed zone append BIO.
Fixes: 748dc0b65ec2 ("block: fix partial zone append completion handling in req_bio_endio()")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of device-specific small fixes: a series of fixes for
TAS2781 HD-audio codec, ASoC SOF, Cirrus CS35L56 and a couple of
legacy drivers"
* tag 'sound-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/tas2781: remove useless dev_dbg from playback_hook
ALSA: hda/tas2781: add debug statements to kcontrols
ALSA: hda/tas2781: add locks to kcontrols
ALSA: hda/tas2781: remove digital gain kcontrol
ALSA: aoa: avoid false-positive format truncation warning
ALSA: sh: aica: reorder cleanup operations to avoid UAF bugs
ALSA: hda: cs35l56: Set the init_done flag before component_add()
ALSA: hda: cs35l56: Raise device name message log level
ASoC: SOF: ipc4-topology: support NHLT device type
ALSA: hda: intel-nhlt: add intel_nhlt_ssp_device_type() function
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"ARM SMMU fixes:
- Fix swabbing of the STE fields in the unlikely event of running on
a big-endian machine
- Fix setting of STE.SHCFG on hardware that doesn't implement support
for attribute overrides
IOMMU core:
- PASID validation fix in device attach path"
* tag 'iommu-fixes-v6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Validate the PASID in iommu_attach_device_pasid()
iommu/arm-smmu-v3: Fix access for STE.SHCFG
iommu/arm-smmu-v3: Add cpu_to_le64() around STRTAB_STE_0_V
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fixes from Chuck Lever:
- Address three recently introduced regressions
* tag 'nfsd-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
NFSD: CREATE_SESSION must never cache NFS4ERR_DELAY replies
SUNRPC: Revert 561141dd494382217bace4d1a51d08168420eace
nfsd: Fix error cleanup path in nfsd_rename()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, WiFi and netfilter.
Current release - regressions:
- ipv6: fix address dump when IPv6 is disabled on an interface
Current release - new code bugs:
- bpf: temporarily disable atomic operations in BPF arena
- nexthop: fix uninitialized variable in nla_put_nh_group_stats()
Previous releases - regressions:
- bpf: protect against int overflow for stack access size
- hsr: fix the promiscuous mode in offload mode
- wifi: don't always use FW dump trig
- tls: adjust recv return with async crypto and failed copy to
userspace
- tcp: properly terminate timers for kernel sockets
- ice: fix memory corruption bug with suspend and rebuild
- at803x: fix kernel panic with at8031_probe
- qeth: handle deferred cc1
Previous releases - always broken:
- bpf: fix bug in BPF_LDX_MEMSX
- netfilter: reject table flag and netdev basechain updates
- inet_defrag: prevent sk release while still in use
- wifi: pick the version of SESSION_PROTECTION_NOTIF
- wwan: t7xx: split 64bit accesses to fix alignment issues
- mlxbf_gige: call request_irq() after NAPI initialized
- hns3: fix kernel crash when devlink reload during pf
initialization"
* tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits)
inet: inet_defrag: prevent sk release while still in use
Octeontx2-af: fix pause frame configuration in GMP mode
net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips
net: bcmasp: Remove phy_{suspend/resume}
net: bcmasp: Bring up unimac after PHY link up
net: phy: qcom: at803x: fix kernel panic with at8031_probe
netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
netfilter: nf_tables: skip netdev hook unregistration if table is dormant
netfilter: nf_tables: reject table flag and netdev basechain updates
netfilter: nf_tables: reject destroy command to remove basechain hooks
bpf: update BPF LSM designated reviewer list
bpf: Protect against int overflow for stack access size
bpf: Check bloom filter map value size
bpf: fix warning for crash_kexec
selftests: netdevsim: set test timeout to 10 minutes
net: wan: framer: Add missing static inline qualifiers
mlxbf_gige: call request_irq() after NAPI initialized
tls: get psock ref after taking rxlock to avoid leak
selftests: tls: add test with a partially invalid iov
tls: adjust recv return with async crypto and failed copy to userspace
...
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
bridge:
- select DRM_KMS_HELPER
dma-buf:
- fix NULL-pointer deref
dp:
- fix div-by-zero in DP MST unplug code
fbdev:
- select FB_IOMEM_FOPS for SBus
nouveau:
- dmem: handle kcalloc() allocation failures
qxl:
- remove unused variables
rockchip:
- vop2: remove support for AR30 and AB30 formats
sched:
- fix NULL-pointer deref
vmwgfx:
- debugfs: create ttm_resource_manager entry only if needed
Signed-off-by: Dave Airlie <[email protected]>
From: Thomas Zimmermann <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This is required, as CONFIG_DAMON_DEBUGFS is enabled, and --alltests UML
builds will fail due to the missing config option otherwise.
Fixes: f4cba4bf6777 ("mm/damon: rename CONFIG_DAMON_DBGFS to DAMON_DBGFS_DEPRECATED")
Signed-off-by: David Gow <[email protected]>
Reviewed-by: Rae Moar <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
intel_bios_encoder_supports_dp_dual_mode()
If we have no VBT, or the VBT didn't declare the encoder
in question, we won't have the 'devdata' for the encoder.
Instead of oopsing just bail early.
We won't be able to tell whether the port is DP++ or not,
but so be it.
Cc: [email protected]
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10464
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jani Nikula <[email protected]>
(cherry picked from commit 26410896206342c8a80d2b027923e9ee7d33b733)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Calling i915_gem_object_get_dma_address() from the vblank
evade critical section triggers might_sleep().
While we know that we've already pinned the framebuffer
and thus i915_gem_object_get_dma_address() will in fact
not sleep in this case, it seems reasonable to keep the
unconditional might_sleep() for maximum coverage.
So let's instead pre-populate the dma address during
fb pinning, which all happens before we enter the
vblank evade critical section.
We can use u32 for the dma address as this class of
hardware doesn't support >32bit addresses.
Cc: [email protected]
Fixes: 0225a90981c8 ("drm/i915: Make cursor plane registers unlocked")
Reported-by: Borislav Petkov <[email protected]>
Closes: https://lore.kernel.org/intel-gfx/20240227100342.GAZd2zfmYcPS_SndtO@fat_crate.local/
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Tested-by: Borislav Petkov (AMD) <[email protected]>
Reviewed-by: Chaitanya Kumar Borah <[email protected]>
(cherry picked from commit c1289a5c3594cf04caa94ebf0edeb50c62009f1f)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Originally, with strict in order execution, we could complete execution
only when the queue was empty. Preempt-to-busy allows replacement of an
active request that may complete before the preemption is processed by
HW. If that happens, the request is retired from the queue, but the
queue_priority_hint remains set, preventing direct submission until
after the next CS interrupt is processed.
This preempt-to-busy race can be triggered by the heartbeat, which will
also act as the power-management barrier and upon completion allow us to
idle the HW. We may process the completion of the heartbeat, and begin
parking the engine before the CS event that restores the
queue_priority_hint, causing us to fail the assertion that it is MIN.
<3>[ 166.210729] __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[ 166.210781] Dumping ftrace buffer:
<0>[ 166.210795] ---------------------------------
...
<0>[ 167.302811] drm_fdin-1097 2..s1. 165741070us : trace_ports: 0000:00:02.0 rcs0: promote { ccid:20 1217:2 prio 0 }
<0>[ 167.302861] drm_fdin-1097 2d.s2. 165741072us : execlists_submission_tasklet: 0000:00:02.0 rcs0: preempting last=1217:2, prio=0, hint=2147483646
<0>[ 167.302928] drm_fdin-1097 2d.s2. 165741072us : __i915_request_unsubmit: 0000:00:02.0 rcs0: fence 1217:2, current 0
<0>[ 167.302992] drm_fdin-1097 2d.s2. 165741073us : __i915_request_submit: 0000:00:02.0 rcs0: fence 3:4660, current 4659
<0>[ 167.303044] drm_fdin-1097 2d.s1. 165741076us : execlists_submission_tasklet: 0000:00:02.0 rcs0: context:3 schedule-in, ccid:40
<0>[ 167.303095] drm_fdin-1097 2d.s1. 165741077us : trace_ports: 0000:00:02.0 rcs0: submit { ccid:40 3:4660* prio 2147483646 }
<0>[ 167.303159] kworker/-89 11..... 165741139us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence c90:2, current 2
<0>[ 167.303208] kworker/-89 11..... 165741148us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:c90 unpin
<0>[ 167.303272] kworker/-89 11..... 165741159us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 1217:2, current 2
<0>[ 167.303321] kworker/-89 11..... 165741166us : __intel_context_do_unpin: 0000:00:02.0 rcs0: context:1217 unpin
<0>[ 167.303384] kworker/-89 11..... 165741170us : i915_request_retire.part.0: 0000:00:02.0 rcs0: fence 3:4660, current 4660
<0>[ 167.303434] kworker/-89 11d..1. 165741172us : __intel_context_retire: 0000:00:02.0 rcs0: context:1216 retire runtime: { total:56028ns, avg:56028ns }
<0>[ 167.303484] kworker/-89 11..... 165741198us : __engine_park: 0000:00:02.0 rcs0: parked
<0>[ 167.303534] <idle>-0 5d.H3. 165741207us : execlists_irq_handler: 0000:00:02.0 rcs0: semaphore yield: 00000040
<0>[ 167.303583] kworker/-89 11..... 165741397us : __intel_context_retire: 0000:00:02.0 rcs0: context:1217 retire runtime: { total:325575ns, avg:0ns }
<0>[ 167.303756] kworker/-89 11..... 165741777us : __intel_context_retire: 0000:00:02.0 rcs0: context:c90 retire runtime: { total:0ns, avg:0ns }
<0>[ 167.303806] kworker/-89 11..... 165742017us : __engine_park: __engine_park:283 GEM_BUG_ON(engine->sched_engine->queue_priority_hint != (-((int)(~0U >> 1)) - 1))
<0>[ 167.303811] ---------------------------------
<4>[ 167.304722] ------------[ cut here ]------------
<2>[ 167.304725] kernel BUG at drivers/gpu/drm/i915/gt/intel_engine_pm.c:283!
<4>[ 167.304731] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 167.304734] CPU: 11 PID: 89 Comm: kworker/11:1 Tainted: G W 6.8.0-rc2-CI_DRM_14193-gc655e0fd2804+ #1
<4>[ 167.304736] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
<4>[ 167.304738] Workqueue: i915-unordered retire_work_handler [i915]
<4>[ 167.304839] RIP: 0010:__engine_park+0x3fd/0x680 [i915]
<4>[ 167.304937] Code: 00 48 c7 c2 b0 e5 86 a0 48 8d 3d 00 00 00 00 e8 79 48 d4 e0 bf 01 00 00 00 e8 ef 0a d4 e0 31 f6 bf 09 00 00 00 e8 03 49 c0 e0 <0f> 0b 0f 0b be 01 00 00 00 e8 f5 61 fd ff 31 c0 e9 34 fd ff ff 48
<4>[ 167.304940] RSP: 0018:ffffc9000059fce0 EFLAGS: 00010246
<4>[ 167.304942] RAX: 0000000000000200 RBX: 0000000000000000 RCX: 0000000000000006
<4>[ 167.304944] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000009
<4>[ 167.304946] RBP: ffff8881330ca1b0 R08: 0000000000000001 R09: 0000000000000001
<4>[ 167.304947] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8881330ca000
<4>[ 167.304948] R13: ffff888110f02aa0 R14: ffff88812d1d0205 R15: ffff88811277d4f0
<4>[ 167.304950] FS: 0000000000000000(0000) GS:ffff88844f780000(0000) knlGS:0000000000000000
<4>[ 167.304952] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 167.304953] CR2: 00007fc362200c40 CR3: 000000013306e003 CR4: 0000000000770ef0
<4>[ 167.304955] PKRU: 55555554
<4>[ 167.304957] Call Trace:
<4>[ 167.304958] <TASK>
<4>[ 167.305573] ____intel_wakeref_put_last+0x1d/0x80 [i915]
<4>[ 167.305685] i915_request_retire.part.0+0x34f/0x600 [i915]
<4>[ 167.305800] retire_requests+0x51/0x80 [i915]
<4>[ 167.305892] intel_gt_retire_requests_timeout+0x27f/0x700 [i915]
<4>[ 167.305985] process_scheduled_works+0x2db/0x530
<4>[ 167.305990] worker_thread+0x18c/0x350
<4>[ 167.305993] kthread+0xfe/0x130
<4>[ 167.305997] ret_from_fork+0x2c/0x50
<4>[ 167.306001] ret_from_fork_asm+0x1b/0x30
<4>[ 167.306004] </TASK>
It is necessary for the queue_priority_hint to be lower than the next
request submission upon waking up, as we rely on the hint to decide when
to kick the tasklet to submit that first request.
Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/10154
Signed-off-by: Chris Wilson <[email protected]>
Signed-off-by: Janusz Krzysztofik <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: <[email protected]> # v5.4+
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 98850e96cf811dc2d0a7d0af491caff9f5d49c1e)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Object debugging tools were sporadically reporting illegal attempts to
free a still active i915 VMA object when parking a GT believed to be idle.
[161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915]
[161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0
...
[161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1
[161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022
[161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915]
[161.360592] RIP: 0010:debug_print_object+0x80/0xb0
...
[161.361347] debug_object_free+0xeb/0x110
[161.361362] i915_active_fini+0x14/0x130 [i915]
[161.361866] release_references+0xfe/0x1f0 [i915]
[161.362543] i915_vma_parked+0x1db/0x380 [i915]
[161.363129] __gt_park+0x121/0x230 [i915]
[161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915]
That has been tracked down to be happening when another thread is
deactivating the VMA inside __active_retire() helper, after the VMA's
active counter has been already decremented to 0, but before deactivation
of the VMA's object is reported to the object debugging tool.
We could prevent from that race by serializing i915_active_fini() with
__active_retire() via ref->tree_lock, but that wouldn't stop the VMA from
being used, e.g. from __i915_vma_retire() called at the end of
__active_retire(), after that VMA has been already freed by a concurrent
i915_vma_destroy() on return from the i915_active_fini(). Then, we should
rather fix the issue at the VMA level, not in i915_active.
Since __i915_vma_parked() is called from __gt_park() on last put of the
GT's wakeref, the issue could be addressed by holding the GT wakeref long
enough for __active_retire() to complete before that wakeref is released
and the GT parked.
I believe the issue was introduced by commit d93939730347 ("drm/i915:
Remove the vma refcount") which moved a call to i915_active_fini() from
a dropped i915_vma_release(), called on last put of the removed VMA kref,
to i915_vma_parked() processing path called on last put of a GT wakeref.
However, its visibility to the object debugging tool was suppressed by a
bug in i915_active that was fixed two weeks later with commit e92eb246feb9
("drm/i915/active: Fix missing debug object activation").
A VMA associated with a request doesn't acquire a GT wakeref by itself.
Instead, it depends on a wakeref held directly by the request's active
intel_context for a GT associated with its VM, and indirectly on that
intel_context's engine wakeref if the engine belongs to the same GT as the
VMA's VM. Those wakerefs are released asynchronously to VMA deactivation.
Fix the issue by getting a wakeref for the VMA's GT when activating it,
and putting that wakeref only after the VMA is deactivated. However,
exclude global GTT from that processing path, otherwise the GPU never goes
idle. Since __i915_vma_retire() may be called from atomic contexts, use
async variant of wakeref put. Also, to avoid circular locking dependency,
take care of acquiring the wakeref before VM mutex when both are needed.
v7: Add inline comments with justifications for:
- using untracked variants of intel_gt_pm_get/put() (Nirmoy),
- using async variant of _put(),
- not getting the wakeref in case of a global GTT,
- always getting the first wakeref outside vm->mutex.
v6: Since __i915_vma_active/retire() callbacks are not serialized, storing
a wakeref tracking handle inside struct i915_vma is not safe, and
there is no other good place for that. Use untracked variants of
intel_gt_pm_get/put_async().
v5: Replace "tile" with "GT" across commit description (Rodrigo),
- avoid mentioning multi-GT case in commit description (Rodrigo),
- explain why we need to take a temporary wakeref unconditionally inside
i915_vma_pin_ww() (Rodrigo).
v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm
wakerefs") (Andi),
- for more easy backporting, split out removal of former insufficient
workarounds and move them to separate patches (Nirmoy).
- clean up commit message and description a bit.
v3: Identify root cause more precisely, and a commit to blame,
- identify and drop former workarounds,
- update commit message and description.
v2: Get the wakeref before VM mutex to avoid circular locking dependency,
- drop questionable Fixes: tag.
Fixes: d93939730347 ("drm/i915: Remove the vma refcount")
Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875
Signed-off-by: Janusz Krzysztofik <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Nirmoy Das <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: [email protected] # v5.19+
Reviewed-by: Nirmoy Das <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit f3c71b2ded5c4367144a810ef25f998fd1d6c381)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
It is misleading, if the intention was to also print something
in case it succeed it should have a different string.
Cc: Alan Previn <[email protected]>
Signed-off-by: José Roberto de Souza <[email protected]>
Fixes: 698e19da2914 ("drm/i915: Skip pxp init if gt is wedged")
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit d437099ab21cd4c6ce5d578b765df642d759c929)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Since commit 0c65dc062611 ("drm/i915/jsl: s/JSL/JASPERLAKE for
platform/subplatform defines"), boot freezes on a Jasper Lake tablet
(Librem 11), usually with graphical corruption on the eDP display,
but sometimes just a black screen. This commit was included in 6.6 and
later.
That commit was intended to refactor EHL and JSL macros, but the change
to ehl_combo_pll_div_frac_wa_needed() started matching JSL incorrectly
when it was only intended to match EHL.
It replaced:
return ((IS_PLATFORM(i915, INTEL_ELKHARTLAKE) &&
IS_JSL_EHL_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
with:
return (((IS_ELKHARTLAKE(i915) || IS_JASPERLAKE(i915)) &&
IS_DISPLAY_STEP(i915, STEP_B0, STEP_FOREVER)) ||
Remove IS_JASPERLAKE() to fix the regression.
Signed-off-by: Jonathon Hall <[email protected]>
Cc: [email protected]
Fixes: 0c65dc062611 ("drm/i915/jsl: s/JSL/JASPERLAKE for platform/subplatform defines")
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Jani Nikula <[email protected]>
(cherry picked from commit 1ef48859317b2a77672dea8682df133abf9c44ed)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In i915 hwmon sysfs getter path we now take a hwmon_lock, then acquire an
rpm wakeref. That results in lock inversion:
<4> [197.079335] ======================================================
<4> [197.085473] WARNING: possible circular locking dependency detected
<4> [197.091611] 6.8.0-rc7-Patchwork_129026v7-gc4dc92fb1152+ #1 Not tainted
<4> [197.098096] ------------------------------------------------------
<4> [197.104231] prometheus-node/839 is trying to acquire lock:
<4> [197.109680] ffffffff82764d80 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x9a/0x350
<4> [197.116939]
but task is already holding lock:
<4> [197.122730] ffff88811b772a40 (&hwmon->hwmon_lock){+.+.}-{3:3}, at: hwm_energy+0x4b/0x100 [i915]
<4> [197.131543]
which lock already depends on the new lock.
...
<4> [197.507922] Chain exists of:
fs_reclaim --> >->reset.mutex --> &hwmon->hwmon_lock
<4> [197.518528] Possible unsafe locking scenario:
<4> [197.524411] CPU0 CPU1
<4> [197.528916] ---- ----
<4> [197.533418] lock(&hwmon->hwmon_lock);
<4> [197.537237] lock(>->reset.mutex);
<4> [197.543376] lock(&hwmon->hwmon_lock);
<4> [197.549682] lock(fs_reclaim);
...
<4> [197.632548] Call Trace:
<4> [197.634990] <TASK>
<4> [197.637088] dump_stack_lvl+0x64/0xb0
<4> [197.640738] check_noncircular+0x15e/0x180
<4> [197.652968] check_prev_add+0xe9/0xce0
<4> [197.656705] __lock_acquire+0x179f/0x2300
<4> [197.660694] lock_acquire+0xd8/0x2d0
<4> [197.673009] fs_reclaim_acquire+0xa1/0xd0
<4> [197.680478] __kmalloc+0x9a/0x350
<4> [197.689063] acpi_ns_internalize_name.part.0+0x4a/0xb0
<4> [197.694170] acpi_ns_get_node_unlocked+0x60/0xf0
<4> [197.720608] acpi_ns_get_node+0x3b/0x60
<4> [197.724428] acpi_get_handle+0x57/0xb0
<4> [197.728164] acpi_has_method+0x20/0x50
<4> [197.731896] acpi_pci_set_power_state+0x43/0x120
<4> [197.736485] pci_power_up+0x24/0x1c0
<4> [197.740047] pci_pm_default_resume_early+0x9/0x30
<4> [197.744725] pci_pm_runtime_resume+0x2d/0x90
<4> [197.753911] __rpm_callback+0x3c/0x110
<4> [197.762586] rpm_callback+0x58/0x70
<4> [197.766064] rpm_resume+0x51e/0x730
<4> [197.769542] rpm_resume+0x267/0x730
<4> [197.773020] rpm_resume+0x267/0x730
<4> [197.776498] rpm_resume+0x267/0x730
<4> [197.779974] __pm_runtime_resume+0x49/0x90
<4> [197.784055] __intel_runtime_pm_get+0x19/0xa0 [i915]
<4> [197.789070] hwm_energy+0x55/0x100 [i915]
<4> [197.793183] hwm_read+0x9a/0x310 [i915]
<4> [197.797124] hwmon_attr_show+0x36/0x120
<4> [197.800946] dev_attr_show+0x15/0x60
<4> [197.804509] sysfs_kf_seq_show+0xb5/0x100
Acquire the wakeref before the lock and hold it as long as the lock is
also held. Follow that pattern across the whole source file where similar
lock inversion can happen.
v2: Keep hardware read under the lock so the whole operation of updating
energy from hardware is still atomic (Guenter),
- instead, acquire the rpm wakeref before the lock and hold it as long
as the lock is held,
- use the same aproach for other similar places across the i915_hwmon.c
source file (Rodrigo).
Fixes: 1b44019a93e2 ("drm/i915/guc: Disable PL1 power limit when loading GuC firmware")
Signed-off-by: Janusz Krzysztofik <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Guenter Roeck <[email protected]>
Cc: <[email protected]> # v6.5+
Reviewed-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 71b218771426ea84c0e0148a2b7ac52c1f76e792)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Looks like the undelayed vblank gets signalled exactly when
the active period ends. That is a problem for DSB+VRR when
we are already in vblank and expect DSB to start executing
as soon as we send the push. Instead of starting, the DSB
just keeps on waiting for the undelayed vblank which won't
signal until the end of the next frame's active period,
which is far too late.
The end result is that DSB won't have even started
executing by the time the flips/etc. have completed.
We then wait for an extra 1ms, after which we terminate
the DSB and report a timeout:
[drm] *ERROR* [CRTC:80:pipe A] DSB 0 timed out waiting for idle (current head=0xfedf4000, head=0x0, tail=0x1080)
To fix this let's configure DSB to use the so called VRR
"safe window" instead of the undelayed vblank to trigger
the DSB vblank logic, when VRR is enabled.
Cc: [email protected]
Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9927
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Animesh Manna <[email protected]>
(cherry picked from commit 41429d9b68367596eb3d6d5961e6295c284622a7)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Looks like TRANS_CHICKEN bit 31 means something totally different
depending on the platform:
TGL: generate VRR "safe window" for DSB
ADL/DG2: make TRANS_SET_CONTEXT_LATENCY effective with VRR
So far we've only set this on ADL/DG2, but when using DSB+VRR
we also need to set it on TGL.
And a quick test on MTL says it doesn't need this bit for either
of those purposes, even though it's still documented as valid
in bspec.
Cc: [email protected]
Fixes: 34d8311f4a1c ("drm/i915/dsb: Re-instate DSB for LUT updates")
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9927
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Animesh Manna <[email protected]>
(cherry picked from commit 810e4519a1b34b5a0ff0eab32e5b184f533c5ee9)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove duplicate checks for debugfs entry "DRRS capable:".
Fixes: 20af10845864 ("drm/i915/display/debugfs: New entry "DRRS capable" to i915_drrs_status")
Cc: Jani Nikula <[email protected]>
Cc: Ankit Nautiyal <[email protected]>
Cc: Mitul Golani <[email protected]>
Signed-off-by: Bhanuprakash Modem <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Jani Nikula <[email protected]>
(cherry picked from commit 3d81fceb60f20fe2ceed2198636ee6dc9ef46775)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Rename cpu_transcoder_has_drrs() to intel_cpu_transcoder_has_drrs() and
move it to intel_drrs.[ch].
V2:
- Move helpers to intel_drrs.[ch] (Jani)
- Fix commit message (Jani)
Cc: Jani Nikula <[email protected]>
Cc: Ankit Nautiyal <[email protected]>
Cc: Mitul Golani <[email protected]>
Signed-off-by: Bhanuprakash Modem <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Jani Nikula <[email protected]>
(cherry picked from commit 2d04f8158548103c082190c8dbf6a19097e2423e)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Applying WA 14018575942 only on Compute engine has impact on
some apps like chrome. Updating this WA to apply on Render
engine as well as it is helping with performance on Chrome.
Note: There is no concern from media team thus not applying
WA on media engines. We will revisit if any issues reported
from media team.
V2(Matt):
- Use correct WA number
Fixes: 668f37e1ee11 ("drm/i915/mtl: Update workaround 14018778641")
Signed-off-by: Tejas Upadhyay <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 71271280175aa0ed6673e40cce7c01296bcd05f6)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Reinstate commit 88b065943cb5 ("drm/i915/dsi: Do display on
sequence later on icl+"), for the most part. Turns out some
machines (eg. Chuwi Minibook X) really do need that updated order.
It is also the order the Windows driver uses.
However we can't just undo the revert since that would again
break Lenovo 82TQ. After staring at the VBT sequences for both
machines I've concluded that the Lenovo 82TQ sequences look
somewhat broken:
- INIT_OTP is not present at all
- what should be in INIT_OTP is found in DISPLAY_ON
- what should be in DISPLAY_ON is found in BACKLIGHT_ON
(along with the actual backlight stuff)
The Chuwi Minibook X on the other hand has a full complement
of sequences in its VBT.
So let's try to deal with the broken sequences in the
Lenovo 82TQ VBT by simply swapping the (non-existent)
INIT_OTP sequence with the DISPLAY_ON sequence. Thus we
execute DISPLAY_ON when intending to execute INIT_OTP,
and execute nothing at all when intending to execute
DISPLAY_ON. That should be 100% equivalent to the
revert, for such broken VBTs.
Cc: [email protected]
Fixes: 6992eb815d08 ("Revert "drm/i915/dsi: Do display on sequence later on icl+"")
References: https://gitlab.freedesktop.org/drm/intel/-/issues/10071
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10334
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Jani Nikula <[email protected]>
(cherry picked from commit 94ae4612ea336bfc3c12b3fc68467c6711a4f39b)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
AuxCCS framebuffers don't work on Xe driver hence disable them
from plane capabilities until they are fixed. FlatCCS framebuffers
work and they are left enabled. CCS is left untouched for i915
driver.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/933
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: José Roberto de Souza <[email protected]>
Tested-by: José Roberto de Souza <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Signed-off-by: José Roberto de Souza <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit b7232a730fbf043f54fb46fbf4a6e92936770e79)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Looks like I misplaced a few hunks when I moved the audio
enable/disable out from the encoder enable/disable hooks.
So we are now doing a double audio enable/disable on SDVO
and g4x+ DP. Probably harmless as doing it twice shouldn't
really change anything, but let's do it just once, as intended.
Fixes: cff742cc6851 ("drm/i915: Hoist the encoder->audio_{enable,disable}() calls higher up")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jani Nikula <[email protected]>
(cherry picked from commit 315bd0a0825776d6c66d474bf572db64fa019ad8)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Commit
8117961d98fb2 ("x86/efi: Disregard setup header of loaded image")
dropped the memcopy of the image's setup header into the boot_params
struct provided to the core kernel, on the basis that EFI boot does not
need it and should rely only on a single protocol to interface with the
boot chain. It is also a prerequisite for being able to increase the
section alignment to 4k, which is needed to enable memory protections
when running in the boot services.
So only the setup_header fields that matter to the core kernel are
populated explicitly, and everything else is ignored. One thing was
overlooked, though: the initrd_addr_max field in the setup_header is not
used by the core kernel, but it is used by the EFI stub itself when it
loads the initrd, where its default value of INT_MAX is used as the soft
limit for memory allocation.
This means that, in the old situation, the initrd was virtually always
loaded in the lower 2G of memory, but now, due to initrd_addr_max being
0x0, the initrd may end up anywhere in memory. This should not be an
issue principle, as most systems can deal with this fine. However, it
does appear to tickle some problems in older UEFI implementations, where
the memory ends up being corrupted, resulting in errors when unpacking
the initramfs.
So set the initrd_addr_max field to INT_MAX like it was before.
Fixes: 8117961d98fb2 ("x86/efi: Disregard setup header of loaded image")
Reported-by: Radek Podgorny <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Avoid a type mismatch warning in max() by switching to max_t() and
providing the type explicitly.
Fixes: 3cb4a4827596abc82e ("efi/libstub: fix efi_random_alloc() ...")
Signed-off-by: Ard Biesheuvel <[email protected]>
|
|
Add standalone includes for BUG_ON and BUILD_BUG_ON to avoid build failure
after linux-next include refactoring.
Signed-off-by: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 4df6ac223cad36e7384ed00fe6efc114279f0df6)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
ip_local_out() and other functions can pass skb->sk as function argument.
If the skb is a fragment and reassembly happens before such function call
returns, the sk must not be released.
This affects skb fragments reassembled via netfilter or similar
modules, e.g. openvswitch or ct_act.c, when run as part of tx pipeline.
Eric Dumazet made an initial analysis of this bug. Quoting Eric:
Calling ip_defrag() in output path is also implying skb_orphan(),
which is buggy because output path relies on sk not disappearing.
A relevant old patch about the issue was :
8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
[..]
net/ipv4/ip_output.c depends on skb->sk being set, and probably to an
inet socket, not an arbitrary one.
If we orphan the packet in ipvlan, then downstream things like FQ
packet scheduler will not work properly.
We need to change ip_defrag() to only use skb_orphan() when really
needed, ie whenever frag_list is going to be used.
Eric suggested to stash sk in fragment queue and made an initial patch.
However there is a problem with this:
If skb is refragmented again right after, ip_do_fragment() will copy
head->sk to the new fragments, and sets up destructor to sock_wfree.
IOW, we have no choice but to fix up sk_wmem accouting to reflect the
fully reassembled skb, else wmem will underflow.
This change moves the orphan down into the core, to last possible moment.
As ip_defrag_offset is aliased with sk_buff->sk member, we must move the
offset into the FRAG_CB, else skb->sk gets clobbered.
This allows to delay the orphaning long enough to learn if the skb has
to be queued or if the skb is completing the reasm queue.
In the former case, things work as before, skb is orphaned. This is
safe because skb gets queued/stolen and won't continue past reasm engine.
In the latter case, we will steal the skb->sk reference, reattach it to
the head skb, and fix up wmem accouting when inet_frag inflates truesize.
Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().")
Diagnosed-by: Eric Dumazet <[email protected]>
Reported-by: xingwei lee <[email protected]>
Reported-by: yue sun <[email protected]>
Reported-by: [email protected]
Signed-off-by: Florian Westphal <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The Octeontx2 MAC block (CGX) has separate data paths (SMU and GMP) for
different speeds, allowing for efficient data transfer.
The previous patch which added pause frame configuration has a bug due
to which pause frame feature is not working in GMP mode.
This patch fixes the issue by configurating appropriate registers.
Fixes: f7e086e754fe ("octeontx2-af: Pause frame configuration at cgx")
Signed-off-by: Hariprasad Kelam <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
PCI11x1x Rev B0 devices might drop packets when receiving back to back frames
at 2.5G link speed. Change the B0 Rev device's Receive filtering Engine FIFO
threshold parameter from its hardware default of 4 to 3 dwords to prevent the
problem. Rev C0 and later hardware already defaults to 3 dwords.
Fixes: bb4f6bffe33c ("net: lan743x: Add PCI11010 / PCI11414 device IDs")
Signed-off-by: Raju Lakkaraju <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240327
reports [1]:
drivers/gpu/drm/qxl/qxl_ioctl.c:148:14: error: variable 'num_relocs'
set but not used [-Werror,-Wunused-but-set-variable]
The variable was originally used in the `out_free_bos` label, but commit
74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command")
removed the use that happened in that label.
Thus remove the unused variable.
Fixes: 74d9a6335dce ("drm/qxl: Simplify cleaning qxl processing command")
Closes: https://lore.kernel.org/lkml/CANiq72kqqQfUxLkHJYqeBAhpc6YcX7bfR96gmmbF=j8hEOykqw@mail.gmail.com/ [1]
Signed-off-by: Miguel Ojeda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Maxime Ripard <[email protected]>
|
|
Clang 14 in an (essentially) defconfig loongarch64 build for next-20240326
reports [1]:
drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set
but not used [-Werror,-Wunused-but-set-variable]
The variable is already unused in the version that got into the tree.
Thus remove the unused variable.
Fixes: f64122c1f6ad ("drm: add new QXL driver. (v1.4)")
Closes: https://lore.kernel.org/lkml/CANiq72mjc5t4n25SQvYSrOEhxxpXYPZ4pPzneSJHEnc3qApu2Q@mail.gmail.com/ [1]
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Miguel Ojeda <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Maxime Ripard <[email protected]>
|
|
Justin Chen says:
====================
net: bcmasp: phy managements fixes
Fix two issues.
- The unimac may be put in a bad state if PHY RX clk doesn't exist
during reset. Work around this by bringing the unimac out of reset
during phy up.
- Remove redundant phy_{suspend/resume}
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
phy_{suspend/resume} is redundant. It gets called from phy_{stop/start}.
Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
Signed-off-by: Justin Chen <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The unimac requires the PHY RX clk during reset or it may be put
into a bad state. Bring up the unimac after link up to ensure the
PHY RX clk exists.
Fixes: 490cb412007d ("net: bcmasp: Add support for ASP2.0 Ethernet controller")
Signed-off-by: Justin Chen <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
On reworking and splitting the at803x driver, in splitting function of
at803x PHYs it was added a NULL dereference bug where priv is referenced
before it's actually allocated and then is tried to write to for the
is_1000basex and is_fiber variables in the case of at8031, writing on
the wrong address.
Fix this by correctly setting priv local variable only after
at803x_probe is called and actually allocates priv in the phydev struct.
Reported-by: William Wortel <[email protected]>
Cc: <[email protected]>
Fixes: 25d2ba94005f ("net: phy: at803x: move specific at8031 probe mode check to dedicated probe")
Signed-off-by: Christian Marangi <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 reject destroy chain command to delete device hooks in netdev
family, hence, only delchain commands are allowed.
Patch #2 reject table flag update interference with netdev basechain
hook updates, this can leave hooks in inconsistent
registration/unregistration state.
Patch #3 do not unregister netdev basechain hooks if table is dormant.
Otherwise, splat with double unregistration is possible.
Patch #4 fixes Kconfig to allow to restore IP_NF_ARPTABLES,
from Kuniyuki Iwashima.
There are a more fixes still in progress on my side that need more work.
* tag 'nf-24-03-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c
netfilter: nf_tables: skip netdev hook unregistration if table is dormant
netfilter: nf_tables: reject table flag and netdev basechain updates
netfilter: nf_tables: reject destroy command to remove basechain hooks
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|