Age | Commit message (Collapse) | Author | Files | Lines |
|
Joerg is recovering from an injury, so temporarily add myself to the
IOMMU MAINTAINERS entry so that I'm more likely to get CC'd on patches
while I help to look after the tree for him.
Suggested-by: Joerg Roedel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
In order to make accurate predictions across CPUs and for all performance
states, Energy Aware Scheduling (EAS) needs frequency-invariant load
tracking signals.
EAS task placement aims to minimize energy consumption, and does so in
part by limiting the search space to only CPUs with the highest spare
capacity (CPU capacity - CPU utilization) in their performance domain.
Those candidates are the placement choices that will keep frequency at
its lowest possible and therefore save the most energy.
But without frequency invariance, a CPU's utilization is relative to the
CPU's current performance level, and not relative to its maximum
performance level, which determines its capacity. As a result, it will
fail to correctly indicate any potential spare capacity obtained by an
increase in a CPU's performance level. Therefore, a non-invariant
utilization signal would render the EAS task placement logic invalid.
Now that we properly report support for the Frequency Invariance Engine
(FIE) through arch_scale_freq_invariant() for arm and arm64 systems,
while also ensuring a re-evaluation of the EAS use conditions for
possible invariance status change, we can assert this is the case when
initializing EAS. Warn and bail out otherwise.
Suggested-by: Quentin Perret <[email protected]>
Signed-off-by: Ionela Voinescu <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Task scheduler behavior depends on frequency invariance (FI) support and
the resulting invariant load tracking signals. For example, in order to
make accurate predictions across CPUs for all performance states, Energy
Aware Scheduling (EAS) needs frequency-invariant load tracking signals
and therefore it has a direct dependency on FI. This dependency is known,
but EAS enablement is not yet conditioned on the presence of FI during
the built of the scheduling domain hierarchy.
Before this is done, the following must be considered: while
arch_scale_freq_invariant() will see changes in FI support and could
be used to condition the use of EAS, it could return different values
during system initialisation.
For arm64, such a scenario will happen for a system that does not support
cpufreq driven FI, but does support counter-driven FI. For such a system,
arch_scale_freq_invariant() will return false if called before counter
based FI initialisation, but change its status to true after it.
If EAS becomes explicitly dependent on FI this would affect the task
scheduler behavior which builds its scheduling domain hierarchy well
before the late counter-based FI init. During that process, EAS would be
disabled due to its dependency on FI.
Two points of future early calls to arch_scale_freq_invariant() which
would determine EAS enablement are:
- (1) drivers/base/arch_topology.c:126 <<update_topology_flags_workfn>>
rebuild_sched_domains();
This will happen after CPU capacity initialisation.
- (2) kernel/sched/cpufreq_schedutil.c:917 <<rebuild_sd_workfn>>
rebuild_sched_domains_energy();
-->rebuild_sched_domains();
This will happen during sched_cpufreq_governor_change() for the
schedutil cpufreq governor.
Therefore, before enforcing the presence of FI support for the use of EAS,
ensure the following: if there is a change in FI support status after
counter init, use the existing rebuild_sched_domains_energy() function to
trigger a rebuild of the scheduling and performance domains that in turn
will determine the enablement of EAS.
Signed-off-by: Ionela Voinescu <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Catalin Marinas <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add the rebuild_sched_domains_energy() function to wrap the functionality
that rebuilds the scheduling domains if any of the Energy Aware Scheduling
(EAS) initialisation conditions change. This functionality is used when
schedutil is added or removed or when EAS is enabled or disabled
through the sched_energy_aware sysctl.
Therefore, create a single function that is used in both these cases and
that can be later reused.
Signed-off-by: Ionela Voinescu <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Quentin Perret <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
In case the user wants to stop controlling a uclamp constraint value
for a task, use the magic value -1 in sched_util_{min,max} with the
appropriate sched_flags (SCHED_FLAG_UTIL_CLAMP_{MIN,MAX}) to indicate
the reset.
The advantage over the 'additional flag' approach (i.e. introducing
SCHED_FLAG_UTIL_CLAMP_RESET) is that no additional flag has to be
exported via uapi. This avoids the need to document how this new flag
has be used in conjunction with the existing uclamp related flags.
The following subtle issue is fixed as well. When a uclamp constraint
value is set on a !user_defined uclamp_se it is currently first reset
and then set.
Fix this by AND'ing !user_defined with !SCHED_FLAG_UTIL_CLAMP which
stands for the 'sched class change' case.
The related condition 'if (uc_se->user_defined)' moved from
__setscheduler_uclamp() into uclamp_reset().
Signed-off-by: Dietmar Eggemann <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Yun Hsiang <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Signed-off-by: Tal Zussman <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/20201113005156.GA8408@charmander
|
|
sched_debug
This document seems to be out of date for many, many years. Even it has
misspelled from the first day.
ARCH_HASH_SCHED_TUNE should be ARCH_HAS_SCHED_TUNE
ARCH_HASH_SCHED_DOMAIN should be ARCH_HAS_SCHED_DOMAIN
Since v2.6.14, kernel completely deleted the relevant code and even
arch_init_sched_domains() was deleted.
Right now, kernel is asking architectures to call set_sched_topology() to
override the default sched domains.
On the other hand, to print the schedule debug information, users need to
set sched_debug cmdline or enable it by sysfs entry. So this patch also
adds the description for sched_debug.
Signed-off-by: Barry Song <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Valentin Schneider <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
NUMA topologies where the shortest path between some two nodes requires
three or more hops (i.e. diameter > 2) end up being misrepresented in the
scheduler topology structures.
This is currently detected when booting a kernel with CONFIG_SCHED_DEBUG=y
+ sched_debug on the cmdline, although this will only yield a warning about
sched_group spans not matching sched_domain spans:
ERROR: groups don't span domain->span
Add an explicit warning for that case, triggered regardless of
CONFIG_SCHED_DEBUG, and decorate it with an appropriate comment.
The topology described in the comment can be booted up on QEMU by appending
the following to your usual QEMU incantation:
-smp cores=4 \
-numa node,cpus=0,nodeid=0 -numa node,cpus=1,nodeid=1, \
-numa node,cpus=2,nodeid=2, -numa node,cpus=3,nodeid=3, \
-numa dist,src=0,dst=1,val=20, -numa dist,src=0,dst=2,val=30, \
-numa dist,src=0,dst=3,val=40, -numa dist,src=1,dst=2,val=20, \
-numa dist,src=1,dst=3,val=30, -numa dist,src=2,dst=3,val=20
A somewhat more realistic topology (6-node mesh) with the same affliction
can be conjured with:
-smp cores=6 \
-numa node,cpus=0,nodeid=0 -numa node,cpus=1,nodeid=1, \
-numa node,cpus=2,nodeid=2, -numa node,cpus=3,nodeid=3, \
-numa node,cpus=4,nodeid=4, -numa node,cpus=5,nodeid=5, \
-numa dist,src=0,dst=1,val=20, -numa dist,src=0,dst=2,val=30, \
-numa dist,src=0,dst=3,val=40, -numa dist,src=0,dst=4,val=30, \
-numa dist,src=0,dst=5,val=20, \
-numa dist,src=1,dst=2,val=20, -numa dist,src=1,dst=3,val=30, \
-numa dist,src=1,dst=4,val=20, -numa dist,src=1,dst=5,val=30, \
-numa dist,src=2,dst=3,val=20, -numa dist,src=2,dst=4,val=30, \
-numa dist,src=2,dst=5,val=40, \
-numa dist,src=3,dst=4,val=20, -numa dist,src=3,dst=5,val=30, \
-numa dist,src=4,dst=5,val=20
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
|
|
One of our machines keeled over trying to rebuild the scheduler domains.
Mainline produces the same splat:
BUG: unable to handle page fault for address: 0000607f820054db
CPU: 2 PID: 149 Comm: kworker/1:1 Not tainted 5.10.0-rc1-master+ #6
Workqueue: events cpuset_hotplug_workfn
RIP: build_sched_domains
Call Trace:
partition_sched_domains_locked
rebuild_sched_domains_locked
cpuset_hotplug_workfn
It happens with cgroup2 and exclusive cpusets only. This reproducer
triggers it on an 8-cpu vm and works most effectively with no
preexisting child cgroups:
cd $UNIFIED_ROOT
mkdir cg1
echo 4-7 > cg1/cpuset.cpus
echo root > cg1/cpuset.cpus.partition
# with smt/control reading 'on',
echo off > /sys/devices/system/cpu/smt/control
RIP maps to
sd->shared = *per_cpu_ptr(sdd->sds, sd_id);
from sd_init(). sd_id is calculated earlier in the same function:
cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
sd_id = cpumask_first(sched_domain_span(sd));
tl->mask(cpu), which reads cpu_sibling_map on x86, returns an empty mask
and so cpumask_first() returns >= nr_cpu_ids, which leads to the bogus
value from per_cpu_ptr() above.
The problem is a race between cpuset_hotplug_workfn() and a later
offline of CPU N. cpuset_hotplug_workfn() updates the effective masks
when N is still online, the offline clears N from cpu_sibling_map, and
then the worker uses the stale effective masks that still have N to
generate the scheduling domains, leading the worker to read
N's empty cpu_sibling_map in sd_init().
rebuild_sched_domains_locked() prevented the race during the cgroup2
cpuset series up until the Fixes commit changed its check. Make the
check more robust so that it can detect an offline CPU in any exclusive
cpuset's effective mask, not just the top one.
Fixes: 0ccea8feb980 ("cpuset: Make generate_sched_domains() work with partition")
Signed-off-by: Daniel Jordan <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Oleksandr reported hitting the WARN in the 'task_rq(p) != rq' branch
of migration_cpu_stop(). Valentin noted that using cpu_of(rq) in that
case is just plain wrong to begin with, since per the earlier branch
that isn't the actual CPU of the task.
Replace both instances of is_cpu_allowed() by a direct p->cpus_mask
test using task_cpu().
Reported-by: Oleksandr Natalenko <[email protected]>
Debugged-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
|
|
Qian reported that some fuzzer issuing sched_setaffinity() ends up stuck on
a wait_for_completion(). The problematic pattern seems to be:
affine_move_task()
// task_running() case
stop_one_cpu();
wait_for_completion(&pending->done);
Combined with, on the stopper side:
migration_cpu_stop()
// Task moved between unlocks and scheduling the stopper
task_rq(p) != rq &&
// task_running() case
dest_cpu >= 0
=> no complete_all()
This can happen with both PREEMPT and !PREEMPT, although !PREEMPT should
be more likely to see this given the targeted task has a much bigger window
to block and be woken up elsewhere before the stopper runs.
Make migration_cpu_stop() always look at pending affinity requests; signal
their completion if the stopper hits a rq mismatch but the task is
still within its allowed mask. When Migrate-Disable isn't involved, this
matches the previous set_cpus_allowed_ptr() vs migration_cpu_stop()
behaviour.
Fixes: 6d337eab041d ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Reported-by: Qian Cai <[email protected]>
Signed-off-by: Valentin Schneider <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
|
|
Fix the compile error below (CONFIG_PCI_ATS not set):
drivers/iommu/intel/dmar.c: In function ‘vf_inherit_msi_domain’:
drivers/iommu/intel/dmar.c:338:59: error: ‘struct pci_dev’ has no member named ‘physfn’; did you mean ‘is_physfn’?
338 | dev_set_msi_domain(&pdev->dev, dev_get_msi_domain(&pdev->physfn->dev));
| ^~~~~~
| is_physfn
Fixes: ff828729be44 ("iommu/vt-d: Cure VF irqdomain hickup")
Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/linux-iommu/CAMuHMdXA7wfJovmfSH2nbAhN0cPyCiFHodTvg4a8Hm9rx5Dj-w@mail.gmail.com/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-next/iommu/fixes
Pull in x86 fixes from Thomas, as they include a change to the Intel DMAR
code on which we depend:
* tag 'x86-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
iommu/vt-d: Cure VF irqdomain hickup
x86/platform/uv: Fix copied UV5 output archtype
x86/platform/uv: Drop last traces of uv_flush_tlb_others
|
|
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-misc-fixes
Fix for drm/sun4i shared with arm-soc
This patch is a preliminary fix that will conflict with subsequent work merged
through arm-soc.
Signed-off-by: Maxime Ripard <[email protected]>
# gpg: Signature made Wed 18 Nov 2020 09:51:53 AM CET
# gpg: using EDDSA key 5C1337A45ECA9AEB89060E9EE3EF0D6F671851C5
# gpg: Good signature from "Maxime Ripard <[email protected]>" [unknown]
# gpg: aka "Maxime Ripard <[email protected]>" [unknown]
# gpg: aka "Maxime Ripard (Work Address) <[email protected]>" [unknown]
# gpg: aka "Maxime Ripard (Work Address) <[email protected]>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: BE56 75C3 7E81 8C8B 5764 241C 254B CFC5 6BF6 CE8D
# Subkey fingerprint: 5C13 37A4 5ECA 9AEB 8906 0E9E E3EF 0D6F 6718 51C5
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
drm-intel-fixes
gvt-fixes-2020-11-17
- Temporarily disable VFIO edid on BXT/APL (Colin)
- Fix emulated DPCD for version 1.2 (Tina)
- Fix error return when failing to take module reference (Xiongfeng)
Signed-off-by: Rodrigo Vivi <[email protected]>
From: Zhenyu Wang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Since we allocate some breadcrumbs for the virtual engine, and the
virtual engine has a custom destructor, we also need to free the
breadcrumbs after use.
Fixes: b3786b29379c ("drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 45e50f48b7907e650cfbbc7879abfe3a0c419c73)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
EDID can declare the maximum supported bpc up to 16,
and apparently there are displays that do so. Currently
we assume 12 bpc is tha max. Fix the assumption and
toss in a MISSING_CASE() for any other value we don't
expect to see.
This fixes modesets with a display with EDID max bpc > 12.
Previously any modeset would just silently fail on platforms
that didn't otherwise limit this via the max_bpc property.
In particular we don't add the max_bpc property to HDMI
ports on gmch platforms, and thus we would see the raw
max_bpc coming from the EDID.
I suppose we could already adjust this to also allow 16bpc,
but seeing as no current platform supports that there is
little point.
Cc: [email protected]
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2632
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: José Roberto de Souza <[email protected]>
(cherry picked from commit 2ca5a7b85b0c2b97ef08afbd7799b022e29f192e)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2020-11-18
Jimmy Assarsson provides two patches for the kvaser_pciefd and kvaser_usb
drivers, where the can_bittiming_const are fixed.
The next patch is by me and fixes an erroneous flexcan_transceiver_enable()
during bus-off recovery in the flexcan driver.
Jarkko Nikula's patch for the m_can driver fixes the IRQ handler to only
process the interrupts if the device is not suspended.
* tag 'linux-can-fixes-for-5.10-20201118' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: m_can: process interrupt only when not runtime suspended
can: flexcan: flexcan_chip_start(): fix erroneous flexcan_transceiver_enable() during bus-off recovery
can: kvaser_usb: kvaser_usb_hydra: Fix KCAN bittiming limits
can: kvaser_pciefd: Fix KCAN bittiming limits
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Slave function read the following capabilities from the wrong offset:
1. log_mc_entry_sz
2. fs_log_entry_sz
3. log_mc_hash_sz
Fix that by adjusting these capabilities offset to match firmware
layout.
Due to the wrong offset read, the following issues might occur:
1+2. Negative value reported at max_mcast_qp_attach.
3. Driver to init FW with multicast hash size of zero.
Fixes: a40ded604365 ("net/mlx4_core: Add masking for a few queries on HCA caps")
Signed-off-by: Aya Levin <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Reviewed-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Tariq Toukan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2020-11-17
This series introduces some fixes to mlx5 driver.
* tag 'mlx5-fixes-2020-11-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: fix error return code in mlx5e_tc_nic_init()
net/mlx5: E-Switch, Fail mlx5_esw_modify_vport_rate if qos disabled
net/mlx5: Disable QoS when min_rates on all VFs are zero
net/mlx5: Clear bw_share upon VF disable
net/mlx5: Add handling of port type in rule deletion
net/mlx5e: Fix check if netdev is bond slave
net/mlx5e: Fix IPsec packet drop by mlx5e_tc_update_skb
net/mlx5e: Set IPsec WAs only in IP's non checksum partial case.
net/mlx5e: Fix refcount leak on kTLS RX resync
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The `skb' is mapped for DMA in ns_send() but does not unmap DMA in case
push_scqe() fails to submit the `skb'. The memory of the `skb' is
released so only the DMA mapping is leaking.
Unmap the DMA mapping in case push_scqe() failed.
Fixes: 864a3ff635fa7 ("atm: [nicstar] remove virt_to_bus() and support 64-bit platforms")
Cc: Chas Williams <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The ethernet driver may allocate skb (and skb->data) via napi_alloc_skb().
This ends up to page_frag_alloc() to allocate skb->data from
page_frag_cache->va.
During the memory pressure, page_frag_cache->va may be allocated as
pfmemalloc page. As a result, the skb->pfmemalloc is always true as
skb->data is from page_frag_cache->va. The skb will be dropped if the
sock (receiver) does not have SOCK_MEMALLOC. This is expected behaviour
under memory pressure.
However, once kernel is not under memory pressure any longer (suppose large
amount of memory pages are just reclaimed), the page_frag_alloc() may still
re-use the prior pfmemalloc page_frag_cache->va to allocate skb->data. As a
result, the skb->pfmemalloc is always true unless page_frag_cache->va is
re-allocated, even if the kernel is not under memory pressure any longer.
Here is how kernel runs into issue.
1. The kernel is under memory pressure and allocation of
PAGE_FRAG_CACHE_MAX_ORDER in __page_frag_cache_refill() will fail. Instead,
the pfmemalloc page is allocated for page_frag_cache->va.
2: All skb->data from page_frag_cache->va (pfmemalloc) will have
skb->pfmemalloc=true. The skb will always be dropped by sock without
SOCK_MEMALLOC. This is an expected behaviour.
3. Suppose a large amount of pages are reclaimed and kernel is not under
memory pressure any longer. We expect skb->pfmemalloc drop will not happen.
4. Unfortunately, page_frag_alloc() does not proactively re-allocate
page_frag_alloc->va and will always re-use the prior pfmemalloc page. The
skb->pfmemalloc is always true even kernel is not under memory pressure any
longer.
Fix this by freeing and re-allocating the page instead of recycling it.
References: https://lore.kernel.org/lkml/[email protected]/
References: https://lore.kernel.org/linux-mm/[email protected]/
Suggested-by: Matthew Wilcox (Oracle) <[email protected]>
Cc: Aruna Ramakrishna <[email protected]>
Cc: Bert Barbe <[email protected]>
Cc: Rama Nichanamatlu <[email protected]>
Cc: Venkat Venkatsubra <[email protected]>
Cc: Manjunath Patil <[email protected]>
Cc: Joe Jin <[email protected]>
Cc: SRINIVAS <[email protected]>
Fixes: 79930f5892e1 ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Dongli Zhang <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We recently improved our display atomic commit and tail sequence to
avoid some issues related to concurrency. One of the major changes
consisted of moving the interrupt disable and the stream release from
our atomic commit to our atomic tail (commit 6d90a208cfff
("drm/amd/display: Move disable interrupt into commit tail")) .
However, the new code introduced inside our commit tail function was
inserted right after the function
drm_atomic_helper_update_legacy_modeset_state(), which has routines for
updating internal data structs related to timestamps. As a result, in
certain conditions, the display module can reach a situation where we
update our constants and, after that, clean it. This situation generates
the following warning:
amdgpu 0000:03:00.0: drm_WARN_ON_ONCE(drm_drv_uses_atomic_modeset(dev))
WARNING: CPU: 6 PID: 1269 at drivers/gpu/drm/drm_vblank.c:722
drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x32b/0x340 [drm]
...
RIP:
0010:drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x32b/0x340
[drm]
...
Call Trace:
? dc_stream_get_vblank_counter+0x57/0x60 [amdgpu]
drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x20 [drm]
drm_get_last_vbltimestamp+0xad/0xc0 [drm]
drm_reset_vblank_timestamp+0x63/0xd0 [drm]
drm_crtc_vblank_on+0x85/0x150 [drm]
amdgpu_dm_atomic_commit_tail+0xaf1/0x2330 [amdgpu]
commit_tail+0x99/0x130 [drm_kms_helper]
drm_atomic_helper_commit+0x123/0x150 [drm_kms_helper]
amdgpu_dm_atomic_commit+0x11/0x20 [amdgpu]
drm_atomic_commit+0x4a/0x50 [drm]
drm_atomic_helper_set_config+0x7c/0xc0 [drm_kms_helper]
drm_mode_setcrtc+0x20b/0x7e0 [drm]
? tomoyo_path_number_perm+0x6f/0x200
? drm_mode_getcrtc+0x190/0x190 [drm]
drm_ioctl_kernel+0xae/0xf0 [drm]
drm_ioctl+0x245/0x400 [drm]
? drm_mode_getcrtc+0x190/0x190 [drm]
amdgpu_drm_ioctl+0x4e/0x80 [amdgpu]
__x64_sys_ioctl+0x91/0xc0
do_syscall_64+0x38/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xa9
...
For fixing this issue we rely upon a refactor introduced on
drm_atomic_helper_update_legacy_modeset_state ("Remove the timestamping
constant update from drm_atomic_helper_update_legacy_modeset_state()")
which decouples constant values update from
drm_atomic_helper_update_legacy_modeset_state to a new helper.
Basically, this commit uses this new helper and place it right after our
release module to avoid a situation where our CRTC struct gets wrong
values.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1373
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1349
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 fix from Andreas Gruenbacher:
"Fix gfs2 freeze/thaw"
* tag 'gfs2-v5.10-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: Fix regression in freeze_go_sync
|
|
Pull nfsd fix from Bruce Fields:
"Just one quick fix for a tracing oops"
* tag 'nfsd-5.10-2' of git://linux-nfs.org/~bfields/linux:
SUNRPC: Fix oops in the rpc_xdr_buf event class
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kunit fixes from Shuah Khan:
"Several fixes to Kunit documentation and tools, and to not pollute
the source directory.
Also remove the incorrect kunit .gitattributes file"
* tag 'linux-kselftest-kunit-fixes-5.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kunit: fix display of failed expectations for strings
kunit: tool: fix extra trailing \n in raw + parsed test output
kunit: tool: print out stderr from make (like build warnings)
KUnit: Docs: usage: wording fixes
KUnit: Docs: style: fix some Kconfig example issues
KUnit: Docs: fix a wording typo
kunit: Do not pollute source directory with generated files (test.log)
kunit: Do not pollute source directory with generated files (.kunitconfig)
kunit: tool: fix pre-existing python type annotation errors
kunit: Fix kunit.py parse subcommand (use null build_dir)
kunit: tool: unmark test_data as binary blobs
|
|
When the switch is hardware reset, it reads the contents of the
EEPROM. This can contain instructions for programming values into
registers and to perform waits between such programming. Reading the
EEPROM can take longer than the 100ms mv88e6xxx_hardware_reset() waits
after deasserting the reset GPIO. So poll the EEPROM done bit to
ensure it is complete.
Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: Ruslan Sushko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Ido Schimmel says:
====================
mlxsw: Couple of fixes
Patch #1 fixes firmware flashing when CONFIG_MLXSW_CORE=y and
CONFIG_MLXFW=m.
Patch #2 prevents EMAD transactions from needlessly failing when the
system is under heavy load by using exponential backoff.
Please consider patch #2 for stable.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The driver sends Ethernet Management Datagram (EMAD) packets to the
device for configuration purposes and waits for up to 200ms for a reply.
A request is retried up to 5 times.
When the system is under heavy load, replies are not always processed in
time and EMAD transactions fail.
Make the process more robust to such delays by using exponential
backoff. First wait for up to 200ms, then retransmit and wait for up to
400ms and so on.
Fixes: caf7297e7ab5 ("mlxsw: core: Introduce support for asynchronous EMAD register access")
Reported-by: Denis Yulevich <[email protected]>
Tested-by: Denis Yulevich <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The commit cited below moved firmware flashing functionality from
mlxsw_spectrum to mlxsw_core, but did not adjust the Kconfig
dependencies. This makes it possible to have mlxsw_core as built-in and
mlxfw as a module. The mlxfw code is therefore not reachable from
mlxsw_core and firmware flashing fails:
# devlink dev flash pci/0000:01:00.0 file mellanox/mlxsw_spectrum-13.2008.1310.mfa2
devlink answers: Operation not supported
Fix by having mlxsw_core select mlxfw.
Fixes: b79cb787ac70 ("mlxsw: Move fw flashing code into core.c")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Vadim Pasternak <[email protected]>
Tested-by: Vadim Pasternak <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
DSA network devices rely on having their DSA management interface up and
running otherwise their ndo_open() will return -ENETDOWN. Without doing
this it would not be possible to use DSA devices as netconsole when
configured on the command line. These devices also do not utilize the
upper/lower linking so the check about the netpoll device having upper
is not going to be a problem.
The solution adopted here is identical to the one done for
net/ipv4/ipconfig.c with 728c02089a0e ("net: ipv4: handle DSA enabled
master network devices"), with the network namespace scope being
restricted to that of the process configuring netpoll.
Fixes: 04ff53f96a93 ("net: dsa: Add netconsole support")
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: a6a5325239c2 ("atl1e: Atheros L1E Gigabit Ethernet driver")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 43250ddd75a3 ("atl1c: Atheros L1C Gigabit Ethernet driver")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
LTE module MR400 embedded in TL-MR6400 v4 requires DTR to be set.
Signed-off-by: Filip Moc <[email protected]>
Acked-by: Bjørn Mork <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
At the start of driver initialization, we do not know what bias
setting the bootloader has configured the system for and we only know
for certain the very first time we do a transition.
However, since the initial value of the comparison index is -EINVAL,
this negative value results in an array out of bound access on the
very first transition.
Since we don't know what the setting is, we just set the bias
configuration as there is nothing to compare against. This prevents
the array out of bound access.
NOTE: Even though we could use a more relaxed check of "< 0" the only
valid values(ignoring cosmic ray induced bitflips) are -EINVAL, 0+.
Fixes: 40b1936efebd ("regulator: Introduce TI Adaptive Body Bias(ABB) on-chip LDO driver")
Link: https://lore.kernel.org/linux-mm/CA+G9fYuk4imvhyCN7D7T6PMDH6oNp6HDCRiTUKMQ6QXXjBa4ag@mail.gmail.com/
Reported-by: Naresh Kamboju <[email protected]>
Reviewed-by: Arnd Bergmann <[email protected]>
Signed-off-by: Nishanth Menon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
In kabylake_set_bias_level(), enabling mclk may fail if the clock has
already been enabled by the firmware. Attempts to disable that clock
later will fail with a warning backtrace.
mclk already disabled
WARNING: CPU: 2 PID: 108 at drivers/clk/clk.c:952 clk_core_disable+0x1b6/0x1cf
...
Call Trace:
clk_disable+0x2d/0x3a
kabylake_set_bias_level+0x72/0xfd [snd_soc_kbl_rt5663_rt5514_max98927]
snd_soc_card_set_bias_level+0x2b/0x6f
snd_soc_dapm_set_bias_level+0xe1/0x209
dapm_pre_sequence_async+0x63/0x96
async_run_entry_fn+0x3d/0xd1
process_one_work+0x2a9/0x526
...
Only disable the clock if it has been enabled.
Fixes: 15747a802075 ("ASoC: eve: implement set_bias_level function for rt5514")
Cc: Brent Lu <[email protected]>
Cc: Curtis Malainey <[email protected]>
Signed-off-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mark Brown <[email protected]>
|
|
In xfs_initialize_perag(), if kmem_zalloc(), xfs_buf_hash_init(), or
radix_tree_preload() failed, the returned value 'error' is not set
accordingly.
Reported-as-fixing: 8b26c5825e02 ("xfs: handle ENOMEM correctly during initialisation of perag structures")
Fixes: 9b2471797942 ("xfs: cache unlinked pointers in an rhashtable")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Yu Kuai <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
|
|
The aim of the inode btree record iterator function is to call a
callback on every record in the btree. To avoid having to tear down and
recreate the inode btree cursor around every callback, it caches a
certain number of records in a memory buffer. After each batch of
callback invocations, we have to perform a btree lookup to find the
next record after where we left off.
However, if the keys of the inode btree are corrupt, the lookup might
put us in the wrong part of the inode btree, causing the walk function
to loop forever. Therefore, we add extra cursor tracking to make sure
that we never go backwards neither when performing the lookup nor when
jumping to the next inobt record. This also fixes an off by one error
where upon resume the lookup should have been for the inode /after/ the
point at which we stopped.
Found by fuzzing xfs/460 with keys[2].startino = ones causing bulkstat
and quotacheck to hang.
Fixes: a211432c27ff ("xfs: create simplified inode walk function")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Chandan Babu R <[email protected]>
|
|
Currently, commit e9e2eae89ddb dropped a (int) decoration from
XFS_LITINO(mp), and since sizeof() expression is also involved,
the result of XFS_LITINO(mp) is simply as the size_t type
(commonly unsigned long).
Considering the expression in xfs_attr_shortform_bytesfit():
offset = (XFS_LITINO(mp) - bytes) >> 3;
let "bytes" be (int)340, and
"XFS_LITINO(mp)" be (unsigned long)336.
on 64-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 3 =
(int)(0xfffffffffffffffcUL >> 3) = -1
but on 32-bit platform, the expression is
offset = ((unsigned long)336 - (int)340) >> 3 =
(int)(0xfffffffcUL >> 3) = 0x1fffffff
instead.
so offset becomes a large positive number on 32-bit platform, and
cause xfs_attr_shortform_bytesfit() returns maxforkoff rather than 0.
Therefore, one result is
"ASSERT(new_size <= XFS_IFORK_SIZE(ip, whichfork));"
assertion failure in xfs_idata_realloc(), which was also the root
cause of the original bugreport from Dennis, see:
https://bugzilla.redhat.com/show_bug.cgi?id=1894177
And it can also be manually triggered with the following commands:
$ touch a;
$ setfattr -n user.0 -v "`seq 0 80`" a;
$ setfattr -n user.1 -v "`seq 0 80`" a
on 32-bit platform.
Fix the case in xfs_attr_shortform_bytesfit() by bailing out
"XFS_LITINO(mp) < bytes" in advance suggested by Eric and a misleading
comment together with this bugfix suggested by Darrick. It seems the
other users of XFS_LITINO(mp) are not impacted.
Fixes: e9e2eae89ddb ("xfs: only check the superblock version for dinode size calculation")
Cc: <[email protected]> # 5.7+
Reported-and-tested-by: Dennis Gilmore <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
|
|
Teach the directory scrubber to check all the bestfree entries,
including the null ones. We want to be able to detect the case where
the entry is null but there actually /is/ a directory data block.
Found by fuzzing lbests[0] = ones in xfs/391.
Fixes: df481968f33b ("xfs: scrub directory freespace")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Chandan Babu R <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
|
|
We always know the correct state of the rmap record flags (attr, bmbt,
unwritten) so check them by direct comparison.
Fixes: d852657ccfc0 ("xfs: cross-reference reverse-mapping btree")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Chandan Babu R <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
|
|
The comment and logic in xchk_btree_check_minrecs for dealing with
inode-rooted btrees isn't quite correct. While the direct children of
the inode root are allowed to have fewer records than what would
normally be allowed for a regular ondisk btree block, this is only true
if there is only one child block and the number of records don't fit in
the inode root.
Fixes: 08a3a692ef58 ("xfs: btree scrub should check minrecs")
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Chandan Babu R <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
|
|
Avoid processing bogus interrupt statuses when the HW is runtime suspended and
the M_CAN_IR register read may get all bits 1's. Handler can be called if the
interrupt request is shared with other peripherals or at the end of free_irq().
Therefore check the runtime suspended status before processing.
Fixes: cdf8259d6573 ("can: m_can: Add PM Support")
Signed-off-by: Jarkko Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Patch 541656d3a513 ("gfs2: freeze should work on read-only mounts") changed
the check for glock state in function freeze_go_sync() from "gl->gl_state
== LM_ST_SHARED" to "gl->gl_req == LM_ST_EXCLUSIVE". That's wrong and it
regressed gfs2's freeze/thaw mechanism because it caused only the freezing
node (which requests the glock in EX) to queue freeze work.
All nodes go through this go_sync code path during the freeze to drop their
SHared hold on the freeze glock, allowing the freezing node to acquire it
in EXclusive mode. But all the nodes must freeze access to the file system
locally, so they ALL must queue freeze work. The freeze_work calls
freeze_func, which makes a request to reacquire the freeze glock in SH,
effectively blocking until the thaw from the EX holder. Once thawed, the
freezing node drops its EX hold on the freeze glock, then the (blocked)
freeze_func reacquires the freeze glock in SH again (on all nodes, including
the freezer) so all nodes go back to a thawed state.
This patch changes the check back to gl_state == LM_ST_SHARED like it was
prior to 541656d3a513.
Fixes: 541656d3a513 ("gfs2: freeze should work on read-only mounts")
Cc: [email protected] # v5.8+
Signed-off-by: Bob Peterson <[email protected]>
Signed-off-by: Andreas Gruenbacher <[email protected]>
|
|
flexcan_transceiver_enable() during bus-off recovery
If the CAN controller goes into bus off, the do_set_mode() callback with
CAN_MODE_START can be used to recover the controller, which then calls
flexcan_chip_start(). If configured, this is done automatically by the
framework or manually by the user.
In flexcan_chip_start() there is an explicit call to
flexcan_transceiver_enable(), which does a regulator_enable() on the
transceiver regulator. This results in a net usage counter increase, as there
is no corresponding flexcan_transceiver_disable() in the bus off code path.
This further leads to the transceiver stuck enabled, even if the CAN interface
is shut down.
To fix this problem the
flexcan_transceiver_enable()/flexcan_transceiver_disable() are moved out of
flexcan_chip_start()/flexcan_chip_stop() into flexcan_open()/flexcan_close().
Fixes: e955cead0311 ("CAN: Add Flexcan CAN controller driver")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Kleine-Budde <[email protected]>
|
|
Don't recycle a refnode until we're done with all requests of nodes
ejected before.
Signed-off-by: Pavel Begunkov <[email protected]>
Cc: [email protected] # v5.7+
Signed-off-by: Jens Axboe <[email protected]>
|
|
An active ref_node always can be found in ctx->files_data, it's much
safer to get it this way instead of poking into files_data->ref_list.
Signed-off-by: Pavel Begunkov <[email protected]>
Cc: [email protected] # v5.7+
Signed-off-by: Jens Axboe <[email protected]>
|
|
"intel_iommu=off" command line is used to disable iommu but iommu is force
enabled in a tboot system for security reason.
However for better performance on high speed network device, a new option
"intel_iommu=tboot_noforce" is introduced to disable the force on.
By default kernel should panic if iommu init fail in tboot for security
reason, but it's unnecessory if we use "intel_iommu=tboot_noforce,off".
Fix the code setting force_on and move intel_iommu_tboot_noforce
from tboot code to intel iommu code.
Fixes: 7304e8f28bb2 ("iommu/vt-d: Correctly disable Intel IOMMU force on")
Signed-off-by: Zhenzhong Duan <[email protected]>
Tested-by: Lukasz Hawrylko <[email protected]>
Acked-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
The error codes were not set on some of these error paths.
Also the error handling was more confusing than it needed to be so I
cleaned it up and shuffled it around a bit.
Fixes: d2fb0a043838 ("dmaengine: break out channel registration")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Peter Ujfalusi <[email protected]>
Link: https://lore.kernel.org/r/20201113101631.GE168908@mwanda
Signed-off-by: Vinod Koul <[email protected]>
|