Age | Commit message (Collapse) | Author | Files | Lines |
|
for the v6.8 merge window
This fix didn't make it upstream in time, pick it up
for the v6.8 merge window.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
When a CPU is taken offline, the contribution of its cfs_rqs to task_groups'
load may remain and will negatively impact the calculation of the share of
the online CPUs.
To fix this bug, clear the contribution of an offlining CPU to task groups'
load and skip its contribution while it is inactive.
Here's the reproducer of the anomaly, by Imran Khan:
"So far I have encountered only one rather lengthy way of reproducing this issue,
which is as follows:
1. Take a KVM guest (booted with 4 CPUs and can be scaled up to 124 CPUs) and
create 2 custom cgroups: /sys/fs/cgroup/cpu/test_group_1 and /sys/fs/cgroup/
cpu/test_group_2
2. Assign a CPU intensive workload to each of these cgroups and start the
workload.
For my tests I am using following app:
int main(int argc, char *argv[])
{
unsigned long count, i, val;
if (argc != 2) {
printf("usage: ./a.out <number of random nums to generate> \n");
return 0;
}
count = strtoul(argv[1], NULL, 10);
printf("Generating %lu random numbers \n", count);
for (i = 0; i < count; i++) {
val = rand();
val = val % 2;
//usleep(1);
}
printf("Generated %lu random numbers \n", count);
return 0;
}
Also since the system is booted with 4 CPUs, in order to completely load the
system I am also launching 4 instances of same test app under:
/sys/fs/cgroup/cpu/
3. We can see that both of the cgroups get similar CPU time:
# systemd-cgtop --depth 1
Path Tasks %CPU Memory Input/s Output/s
/ 659 - 5.5G - -
/system.slice - - 5.7G - -
/test_group_1 4 - - - -
/test_group_2 3 - - - -
/user.slice 31 - 56.5M - -
Path Tasks %CPU Memory Input/s Output/s
/ 659 394.6 5.5G - -
/test_group_2 3 65.7 - - -
/user.slice 29 55.1 48.0M - -
/test_group_1 4 47.3 - - -
/system.slice - 2.2 5.7G - -
Path Tasks %CPU Memory Input/s Output/s
/ 659 394.8 5.5G - -
/test_group_1 4 62.9 - - -
/user.slice 28 44.9 54.2M - -
/test_group_2 3 44.7 - - -
/system.slice - 0.9 5.7G - -
Path Tasks %CPU Memory Input/s Output/s
/ 659 394.4 5.5G - -
/test_group_2 3 58.8 - - -
/test_group_1 4 51.9 - - -
/user.slice 30 39.3 59.6M - -
/system.slice - 1.9 5.7G - -
Path Tasks %CPU Memory Input/s Output/s
/ 659 394.7 5.5G - -
/test_group_1 4 60.9 - - -
/test_group_2 3 57.9 - - -
/user.slice 28 43.5 36.9M - -
/system.slice - 3.0 5.7G - -
Path Tasks %CPU Memory Input/s Output/s
/ 659 395.0 5.5G - -
/test_group_1 4 66.8 - - -
/test_group_2 3 56.3 - - -
/user.slice 29 43.1 51.8M - -
/system.slice - 0.7 5.7G - -
4. Now move systemd-udevd to one of these test groups, say test_group_1, and
perform scale up to 124 CPUs followed by scale down back to 4 CPUs from the
host side.
5. Run the same workload i.e 4 instances of CPU hogger under /sys/fs/cgroup/cpu
and one instance of CPU hogger each in /sys/fs/cgroup/cpu/test_group_1 and
/sys/fs/cgroup/test_group_2.
It can be seen that test_group_1 (the one where systemd-udevd was moved) is getting
much less CPU time than the test_group_2, even though at this point of time both of
these groups have only CPU hogger running:
# systemd-cgtop --depth 1
Path Tasks %CPU Memory Input/s Output/s
/ 1219 - 5.4G - -
/system.slice - - 5.6G - -
/test_group_1 4 - - - -
/test_group_2 3 - - - -
/user.slice 26 - 91.3M - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 394.3 5.4G - -
/test_group_2 3 82.7 - - -
/test_group_1 4 14.3 - - -
/system.slice - 0.8 5.6G - -
/user.slice 26 0.4 91.2M - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 394.6 5.4G - -
/test_group_2 3 67.4 - - -
/system.slice - 24.6 5.6G - -
/test_group_1 4 12.5 - - -
/user.slice 26 0.4 91.2M - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 395.2 5.4G - -
/test_group_2 3 60.9 - - -
/system.slice - 27.9 5.6G - -
/test_group_1 4 12.2 - - -
/user.slice 26 0.4 91.2M - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 395.2 5.4G - -
/test_group_2 3 69.4 - - -
/test_group_1 4 13.9 - - -
/user.slice 28 1.6 92.0M - -
/system.slice - 1.0 5.6G - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 395.6 5.4G - -
/test_group_2 3 59.3 - - -
/test_group_1 4 14.1 - - -
/user.slice 28 1.3 92.2M - -
/system.slice - 0.7 5.6G - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 395.5 5.4G - -
/test_group_2 3 67.2 - - -
/test_group_1 4 11.5 - - -
/user.slice 28 1.3 92.5M - -
/system.slice - 0.6 5.6G - -
Path Tasks %CPU Memory Input/s Output/s
/ 1221 395.1 5.4G - -
/test_group_2 3 76.8 - - -
/test_group_1 4 12.9 - - -
/user.slice 28 1.3 92.8M - -
/system.slice - 1.2 5.6G - -
From sched_debug data it can be seen that in bad case the load.weight of per-CPU
sched entities corresponding to test_group_1 has reduced significantly and
also load_avg of test_group_1 remains much higher than that of test_group_2,
even though systemd-udevd stopped running long time back and at this point of
time both cgroups just have the CPU hogger app as running entity."
[ mingo: Added details from the original discussion, plus minor edits to the patch. ]
Reported-by: Imran Khan <[email protected]>
Tested-by: Imran Khan <[email protected]>
Tested-by: Aaron Lu <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Imran Khan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
check_preempt_wakeup_fair()
This variable became unused in:
5e963f2bd465 ("sched/fair: Commit to EEVDF")
Signed-off-by: Wang Jinchao <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Running N CPU-bound tasks on an N CPUs platform:
- with asymmetric CPU capacity
- not being a DynamIq system (i.e. having a PKG level sched domain
without the SD_SHARE_PKG_RESOURCES flag set)
.. might result in a task placement where two tasks run on a big CPU
and none on a little CPU. This placement could be more optimal by
using all CPUs.
Testing platform:
Juno-r2:
- 2 big CPUs (1-2), maximum capacity of 1024
- 4 little CPUs (0,3-5), maximum capacity of 383
Testing workload ([1]):
Spawn 6 CPU-bound tasks. During the first 100ms (step 1), each tasks
is affine to a CPU, except for:
- one little CPU which is left idle.
- one big CPU which has 2 tasks affine.
After the 100ms (step 2), remove the cpumask affinity.
Behavior before the patch:
During step 2, the load balancer running from the idle CPU tags sched
domains as:
- little CPUs: 'group_has_spare'. Cf. group_has_capacity() and
group_is_overloaded(), 3 CPU-bound tasks run on a 4 CPUs
sched-domain, and the idle CPU provides enough spare capacity
regarding the imbalance_pct
- big CPUs: 'group_overloaded'. Indeed, 3 tasks run on a 2 CPUs
sched-domain, so the following path is used:
group_is_overloaded()
\-if (sgs->sum_nr_running <= sgs->group_weight) return true;
The following path which would change the migration type to
'migrate_task' is not taken:
calculate_imbalance()
\-if (env->idle != CPU_NOT_IDLE && env->imbalance == 0)
as the local group has some spare capacity, so the imbalance
is not 0.
The migration type requested is 'migrate_util' and the busiest
runqueue is the big CPU's runqueue having 2 tasks (each having a
utilization of 512). The idle little CPU cannot pull one of these
task as its capacity is too small for the task. The following path
is used:
detach_tasks()
\-case migrate_util:
\-if (util > env->imbalance) goto next;
After the patch:
As the number of failed balancing attempts grows (with
'nr_balance_failed'), progressively make it easier to migrate
a big task to the idling little CPU. A similar mechanism is
used for the 'migrate_load' migration type.
Improvement:
Running the testing workload [1] with the step 2 representing
a ~10s load for a big CPU:
Before patch: ~19.3s
After patch: ~18s (-6.7%)
Similar issue reported at:
https://lore.kernel.org/lkml/[email protected]/
Suggested-by: Vincent Guittot <[email protected]>
Signed-off-by: Pierre Gondois <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Reviewed-by: Vincent Guittot <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
Acked-by: Qais Yousef <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
With UTIL_EST_FASTUP now being permanent, we can take advantage of the
fact that the ewma jumps directly to a higher utilization at dequeue to
simplify util_est and remove the enqueued field.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
Reviewed-by: Hongyan Xia <[email protected]>
Reviewed-by: Alex Shi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
sched_feat(UTIL_EST_FASTUP) has been added to easily disable the feature
in order to check for possibly related regressions. After 3 years, it has
never been used and no regression has been reported. Let's remove it
and make fast increase a permanent behavior.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
Reviewed-by: Hongyan Xia <[email protected]>
Reviewed-by: Tang Yizhou <[email protected]>
Reviewed-by: Yanteng Si <[email protected]> [for the Chinese translation]
Reviewed-by: Alex Shi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Use the new capacity_ref_freq() method to set the ratio that is used by AMU for
computing the arch_scale_freq_capacity().
This helps to keep everything aligned using the same reference for
computing CPUs capacity.
The default value of the ratio (stored in per_cpu(arch_max_freq_scale))
ensures that arch_scale_freq_capacity() returns max capacity until it is
set to its correct value with the cpu capacity and capacity_ref_freq().
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Save the frequency associated to the performance that has been used when
initializing the capacity of CPUs.
Also, cppc cpufreq driver can register an artificial energy model. In such
case, it needs the frequency for this compute capacity.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Pierre Gondois <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Move and rename cppc_cpufreq_perf_to_khz() and cppc_cpufreq_khz_to_perf() to
use them outside cppc_cpufreq in topology_init_cpu_capacity_cppc().
Modify the interface to use struct cppc_perf_caps *caps instead of
struct cppc_cpudata *cpu_data as we only use the fields of cppc_perf_caps.
cppc_cpufreq was converting the lowest and nominal freq from MHz to kHz
before using them. We move this conversion inside cppc_perf_to_khz and
cppc_khz_to_perf to make them generic and usable outside cppc_cpufreq.
No functional change
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Pierre Gondois <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The last item of a performance domain is not always the performance point
that has been used to compute CPU's capacity. This can lead to different
target frequency compared with other part of the system like schedutil and
would result in wrong energy estimation.
A new arch_scale_freq_ref() is available to return a fixed and coherent
frequency reference that can be used when computing the CPU's frequency
for an level of utilization. Use this function to get this reference
frequency.
Energy model is never used without defining arch_scale_freq_ref() but
can be compiled. Define a default arch_scale_freq_ref() returning 0
in such case.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
cpuinfo.max_freq can change at runtime because of boost as an example. This
implies that the value could be different than the one that has been
used when computing the capacity of a CPU.
The new arch_scale_freq_ref() returns a fixed and coherent reference
frequency that can be used when computing a frequency based on utilization.
Use this arch_scale_freq_ref() when available and fallback to
policy otherwise.
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Reviewed-by: Dietmar Eggemann <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
cpuinfo.max_freq can change at runtime because of boost as an example. This
implies that the value could be different from the frequency that has been
used to compute the capacity of a CPU.
The new arch_scale_freq_ref() returns a fixed and coherent frequency
that can be used to compute the capacity for a given frequency.
[ Also fix a arch_set_freq_scale() newline style wart in <linux/cpufreq.h>. ]
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Create a new method to get a unique and fixed max frequency. Currently
cpuinfo.max_freq or the highest (or last) state of performance domain are
used as the max frequency when computing the frequency for a level of
utilization, but:
- cpuinfo_max_freq can change at runtime. boost is one example of
such change.
- cpuinfo.max_freq and last item of the PD can be different leading to
different results between cpufreq and energy model.
We need to save the reference frequency that has been used when computing
the CPUs capacity and use this fixed and coherent value to convert between
frequency and CPU's capacity.
In fact, we already save the frequency that has been used when computing
the capacity of each CPU. We extend the precision to save kHz instead of
MHz currently and we modify the type to be aligned with other variables
used when converting frequency to capacity and the other way.
[ mingo: Minor edits. ]
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Tested-by: Lukasz Luba <[email protected]>
Reviewed-by: Lukasz Luba <[email protected]>
Acked-by: Sudeep Holla <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Pull block fixes from Jens Axboe:
"Just an NVMe pull request this time, with a fix for bad sleeping
context, and a revert of a patch that caused some trouble"
* tag 'block-6.7-2023-12-22' of git://git.kernel.dk/linux:
nvme-pci: fix sleeping function called from interrupt context
Revert "nvme-fc: fix race between error recovery and creating association"
|
|
Pull kvm fixes from Paolo Bonzini:
"RISC-V:
- Fix a race condition in updating external interrupt for
trap-n-emulated IMSIC swfile
- Fix print_reg defaults in get-reg-list selftest
ARM:
- Ensure a vCPU's redistributor is unregistered from the MMIO bus if
vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
x86:
- Fix breakage for SEV-ES guests that use XSAVES
Selftests:
- Fix bad use of strcat(), by not using strcat() at all"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: SEV: Do not intercept accesses to MSR_IA32_XSS for SEV-ES guests
KVM: selftests: Fix dynamic generation of configuration names
RISCV: KVM: update external interrupt atomically for IMSIC swfile
KVM: riscv: selftests: Fix get-reg-list print_reg defaults
KVM: selftests: Ensure sysreg-defs.h is generated at the expected path
KVM: Convert comment into an assertion in kvm_io_bus_register_dev()
KVM: arm64: vgic: Ensure that slots_lock is held in vgic_register_all_redist_iodevs()
KVM: arm64: vgic: Force vcpu vgic teardown on vcpu destroy
KVM: arm64: vgic: Add a non-locking primitive for kvm_vgic_vcpu_destroy()
KVM: arm64: vgic: Simplify kvm_vgic_destroy()
|
|
kvm-master
KVM/riscv fixes for 6.7, take #1
- Fix a race condition in updating external interrupt for
trap-n-emulated IMSIC swfile
- Fix print_reg defaults in get-reg-list selftest
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master
KVM/arm64 fixes for 6.7, part #2
- Ensure a vCPU's redistributor is unregistered from the MMIO bus
if vCPU creation fails
- Fix building KVM selftests for arm64 from the top-level Makefile
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux
Pull printk fix from Petr Mladek:
- Prevent refcount warning from code releasing a fwnode
* tag 'printk-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
lib/vsprintf: Fix %pfwf when current node refcount == 0
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Apparently there were so many kids wishing bug fixes that made Santa
busy; here we have lots of fixes although it's a bit late. But all
changes are device-specific, hence it should be relatively safe to
apply.
Most of changes are for Cirrus codecs (for both ASoC and HD-audio),
while the remaining are fixes for TI codecs, HD-audio and USB-audio
quirks"
* tag 'sound-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (24 commits)
ALSA: hda: cs35l41: Only add SPI CS GPIO if SPI is enabled in kernel
ALSA: hda: cs35l41: Do not allow uninitialised variables to be freed
ASoC: fsl_sai: Fix channel swap issue on i.MX8MP
ASoC: hdmi-codec: fix missing report for jack initial status
ALSA: hda/realtek: Add quirks for ASUS Zenbook 2023 Models
ALSA: hda: cs35l41: Support additional ASUS Zenbook 2023 Models
ALSA: hda/realtek: Add quirks for ASUS Zenbook 2022 Models
ALSA: hda: cs35l41: Support additional ASUS Zenbook 2022 Models
ALSA: hda/realtek: Add quirks for ASUS ROG 2023 models
ALSA: hda: cs35l41: Support additional ASUS ROG 2023 models
ALSA: hda: cs35l41: Add config table to support many laptops without _DSD
ASoC: Intel: bytcr_rt5640: Add new swapped-speakers quirk
ASoC: Intel: bytcr_rt5640: Add quirk for the Medion Lifetab S10346
kselftest: alsa: fixed a print formatting warning
ALSA: usb-audio: Increase delay in MOTU M quirk
ASoC: tas2781: check the validity of prm_no/cfg_no
ALSA: hda/tas2781: select program 0, conf 0 by default
ALSA: hda/realtek: Add quirk for ASUS ROG GV302XA
ASoC: cs42l43: Don't enable bias sense during type detect
ASoC: Intel: soc-acpi-intel-mtl-match: Change CS35L56 prefixes to AMPn
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- error path fixes (qcom-geni)
- polling mode fix (rk3x)
- target mode state machine fix (aspeed)
* tag 'i2c-for-6.7-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: aspeed: Handle the coalesced stop conditions with the start conditions.
i2c: rk3x: fix potential spinlock recursion on poll
i2c: qcom-geni: fix missing clk_disable_unprepare() and geni_se_resources_off()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"Here's another round of fixes from the GPIO subsystem for this release
cycle.
There's one commit adding synchronization to an ioctl() we overlooked
previously and another synchronization changeset for one of the
drivers:
- add protection against GPIO device removal to an overlooked ioctl()
- synchronize the interrupt mask register manually in gpio-dwapb"
* tag 'gpio-fixes-for-v6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio: dwapb: mask/unmask IRQ when disable/enale it
gpiolib: cdev: add gpio_device locking wrapper around gpio_ioctl()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A single patch fixing a build issue for x86 32-bit configurations with
CONFIG_XEN, which was introduced in the 6.7 development cycle"
* tag 'for-linus-6.7a-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: add CPU dependencies for 32-bit build
|
|
Pull drm fixes from Dave Airlie:
"Pretty quiet for this week, just i915 and amdgpu fixes,
I think the misc tree got lost this week, but didn't seem to have too
much in it, so it can wait. I've also got a bunch of nouveau GSP fixes
sailing around that'll probably land next time as well.
amdgpu:
- DCN 3.5 fixes
- DCN 3.2 SubVP fix
- GPUVM fix
amdkfd:
- SVM fix for APUs
i915:
- Fix state readout and check for DSC and bigjoiner combo
- Fix a potential integer overflow
- Reject async flips with bigjoiner
- Fix MTL HDMI/DP PLL clock selection
- Fix various issues by disabling pipe DMC events"
* tag 'drm-fixes-2023-12-22' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: re-create idle bo's PTE during VM state machine reset
drm/amd/display: dereference variable before checking for zero
drm/amd/display: get dprefclk ss info from integration info table
drm/amd/display: Add case for dcn35 to support usb4 dmub hpd event
drm/amd/display: disable FPO and SubVP for older DMUB versions on DCN32x
drm/amdkfd: svm range always mapped flag not working on APU
drm/amd/display: Revert " drm/amd/display: Use channel_width = 2 for vram table 3.0"
drm/i915/dmc: Don't enable any pipe DMC events
drm/i915/mtl: Fix HDMI/DP PLL clock selection
drm/i915: Reject async flips with bigjoiner
drm/i915/hwmon: Fix static analysis tool reported issues
drm/i915/display: Get bigjoiner config before dsc config during readout
|
|
Pull 9p fixes from Dominique Martinet:
"Two small fixes scheduled for stable trees:
A tracepoint fix that's been reading past the end of messages forever,
but semi-recently also went over the end of the buffer. And a
potential incorrectly freeing garbage in pdu parsing error path"
* tag '9p-for-6.7-rc7' of https://github.com/martinetd/linux:
net: 9p: avoid freeing uninit memory in p9pdu_vreadf
9p: prevent read overrun in protocol dump tracepoint
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v6.7-rc7:
- Fix state readout and check for DSC and bigjoiner combo
- Fix a potential integer overflow
- Reject async flips with bigjoiner
- Fix MTL HDMI/DP PLL clock selection
- Fix various issues by disabling pipe DMC events
Signed-off-by: Dave Airlie <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.7-2023-12-20:
amdgpu:
- DCN 3.5 fixes
- DCN 3.2 SubVP fix
- GPUVM fix
amdkfd:
- SVM fix for APUs
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
"Some driver fixes for v6.7, all are in drivers, the most interesting
one is probably the AMD laptop suspend bug which really needs fixing.
Freedestop org has the bug description:
https://gitlab.freedesktop.org/drm/amd/-/issues/2812
Summary:
- Ignore disabled device tree nodes in the Starfive 7100 and 7100
drivers.
- Mask non-wake source pins with interrupt enabled at suspend in the
AMD driver, this blocks unnecessary wakeups from misc interrupts.
This can be power consuming because in many cases the system
doesn't really suspend, it just wakes right back up.
- Fix a typo breaking compilation of the cy8c95x0 driver, and fix up
bugs in the get/set config callbacks.
- Use a dedicated lock class for the PIO4 drivers IRQ. This fixes a
crash on suspend"
* tag 'pinctrl-v6.7-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: at91-pio4: use dedicated lock class for IRQ
pinctrl: cy8c95x0: Fix get_pincfg
pinctrl: cy8c95x0: Fix regression
pinctrl: cy8c95x0: Fix typo
pinctrl: amd: Mask non-wake source pins with interrupt enabled at suspend
pinctrl: starfive: jh7100: ignore disabled device tree nodes
pinctrl: starfive: jh7110: ignore disabled device tree nodes
|
|
Pull NVMe fixes from Keith:
"nvme fixes for Linux 6.7
- Revert a commit with improper sleep context (Keith)
- Fix async event handling sleep context (Maurizio)"
* tag 'nvme-6.7-2023-12-21' of git://git.infradead.org/nvme:
nvme-pci: fix sleeping function called from interrupt context
Revert "nvme-fc: fix race between error recovery and creating association"
|
|
When an afs_volume struct is put, its refcount is reduced to 0 before
the cell->volume_lock is taken and the volume removed from the
cell->volumes tree.
Unfortunately, this means that the lookup code can race and see a volume
with a zero ref in the tree, resulting in a use-after-free:
refcount_t: addition on 0; use-after-free.
WARNING: CPU: 3 PID: 130782 at lib/refcount.c:25 refcount_warn_saturate+0x7a/0xda
...
RIP: 0010:refcount_warn_saturate+0x7a/0xda
...
Call Trace:
afs_get_volume+0x3d/0x55
afs_create_volume+0x126/0x1de
afs_validate_fc+0xfe/0x130
afs_get_tree+0x20/0x2e5
vfs_get_tree+0x1d/0xc9
do_new_mount+0x13b/0x22e
do_mount+0x5d/0x8a
__do_sys_mount+0x100/0x12a
do_syscall_64+0x3a/0x94
entry_SYSCALL_64_after_hwframe+0x62/0x6a
Fix this by:
(1) When putting, use a flag to indicate if the volume has been removed
from the tree and skip the rb_erase if it has.
(2) When looking up, use a conditional ref increment and if it fails
because the refcount is 0, replace the node in the tree and set the
removal flag.
Fixes: 20325960f875 ("afs: Reorganise volume and server trees to be rooted on the cell")
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jeffrey Altman <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The IDA usually detects double-frees, but that detection failed to
consider the case when there are no nearby IDs allocated and so we have a
NULL bitmap rather than simply having a clear bit. Add some tests to the
test-suite to be sure we don't inadvertently reintroduce this problem.
Unfortunately they're quite noisy so include a message to disregard
the warnings.
Reported-by: Zhenghan Wang <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In afs_update_cell(), ret is the result of the DNS lookup and the errors
are to be handled by a switch - however, the value gets clobbered in
between by setting it to -ENOMEM in case afs_alloc_vlserver_list()
fails.
Fix this by moving the setting of -ENOMEM into the error handling for
OOM failure. Further, only do it if we don't have an alternative error
to return.
Found by Linux Verification Center (linuxtesting.org) with SVACE. Based
on a patch from Anastasia Belova [1].
Fixes: d5c32c89b208 ("afs: Fix cell DNS lookup")
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jeffrey Altman <[email protected]>
cc: Anastasia Belova <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]/ [1]
Link: https://lore.kernel.org/r/[email protected]/ # v1
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
"Improve the interaction of arbitrary lookups in the AFS dynamic root
that hit DNS lookup failures [1] where kafs behaves differently from
openafs and causes some applications to fail that aren't expecting
that. Further, negative DNS results aren't getting removed and are
causing failures to persist.
- Always delete unused (particularly negative) dentries as soon as
possible so that they don't prevent future lookups from retrying.
- Fix the handling of new-style negative DNS lookups in ->lookup() to
make them return ENOENT so that userspace doesn't get confused when
stat succeeds but the following open on the looked up file then
fails.
- Fix key handling so that DNS lookup results are reclaimed almost as
soon as they expire rather than sitting round either forever or for
an additional 5 mins beyond a set expiry time returning
EKEYEXPIRED. They persist for 1s as /bin/ls will do a second stat
call if the first fails"
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216637 [1]
Reviewed-by: Jeffrey Altman <[email protected]>
* tag 'afs-fixes-20231221' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry
afs: Fix dynamic root lookup DNS check
afs: Fix the dynamic root's d_delete to always delete unused dentries
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Fix another kerneldoc warning
- Fix eventfs files to inherit the ownership of its parent directory.
The dynamic creation of dentries in eventfs did not take into account
if the tracefs file system was mounted with a gid/uid, and would
still default to the gid/uid of root. This is a regression.
- Fix warning when synthetic event testing is enabled along with
startup event tracing testing is enabled
* tag 'trace-v6.7-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing / synthetic: Disable events after testing in synth_event_gen_test_init()
eventfs: Have event files and directories default to parent uid and gid
tracing/synthetic: fix kernel-doc warnings
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from WiFi and bpf.
Current release - regressions:
- bpf: syzkaller found null ptr deref in unix_bpf proto add
- eth: i40e: fix ST code value for clause 45
Previous releases - regressions:
- core: return error from sk_stream_wait_connect() if sk_wait_event()
fails
- ipv6: revert remove expired routes with a separated list of routes
- wifi rfkill:
- set GPIO direction
- fix crash with WED rx support enabled
- bluetooth:
- fix deadlock in vhci_send_frame
- fix use-after-free in bt_sock_recvmsg
- eth: mlx5e: fix a race in command alloc flow
- eth: ice: fix PF with enabled XDP going no-carrier after reset
- eth: bnxt_en: do not map packet buffers twice
Previous releases - always broken:
- core:
- check vlan filter feature in vlan_vids_add_by_dev() and
vlan_vids_del_by_dev()
- check dev->gso_max_size in gso_features_check()
- mptcp: fix inconsistent state on fastopen race
- phy: skip LED triggers on PHYs on SFP modules
- eth: mlx5e:
- fix double free of encap_header
- fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list()"
* tag 'net-6.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
net: check dev->gso_max_size in gso_features_check()
kselftest: rtnetlink.sh: use grep_fail when expecting the cmd fail
net/ipv6: Revert remove expired routes with a separated list of routes
net: avoid build bug in skb extension length calculation
net: ethernet: mtk_wed: fix possible NULL pointer dereference in mtk_wed_wo_queue_tx_clean()
net: stmmac: fix incorrect flag check in timestamp interrupt
selftests: add vlan hw filter tests
net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()
net: hns3: add new maintainer for the HNS3 ethernet driver
net: mana: select PAGE_POOL
net: ks8851: Fix TX stall caused by TX buffer overrun
ice: Fix PF with enabled XDP going no-carrier after reset
ice: alter feature support check for SRIOV and LAG
ice: stop trashing VF VSI aggregator node ID information
mailmap: add entries for Geliang Tang
mptcp: fill in missing MODULE_DESCRIPTION()
mptcp: fix inconsistent state on fastopen race
selftests: mptcp: join: fix subflow_send_ack lookup
net: phy: skip LED triggers on PHYs on SFP modules
bpf: Add missing BPF_LINK_TYPE invocations
...
|
|
The synth_event_gen_test module can be built in, if someone wants to run
the tests at boot up and not have to load them.
The synth_event_gen_test_init() function creates and enables the synthetic
events and runs its tests.
The synth_event_gen_test_exit() disables the events it created and
destroys the events.
If the module is builtin, the events are never disabled. The issue is, the
events should be disable after the tests are run. This could be an issue
if the rest of the boot up tests are enabled, as they expect the events to
be in a known state before testing. That known state happens to be
disabled.
When CONFIG_SYNTH_EVENT_GEN_TEST=y and CONFIG_EVENT_TRACE_STARTUP_TEST=y
a warning will trigger:
Running tests on trace events:
Testing event create_synth_test:
Enabled event during self test!
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1 at kernel/trace/trace_events.c:4150 event_trace_self_tests+0x1c2/0x480
Modules linked in:
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc2-test-00031-gb803d7c664d5-dirty #276
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:event_trace_self_tests+0x1c2/0x480
Code: bb e8 a2 ab 5d fc 48 8d 7b 48 e8 f9 3d 99 fc 48 8b 73 48 40 f6 c6 01 0f 84 d6 fe ff ff 48 c7 c7 20 b6 ad bb e8 7f ab 5d fc 90 <0f> 0b 90 48 89 df e8 d3 3d 99 fc 48 8b 1b 4c 39 f3 0f 85 2c ff ff
RSP: 0000:ffffc9000001fdc0 EFLAGS: 00010246
RAX: 0000000000000029 RBX: ffff88810399ca80 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffffb9f19478 RDI: ffff88823c734e64
RBP: ffff88810399f300 R08: 0000000000000000 R09: fffffbfff79eb32a
R10: ffffffffbcf59957 R11: 0000000000000001 R12: ffff888104068090
R13: ffffffffbc89f0a0 R14: ffffffffbc8a0f08 R15: 0000000000000078
FS: 0000000000000000(0000) GS:ffff88823c700000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001f6282001 CR4: 0000000000170ef0
Call Trace:
<TASK>
? __warn+0xa5/0x200
? event_trace_self_tests+0x1c2/0x480
? report_bug+0x1f6/0x220
? handle_bug+0x6f/0x90
? exc_invalid_op+0x17/0x50
? asm_exc_invalid_op+0x1a/0x20
? tracer_preempt_on+0x78/0x1c0
? event_trace_self_tests+0x1c2/0x480
? __pfx_event_trace_self_tests_init+0x10/0x10
event_trace_self_tests_init+0x27/0xe0
do_one_initcall+0xd6/0x3c0
? __pfx_do_one_initcall+0x10/0x10
? kasan_set_track+0x25/0x30
? rcu_is_watching+0x38/0x60
kernel_init_freeable+0x324/0x450
? __pfx_kernel_init+0x10/0x10
kernel_init+0x1f/0x1e0
? _raw_spin_unlock_irq+0x33/0x50
ret_from_fork+0x34/0x60
? __pfx_kernel_init+0x10/0x10
ret_from_fork_asm+0x1b/0x30
</TASK>
This is because the synth_event_gen_test_init() left the synthetic events
that it created enabled. By having it disable them after testing, the
other selftests will run fine.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Cc: Mathieu Desnoyers <[email protected]>
Cc: Tom Zanussi <[email protected]>
Fixes: 9fe41efaca084 ("tracing: Add synth event generation test module")
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Reported-by: Alexander Graf <[email protected]>
Tested-by: Alexander Graf <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
Dongliang reported:
I found that in the latest version, the nodes of tracefs have been
changed to dynamically created.
This has caused me to encounter a problem where the gid I specified in
the mounting parameters cannot apply to all files, as in the following
situation:
/data/tmp/events # mount | grep tracefs
tracefs on /data/tmp type tracefs (rw,seclabel,relatime,gid=3012)
gid 3012 = readtracefs
/data/tmp # ls -lh
total 0
-r--r----- 1 root readtracefs 0 1970-01-01 08:00 README
-r--r----- 1 root readtracefs 0 1970-01-01 08:00 available_events
ums9621_1h10:/data/tmp/events # ls -lh
total 0
drwxr-xr-x 2 root root 0 2023-12-19 00:56 alarmtimer
drwxr-xr-x 2 root root 0 2023-12-19 00:56 asoc
It will prevent certain applications from accessing tracefs properly, I
try to avoid this issue by making the following modifications.
To fix this, have the files created default to taking the ownership of
the parent dentry unless the ownership was previously set by the user.
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
Cc: [email protected]
Cc: Mathieu Desnoyers <[email protected]>
Cc: Hongyu Jin <[email protected]>
Fixes: 28e12c09f5aa0 ("eventfs: Save ownership and mode")
Acked-by: Masami Hiramatsu (Google) <[email protected]>
Reported-by: Dongliang Cui <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
|
|
If a key has an expiration time, then when that time passes, the key is
left around for a certain amount of time before being collected (5 mins by
default) so that EKEYEXPIRED can be returned instead of ENOKEY. This is a
problem for DNS keys because we want to redo the DNS lookup immediately at
that point.
Fix this by allowing key types to be marked such that keys of that type
don't have this extra period, but are reclaimed as soon as they expire and
turn this on for dns_resolver-type keys. To make this easier to handle,
key->expiry is changed to be permanent if TIME64_MAX rather than 0.
Furthermore, give such new-style negative DNS results a 1s default expiry
if no other expiry time is set rather than allowing it to stick around
indefinitely. This shouldn't be zero as ls will follow a failing stat call
immediately with a second with AT_SYMLINK_NOFOLLOW added.
Fixes: 1a4240f4764a ("DNS: Separate out CIFS DNS Resolver code")
Signed-off-by: David Howells <[email protected]>
Tested-by: Markus Suvanto <[email protected]>
cc: Wang Lei <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Marc Dionne <[email protected]>
cc: Jarkko Sakkinen <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-12-21
Hi David, hi Jakub, hi Paolo, hi Eric,
The following pull-request contains BPF updates for your *net* tree.
We've added 3 non-merge commits during the last 5 day(s) which contain
a total of 4 files changed, 45 insertions(+).
The main changes are:
1) Fix a syzkaller splat which triggered an oob issue in bpf_link_show_fdinfo(),
from Jiri Olsa.
2) Fix another syzkaller-found issue which triggered a NULL pointer dereference
in BPF sockmap for unconnected unix sockets, from John Fastabend.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add missing BPF_LINK_TYPE invocations
bpf: sockmap, test for unconnected af_unix sock
bpf: syzkaller found null ptr deref in unix_bpf proto add
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In the hardware implementation of the I2C HID driver based on DesignWare
GPIO IRQ chip, when the user continues to use the I2C HID device in the
suspend process, the I2C HID interrupt will be masked after the resume
process is finished.
This is because the disable_irq()/enable_irq() of the DesignWare GPIO
driver does not synchronize the IRQ mask register state. In normal use
of the I2C HID procedure, the GPIO IRQ irq_mask()/irq_unmask() functions
are called in pairs. In case of an exception, i2c_hid_core_suspend()
calls disable_irq() to disable the GPIO IRQ. With low probability, this
causes irq_unmask() to not be called, which causes the GPIO IRQ to be
masked and not unmasked in enable_irq(), raising an exception.
Add synchronization to the masked register state in the
dwapb_irq_enable()/dwapb_irq_disable() function. mask the GPIO IRQ
before disabling it. After enabling the GPIO IRQ, unmask the IRQ.
Fixes: 7779b3455697 ("gpio: add a driver for the Synopsys DesignWare APB GPIO block")
Cc: [email protected]
Co-developed-by: Riwen Lu <[email protected]>
Signed-off-by: Riwen Lu <[email protected]>
Signed-off-by: xiongxin <[email protected]>
Acked-by: Serge Semin <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
While the GPIO cdev gpio_ioctl() call is in progress, the kernel can
call gpiochip_remove() which will set gdev->chip to NULL, after which
any subsequent access will cause a crash.
gpio_ioctl() was overlooked by the previous fix to protect syscalls
(bdbbae241a04), so add protection for that.
Fixes: bdbbae241a04 ("gpiolib: protect the GPIO device against being dropped while in use by user-space")
Fixes: d7c51b47ac11 ("gpio: userspace ABI for reading/writing GPIO lines")
Fixes: 3c0d9c635ae2 ("gpiolib: cdev: support GPIO_V2_GET_LINE_IOCTL and GPIO_V2_LINE_GET_VALUES_IOCTL")
Fixes: aad955842d1c ("gpiolib: cdev: support GPIO_V2_GET_LINEINFO_IOCTL and GPIO_V2_GET_LINEINFO_WATCH_IOCTL")
Signed-off-by: Kent Gibson <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
Some drivers might misbehave if TSO packets get too big.
GVE for instance uses a 16bit field in its TX descriptor,
and will do bad things if a packet is bigger than 2^16 bytes.
Linux TCP stack honors dev->gso_max_size, but there are
other ways for too big packets to reach an ndo_start_xmit()
handler : virtio_net, af_packet, GRO...
Add a generic check in gso_features_check() and fallback
to GSO when needed.
gso_max_size was added in the blamed commit.
Fixes: 82cc1a7a5687 ("[NET]: Add per-connection option to set max TSO frame size")
Signed-off-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
run_cmd_grep_fail should be used when expecting the cmd fail, or the ret
will be set to 1, and the total test return 1 when exiting. This would cause
the result report to fail if run via run_kselftest.sh.
Before fix:
# ./rtnetlink.sh -t kci_test_addrlft
PASS: preferred_lft addresses have expired
# echo $?
1
After fix:
# ./rtnetlink.sh -t kci_test_addrlft
PASS: preferred_lft addresses have expired
# echo $?
0
Fixes: 9c2a19f71515 ("kselftest: rtnetlink.sh: add verbose flag")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
If CONFIG_SPI is not set in the kernel, there is no point in trying
to set the chip selects. We can selectively compile it.
Fixes: 8c4c216db8fb ("ALSA: hda: cs35l41: Add config table to support many laptops without _DSD")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Stefan Binding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Initialise the variables to NULL so that they cannot be uninitialised
when devm_kfree is called.
Found by static analysis.
Fixes: 8c4c216db8fb ("ALSA: hda: cs35l41: Add config table to support many laptops without _DSD")
Signed-off-by: Stefan Binding <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v6.7
Quite a big collection of fixes, as ever mostly in drivers. There's one
framework fix for the HDMI CODEC where it wasn't handling startup
properly for some controllers, and one new x86 quirk, but otherwise all
local fixes or dropping things we don't want to see in a release.
|
|
Trying to suspend to RAM on SAMA5D27 EVK leads to the following lockdep
warning:
============================================
WARNING: possible recursive locking detected
6.7.0-rc5-wt+ #532 Not tainted
--------------------------------------------
sh/92 is trying to acquire lock:
c3cf306c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100
but task is already holding lock:
c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&irq_desc_lock_class);
lock(&irq_desc_lock_class);
*** DEADLOCK ***
May be due to missing lock nesting notation
6 locks held by sh/92:
#0: c3aa0258 (sb_writers#6){.+.+}-{0:0}, at: ksys_write+0xd8/0x178
#1: c4c2df44 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x138/0x284
#2: c32684a0 (kn->active){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x148/0x284
#3: c232b6d4 (system_transition_mutex){+.+.}-{3:3}, at: pm_suspend+0x13c/0x4e8
#4: c387b088 (&dev->mutex){....}-{3:3}, at: __device_suspend+0x1e8/0x91c
#5: c3d7c46c (&irq_desc_lock_class){-.-.}-{2:2}, at: __irq_get_desc_lock+0xe8/0x100
stack backtrace:
CPU: 0 PID: 92 Comm: sh Not tainted 6.7.0-rc5-wt+ #532
Hardware name: Atmel SAMA5
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x34/0x48
dump_stack_lvl from __lock_acquire+0x19ec/0x3a0c
__lock_acquire from lock_acquire.part.0+0x124/0x2d0
lock_acquire.part.0 from _raw_spin_lock_irqsave+0x5c/0x78
_raw_spin_lock_irqsave from __irq_get_desc_lock+0xe8/0x100
__irq_get_desc_lock from irq_set_irq_wake+0xa8/0x204
irq_set_irq_wake from atmel_gpio_irq_set_wake+0x58/0xb4
atmel_gpio_irq_set_wake from irq_set_irq_wake+0x100/0x204
irq_set_irq_wake from gpio_keys_suspend+0xec/0x2b8
gpio_keys_suspend from dpm_run_callback+0xe4/0x248
dpm_run_callback from __device_suspend+0x234/0x91c
__device_suspend from dpm_suspend+0x224/0x43c
dpm_suspend from dpm_suspend_start+0x9c/0xa8
dpm_suspend_start from suspend_devices_and_enter+0x1e0/0xa84
suspend_devices_and_enter from pm_suspend+0x460/0x4e8
pm_suspend from state_store+0x78/0xe4
state_store from kernfs_fop_write_iter+0x1a0/0x284
kernfs_fop_write_iter from vfs_write+0x38c/0x6f4
vfs_write from ksys_write+0xd8/0x178
ksys_write from ret_fast_syscall+0x0/0x1c
Exception stack(0xc52b3fa8 to 0xc52b3ff0)
3fa0: 00000004 005a0ae8 00000001 005a0ae8 00000004 00000001
3fc0: 00000004 005a0ae8 00000001 00000004 00000004 b6c616c0 00000020 0059d190
3fe0: 00000004 b6c61678 aec5a041 aebf1a26
This warning is raised because pinctrl-at91-pio4 uses chained IRQ. Whenever
a wake up source configures an IRQ through irq_set_irq_wake, it will
lock the corresponding IRQ desc, and then call irq_set_irq_wake on "parent"
IRQ which will do the same on its own IRQ desc, but since those two locks
share the same class, lockdep reports this as an issue.
Fix lockdep false positive by setting a different class for parent and
children IRQ
Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller")
Signed-off-by: Alexis Lothoré <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Linus Walleij <[email protected]>
|
|
This reverts commit 3dec89b14d37ee635e772636dad3f09f78f1ab87.
The commit has some race conditions given how expires is managed on a
fib6_info in relation to gc start, adding the entry to the gc list and
setting the timer value leading to UAF. Revert the commit and try again
in a later release.
Fixes: 3dec89b14d37 ("net/ipv6: Remove expired routes with a separated list of routes")
Cc: Kui-Feng Lee <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-12-18 (ice)
This series contains updates to ice driver only.
Jakes stops clearing of needed aggregator information.
Dave adds a check for LAG device support before initializing the
associated event handler.
Larysa restores accounting of XDP queues in TC configurations.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix PF with enabled XDP going no-carrier after reset
ice: alter feature support check for SRIOV and LAG
ice: stop trashing VF VSI aggregator node ID information
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Xen only supports modern CPUs even when running a 32-bit kernel, and it now
requires a kernel built for a 64 byte (or larger) cache line:
In file included from <command-line>:
In function 'xen_vcpu_setup',
inlined from 'xen_vcpu_setup_restore' at arch/x86/xen/enlighten.c:111:3,
inlined from 'xen_vcpu_restore' at arch/x86/xen/enlighten.c:141:3:
include/linux/compiler_types.h:435:45: error: call to '__compiletime_assert_287' declared with attribute error: BUILD_BUG_ON failed: sizeof(*vcpup) > SMP_CACHE_BYTES
arch/x86/xen/enlighten.c:166:9: note: in expansion of macro 'BUILD_BUG_ON'
166 | BUILD_BUG_ON(sizeof(*vcpup) > SMP_CACHE_BYTES);
| ^~~~~~~~~~~~
Enforce the dependency with a whitelist of CPU configurations. In normal
distro kernels, CONFIG_X86_GENERIC is enabled, and this works fine. When this
is not set, still allow Xen to be built on kernels that target a 64-bit
capable CPU.
Fixes: db2832309a82 ("x86/xen: fix percpu vcpu_info allocation")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Tested-by: Alyssa Ross <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Juergen Gross <[email protected]>
|