Age | Commit message (Collapse) | Author | Files | Lines |
|
Check if the incoming interface is available and NFT_BREAK
in case neither skb->sk nor input device are set.
Because nf_sk_lookup_slow*() assume packet headers are in the
'in' direction, use in postrouting is not going to yield a meaningful
result. Same is true for the forward chain, so restrict the use
to prerouting, input and output.
Use in output work if a socket is already attached to the skb.
Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching")
Reported-and-tested-by: Topi Miettinen <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Merge cpuidle fixes for 5.18-rc5:
- Make intel_idle enable C1E promotion on all CPUs when C1E is
preferred to C1 (Artem Bityutskiy).
- Make C6 optimization on Sapphire Rapids added recently work as
expected if both C1E and C1 are "preferred" (Artem Bityutskiy).
* pm-cpuidle:
intel_idle: Fix SPR C6 optimization
intel_idle: Fix the 'preferred_cstates' module parameter
|
|
Now the generic code can handle kallsyms fixup properly so no need to
keep the arch-functions anymore.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now arch-specific functions all do the same thing. When it fixes the
symbol address it needs to check the boundary between the kernel image
and modules. For the last symbol in the previous region, it cannot
know the exact size as it's discarded already. Thus it just uses a
small page size (4096) and rounds it up like the last symbol.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The symbol fixup is necessary for symbols in kallsyms since they don't
have size info. So we use the next symbol's address to calculate the
size. Now it's also used for user binaries because sometimes they miss
size for hand-written asm functions.
There's a arch-specific function to handle kallsyms differently but
currently it cannot distinguish kallsyms from others. Pass this
information explicitly to handle it properly. Note that those arch
functions will be moved to the generic function so I didn't added it to
the arch-functions.
Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Adds a perf_event_attr test for Arm SPE in which the presence of
physical addresses are checked when SPE unit is run with pa_enable=1.
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Timothy Hayes <[email protected]>
Tested-by: Leo Yan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: John Garry <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch corrects a bug whereby SPE collection is invoked with
pa_enable=1 but synthesized events fail to show physical addresses.
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Timothy Hayes <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: John Garry <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch corrects a bug whereby synthesized events from SPE
samples are missing virtual addresses.
Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples")
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Timothy Hayes <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: Jiri Olsa <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: John Garry <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: [email protected]
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: [email protected]
Cc: Song Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Intel PT does not capture data in separate directories, so do not
use separate directory processing because it doesn't work for
timeless decoding. It also looks like it doesn't support one_mmap
handling.
Example:
Before:
# perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.799 MB perf.data ]
# perf script --itrace=bep | head
#
After:
# perf script --itrace=bep | head
perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms])
perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms])
Fixes: bb6be405c4a2a5 ("perf session: Load data directory files for analysis")
Cc: [email protected]
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Ian Rogers <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
|
|
Commit 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered
I/O") changed gfs2_file_read_iter() and gfs2_file_buffered_write() to
allow dropping the inode glock while faulting in user buffers. When the
lock was dropped, a short result was returned to indicate that the
operation was interrupted.
As pointed out by Linus (see the link below), this behavior is broken
and the operations should always re-acquire the inode glock and resume
the operation instead.
Link: https://lore.kernel.org/lkml/CAHk-=whaz-g_nOOoo8RRiWNjnv2R+h6_xk2F1J4TuSRxk1MtLw@mail.gmail.com/
Fixes: 00bfe02f4796 ("gfs2: Fix mmap + page fault deadlocks for buffered I/O")
Signed-off-by: Andreas Gruenbacher <[email protected]>
|
|
Unfortunately, the name/value choice for the MTE ELF segment type
(PT_ARM_MEMTAG_MTE) was pretty poor: LOPROC+1 is already in use by
PT_AARCH64_UNWIND, as defined in the AArch64 ELF ABI
(https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst).
Update the ELF segment type value to LOPROC+2 and also change the define
to PT_AARCH64_MEMTAG_MTE to match the AArch64 ELF ABI namespace. The
AArch64 ELF ABI document is updating accordingly (segment type not
previously mentioned in the document).
Signed-off-by: Catalin Marinas <[email protected]>
Fixes: 761b9b366cec ("elf: Introduce the ARM MTE ELF segment type")
Cc: Will Deacon <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Eric Biederman <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Luis Machado <[email protected]>
Cc: Richard Earnshaw <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix regression causing some HCI events to be discarded when they
shouldn't.
* tag 'for-net-2022-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted
Bluetooth: hci_event: Fix creating hci_conn object on error status
Bluetooth: hci_event: Fix checking for invalid handle on error status
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
valid data
With tape devices, the SCF_TREAT_READ_AS_NORMAL flag is used by the target
subsystem to mark commands which have both data to return as well as sense
data. But with pscsi, SCF_TREAT_READ_AS_NORMAL can be set even if there is
no data to return. The SCF_TREAT_READ_AS_NORMAL flag causes the target core
to call iscsit data-in callbacks even if there is no data, which iscsit
does not support. This results in iscsit going into an error state
requiring recovery and being unable to complete the command to the
initiator.
This issue can be resolved by fixing pscsi to only set
SCF_TREAT_READ_AS_NORMAL if there is valid data to return alongside the
sense data.
Link: https://lore.kernel.org/r/[email protected]
Fixes: bd81372065fa ("scsi: target: transport should handle st FM/EOM/ILI reads")
Reported-by: Scott Hamilton <[email protected]>
Tested-by: Laurence Oberman <[email protected]>
Reviewed-by: Laurence Oberman <[email protected]>
Signed-off-by: David Jeffery <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
Put device node in error path in fec_enet_init_stop_mode().
Fixes: 8a448bf832af ("net: ethernet: fec: move GPR register offset and bit into DT")
Signed-off-by: Yang Yingliang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
While handling PCI errors (AER flow) driver tries to
disable NAPI [napi_disable()] after NAPI is deleted
[__netif_napi_del()] which causes unexpected system
hang/crash.
System message log shows the following:
=======================================
[ 3222.537510] EEH: Detected PCI bus error on PHB#384-PE#800000 [ 3222.537511] EEH: This PCI device has failed 2 times in the last hour and will be permanently disabled after 5 failures.
[ 3222.537512] EEH: Notify device drivers to shutdown [ 3222.537513] EEH: Beginning: 'error_detected(IO frozen)'
[ 3222.537514] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537516] bnx2x: [bnx2x_io_error_detected:14236(eth14)]IO error detected [ 3222.537650] EEH: PE#800000 (PCI 0384:80:00.0): bnx2x driver reports:
'need reset'
[ 3222.537651] EEH: PE#800000 (PCI 0384:80:00.1): Invoking
bnx2x->error_detected(IO frozen)
[ 3222.537651] bnx2x: [bnx2x_io_error_detected:14236(eth13)]IO error detected [ 3222.537729] EEH: PE#800000 (PCI 0384:80:00.1): bnx2x driver reports:
'need reset'
[ 3222.537729] EEH: Finished:'error_detected(IO frozen)' with aggregate recovery state:'need reset'
[ 3222.537890] EEH: Collect temporary log [ 3222.583481] EEH: of node=0384:80:00.0 [ 3222.583519] EEH: PCI device/vendor: 168e14e4 [ 3222.583557] EEH: PCI cmd/status register: 00100140 [ 3222.583557] EEH: PCI-E capabilities and status follow:
[ 3222.583744] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.583892] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.583893] EEH: PCI-E 20: 00000000 [ 3222.583893] EEH: PCI-E AER capability register set follows:
[ 3222.584079] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.584230] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.584378] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.584416] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.584416] EEH: of node=0384:80:00.1 [ 3222.584454] EEH: PCI device/vendor: 168e14e4 [ 3222.584491] EEH: PCI cmd/status register: 00100140 [ 3222.584492] EEH: PCI-E capabilities and status follow:
[ 3222.584677] EEH: PCI-E 00: 00020010 012c8da2 00095d5e 00455c82 [ 3222.584825] EEH: PCI-E 10: 10820000 00000000 00000000 00000000 [ 3222.584826] EEH: PCI-E 20: 00000000 [ 3222.584826] EEH: PCI-E AER capability register set follows:
[ 3222.585011] EEH: PCI-E AER 00: 13c10001 00000000 00000000 00062030 [ 3222.585160] EEH: PCI-E AER 10: 00002000 000031c0 000001e0 00000000 [ 3222.585309] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000 [ 3222.585347] EEH: PCI-E AER 30: 00000000 00000000 [ 3222.586872] RTAS: event: 5, Type: Platform Error (224), Severity: 2 [ 3222.586873] EEH: Reset without hotplug activity [ 3224.762767] EEH: Beginning: 'slot_reset'
[ 3224.762770] EEH: PE#800000 (PCI 0384:80:00.0): Invoking
bnx2x->slot_reset()
[ 3224.762771] bnx2x: [bnx2x_io_slot_reset:14271(eth14)]IO slot reset initializing...
[ 3224.762887] bnx2x 0384:80:00.0: enabling device (0140 -> 0142) [ 3224.768157] bnx2x: [bnx2x_io_slot_reset:14287(eth14)]IO slot reset
--> driver unload
Uninterruptible tasks
=====================
crash> ps | grep UN
213 2 11 c000000004c89e00 UN 0.0 0 0 [eehd]
215 2 0 c000000004c80000 UN 0.0 0 0
[kworker/0:2]
2196 1 28 c000000004504f00 UN 0.1 15936 11136 wickedd
4287 1 9 c00000020d076800 UN 0.0 4032 3008 agetty
4289 1 20 c00000020d056680 UN 0.0 7232 3840 agetty
32423 2 26 c00000020038c580 UN 0.0 0 0
[kworker/26:3]
32871 4241 27 c0000002609ddd00 UN 0.1 18624 11648 sshd
32920 10130 16 c00000027284a100 UN 0.1 48512 12608 sendmail
33092 32987 0 c000000205218b00 UN 0.1 48512 12608 sendmail
33154 4567 16 c000000260e51780 UN 0.1 48832 12864 pickup
33209 4241 36 c000000270cb6500 UN 0.1 18624 11712 sshd
33473 33283 0 c000000205211480 UN 0.1 48512 12672 sendmail
33531 4241 37 c00000023c902780 UN 0.1 18624 11648 sshd
EEH handler hung while bnx2x sleeping and holding RTNL lock
===========================================================
crash> bt 213
PID: 213 TASK: c000000004c89e00 CPU: 11 COMMAND: "eehd"
#0 [c000000004d477e0] __schedule at c000000000c70808
#1 [c000000004d478b0] schedule at c000000000c70ee0
#2 [c000000004d478e0] schedule_timeout at c000000000c76dec
#3 [c000000004d479c0] msleep at c0000000002120cc
#4 [c000000004d479f0] napi_disable at c000000000a06448
^^^^^^^^^^^^^^^^
#5 [c000000004d47a30] bnx2x_netif_stop at c0080000018dba94 [bnx2x]
#6 [c000000004d47a60] bnx2x_io_slot_reset at c0080000018a551c [bnx2x]
#7 [c000000004d47b20] eeh_report_reset at c00000000004c9bc
#8 [c000000004d47b90] eeh_pe_report at c00000000004d1a8
#9 [c000000004d47c40] eeh_handle_normal_event at c00000000004da64
And the sleeping source code
============================
crash> dis -ls c000000000a06448
FILE: ../net/core/dev.c
LINE: 6702
6697 {
6698 might_sleep();
6699 set_bit(NAPI_STATE_DISABLE, &n->state);
6700
6701 while (test_and_set_bit(NAPI_STATE_SCHED, &n->state))
* 6702 msleep(1);
6703 while (test_and_set_bit(NAPI_STATE_NPSVC, &n->state))
6704 msleep(1);
6705
6706 hrtimer_cancel(&n->timer);
6707
6708 clear_bit(NAPI_STATE_DISABLE, &n->state);
6709 }
EEH calls into bnx2x twice based on the system log above, first through
bnx2x_io_error_detected() and then bnx2x_io_slot_reset(), and executes
the following call chains:
bnx2x_io_error_detected()
+-> bnx2x_eeh_nic_unload()
+-> bnx2x_del_all_napi()
+-> __netif_napi_del()
bnx2x_io_slot_reset()
+-> bnx2x_netif_stop()
+-> bnx2x_napi_disable()
+->napi_disable()
Fix this by correcting the sequence of NAPI APIs usage,
that is delete the NAPI after disabling it.
Fixes: 7fa6f34081f1 ("bnx2x: AER revised")
Reported-by: David Christensen <[email protected]>
Tested-by: David Christensen <[email protected]>
Signed-off-by: Manish Chopra <[email protected]>
Signed-off-by: Ariel Elior <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Calling tls_append_frag when max_open_record_len == record->len might
add an empty fragment to the TLS record if the call happens to be on the
page boundary. Normally tls_append_frag coalesces the zero-sized
fragment to the previous one, but not if it's on page boundary.
If a resync happens then, the mlx5 driver posts dump WQEs in
tx_post_resync_dump, and the empty fragment may become a data segment
with byte_count == 0, which will confuse the NIC and lead to a CQE
error.
This commit fixes the described issue by skipping tls_append_frag on
zero size to avoid adding empty fragments. The fix is not in the driver,
because an empty fragment is hardly the desired behavior.
Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure")
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2022-04-27
We've added 5 non-merge commits during the last 20 day(s) which contain
a total of 6 files changed, 34 insertions(+), 12 deletions(-).
The main changes are:
1) Fix xsk sockets when rx and tx are separately bound to the same umem, also
fix xsk copy mode combined with busy poll, from Maciej Fijalkowski.
2) Fix BPF tunnel/collect_md helpers with bpf_xmit lwt hook usage which triggered
a crash due to invalid metadata_dst access, from Eyal Birger.
3) Fix release of page pool in XDP live packet mode, from Toke Høiland-Jørgensen.
4) Fix potential NULL pointer dereference in kretprobes, from Adam Zabrocki.
(Masami & Steven preferred this small fix to be routed via bpf tree given it's
follow-up fix to Masami's rethook work that went via bpf earlier, too.)
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
xsk: Fix possible crash when multiple sockets are created
kprobes: Fix KRETPROBES when CONFIG_KRETPROBE_ON_RETHOOK is set
bpf, lwt: Fix crash when using bpf_skb_set_tunnel_key() from bpf_xmit lwt hook
bpf: Fix release of page_pool in BPF_PROG_RUN in test runner
xsk: Fix l2fwd for copy mode + busy poll combo
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Without MMHUB clock gating being enabled then MMHUB will not disconnect
from DF and will result in DF C-state entry can't be accessed during S2idle
suspend, and eventually s0ix entry will be blocked.
Signed-off-by: Prike Liang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The adev->pm.mutx is already held at the beginning of
amdgpu_dpm_compute_clocks/amdgpu_dpm_enable_uvd/amdgpu_dpm_enable_vce.
But on their calling path, amdgpu_display_bandwidth_update will be
called and thus its sub functions amdgpu_dpm_get_sclk/mclk. They
will then try to acquire the same adev->pm.mutex and deadlock will
occur.
By placing amdgpu_display_bandwidth_update outside of adev->pm.mutex
protection(considering logically they do not need such protection) and
restructuring the call flow accordingly, we can eliminate the deadlock
issue. This comes with no real logics change.
Fixes: 3712e7a49459 ("drm/amd/pm: unified lock protections in amdgpu_dpm.c")
Reported-by: Paul Menzel <[email protected]>
Reported-by: Arthur Marsh <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1957
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When dcn20_clk_src_construct() fails, we need to release clk_src.
Fixes: 6f4e6361c3ff ("drm/amd/display: Add Renoir resource (v2)")
Signed-off-by: Miaoqian Lin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We normally runtime suspend when there are displays attached if they
are in the DPMS off state, however, if something wakes the GPU
we send a hotplug event on resume (in case any displays were connected
while the GPU was in suspend) which can cause userspace to light
up the displays again soon after they were turned off.
Prior to
commit 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's."),
the driver took a runtime pm reference when the fbdev emulation was
enabled because we didn't implement proper shadowing support for
vram access when the device was off so the device never runtime
suspended when there was a console bound. Once that commit landed,
we now utilize the core fb helper implementation which properly
handles the emulation, so runtime pm now suspends in cases where it did
not before. Ultimately, we need to sort out why runtime suspend in not
working in this case for some users, but this should restore similar
behavior to before.
v2: move check into runtime_suspend
v3: wake ups -> wakeups in comment, retain pm_runtime behavior in
runtime_idle callback
Fixes: 087451f372bf76 ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Link: https://lore.kernel.org/r/[email protected]/
Tested-by: Michele Ballabio <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Add support to checkpoint/restore GWS (Global Wave Sync) queues.
Signed-off-by: David Yat Sin <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated
each time the queue gets evicted.
Fixes: b8020b0304c8 ("drm/amdkfd: Enable over-subscription with >1 GWS queue")
Signed-off-by: David Yat Sin <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Merge fixes from Andrew Morton:
"Two patches.
Subsystems affected by this patch series: mm/kasan and mm/debug"
* emailed patches from Andrew Morton <[email protected]>:
docs: vm/page_owner: use literal blocks for param description
kasan: prevent cpu_quarantine corruption when CPU offline and cache shrink occur at same time
|
|
Sphinx generates hard-to-read lists of parameters at the bottom of the
page. Fix them by putting literal-block markers of "::" in front of
them.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Akira Yokosawa <[email protected]>
Fixes: 57f2b54a9379 ("Documentation/vm/page_owner.rst: update the documentation")
Cc: Shenghong Han <[email protected]>
Cc: Haowen Bai <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Alex Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
occur at same time
kasan_quarantine_remove_cache() is called in kmem_cache_shrink()/
destroy(). The kasan_quarantine_remove_cache() call is protected by
cpuslock in kmem_cache_destroy() to ensure serialization with
kasan_cpu_offline().
However the kasan_quarantine_remove_cache() call is not protected by
cpuslock in kmem_cache_shrink(). When a CPU is going offline and cache
shrink occurs at same time, the cpu_quarantine may be corrupted by
interrupt (per_cpu_remove_cache operation).
So add a cpu_quarantine offline flags check in per_cpu_remove_cache().
[[email protected]: add comment, per Zqiang]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zqiang <[email protected]>
Reviewed-by: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit 0006707723233cb2a9a23ca19fc3d0864835704c. It has a
couple problems:
* bio_issue_time() is stored in bio->bi_issue truncated to 51 bits. This
overflows in slightly over 26 days. Setting rq->io_start_time_ns with it
means that io duration calculation would yield >26days after 26 days of
uptime. This, for example, confuses kyber making it cause high IO
latencies.
* rq->io_start_time_ns should record the time that the IO is issued to the
device so that on-device latency can be measured. However,
bio_issue_time() is set before the bio goes through the rq-qos controllers
(wbt, iolatency, iocost), so when the bio gets throttled in any of the
mechanisms, the measured latencies make no sense - on-device latencies end
up higher than request-alloc-to-completion latencies.
We'll need a smarter way to avoid calling ktime_get_ns() repeatedly
back-to-back. For now, let's revert the commit.
Signed-off-by: Tejun Heo <[email protected]>
Cc: [email protected] # v5.16+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The Sapphire Rapids (SPR) C6 optimization was added to the end of the
'spr_idle_state_table_update()' function. However, the function has a
'return' which may happen before the optimization has a chance to run.
And this may prevent the optimization from happening.
This is an unlikely scenario, but possible if user boots with, say,
the 'intel_idle.preferred_cstates=6' kernel boot option.
This patch fixes the issue by eliminating the problematic 'return'
statement.
Fixes: 3a9cf77b60dc ("intel_idle: add core C6 optimization for SPR")
Suggested-by: Jan Beulich <[email protected]>
Reported-by: Jan Beulich <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
[ rjw: Minor changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Problem description.
When user boots kernel up with the 'intel_idle.preferred_cstates=4' option,
we enable C1E and disable C1 states on Sapphire Rapids Xeon (SPR). In order
for C1E to work on SPR, we have to enable the C1E promotion bit on all
CPUs. However, we enable it only on one CPU.
Fix description.
The 'intel_idle' driver already has the infrastructure for disabling C1E
promotion on every CPU. This patch uses the same infrastructure for
enabling C1E promotion on every CPU. It changes the boolean
'disable_promotion_to_c1e' variable to a tri-state 'c1e_promotion'
variable.
Tested on a 2-socket SPR system. I verified the following combinations:
* C1E promotion enabled and disabled in BIOS.
* Booted with and without the 'intel_idle.preferred_cstates=4' kernel
argument.
In all 4 cases C1E promotion was correctly set on all CPUs.
Also tested on an old Broadwell system, just to make sure it does not cause
a regression. C1E promotion was correctly disabled on that system, both C1
and C1E were exposed (as expected).
Fixes: da0e58c038e6 ("intel_idle: add 'preferred_cstates' module argument")
Reported-by: Jan Beulich <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
[ rjw: Minor changelog edits ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm
Pull ARM cpufreq fixes for 5.18-rc5 from Viresh Kumar:
"- Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov and
Vladimir Zapolskiy).
- Fix memory leak with the Sun501 driver (Xiaobing Luo)."
* tag 'cpufreq-arm-fixes-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm:
cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts
cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe
cpufreq: qcom-cpufreq-hw: Fix throttle frequency value on EPSS platforms
cpufreq: qcom-hw: provide online/offline operations
cpufreq: qcom-hw: fix the opp entries refcounting
cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
|
|
If we pass too short string to "hex2bin" (and the string size without
the terminating NUL character is even), "hex2bin" reads one byte after
the terminating NUL character. This patch fixes it.
Note that hex_to_bin returns -1 on error and hex2bin return -EINVAL on
error - so we can't just return the variable "hi" or "lo" on error.
This inconsistency may be fixed in the next merge window, but for the
purpose of fixing this bug, we just preserve the existing behavior and
return -1 and -EINVAL.
Signed-off-by: Mikulas Patocka <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Fixes: b78049831ffe ("lib: add error checking to hex2bin")
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The function hex2bin is used to load cryptographic keys into device
mapper targets dm-crypt and dm-integrity. It should take constant time
independent on the processed data, so that concurrently running
unprivileged code can't infer any information about the keys via
microarchitectural convert channels.
This patch changes the function hex_to_bin so that it contains no
branches and no memory accesses.
Note that this shouldn't cause performance degradation because the size
of the new function is the same as the size of the old function (on
x86-64) - and the new function causes no branch misprediction penalties.
I compile-tested this function with gcc on aarch64 alpha arm hppa hppa64
i386 ia64 m68k mips32 mips64 powerpc powerpc64 riscv sh4 s390x sparc32
sparc64 x86_64 and with clang on aarch64 arm hexagon i386 mips32 mips64
powerpc powerpc64 s390x sparc32 sparc64 x86_64 to verify that there are
no branches in the generated code.
Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
kernfs_remove supported NULL kernfs_node param to bail out but revent
per-fs lock change introduced regression that dereferencing the
param without NULL check so kernel goes crash.
This patch checks the NULL kernfs_node in kernfs_remove and if so,
just return.
Quote from bug report by Jirka
```
The bug is triggered by running NAS Parallel benchmark suite on
SuperMicro servers with 2x Xeon(R) Gold 6126 CPU. Here is the error
log:
[ 247.035564] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 247.036009] #PF: supervisor read access in kernel mode
[ 247.036009] #PF: error_code(0x0000) - not-present page
[ 247.036009] PGD 0 P4D 0
[ 247.036009] Oops: 0000 [#1] PREEMPT SMP PTI
[ 247.058060] CPU: 1 PID: 6546 Comm: umount Not tainted
5.16.0393c3714081a53795bbff0e985d24146def6f57f+ #16
[ 247.058060] Hardware name: Supermicro Super Server/X11DDW-L, BIOS
2.0b 03/07/2018
[ 247.058060] RIP: 0010:kernfs_remove+0x8/0x50
[ 247.058060] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 49 c7 c4 f4
ff ff ff eb b2 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00
41 54 55 <48> 8b 47 08 48 89 fd 48 85 c0 48 0f 44 c7 4c 8b 60 50 49 83
c4 60
[ 247.058060] RSP: 0018:ffffbbfa48a27e48 EFLAGS: 00010246
[ 247.058060] RAX: 0000000000000001 RBX: ffffffff89e31f98 RCX: 0000000080200018
[ 247.058060] RDX: 0000000080200019 RSI: fffff6760786c900 RDI: 0000000000000000
[ 247.058060] RBP: ffffffff89e31f98 R08: ffff926b61b24d00 R09: 0000000080200018
[ 247.122048] R10: ffff926b61b24d00 R11: ffff926a8040c000 R12: ffff927bd09a2000
[ 247.122048] R13: ffffffff89e31fa0 R14: dead000000000122 R15: dead000000000100
[ 247.122048] FS: 00007f01be0a8c40(0000) GS:ffff926fa8e40000(0000)
knlGS:0000000000000000
[ 247.122048] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 247.122048] CR2: 0000000000000008 CR3: 00000001145c6003 CR4: 00000000007706e0
[ 247.122048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 247.122048] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 247.122048] PKRU: 55555554
[ 247.122048] Call Trace:
[ 247.122048] <TASK>
[ 247.122048] rdt_kill_sb+0x29d/0x350
[ 247.122048] deactivate_locked_super+0x36/0xa0
[ 247.122048] cleanup_mnt+0x131/0x190
[ 247.122048] task_work_run+0x5c/0x90
[ 247.122048] exit_to_user_mode_prepare+0x229/0x230
[ 247.122048] syscall_exit_to_user_mode+0x18/0x40
[ 247.122048] do_syscall_64+0x48/0x90
[ 247.122048] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 247.122048] RIP: 0033:0x7f01be2d735b
```
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215696
Link: https://lore.kernel.org/lkml/CAE4VaGDZr_4wzRn2___eDYRtmdPaGGJdzu_LCSkJYuY9BEO3cw@mail.gmail.com/
Fixes: 393c3714081a (kernfs: switch global kernfs_rwsem lock to per-fs lock)
Cc: [email protected]
Reported-by: Jirka Hladky <[email protected]>
Tested-by: Jirka Hladky <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fixes from Damien Le Moal:
"Two fixes for rc5:
- Fix inode initialization to make sure that the inode flags are all
cleared.
- Use zone reset operation instead of close to make sure that the
zone of an empty sequential file in never in an active state after
closing the file"
* tag 'zonefs-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: Fix management of open zones
zonefs: Clear inode information flags on inode creation
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
"Core fix:
- Fix a possible data corruption of the 'part' field in mtd_info
Rawnand fixes:
- Fix the check on the return value of wait_for_completion_timeout
- Fix wrong ECC parameters for mt7622
- Fix a possible memory corruption that might panic in the Qcom
driver"
* tag 'mtd/fixes-for-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: qcom: fix memory corruption that causes panic
mtd: fix 'part' field data corruption in mtd_info
mtd: rawnand: Fix return value check of wait_for_completion_timeout
mtd: rawnand: fix ecc parameters for mt7622
|
|
Welcome Eric!
Acked-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Minh Yuan reported a concurrency use-after-free issue in the floppy code
between raw_cmd_ioctl and seek_interrupt.
[ It turns out this has been around, and that others have reported the
KASAN splats over the years, but Minh Yuan had a reproducer for it and
so gets primary credit for reporting it for this fix - Linus ]
The problem is, this driver tends to break very easily and nowadays,
nobody is expected to use FDRAWCMD anyway since it was used to
manipulate non-standard formats. The risk of breaking the driver is
higher than the risk presented by this race, and accessing the device
requires privileges anyway.
Let's just add a config option to completely disable this ioctl and
leave it disabled by default. Distros shouldn't use it, and only those
running on antique hardware might need to enable it.
Link: https://lore.kernel.org/all/[email protected]/
Link: https://lore.kernel.org/lkml/CAKcFiNC=MfYVW-Jt9A3=FPJpTwCD2PL_ULNCpsCVE5s8ZeBQgQ@mail.gmail.com
Link: https://lore.kernel.org/all/CAEAjamu1FRhz6StCe_55XY5s389ZP_xmCF69k987En+1z53=eg@mail.gmail.com
Reported-by: Minh Yuan <[email protected]>
Reported-by: [email protected]
Reported-by: cruise k <[email protected]>
Reported-by: Kyungtae Kim <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Tested-by: Denis Efremov <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Sparse reports this issue
core.c: note: in included file:
core.h:239:12: warning: symbol 'pmc_lpm_modes' was not declared. Should it be static?
Global variables should not be defined in headers. This only works
because core.h is only included by core.c. Single file use
variables should be static, so change its storage-class specifier
to static.
Signed-off-by: Tom Rix <[email protected]>
Reviewed-by: David E. Box <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Hans de Goede <[email protected]>
|
|
Fix bug that added an offset to the mailbox addr during multi-packet
reads. Did not affect current ABI since it doesn't support multi-packet
transactions.
Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver")
Signed-off-by: David E. Box <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
Due to change in firmware flow, update mailbox writes to poll on ready bit
instead of run_busy bit. This change makes the polling method consistent
for both writes and reads, which also uses the ready bit.
Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver")
Signed-off-by: David E. Box <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
To prevent an agent from indefinitely holding the mailbox firmware has
implemented a leaky bucket algorithm. Repeated access to the mailbox may
now incur a delay of up to 2.1 seconds. Add a retry loop that tries for
up to 2.5 seconds to acquire the mailbox.
Fixes: 2546c6000430 ("platform/x86: Add Intel Software Defined Silicon driver")
Signed-off-by: David E. Box <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
Loading this driver in guests results in unchecked MSR access error for
MSR 0x620.
There is no use of reading and modifying package/die scope uncore MSRs
in guests. So check for CPU feature X86_FEATURE_HYPERVISOR to prevent
loading of this driver in guests.
Fixes: dbce412a7733 ("platform/x86/intel-uncore-freq: Split common and enumeration part")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=215870
Suggested-by: Borislav Petkov <[email protected]>
Signed-off-by: Srinivas Pandruvada <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Hans de Goede <[email protected]>
|
|
This works on my system.
Signed-off-by: Darryn Anton Jordan <[email protected]>
Acked-by: Thomas Weißschuh <[email protected]>
Link: https://lore.kernel.org/r/Ylguq87YG+9L3foV@hark
Signed-off-by: Hans de Goede <[email protected]>
|
|
The Latitude 7520 supports AC timeouts, but it has no KBD_LED_AC_TOKEN
and so changes to stop_timeout appear to have no effect if the laptop
is plugged in.
Signed-off-by: Gabriele Mazzotta <[email protected]>
Acked-by: Pali Rohár <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Hans de Goede <[email protected]>
|
|
fails
Before this commit fan_curve_check_present() was trying to not cause
the probe to fail on devices without fan curve control by testing for
known error codes returned by asus_wmi_evaluate_method_buf().
Checking for ENODATA or ENODEV, with the latter being returned by this
function when an ACPI integer with a value of ASUS_WMI_UNSUPPORTED_METHOD
is returned. But for other ACPI integer returns this function just returns
them as is, including the ASUS_WMI_DSTS_UNKNOWN_BIT value of 2.
On the Asus U36SD ASUS_WMI_DSTS_UNKNOWN_BIT gets returned, leading to:
asus-nb-wmi: probe of asus-nb-wmi failed with error 2
Instead of playing whack a mole with error codes here, simply treat all
errors as there not being any fan curves, fixing the driver no longer
loading on the Asus U36SD laptop.
Fixes: e3d13da7f77d ("platform/x86: asus-wmi: Fix regression when probing for fan curve control")
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2079125
Cc: Luke D. Jones <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
asus_wmi_evaluate_method_buf()
This code tests for if the obj->buffer.length is larger than the buffer
but then it just does the memcpy() anyway.
Fixes: 0f0ac158d28f ("platform/x86: asus-wmi: Add support for custom fan curves")
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/20220413073744.GB8812@kili
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
|
|
When an iocg is in debt, its inuse weight is owned by debt handling and
should stay at 1. This invariant was broken when determining the amount of
surpluses at the beginning of donation calculation - when an iocg's
hierarchical weight is too low, the iocg is excluded from donation
calculation and its inuse is reset to its active regardless of its
indebtedness, triggering warnings like the following:
WARNING: CPU: 5 PID: 0 at block/blk-iocost.c:1416 iocg_kick_waitq+0x392/0x3a0
...
RIP: 0010:iocg_kick_waitq+0x392/0x3a0
Code: 00 00 be ff ff ff ff 48 89 4d a8 e8 98 b2 70 00 48 8b 4d a8 85 c0 0f 85 4a fe ff ff 0f 0b e9 43 fe ff ff 0f 0b e9 4d fe ff ff <0f> 0b e9 50 fe ff ff e8 a2 ae 70 00 66 90 0f 1f 44 00 00 55 48 89
RSP: 0018:ffffc90000200d08 EFLAGS: 00010016
...
<IRQ>
ioc_timer_fn+0x2e0/0x1470
call_timer_fn+0xa1/0x2c0
...
As this happens only when an iocg's hierarchical weight is negligible, its
impact likely is limited to triggering the warnings. Fix it by skipping
resetting inuse of under-weighted debtors.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Rik van Riel <[email protected]>
Fixes: c421a3eb2e27 ("blk-iocost: revamp debt handling")
Cc: [email protected] # v5.10+
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
`nf_flowtable_udp_timeout` sysctl option is available only
if CONFIG_NFT_FLOW_OFFLOAD enabled. But infra for this flow
offload UDP timeout was added under CONFIG_NF_FLOW_TABLE
config option. So, if you have CONFIG_NFT_FLOW_OFFLOAD
disabled and CONFIG_NF_FLOW_TABLE enabled, the
`nf_flowtable_udp_timeout` is not present in sysfs.
Please note, that TCP flow offload timeout sysctl option
is present even CONFIG_NFT_FLOW_OFFLOAD is disabled.
I suppose it was a typo in commit that adds UDP flow offload
timeout and CONFIG_NF_FLOW_TABLE should be used instead.
Fixes: 975c57504da1 ("netfilter: conntrack: Introduce udp offload timeout configuration")
Signed-off-by: Volodymyr Mytnyk <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Jaco Kroon reported tcp problems that Eric Dumazet and Neal Cardwell
pinpointed to nf_conntrack tcp_in_window() bug.
tcp trace shows following sequence:
I > R Flags [S], seq 3451342529, win 62580, options [.. tfo [|tcp]>
R > I Flags [S.], seq 2699962254, ack 3451342530, win 65535, options [..]
R > I Flags [P.], seq 1:89, ack 1, [..]
Note 3rd ACK is from responder to initiator so following branch is taken:
} else if (((state->state == TCP_CONNTRACK_SYN_SENT
&& dir == IP_CT_DIR_ORIGINAL)
|| (state->state == TCP_CONNTRACK_SYN_RECV
&& dir == IP_CT_DIR_REPLY))
&& after(end, sender->td_end)) {
... because state == TCP_CONNTRACK_SYN_RECV and dir is REPLY.
This causes the scaling factor to be reset to 0: window scale option
is only present in syn(ack) packets. This in turn makes nf_conntrack
mark valid packets as out-of-window.
This was always broken, it exists even in original commit where
window tracking was added to ip_conntrack (nf_conntrack predecessor)
in 2.6.9-rc1 kernel.
Restrict to 'tcph->syn', just like the 3rd condtional added in
commit 82b72cb94666 ("netfilter: conntrack: re-init state for retransmitted syn-ack").
Upon closer look, those conditionals/branches can be merged:
Because earlier checks prevent syn-ack from showing up in
original direction, the 'dir' checks in the conditional quoted above are
redundant, remove them. Return early for pure syn retransmitted in reply
direction (simultaneous open).
Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
Reported-by: Jaco Kroon <[email protected]>
Signed-off-by: Florian Westphal <[email protected]>
Acked-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-04-26
This series contains updates to ice driver only.
Ivan Vecera removes races related to VF message processing by changing
mutex_trylock() call to mutex_lock() and moving additional operations
to occur under mutex.
Petr Oros increases wait time after firmware flash as current time is
not sufficient.
Jake resolves a use-after-free issue for mailbox snapshot.
====================
Signed-off-by: David S. Miller <[email protected]>
|