Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull bootconfig update from Masami Hiramatsu:
- Remove duplicate included header file linux/bootconfig.h from
lib/bootconfig.c. This is a cleanup, no behavior change.
* tag 'bootconfig-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
bootconfig: Remove duplicate included header file linux/bootconfig.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
"Uprobes:
- x86/shstk: Make return uprobe work with shadow stack
- Add uretprobe syscall which speeds up the uretprobe 10-30% faster.
This syscall is automatically used from user-space trampolines
which are generated by the uretprobe. If this syscall is used by
normal user program, it will cause SIGILL. Note that this is
currently only implemented on x86_64.
(This also has two fixes for adjusting the syscall number to avoid
conflict with new *attrat syscalls.)
- uprobes/perf: fix user stack traces in the presence of pending
uretprobe. This corrects the uretprobe's trampoline address in the
stacktrace with correct return address
- selftests/x86: Add a return uprobe with shadow stack test
- selftests/bpf: Add uretprobe syscall related tests.
- test case for register integrity check
- test case with register changing case
- test case for uretprobe syscall without uprobes (expected to fail)
- test case for uretprobe with shadow stack
- selftests/bpf: add test validating uprobe/uretprobe stack traces
- MAINTAINERS: Add uprobes entry. This does not specify the tree but
to clarify who maintains and reviews the uprobes
Kprobes:
- tracing/kprobes: Test case cleanups.
Replace redundant WARN_ON_ONCE() + pr_warn() with WARN_ONCE() and
remove unnecessary code from selftest
- tracing/kprobes: Add symbol counting check when module loads.
This checks the uniqueness of the probed symbol on modules. The
same check has already done for kernel symbols
(This also has a fix for build error with CONFIG_MODULES=n)
Cleanup:
- Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples"
* tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: Add uprobes entry
selftests/bpf: Change uretprobe syscall number in uprobe_syscall test
uprobe: Change uretprobe syscall scope and number
tracing/kprobes: Fix build error when find_module() is not available
tracing/kprobes: Add symbol counting check when module loads
selftests/bpf: add test validating uprobe/uretprobe stack traces
perf,uprobes: fix user stack traces in the presence of pending uretprobes
tracing/kprobe: Remove cleanup code unrelated to selftest
tracing/kprobe: Integrate test warnings into WARN_ONCE
selftests/bpf: Add uretprobe shadow stack test
selftests/bpf: Add uretprobe syscall call from user space test
selftests/bpf: Add uretprobe syscall test for regs changes
selftests/bpf: Add uretprobe syscall test for regs integrity
selftests/x86: Add return uprobe shadow stack test
uprobe: Add uretprobe syscall to speed up return probe
uprobe: Wire up uretprobe system call
x86/shstk: Make return uprobe work with shadow stack
samples: kprobes: add missing MODULE_DESCRIPTION() macros
fprobe: add missing MODULE_DESCRIPTION() macro
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev updates from Helge Deller:
- Detect VGA compatibility from VESA attributes (Thomas Zimmermann)
- Make I2C terminology more inclusive in smscufx and viafb (Easwar
Hariharan)
- Add lots of missing MODULE_DESCRIPTION() macros (Jeff Johnson)
- Logo code cleanups (Geert Uytterhoeven)
- Minor fixes by Chen Ni, Kuninori Morimoto, Uwe Kleine-König and
Christophe Jaillett
* tag 'fbdev-for-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev: (21 commits)
fbdev: viafb: Make I2C terminology more inclusive
fbdev: smscufx: Make I2C terminology more inclusive
fbdev: omap2: Return clk_prepare_enable to transfer the error
fbdev: mmp: Constify struct mmp_overlay_ops
fbdev: Drop explicit initialization of struct i2c_device_id::driver_data to 0
video: agp: add remaining missing MODULE_DESCRIPTION() macros
video: console: add missing MODULE_DESCRIPTION() macros
fbdev: amifb: add missing MODULE_DESCRIPTION() macro
fbdev: c2p_planar: add missing MODULE_DESCRIPTION() macro
fbdev: vesafb: Detect VGA compatibility from screen info's VESA attributes
fbdev: omapfb: use of_graph_get_remote_port()
fbdev: omapdss: use for_each_endpoint_of_node()
fbdev: offb: add missing MODULE_DESCRIPTION() macro
fbdev: vfb: add missing MODULE_DESCRIPTION() macro
fbdev: macmodes: add missing MODULE_DESCRIPTION() macro
fbdev: goldfishfb: add missing MODULE_DESCRIPTION() macro
fbdev: kyro: add missing MODULE_DESCRIPTION() macro
fbdev: viafb: add missing MODULE_DESCRIPTION() macro
fbdev: matroxfb: add missing MODULE_DESCRIPTION() macros
video/logo: Remove linux_serial_image comments
...
|
|
Drop unnecessary blank space from binding documentation.
Reported-by: Pavel Machek <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Biju Das <[email protected]>
Acked-by: Conor Dooley <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Replace a comma between expression statements by a semicolon.
Fixes: d65112f58464 ("watchdog: Add Renesas RZ/N1 Watchdog driver")
Signed-off-by: Chen Ni <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Replace a comma between expression statements by a semicolon.
Fixes: 1f6602c8ed1e ("watchdog: lenovo_se10_wdt: Watchdog driver for Lenovo SE10 platform")
Signed-off-by: Chen Ni <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Reviewed-by: Mark Pearson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Wim Van Sebroeck <[email protected]>
|
|
Similarly to PCI where msi-map/msi-mask are used to compute the full RID
(aka DID in ITS speak), use the msi-parent as the discovery mechanism,
since there is no way a device can generally express its ID.
However, since switching to a per-device MSI domain model, the domain
passed to its_pmsi_prepare() is the wrong one, and points to the device's
instead of the ITS'. Bad.
Use the parent domain instead, which is the ITS domain.
Fixes: 80b63cc1cc146 ("irqchip/gic-v3-its: Switch platform MSI to MSI parent")
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Since 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for
PCI/MSI[-X]"), the primary domain a PCI device allocates its interrupts
from is the one that is directly attached to the device itself.
By virtue of being a PCI device, it has no OF node.
This domain is (through more layer than it is worth describing)
passed to its_pci_msi_prepare(), which tries to compute the
full RID that is presented to the ITS by the device. This is ultimately
done by calling pci_msi_domain_get_msi_rid(), passing both the
domain and the PCI device as arguments.
The baked-in assumption is that either the domain that is passed
to pci_msi_domain_get_msi_rid() describes an interrupt controller
with either an OF node or an entry in an ACPI IORT table.
In this case, it is *neither*. This domain is does not represent
anything firmware-based, but just an allocation unit for the device.
As a result, it fails to provide the full RID (which requires inspecting
the msi-map/msi-mask properties in the DT), and stick to the BDF, which
isn't very useful.
Tragedy follows with a litany of devices that randomly die as they fail to
see any MSI (because the RID is wrong) or fail to get an allocation
(because they try to steal LPIs from their neighbour's pool).
This will happen on any system where a single ITS is shared by multiple
root ports and end-points with overlapping BDF numbers, and has the
topology described in the device-tree. Simpler DT topologies will luckily
work, and so will ACPI-based systems.
Solve it by pointing pci_msi_domain_get_msi_rid() at the *parent* domain,
which is the ITS, resulting in a correct mapping and a restored happiness
in my personal zoo.
Fixes: 6adb35ff43a16 ("irqchip/gic-v3-its: Provide MSI parent for PCI/MSI[-X]")
Reported-by: Johan Hovold <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Now that the platform MSI hack is gone, nothing needs to know about struct
msi_device_data outside of the core code.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
No more users!
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All related domains provide MSI parent functionality, so the fallback code
to the original platform MSI implementation is not longer required.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All platform MSI users and the PCI/MSI code handle per device MSI domains
when the irqdomain associated to the device provides MSI parent
functionality.
Remove the "global" platform domain related code and provide the MSI parent
functionality by filling in msi_parent_ops.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All platform MSI users and the PCI/MSI code handle per device MSI domains
when the irqdomain associated to the device provides MSI parent
functionality.
Remove the "global" platform domain related code and provide the MSI parent
functionality by filling in msi_parent_ops.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All platform MSI users and the PCI/MSI code handle per device MSI domains
when the irqdomain associated to the device provides MSI parent
functionality.
Remove the "global" platform domain related code and provide the MSI parent
functionality by filling in msi_parent_ops.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The core infrastructure has everything in place to switch ICU to per
device MSI domains and avoid the convoluted construct of the existing
platform-MSI layering violation.
The new infrastructure provides a wired interrupt specific interface in the
MSI core which converts the 'hardware interrupt number + trigger type'
allocation which is required for wired interrupts in the regular irqdomain
code to a normal MSI allocation.
The hardware interrupt number and the trigger type are stored in the MSI
descriptor device cookie by the core code so the ICU specific code can
retrieve them.
The new per device domain is only instantiated when the irqdomain which is
associated to the ICU device provides MSI parent functionality. Up to
that point it invokes the existing code. Once the parent is converted the
code for the current platform-MSI mechanism is removed.
The new domain shares the interrupt chip callbacks and the translation
function. The only new functionality aside of filling out the
msi_domain_templates is a domain specific set_desc() callback, which will go
away once all platform-MSI code has been converted.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All platform MSI users and the PCI/MSI code handle per device MSI domains
when the irqdomain associated to the device provides MSI parent
functionality.
Remove the "global" platform domain related code and provide the MSI parent
functionality by filling in msi_parent_ops.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All platform MSI users and the PCI/MSI code handle per device MSI domains
when the irqdomain associated to the device provides MSI parent
functionality.
Remove the "global" PCI/MSI and platform domain related code and provide
the MSI parent functionality by filling in msi_parent_ops.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The MBI chip creates two MSI domains:
- PCI/MSI
- Platform device domain
Both have the MBI domain as parent and differ slightly in the interrupt
chip callbacks and the platform device domain supports level type
signaling.
Convert it over to the MSI parent domain mechanism by:
- Providing the required templates
- Implementing a custom init_dev_msi_info() callback which sets the chip
callbacks and the level support flags depending on the domain bus token
type of the per device domain.
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
No more users.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Now that ITS provides the MSI parent domain, remove the unused fallback
code.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Similar to the previous conversion of the PCI/MSI support lift the
prepare() callback from the existing platform MSI code and enable
platform MSI and the related device domain bus tokens in select
and the child domain initialization code.
All platform MSI users are automatically using the new per device MSI model
now.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add the new bus token to the accepted list of child domain tokens.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The core infrastructure has everything in place to switch MBIGEN to per
device MSI domains and avoid the convoluted construct of the existing
platform-MSI layering violation.
The new infrastructure provides a wired interrupt specific interface in the
MSI core which converts the 'hardware interrupt number + trigger type'
allocation which is required for wired interrupts in the regular irqdomain
code to a normal MSI allocation.
The hardware interrupt number and the trigger type are stored in the MSI
descriptor device cookie by the core code so the MBIGEN specific code can
retrieve them.
The new per device domain is only instantiated when the irqdomain which is
associated to the MBIGEN device provides MSI parent functionality. Up to
that point it invokes the existing code. Once the parent is converted the
code for the current platform-MSI mechanism is removed.
The new domain shares the interrupt chip callbacks and the translation
function. The only new functionality aside of filling out the
msi_domain_template is a domain specific set_desc() callback, which will go
away once all platform-MSI code has been converted.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add the prerequisites for DEVICE MSI into the shared select() and child
domain init function. These domains are really trivial and just provide a
custom irq chip callback to write the MSI message.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The its_pci_msi_prepare() function from the ITS-PCI/MSI code provides the
'global' PCI/MSI domains. Move this function to the ITS-MSI parent code and
amend the function to use the domain hardware size, which is the MSI[X]
vector count, for allocating the ITS slots for the PCI device.
Enable PCI matching in msi_parent_ops and provide the necessary update to
the ITS specific child domain initialization function so that the prepare
callback gets invoked on allocations.
The latter might be optimized to do the allocation right at the point where
the child domain is initialized, but keep it simple for now.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Add the bus tokens for DOMAIN_BUS_PCI_DEVICE_MSI and
DOMAIN_BUS_PCI_DEVICE_MSIX to the common child init
function.
Provide the match mask which can be used by parent domain
implementation so the bitmask based child bus token match
works.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
To support per device MSI domains the ITS must provide MSI parent domain
functionality.
Provide the basic skeleton for this:
- msi_parent_ops
- child domain init callback
- the MSI parent flag set in irqdomain::flags
This does not make ITS a functional parent domain as there is no bit set in
the bus_select_mask yet, but it provides the base to implement PCI and
platform MSI support gradually on top.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
All irqdomains which provide MSI parent domain functionality for per device
MSI domains need to provide a select() callback for the irqdomain and a
function to initialize the child domain.
Most of these functions would just be copy&paste with minimal
modifications, so provide a library function which implements the required
functionality and is customizable via parent_domain::msi_parent_ops. The
check for the supported bus tokens in msi_lib_init_dev_msi_info() is
expanded step by step within the next patches.
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Anna-Maria Behnsen <[email protected]>
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Most ARM(64) PCI/MSI domains mask and unmask in the parent domain after or
before the PCI mask/unmask operation takes place. So there are more than a
dozen of the same wrapper implementation all over the place.
Don't make the same mistake with the new per device PCI/MSI domains and
provide a new MSI feature flag, which lets the domain implementation
enable this sequence in the PCI/MSI code.
Signed-off-by: Shivamurthy Shastri <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Link: https://lore.kernel.org/r/87ed8j34pj.ffs@tglx
|
|
of_platform_populate()
Commit 50b040ef3732 ("PCI/pwrctl: only call of_platform_populate() if
CONFIG_OF is enabled") added the CONFIG_OF guard for the
of_platform_populate() API. But it missed the fact that the CONFIG_OF
platforms can also run on ACPI without devicetree (so dev.of_node will
be NULL). In those cases, of_platform_populate() will fail with below
error messages as seen on the Ampere Altra box:
pci 000c:00:01.0: failed to populate child OF nodes (-22)
pci 000c:00:02.0: failed to populate child OF nodes (-22)
Fix this by checking for the existence of 'dev.of_node' before calling
the of_platform_populate() API. This also warrants the removal of
CONFIG_OF check, since dev_of_node() helper will return NULL if
CONFIG_OF is not enabled.
While at it, let's also use dev_of_node() to pass device OF node pointer
to of_platform_populate().
Fixes: 50b040ef3732 ("PCI/pwrctl: only call of_platform_populate() if CONFIG_OF is enabled")
Reported-by: Linus Torvalds <[email protected]>
Closes: https://lore.kernel.org/linux-arm-msm/CAHk-=wjcO_9dkNf-bNda6bzykb5ZXWtAYA97p7oDsXPHmMRi6g@mail.gmail.com
Reviewed-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Several versions of GCC mis-compile asm goto with outputs. We try to
workaround this, but our workaround is demonstrably incomplete and
liable to result in subtle bugs, especially on arm64 where get_user()
has recently been moved over to using asm goto with outputs.
From discussion(s) with Linus at:
https://lore.kernel.org/linux-arm-kernel/[email protected]/
https://lore.kernel.org/linux-arm-kernel/[email protected]/
... it sounds like the best thing to do for now is to remove the
workaround and make CC_HAS_ASM_GOTO_OUTPUT depend on working compiler
versions.
The issue was originally reported to GCC by Sean Christopherson:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921
... and Jakub Jelinek fixed this for GCC 14, with the fix backported to
13.3.0, 12.4.0, and 11.5.0.
In the kernel, we tried to workaround broken compilers in commits:
4356e9f841f7 ("work around gcc bugs with 'asm goto' with outputs")
68fb3ca0e408 ("update workarounds for gcc "asm goto" issue")
... but the workaround of adding an empty asm("") after the asm volatile
goto(...) demonstrably does not always avoid the problem, as can be seen
in the following test case:
| #define asm_goto_output(x...) \
| do { asm volatile goto(x); asm (""); } while (0)
|
| #define __good_or_bad(__val, __key) \
| do { \
| __label__ __failed; \
| unsigned long __tmp; \
| asm_goto_output( \
| " cbnz %[key], %l[__failed]\n" \
| " mov %[val], #0x900d\n" \
| : [val] "=r" (__tmp) \
| : [key] "r" (__key) \
| : \
| : __failed); \
| (__val) = __tmp; \
| break; \
| __failed: \
| (__val) = 0xbad; \
| } while (0)
|
| unsigned long get_val(unsigned long key);
| unsigned long get_val(unsigned long key)
| {
| unsigned long val = 0xbad;
|
| __good_or_bad(val, key);
|
| return val;
| }
GCC 13.2.0 (at -O2) compiles this to:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| .Lfailed:
| ret
GCC 14.1.0 (at -O2) compiles this to:
| cbnz x0, .Lfailed
| mov x0, #0x900d
| ret
| .Lfailed:
| mov x0, #0xbad
| ret
Note that GCC 13.2.0 erroneously omits the assignment to 'val' in the
error path (even though this does not depend on an output of the asm
goto). GCC 14.1.0 correctly retains the assignment.
This problem can be seen within the kernel with the following test case:
| #include <linux/uaccess.h>
| #include <linux/types.h>
|
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr);
| noinline unsigned long test_unsafe_get_user(unsigned long __user *ptr)
| {
| unsigned long val;
|
| unsafe_get_user(val, ptr, Efault);
| return val;
|
| Efault:
| val = 0x900d;
| return val;
| }
GCC 13.2.0 (arm64 defconfig) compiles this to:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| .Lextable_fixup:
| ret
GCC 13.2.0 (x86_64 defconfig + MITIGATION_RETPOLINE=n) compiles this to:
| endbr64
| mov (%rdi),%rax
| .Lextable_fixup:
| ret
... omitting the assignment to 'val' in the error path, and leaving
garbage in the result register returned by the function (which happens
to contain the faulting address in the generated code).
GCC 14.1.0 (arm64 defconfig) compiles this to:
| and x0, x0, #0xff7fffffffffffff
| ldtr x0, [x0]
| ret
| .Lextable_fixup:
| mov x0, #0x900d // #36877
| ret
GCC 14.1.0 (x86_64 defconfig + MITIGATION_RETPOLINE=n) compiles this to:
| endbr64
| mov (%rdi),%rax
| ret
| .Lextable_fixup:
| mov $0x900d,%eax
| ret
... retaining the expected assignment to 'val' in the error path.
We don't have a complete and reasonable workaround. While placing empty
asm("") blocks after each goto label *might* be sufficient, we don't
know for certain, this is tedious and error-prone, and there doesn't
seem to be a neat way to wrap this up (which is especially painful for
cases with multiple goto labels).
Avoid this issue by disabling CONFIG_CC_HAS_ASM_GOTO_OUTPUT for
known-broken compiler versions and removing the workaround (along with
the CONFIG_GCC_ASM_GOTO_OUTPUT_WORKAROUND config option).
For the moment I've left the default implementation of asm_goto_output()
unchanged. This should now be redundant since any compiler with the fix
for the clobbering issue whould also have a fix for the (earlier)
volatile issue, but it's far less churny to leave it around, which makes
it easier to backport this patch if necessary.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Alex Coplan <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jakub Jelinek <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Szabolcs Nagy <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit fbb5c0606fa4 ("kbuild: add syscall table generation to
scripts/Makefile.asm-headers") started to generate syscall headers
for architectures using generic syscalls.
However, these headers are always rebuilt using GNU Make 4.4.1 or newer.
When using GNU Make 4.4 or older, these headers are not rebuilt when the
command to generate them is changed, despite the use of the if_changed
macro.
scripts/Makefile.asm-headers now uses FORCE, but it is not marked as
.PHONY. To handle the command line change correctly, .*.cmd files must
be included.
Fixes: fbb5c0606fa4 ("kbuild: add syscall table generation to scripts/Makefile.asm-headers")
Reported-by: Linus Torvalds <[email protected]>
Closes: https://lore.kernel.org/lkml/CAHk-=wibB7SvXnUftBgAt+4-3vEKRpvEgBeDEH=i=j2GvDitoA@mail.gmail.com/
Signed-off-by: Masahiro Yamada <[email protected]>
Tested-by: Arnd Bergmann <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Pull drm updates from Dave Airlie:
"There's a lot of stuff in here, amd, i915 and xe have new platform
work, lots of core rework around EDID handling, some new COMPILE_TEST
options, maintainer changes and a lots of other stuff. Summary:
core:
- deprecate DRM data and return 0 date
- connector: Create a set of helpers to help with HDMI support
- Remove driver owner assignments
- Allow more drivers to compile with COMPILE_TEST
- Conversions to drm_edid
- Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
- Remove drm_mm_replace_node
- print: Add a drm prefix to warn level messages too, remove
___drm_dbg, consolidate prefix handling
- New monochrome TV mode variant
ttm:
- improve number of page faults on some platforms
- fix test builds under PREEMPT_RT
- more test coverage
ci:
- Require a more recent version of mesa
- improve farm setup and test generation
dma-buf:
- warn if reserving 0 fence slots
- internal API heap enhancements
fbdev:
- Create memory manager optimized fbdev emulation
panic:
- Allow to select fonts
- improve drm_fb_dma_get_scanout_buffer
- Allow to dump kmsg to the screen
bridge:
- Remove redundant checks on bridge->encoder
- Remove drm_bridge_chain_mode_fixup
- bridge-connector: Plumb in the new HDMI helper
- analogix_dp: Various improvements, handle AUX transfers timeout
- samsung-dsim: Fix timings calculation
- tc358767: Plenty of small fixes, fix no connector attach, fix
clocks
- sii902x: state validation improvements
panels:
- Switch panels from register table initialization to proper code
- Now that the panel code tracks the panel state, remove every ad-hoc
implementation in the panel drivers
- More cleanup of prepare / enable state tracking in drivers
- edp: Drop legacy panel compatibles
- simple-bridge: Switch to devm_drm_bridge_add
- New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
PM070WL4, Lincoln Technologies LCD197, Ortustech
COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti
amdgpu:
- DCN 4.0.x support
- GC 12.0 support
- GMC 12.0 support
- SDMA 7.0 support
- MES12 support
- MMHUB 4.1 support
- GFX12 modifier and DCC support
- lots of IP fixes/updates
amdkfd:
- Contiguous VRAM allocations
- GC 12.0 support
- SDMA 7.0 support
- SR-IOV fixes
- KFD GFX ALU exceptions
i915:
- Battlemage Xe2 HPD display enablement
- Panel Replay enabling
- DP AUX-less ALPM/LOBF
- Enable link training failure fallback for DP MST links
- CMRR (Content Match Refresh Rate) enabling
- Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
- Enable eDP AUX based HDR backlight
- Support replaying GPU hangs with captured context image
- Automate CCS Mode setting during engine resets
- lots of refactoring
- Support replaying GPU hangs with captured context image
- Increase FLR timeout from 3s to 9s
- Enable w/a 16021333562 for DG2, MTL and ARL [guc]
xe:
- update MAINATINERS
- New uapi adding OA functionality to Xe
- expose l3 bank mask
- fix display detect on ADL-N
- runtime PM Fixes
- Fix silent backmerge issues
- More prep for SR-IOV
- HWmon additions
- per client usage info
- Rework GPU page fault handling
- Drop EXEC_QUEUE_FLAG_BANNED
- Add BMG PCI IDs
- Scheduler fixes and improvements
- Rename xe_exec_queue::compute to xe_exec_queue::lr
- Use ttm_uncached for BO with NEEDS_UC flag
- Rename xe perf layer as xe observation layer
- lots of refactoring
radeon:
- Backlight workaround for iMac
- Silence UBSAN flex array warnings
msm:
- Validate registers XML description against schema in CI
- core/dpu: SM7150 support
- mdp5: Add support for MSM8937
- gpu: Add param for userspace to know if raytracing is supported
- gpu: X185 support (aka gpu in X1 laptop chips)
- gpu: a505 support
ivpu:
- hardware scheduler support
- profiling support
- improvements to the platform support layer
- firmware handling improvements
- clocks/power mgmt improvements
- scheduler/logging improvements
habanalabs:
- Gradual sleep in polling memory macro
- Reduce Gaudi2 MSI-X interrupt count to 128
- Add Gaudi2-D revision support
- Add timestamp to CPLD info
- Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
- Align Gaudi2 interrupt names
- Check for errors after preboot is ready
- Change habanalabs maintainer and git repo path
mgag200:
- refactoring and improvements
- Add BMC output
- enable polling
nouveau:
- add registry command line
v3d:
- perf counters improvements
zynqmp:
- irq and debugfs improvements
atmel-hlcdc:
- Support XLCDC in sam9x7
mipi-dbi:
- Remove mipi_dbi_machine_little_endian
- make SPI bits per word configurable
- support RGB888
- allow pixel formats to be specified in the DT
sun4i:
- Rework the blender setup for DE2
panfrost:
- Enable MT8188 support
vc4:
- Monochrome TV support
exynos:
- fix fallback mode regression
- fix memory leak
- Use drm_edid_duplicate() instead of kmemdup()
etnaviv:
- fix i.MX8MP NPU clock gating
- workaround FE register cdc issues on some cores
- fix DMA sync handling for cached buffers
- fix job timeout handling
- keep TS enabled on MMUv2 cores for improved performance
mediatek:
- Convert to platform remove callback returning void-
- Drop chain_mode_fixup call in mode_valid()
- Fixes the errors of MediaTek display driver found by IGT
- Add display support for the MT8365-EVK board
- Fix bit depth overwritten for mtk_ovl_set bit_depth()
- Fix possible_crtcs calculation
- Fix spurious kfree()
ast:
- refactor mode setting code
stm:
- Add LVDS support
- DSI PHY updates"
* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
drm/amdgpu/mes12: add missing opcode string
drm/amdgpu/mes11: update opcode strings
Revert "drm/amd/display: Reset freesync config before update new state"
drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
drm/xe: Drop trace_xe_hw_fence_free
drm/xe/uapi: Rename xe perf layer as xe observation layer
drm/amdgpu: remove exp hw support check for gfx12
drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
drm/amdgpu: flush all cached ras bad pages to eeprom
drm/amdgpu: select compute ME engines dynamically
drm/amd/display: Allow display DCC for DCN401
drm/amdgpu: select compute ME engines dynamically
drm/amdgpu/job: Replace DRM_INFO/ERROR logging
drm/amdgpu: select compute ME engines dynamically
drm/amd/pm: Ignore initial value in smu response register
drm/amdgpu: Initialize VF partition mode
drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
MAINTAINERS: fix Xinhui's name
MAINTAINERS: update powerplay and swsmu
drm/qxl: Pin buffer objects for internal mappings
...
|
|
The GSS routine errors are values, not flags.
Fixes: 0c77668ddb4e ("SUNRPC: Introduce trace points in rpc_auth_gss.ko")
Signed-off-by: Benjamin Coddington <[email protected]>
Reviewed-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
We've observed NFS clients with sync tasks sleeping in __rpc_execute
waiting on RPC_TASK_QUEUED that have not responded to a wake-up from
rpc_make_runnable(). I suspect this problem usually goes unnoticed,
because on a busy client the task will eventually be re-awoken by another
task completion or xprt event. However, if the state manager is draining
the slot table, a sync task missing a wake-up can result in a hung client.
We've been able to prove that the waker in rpc_make_runnable() successfully
calls wake_up_bit() (ie- there's no race to tk_runstate), but the
wake_up_bit() call fails to wake the waiter. I suspect the waker is
missing the load of the bit's wait_queue_head, so waitqueue_active() is
false. There are some very helpful comments about this problem above
wake_up_bit(), prepare_to_wait(), and waitqueue_active().
Fix this by inserting smp_mb__after_atomic() before the wake_up_bit(),
which pairs with prepare_to_wait() calling set_current_state().
Signed-off-by: Benjamin Coddington <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Drivers report a string with a name for each PCM, log it during startup of
pcm-test as a diagnostic aid.
Signed-off-by: Mark Brown <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Currently for the PCM and mixer tests we report test names which identify
the card being tested with the card number. This ensures we have unique
names but since card numbers are dynamically assigned at runtime the names
we end up with will often not be stable on systems with multiple cards
especially where those cards are provided by separate modules loeaded at
runtime. This makes it difficult for automated systems and UIs to relate
test results between runs on affected platforms.
Address this by replacing our use of card numbers with card names which are
more likely to be stable across runs. We use the card ID since it is
guaranteed to be unique by default, unlike the long name. There is still
some vulnerability to ordering issues if multiple cards with the same base
ID are present in the system but have separate dependencies but not all
drivers put distinguishing information in their long names.
Signed-off-by: Mark Brown <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Samsung Galaxy Book Pro 360 (13" 2022 NT935QDB-KC71S) with codec SSID
144d:c1a4 requires the same workaround to enable the speaker amp
as other Samsung models with the ALC298 codec.
Signed-off-by: Seunghun Han <[email protected]>
Cc: <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Kconfig will ask the user twice about power sequencing: once for the QCom
WCN power sequencing driver and then again for the PCI power control
driver using it.
Let's automate the selection of PCI_PWRCTL by introducing a new hidden
symbol: HAVE_PWRCTL which should be selected by all platforms that have
the need to include PCI power control code (right now: only ARCH_QCOM).
The pwrseq-based PCI pwrctl driver itself will then be selected by the
drivers binding to devices that may require external handling of the
power-up sequence (currently: ath11k and ath12k) based on the value
of HAVE_PWRCTL.
Make all PCI pwrctl Kconfig symbols hidden so that no questions are
asked during configuration.
Fixes: 4565d2652a37 ("PCI/pwrctl: Add PCI power control core code")
Reported-by: Linus Torvalds <[email protected]>
Closes: https://lore.kernel.org/lkml/CAHk-=wjWc5dzcj2O1tEgNHY1rnQW63JwtuZi_vAZPqy6wqpoUQ@mail.gmail.com/
Acked-by: Jeff Johnson <[email protected]> # drivers/net/wireless/ath
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
|
|
The iwlwifi wireless driver registers a thermal zone that is only needed
when the network interface handled by it is up and it wants that thermal
zone to be effectively ignored by the core otherwise.
Before commit a8a261774466 ("thermal: core: Call monitor_thermal_zone()
if zone temperature is invalid") that could be achieved by returning
an error code from the thermal zone's .get_temp() callback because the
core did not really handle errors returned by it almost at all.
However, commit a8a261774466 made the core attempt to recover from the
situation in which the temperature of a thermal zone cannot be
determined due to errors returned by its .get_temp() and is always
invalid from the core's perspective.
That was done because there are thermal zones in which .get_temp()
returns errors to start with due to some difficulties related to the
initialization ordering, but then it will start to produce valid
temperature values at one point.
Unfortunately, the simple approach taken by commit a8a261774466,
which is to poll the thermal zone periodically until its .get_temp()
callback starts to return valid temperature values, is at odds with
the special thermal zone in iwlwifi in which .get_temp() may always
return an error because its network interface may always be down. If
that happens, every attempt to invoke the thermal zone's .get_temp()
callback resulting in an error causes the thermal core to print a
dev_warn() message to the kernel log which is super-noisy.
To address this problem, make the core handle the case in which
.get_temp() returns 0, but the temperature value returned by it
is not actually valid, in a special way. Namely, make the core
completely ignore the invalid temperature value coming from
.get_temp() in that case, which requires folding in
update_temperature() into its caller and a few related changes.
On the iwlwifi side, modify iwl_mvm_tzone_get_temp() to return 0
and put THERMAL_TEMP_INVALID into the temperature return memory
location instead of returning an error when the firmware is not
running or it is not of the right type.
Also, to clearly separate the handling of invalid temperature
values from the thermal zone initialization, introduce a special
THERMAL_TEMP_INIT value specifically for the latter purpose.
Fixes: a8a261774466 ("thermal: core: Call monitor_thermal_zone() if zone temperature is invalid")
Closes: https://lore.kernel.org/linux-pm/[email protected]/
Reported-by: Eric Biggers <[email protected]>
Reported-by: Stefan Lippers-Hollmann <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=201761
Tested-by: Oleksandr Natalenko <[email protected]>
Tested-by: Stefan Lippers-Hollmann <[email protected]>
Cc: 6.10+ <[email protected]> # 6.10+
Signed-off-by: Rafael J. Wysocki <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Rebased on top of the current mainline ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for net:
1) Call nf_expect_get_id() to delete expectation by ID. By trial and
error it is possible to leak the LSB of the expectation address on
x86_64. This bug is a leftover when converting the existing code
to use nf_expect_get_id().
2) Incorrect initialization in pipapo set backend leads to packet
mismatches. From Florian Westphal.
3) Extend netfilter's selftests to cover for the pipapo set backend,
also from Florian.
4) Fix sparse warning in IPVS when adding service, from Chen Hanxiao.
netfilter pull request 24-07-17
* tag 'nf-24-07-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
ipvs: properly dereference pe in ip_vs_add_service
selftests: netfilter: add test case for recent mismatch bug
netfilter: nf_set_pipapo: fix initial map fill
netfilter: ctnetlink: use helper function to calculate expect ID
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Martin Willi says:
====================
net: dsa: Fix chip-wide frame size config in some drivers
Some DSA chips support a chip-wide frame size configurations, only. Some
drivers adjust that chip-wide setting for user port changes, overriding
the frame size requirements on the CPU port that includes tagger overhead.
Fix the mv88e6xxx and b53 drivers and align them to the behavior of other
drivers.
v2:
- Skip chip-wide config for non-CPU ports instead of finding the maximim
MTU over all ports
- Add a patch fixing the b53 driver as well
v1: https://lore.kernel.org/netdev/[email protected]/
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Broadcom switches supported by the b53 driver use a chip-wide jumbo frame
configuration. In the commit referenced with the Fixes tag, the setting
is applied just for the last port changing its MTU.
While configuring CPU ports accounts for tagger overhead, user ports do
not. When setting the MTU for a user port, the chip-wide setting is
reduced to not include the tagger overhead, resulting in an potentially
insufficient chip-wide maximum frame size for the CPU port.
As, by design, the CPU port MTU is adjusted for any user port change,
apply the chip-wide setting only for CPU ports. This aligns the driver
to the behavior of other switch drivers.
Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
Suggested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Martin Willi <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Marvell chips not supporting per-port jumbo frame size configurations use
a chip-wide frame size configuration. In the commit referenced with the
Fixes tag, the setting is applied just for the last port changing its MTU.
While configuring CPU ports accounts for tagger overhead, user ports do
not. When setting the MTU for a user port, the chip-wide setting is
reduced to not include the tagger overhead, resulting in an potentially
insufficient maximum frame size for the CPU port. Specifically, sending
full-size frames from the CPU port on a MV88E6097 having a user port MTU
of 1500 bytes results in dropped frames.
As, by design, the CPU port MTU is adjusted for any user port change,
apply the chip-wide setting only for CPU ports.
Fixes: 1baf0fac10fb ("net: dsa: mv88e6xxx: Use chip-wide max frame size for MTU")
Suggested-by: Vladimir Oltean <[email protected]>
Signed-off-by: Martin Willi <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Move page_pool_get_dma_dir() inside the while loop of
airoha_qdma_cleanup_rx_queue routine in order to avoid possible NULL
pointer dereference if airoha_qdma_init_rx_queue() fails before
properly allocating the page_pool pointer.
Fixes: 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC")
Signed-off-by: Lorenzo Bianconi <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/7330a41bba720c33abc039955f6172457a3a34f0.1721205981.git.lorenzo@kernel.org
Signed-off-by: Paolo Abeni <[email protected]>
|
|
add support for Dell DW5933e (0x14c0, 0x4d75)
Signed-off-by: Jack Wu <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Ido Schimmel says:
====================
ipv4: Fix incorrect TOS in route get reply
Two small fixes for incorrect TOS in route get reply. See more details
in the commit messages.
No regressions in FIB tests:
# ./fib_tests.sh
[...]
Tests passed: 218
Tests failed: 0
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4->flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.
However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 tos 0x1c dev dummy1 proto kernel scope link src 192.0.2.1
Fix by instead returning the DSCP field from the FIB result structure
which was populated during the route lookup.
Output after the patch:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get fibmatch 192.0.2.2 tos 0xfc
192.0.2.0/24 dev dummy1 proto kernel scope link src 192.0.2.1
Extend the existing selftests to not only verify that the correct route
is returned, but that it is also returned with correct "tos" value (or
without it).
Fixes: b61798130f1b ("net: ipv4: RTM_GETROUTE: return matched fib result when requested")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Reviewed-by: Guillaume Nault <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The TOS value that is returned to user space in the route get reply is
the one with which the lookup was performed ('fl4->flowi4_tos'). This is
fine when the matched route is configured with a TOS as it would not
match if its TOS value did not match the one with which the lookup was
performed.
However, matching on TOS is only performed when the route's TOS is not
zero. It is therefore possible to have the kernel incorrectly return a
non-zero TOS:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get 192.0.2.2 tos 0xfc
192.0.2.2 tos 0x1c dev dummy1 src 192.0.2.1 uid 0
cache
Fix by adding a DSCP field to the FIB result structure (inside an
existing 4 bytes hole), populating it in the route lookup and using it
when filling the route get reply.
Output after the patch:
# ip link add name dummy1 up type dummy
# ip address add 192.0.2.1/24 dev dummy1
# ip route get 192.0.2.2 tos 0xfc
192.0.2.2 dev dummy1 src 192.0.2.1 uid 0
cache
Fixes: 1a00fee4ffb2 ("ipv4: Remove rt_key_{src,dst,tos} from struct rtable.")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Reviewed-by: Guillaume Nault <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The following splat is easy to reproduce upstream as well as in -stable
kernels. Florian Westphal provided the following commit:
d1dab4f71d37 ("net: add and use __skb_get_hash_symmetric_net")
but this complementary fix has been also suggested by Willem de Bruijn
and it can be easily backported to -stable kernel which consists in
using DEBUG_NET_WARN_ON_ONCE instead to silence the following splat
given __skb_get_hash() is used by the nftables tracing infrastructure to
to identify packets in traces.
[69133.561393] ------------[ cut here ]------------
[69133.561404] WARNING: CPU: 0 PID: 43576 at net/core/flow_dissector.c:1104 __skb_flow_dissect+0x134f/
[...]
[69133.561944] CPU: 0 PID: 43576 Comm: socat Not tainted 6.10.0-rc7+ #379
[69133.561959] RIP: 0010:__skb_flow_dissect+0x134f/0x2ad0
[69133.561970] Code: 83 f9 04 0f 84 b3 00 00 00 45 85 c9 0f 84 aa 00 00 00 41 83 f9 02 0f 84 81 fc ff
ff 44 0f b7 b4 24 80 00 00 00 e9 8b f9 ff ff <0f> 0b e9 20 f3 ff ff 41 f6 c6 20 0f 84 e4 ef ff ff 48 8d 7b 12 e8
[69133.561979] RSP: 0018:ffffc90000006fc0 EFLAGS: 00010246
[69133.561988] RAX: 0000000000000000 RBX: ffffffff82f33e20 RCX: ffffffff81ab7e19
[69133.561994] RDX: dffffc0000000000 RSI: ffffc90000007388 RDI: ffff888103a1b418
[69133.562001] RBP: ffffc90000007310 R08: 0000000000000000 R09: 0000000000000000
[69133.562007] R10: ffffc90000007388 R11: ffffffff810cface R12: ffff888103a1b400
[69133.562013] R13: 0000000000000000 R14: ffffffff82f33e2a R15: ffffffff82f33e28
[69133.562020] FS: 00007f40f7131740(0000) GS:ffff888390800000(0000) knlGS:0000000000000000
[69133.562027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[69133.562033] CR2: 00007f40f7346ee0 CR3: 000000015d200001 CR4: 00000000001706f0
[69133.562040] Call Trace:
[69133.562044] <IRQ>
[69133.562049] ? __warn+0x9f/0x1a0
[ 1211.841384] ? __skb_flow_dissect+0x107e/0x2860
[...]
[ 1211.841496] ? bpf_flow_dissect+0x160/0x160
[ 1211.841753] __skb_get_hash+0x97/0x280
[ 1211.841765] ? __skb_get_hash_symmetric+0x230/0x230
[ 1211.841776] ? mod_find+0xbf/0xe0
[ 1211.841786] ? get_stack_info_noinstr+0x12/0xe0
[ 1211.841798] ? bpf_ksym_find+0x56/0xe0
[ 1211.841807] ? __rcu_read_unlock+0x2a/0x70
[ 1211.841819] nft_trace_init+0x1b9/0x1c0 [nf_tables]
[ 1211.841895] ? nft_trace_notify+0x830/0x830 [nf_tables]
[ 1211.841964] ? get_stack_info+0x2b/0x80
[ 1211.841975] ? nft_do_chain_arp+0x80/0x80 [nf_tables]
[ 1211.842044] nft_do_chain+0x79c/0x850 [nf_tables]
Fixes: 9b52e3f267a6 ("flow_dissector: handle no-skb use case")
Suggested-by: Willem de Bruijn <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|