Age | Commit message (Collapse) | Author | Files | Lines |
|
Include and use the new FDMA header, which now provides the required
masks and bit offsets for operating on the DCB's and DB's.
Signed-off-by: Daniel Machon <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Select the newly introduced FDMA library.
Signed-off-by: Daniel Machon <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
It may be possible for the sum of the values derived from
i915_ggtt_offset() and __get_parent_scratch_offset()/
i915_ggtt_offset() to go over the u32 limit before being assigned
to wq offsets of u64 type.
Mitigate these issues by expanding one of the right operands
to u64 to avoid any overflow issues just in case.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: c2aa552ff09d ("drm/i915/guc: Add multi-lrc context registration")
Cc: Matthew Brost <[email protected]>
Cc: John Harrison <[email protected]>
Signed-off-by: Nikita Zhandarovich <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
(cherry picked from commit 1f1c1bd56620b80ae407c5790743e17caad69cec)
Signed-off-by: Tvrtko Ursulin <[email protected]>
|
|
Brett Creeley says:
====================
ionic: convert Rx queue buffers to use page_pool
Our home-grown buffer management needs to go away and we need to play
nicely with the page_pool infrastructure. This patchset cleans up some
of our API use and converts the Rx traffic queues to use page_pool.
The first few patches are for tidying up things, then a small XDP
configuration refactor, adding page_pool support, and finally adding
support to hot swap an XDP program without having to reconfigure
anything.
The result is code that more closely follows current patterns, as well as
a either a performance boost or equivalent performance as seen with
iperf testing:
mss netio tx_pps rx_pps total_pps tx_bw rx_bw total_bw
---- ------- ---------- ---------- ----------- ------- ------- ----------
Before:
256 bidir 13,839,293 15,515,227 29,354,520 34 38 71
512 bidir 13,913,249 14,671,693 28,584,942 62 65 127
1024 bidir 13,006,189 13,695,413 26,701,602 109 115 224
1448 bidir 12,489,905 12,791,734 25,281,639 145 149 294
2048 bidir 9,195,622 9,247,649 18,443,271 148 149 297
4096 bidir 5,149,716 5,247,917 10,397,633 160 163 323
8192 bidir 3,029,993 3,008,882 6,038,875 179 179 358
9000 bidir 2,789,358 2,800,744 5,590,102 181 180 361
After:
256 bidir 21,540,037 21,344,644 42,884,681 52 52 104
512 bidir 23,170,014 19,207,260 42,377,274 103 85 188
1024 bidir 17,934,280 17,819,247 35,753,527 150 149 299
1448 bidir 15,242,515 14,907,030 30,149,545 167 174 341
2048 bidir 10,692,542 10,663,023 21,355,565 177 176 353
4096 bidir 6,024,977 6,083,580 12,108,557 187 180 367
8192 bidir 3,090,449 3,048,266 6,138,715 180 176 356
9000 bidir 2,859,146 2,864,226 5,723,372 178 180 358
v2: https://lore.kernel.org/[email protected]
v1: https://lore.kernel.org/[email protected]
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Using examples of other driver(s), add the ability to hot-swap an XDP
program without having to reconfigure the queues. To prevent the
q->xdp_prog to be read/written more than once use READ_ONCE() and
WRITE_ONCE() on the q->xdp_prog.
The q->xdp_prog was being checked in multiple different for loops in the
hot path. The change to allow xdp_prog hot swapping created the
possibility for many READ_ONCE(q->xdp_prog) calls during a single napi
callback. Refactor the Rx napi handling to allow a previous
READ_ONCE(q->xdp_prog) (or NULL for hwstamp_rxq) to be passed into the
relevant functions.
Also, move other Rx related hotpath handling into the newly created
ionic_rx_cq_service() function to reduce the scope of the xdp_prog
local variable and put all Rx handling in one function similar to Tx.
Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Our home-grown buffer management needs to go away and we need
to be playing nicely with the page_pool infrastructure. This
converts the Rx traffic queues to use page_pool.
Also, since ionic_rx_buf_size() was removed, redefine
IONIC_PAGE_SIZE to account for IONIC_MAX_BUF_LEN being the
largest allowed buffer to prevent overflowing u16 variables,
which could happen when PAGE_SIZE is defined as >= 64KB.
include/linux/minmax.h:93:37: warning: conversion from 'long unsigned int' to 'u16' {aka 'short unsigned int'} changes value from '65536' to '0' [-Woverflow]
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently when going to/from a NULL XDP program the driver uses
ionic_stop_queues_reconfig() and then ionic_start_queues_reconfig() in
order to re-register the xdp_rxq_info and re-init the queues. This is
fine until page_pool(s) are used in an upcoming patch.
In preparation for adding page_pool support make sure to completely
rebuild the queues when going to/from a NULL XDP program. Without this
change the call to mem_allocator_disconnect() never happens when going
to a NULL XDP program, which eventually results in
xdp_rxq_info_reg_mem_model() failing with -ENOSPC due to the mem_id_pool
ida having no remaining space.
Signed-off-by: Brett Creeley <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Instead of setting up and tearing down the rxq_info only when the XDP
program is loaded or unloaded, we will build the rxq_info whether or not
XDP is in use. This is the more common use pattern and better supports
future conversion to page_pool. Since the rxq_info wants the napi_id
we re-order things slightly to tie this into the queue init and deinit
functions where we do the add and delete of napi.
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We originally were using a per-interface xdp_prog variable to track
a loaded XDP program since we knew there would never be support for a
per-queue XDP program. With that, we only built the per queue rxq_info
struct when an XDP program was loaded and removed it on XDP program unload,
and used the pointer as an indicator in the Rx hotpath to know to how build
the buffers. However, that's really not the model generally used, and
makes a conversion to page_pool Rx buffer cacheing a little problematic.
This patch converts the driver to use the more common approach of using
a per-queue xdp_prog pointer to work out buffer allocations and need
for bpf_prog_run_xdp(). We jostle a couple of fields in the queue struct
in order to keep the new xdp_prog pointer in a warm cacheline.
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We aren't "putting" buf, we're just unlinking them from our tracking in
order to let the XDP_TX and XDP_REDIRECT tx clean paths take care of the
pages when they are done with them. This rename clears up the intent.
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Here's a little debugging aid in case the device starts throwing
Tx completion errors.
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Gal Pressman says:
====================
RX software timestamp for all - round 3
Rounds 1 & 2 of drivers conversion were merged [1][2], this round will
complete the work.
[1] https://lore.kernel.org/netdev/[email protected]/
[2] https://lore.kernel.org/netdev/[email protected]/
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Acked-by: Linus Walleij <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Martin Habets <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Martin Habets <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Acked-by: Shyam Sundar S K <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Michael Chan <[email protected]>
Reviewed-by: Pavan Chebbi <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The responsibility for reporting of RX software timestamp has moved to
the core layer (see __ethtool_get_ts_info()), remove usage from the
device drivers.
Reviewed-by: Carolina Jubran <[email protected]>
Reviewed-by: Rahul Rameshbabu <[email protected]>
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Michael Chan <[email protected]>
Reviewed-by: Pavan Chebbi <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
pa_stats is optional in dt bindings, make it optional in driver as well.
Currently if pa_stats syscon regmap is not found driver returns -ENODEV.
Fix this by not returning an error in case pa_stats is not found and
continue generating ethtool stats without pa_stats.
Fixes: 550ee90ac61c ("net: ti: icssg-prueth: Add support for PA Stats")
Signed-off-by: MD Danish Anwar <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
dev->emacp contains an __iomem pointer and values derived
from it are used as __iomem pointers. So use this annotation
in the return type for helpers that derive pointers from dev->emacp.
Flagged by Sparse as:
.../core.c:444:36: warning: incorrect type in argument 1 (different address spaces)
.../core.c:444:36: expected unsigned int volatile [noderef] [usertype] __iomem *addr
.../core.c:444:36: got unsigned int [usertype] *
.../core.c: note: in included file:
.../core.h:416:25: warning: cast removes address space '__iomem' of expression
Compile tested only.
No functional change intended.
Signed-off-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Willem de Bruijn says:
====================
selftests/net: add packetdrill
Lay the groundwork to import into kselftests the over 150 packetdrill
TCP/IP conformance tests on github.com/google/packetdrill.
1/2: add kselftest infra for TEST_PROGS that need an interpreter
2/2: add the specific packetdrill tests
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Lay the groundwork to import into kselftests the over 150 packetdrill
TCP/IP conformance tests on github.com/google/packetdrill.
Florian recently added support for packetdrill tests in nf_conntrack,
in commit a8a388c2aae49 ("selftests: netfilter: add packetdrill based
conntrack tests").
This patch takes a slightly different approach. It relies on
ksft_runner.sh to run every *.pkt file in the directory.
Any future imports of packetdrill tests should require no additional
coding. Just add the *.pkt files.
Initially import only two features/directories from github. One with a
single script, and one with two. This was the only reason to pick
tcp/inq and tcp/md5.
The path replaces the directory hierarchy in github with a flat space
of files: $(subst /,_,$(wildcard tcp/**/*.pkt)). This is the most
straightforward option to integrate with kselftests. The Linked thread
reviewed two ways to maintain the hierarchy: TEST_PROGS_RECURSE and
PRESERVE_TEST_DIRS. But both introduce significant changes to
kselftest infra and with that risk to existing tests.
Implementation notes:
- restore alphabetical order when adding the new directory to
tools/testing/selftests/Makefile
- imported *.pkt files and support verbatim from the github project,
except for
- update `source ./defaults.sh` path (to adjust for flat dir)
- add SPDX headers
- remove one author statement
- Acknowledgment: drop an e (checkpatch)
Tested:
make -C tools/testing/selftests \
TARGETS=net/packetdrill \
run_tests
make -C tools/testing/selftests \
TARGETS=net/packetdrill \
install INSTALL_PATH=$KSFT_INSTALL_PATH
# in virtme-ng
./run_kselftest.sh -c net/packetdrill
./run_kselftest.sh -t net/packetdrill:tcp_inq_client.pkt
Link: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Support testcases that are themselves not executable, but need an
interpreter to run them.
If a test file is not executable, but an executable file
ksft_runner.sh exists in the TARGET dir, kselftest will run
./ksft_runner.sh ./$BASENAME_TEST
Packetdrill may add hundreds of packetdrill scripts for testing. These
scripts must be passed to the packetdrill process.
Have kselftest run each test directly, as it already solves common
runner requirements like parallel execution and isolation (netns).
A previous RFC added a wrapper in between, which would have to
reimplement such functionality.
Link: https://lore.kernel.org/netdev/[email protected]/T/
Signed-off-by: Willem de Bruijn <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The grc must be initialize first. There can be a condition where if
fou is NULL, goto out will be executed and grc would be used
uninitialized.
Fixes: 7e4196935069 ("fou: Fix null-ptr-deref in GRO.")
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Reviewed-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Rosen Penev says:
====================
various cleanups
Allow CI to build. Also a bugfix for dual GMAC devices.
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
ag71xx_probe is registering ag71xx_interrupt as handler for gmac0/gmac1
interrupts. The handler is trying to use napi_schedule to handle the
processing of packets. But the netif_napi_add for this device is
called a lot later in ag71xx_probe.
It can therefore happen that a still running gmac0/gmac1 is triggering the
interrupt handler with a bit from AG71XX_INT_POLL set in
AG71XX_REG_INT_STATUS. The handler will then call napi_schedule and the
napi code will crash the system because the ag->napi is not yet
initialized.
The gmcc0/gmac1 must be brought in a state in which it doesn't signal a
AG71XX_INT_POLL related status bits as interrupt before registering the
interrupt handler. ag71xx_hw_start will take care of re-initializing the
AG71XX_REG_INT_ENABLE.
This will become relevant when dual GMAC devices get added here.
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Rosen Penev <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The opposite of this condition is checked above and if true, function
returns. Which means this can never be false.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, the of variant is missing reset_control_put in error paths.
The devm variant does not require it.
Allows removing mdio_reset from the struct as it is not used outside the
function.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Allows simplifying get_strings and avoids manual pointer manipulation.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Taken from QCA SDK. No functional difference as same bits get applied.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Oleksij Rempel <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Now that COMPILE_TEST is enabled, it gets flagged when building with
allmodconfig W=1 builds. Text taken from the beginning of the file.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
While this driver is meant for MIPS only, it can be compiled on x86 just
fine. Remove pointless parentheses while at it.
Enables CI building of this driver.
Signed-off-by: Rosen Penev <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Kuniyuki Iwashima says:
====================
af_unix: Correct manage_oob() when OOB follows a consumed OOB.
Recently syzkaller reported UAF of OOB skb.
The bug was introduced by commit 93c99f21db36 ("af_unix: Don't stop
recv(MSG_DONTWAIT) if consumed OOB skb is at the head.") but uncovered
by another recent commit 8594d9b85c07 ("af_unix: Don't call skb_get()
for OOB skb.").
[0]: https://lore.kernel.org/netdev/[email protected]/
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
syzbot reported use-after-free in unix_stream_recv_urg(). [0]
The scenario is
1. send(MSG_OOB)
2. recv(MSG_OOB)
-> The consumed OOB remains in recv queue
3. send(MSG_OOB)
4. recv()
-> manage_oob() returns the next skb of the consumed OOB
-> This is also OOB, but unix_sk(sk)->oob_skb is not cleared
5. recv(MSG_OOB)
-> unix_sk(sk)->oob_skb is used but already freed
The recent commit 8594d9b85c07 ("af_unix: Don't call skb_get() for OOB
skb.") uncovered the issue.
If the OOB skb is consumed and the next skb is peeked in manage_oob(),
we still need to check if the skb is OOB.
Let's do so by falling back to the following checks in manage_oob()
and add the test case in selftest.
Note that we need to add a similar check for SIOCATMARK.
[0]:
BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
Read of size 4 at addr ffff8880326abcc4 by task syz-executor178/5235
CPU: 0 UID: 0 PID: 5235 Comm: syz-executor178 Not tainted 6.11.0-rc5-syzkaller-00742-gfbdaffe41adc #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
unix_stream_read_actor+0xa6/0xb0 net/unix/af_unix.c:2959
unix_stream_recv_urg+0x1df/0x320 net/unix/af_unix.c:2640
unix_stream_read_generic+0x2456/0x2520 net/unix/af_unix.c:2778
unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
sock_recvmsg_nosec net/socket.c:1046 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1068
____sys_recvmsg+0x1db/0x470 net/socket.c:2816
___sys_recvmsg net/socket.c:2858 [inline]
__sys_recvmsg+0x2f0/0x3e0 net/socket.c:2888
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5360d6b4e9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff29b3a458 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
RAX: ffffffffffffffda RBX: 00007fff29b3a638 RCX: 00007f5360d6b4e9
RDX: 0000000000002001 RSI: 0000000020000640 RDI: 0000000000000003
RBP: 00007f5360dde610 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007fff29b3a628 R14: 0000000000000001 R15: 0000000000000001
</TASK>
Allocated by task 5235:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3988 [inline]
slab_alloc_node mm/slub.c:4037 [inline]
kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4080
__alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
alloc_skb include/linux/skbuff.h:1320 [inline]
alloc_skb_with_frags+0xc3/0x770 net/core/skbuff.c:6528
sock_alloc_send_pskb+0x91a/0xa60 net/core/sock.c:2815
sock_alloc_send_skb include/net/sock.h:1778 [inline]
queue_oob+0x108/0x680 net/unix/af_unix.c:2198
unix_stream_sendmsg+0xd24/0xf80 net/unix/af_unix.c:2351
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
___sys_sendmsg net/socket.c:2651 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2680
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 5235:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2252 [inline]
slab_free mm/slub.c:4473 [inline]
kmem_cache_free+0x145/0x350 mm/slub.c:4548
unix_stream_read_generic+0x1ef6/0x2520 net/unix/af_unix.c:2917
unix_stream_recvmsg+0x22b/0x2c0 net/unix/af_unix.c:2996
sock_recvmsg_nosec net/socket.c:1046 [inline]
sock_recvmsg+0x22f/0x280 net/socket.c:1068
__sys_recvfrom+0x256/0x3e0 net/socket.c:2255
__do_sys_recvfrom net/socket.c:2273 [inline]
__se_sys_recvfrom net/socket.c:2269 [inline]
__x64_sys_recvfrom+0xde/0x100 net/socket.c:2269
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff8880326abc80
which belongs to the cache skbuff_head_cache of size 240
The buggy address is located 68 bytes inside of
freed 240-byte region [ffff8880326abc80, ffff8880326abd70)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x326ab
ksm flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xfdffffff(slab)
raw: 00fff00000000000 ffff88801eaee780 ffffea0000b7dc80 dead000000000003
raw: 0000000000000000 00000000800c000c 00000001fdffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 4686, tgid 4686 (udevadm), ts 32357469485, free_ts 28829011109
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493
prep_new_page mm/page_alloc.c:1501 [inline]
get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439
__alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695
__alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
alloc_slab_page+0x5f/0x120 mm/slub.c:2321
allocate_slab+0x5a/0x2f0 mm/slub.c:2484
new_slab mm/slub.c:2537 [inline]
___slab_alloc+0xcd1/0x14b0 mm/slub.c:3723
__slab_alloc+0x58/0xa0 mm/slub.c:3813
__slab_alloc_node mm/slub.c:3866 [inline]
slab_alloc_node mm/slub.c:4025 [inline]
kmem_cache_alloc_node_noprof+0x1fe/0x320 mm/slub.c:4080
__alloc_skb+0x1c3/0x440 net/core/skbuff.c:667
alloc_skb include/linux/skbuff.h:1320 [inline]
alloc_uevent_skb+0x74/0x230 lib/kobject_uevent.c:289
uevent_net_broadcast_untagged lib/kobject_uevent.c:326 [inline]
kobject_uevent_net_broadcast+0x2fd/0x580 lib/kobject_uevent.c:410
kobject_uevent_env+0x57d/0x8e0 lib/kobject_uevent.c:608
kobject_synth_uevent+0x4ef/0xae0 lib/kobject_uevent.c:207
uevent_store+0x4b/0x70 drivers/base/bus.c:633
kernfs_fop_write_iter+0x3a1/0x500 fs/kernfs/file.c:334
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xa72/0xc90 fs/read_write.c:590
page last free pid 1 tgid 1 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1094 [inline]
free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612
kasan_depopulate_vmalloc_pte+0x74/0x90 mm/kasan/shadow.c:408
apply_to_pte_range mm/memory.c:2797 [inline]
apply_to_pmd_range mm/memory.c:2841 [inline]
apply_to_pud_range mm/memory.c:2877 [inline]
apply_to_p4d_range mm/memory.c:2913 [inline]
__apply_to_page_range+0x8a8/0xe50 mm/memory.c:2947
kasan_release_vmalloc+0x9a/0xb0 mm/kasan/shadow.c:525
purge_vmap_node+0x3e3/0x770 mm/vmalloc.c:2208
__purge_vmap_area_lazy+0x708/0xae0 mm/vmalloc.c:2290
_vm_unmap_aliases+0x79d/0x840 mm/vmalloc.c:2885
change_page_attr_set_clr+0x2fe/0xdb0 arch/x86/mm/pat/set_memory.c:1881
change_page_attr_set arch/x86/mm/pat/set_memory.c:1922 [inline]
set_memory_nx+0xf2/0x130 arch/x86/mm/pat/set_memory.c:2110
free_init_pages arch/x86/mm/init.c:924 [inline]
free_kernel_image_pages arch/x86/mm/init.c:943 [inline]
free_initmem+0x79/0x110 arch/x86/mm/init.c:970
kernel_init+0x31/0x2b0 init/main.c:1476
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address:
ffff8880326abb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8880326abc00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
>ffff8880326abc80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8880326abd00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc
ffff8880326abd80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: 93c99f21db36 ("af_unix: Don't stop recv(MSG_DONTWAIT) if consumed OOB skb is at the head.")
Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=8811381d455e3e9ec788
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When OOB skb has been already consumed, manage_oob() returns the next
skb if exists. In such a case, we need to fall back to the else branch
below.
Then, we want to keep holding spin_lock(&sk->sk_receive_queue.lock).
Let's move it out of if-else branch and add lightweight check before
spin_lock() for major use cases without OOB skb.
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When OOB skb has been already consumed, manage_oob() returns the next
skb if exists. In such a case, we need to fall back to the else branch
below.
Then, we need to keep two skbs and free them later with consume_skb()
and kfree_skb().
Let's rename unlinked_skb accordingly.
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This is a prep for the later fix.
No functional change intended.
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2024-09-09
The first patch is by Rob Herring and simplifies the DT parsing in the
cc770 driver.
The next 2 patches both target the rockchip_canfd driver added in the
last pull request to net-next. The first one is by Nathan Chancellor
and fixes the return type of rkcanfd_start_xmit(), the second one is
by me and fixes a 64 bit division on 32 bit platforms in
rkcanfd_timestamp_init().
* tag 'linux-can-next-for-6.12-20240909' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: rockchip_canfd: rkcanfd_timestamp_init(): fix 64 bit division on 32 bit platforms
can: rockchip_canfd: fix return type of rkcanfd_start_xmit()
net: can: cc770: Simplify parsing DT properties
====================
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
dev_pick_tx_cpu_id() has been introduced with two users by
commit a4ea8a3dacc3 ("net: Add generic ndo_select_queue functions").
The use in AF_PACKET has been removed in 2019 by
commit b71b5837f871 ("packet: rework packet_pick_tx_queue() to use common code selection")
The other user was a Netlogic XLP driver, removed in 2021 by
commit 47ac6f567c28 ("staging: Remove Netlogic XLP network driver").
It's relatively unlikely that any modern driver will need an
.ndo_select_queue implementation which picks purely based on CPU ID
and skips XPS, delete dev_pick_tx_cpu_id()
Found by code inspection.
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Matthieu Baerts says:
====================
selftests: mptcp: add time per subtests in TAP output
Patches here add 'time=<N>ms' in the diagnostic data of the TAP output,
e.g.
ok 1 - pm_netlink: defaults addr list # time=9ms
This addition is useful to quickly identify which subtests are taking a
longer time than the others, or more than expected.
Note that there are no specific formats to follow to show this time
according to the TAP 13, TAP 14 and KTAP specifications, but we follow
the format being parsed by NIPA [1].
Patch 1 modifies mptcp_lib.sh to add this support to all MPTCP
selftests.
Patch 2 removes the now duplicated info in mptcp_connect.sh
Patch 3 slightly improves the precision of the first subtests in all
MPTCP subtests.
Patches 4 and 5 remove duplicated spaces in TAP output, for the TAP
parsers that cannot handle them properly.
v1: https://lore.kernel.org/20240902-net-next-mptcp-ksft-subtest-time-v1-0-f1ed499a11b1@kernel.org
Link: https://github.com/linux-netdev/nipa/pull/36
====================
Link: https://patch.msgid.link/20240906-net-next-mptcp-ksft-subtest-time-v2-0-31d5ee4f3bdf@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
|