Age | Commit message (Collapse) | Author | Files | Lines |
|
Convert fsl-fman from txt to yaml format and split it fsl,fman.yam,
fsl,fman-port.yaml, fsl-muram.yaml, fsl-mdio.yaml.
Addition changes:
fsl,fman.yaml:
- Fixed interrupts in example.
- Fixed ethernet@e8000 miss } in example.
- ptp-timer add label in example.
- Ref to new fsl,fman*.yaml.
- Reorder property in example.
- Keep only one example.
- Add const for #address-cells and #size-cells.
- Use defined interrupt type.
- ptp example use node name phc.
fsl,fman-port:
- Keep only one example.
fsl,fman-mdio:
- Add little-endian property.
- Add ref to mdio.yaml.
- Remove suppress-preamble.
- Add #address-cells and #size-cells in example.
- Remove clock-frequency, which already describe in mmio.yaml.
fsl,muram.yaml:
- Add reg property.
- Remove range property.
- Use reg instead of range in example.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240618-ls_fman-v2-2-f00a82623d8e@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert ptp-qoirq from txt to yaml format.
Additional change:
- Fixed example interrupts proptery. Need only 1 irq by check MPC8313 spec.
- Move Reference clock context under clk,sel.
- Interrupts is not required property.
- Use low case for hex value.
- Check reference manual of MPC8313, p1010 and so on, which dts use more
than 1 irqs. Only 1 irq for each ptp device. Check driver code
(drivers/ptp/ptp_qoriq.c) and only 1 irq used. So original description is
wrong.
- Remove comments for compatible string.
Signed-off-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240618-ls_fman-v2-1-f00a82623d8e@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Embedding net_device into structures prohibits the usage of flexible
arrays in the net_device structure. For more details, see the discussion
at [1].
Un-embed the net_devices from struct mt76_dev by converting them
into pointers, and allocating them dynamically. Use the leverage
alloc_netdev_dummy() to allocate the net_device object at
mt76_dma_init().
The free of the device occurs at mt76_dma_cleanup().
Link: https://lore.kernel.org/all/20240229225910.79e224cf@kernel.org/ [1]
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Acked-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20240619105311.3144908-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These functions are only used within the compilation unit they're defined
in so there's no reason to export them.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240619140119.26777-1-brgl@bgdev.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These drivers don't use the driver_data member of struct i2c_device_id,
so don't explicitly initialize this member.
This prepares putting driver_data in an anonymous union which requires
either no initialization or named designators. But it's also a nice
cleanup on its own.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://patch.msgid.link/20240619195631.2545407-2-u.kleine-koenig@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt.c
1e7962114c10 ("bnxt_en: Restore PTP tx_avail count in case of skb_pad() error")
165f87691a89 ("bnxt_en: add timestamping statistics support")
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wireless, bpf and netfilter.
Happy summer solstice! The line count is a bit inflated by a selftest
and update to a driver's FW interface header, in reality this is
slightly below average for us. We are expecting one driver fix from
Intel, but there are no big known issues.
Current release - regressions:
- ipv6: bring NLM_DONE out to a separate recv() again
Current release - new code bugs:
- wifi: cfg80211: wext: set ssids=NULL for passive scans via old wext API
Previous releases - regressions:
- wifi: mac80211: fix monitor channel setting with chanctx emulation
(probably most awaited of the fixes in this PR, tracked by Thorsten)
- usb: ax88179_178a: bring back reset on init, if PHY is disconnected
- bpf: fix UML x86_64 compile failure with BPF
- bpf: avoid splat in pskb_pull_reason(), sanity check added can be hit
with malicious BPF
- eth: mvpp2: use slab_build_skb() for packets in slab, driver was
missed during API refactoring
- wifi: iwlwifi: add missing unlock of mvm mutex
Previous releases - always broken:
- ipv6: add a number of missing null-checks for in6_dev_get(), in case
IPv6 disabling races with the datapath
- bpf: fix reg_set_min_max corruption of fake_reg
- sched: act_ct: add netns as part of the key of tcf_ct_flow_table"
* tag 'net-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
net: usb: rtl8150 fix unintiatilzed variables in rtl8150_get_link_ksettings
selftests: virtio_net: add forgotten config options
bnxt_en: Restore PTP tx_avail count in case of skb_pad() error
bnxt_en: Set TSO max segs on devices with limits
bnxt_en: Update firmware interface to 1.10.3.44
net: stmmac: Assign configured channel value to EXTTS event
net: do not leave a dangling sk pointer, when socket creation fails
net/tcp_ao: Don't leak ao_info on error-path
ice: Fix VSI list rule with ICE_SW_LKUP_LAST type
ipv6: bring NLM_DONE out to a separate recv() again
selftests: add selftest for the SRv6 End.DX6 behavior with netfilter
selftests: add selftest for the SRv6 End.DX4 behavior with netfilter
netfilter: move the sysctl nf_hooks_lwtunnel into the netfilter core
seg6: fix parameter passing when calling NF_HOOK() in End.DX4 and End.DX6 behaviors
netfilter: ipset: Fix suspicious rcu_dereference_protected()
selftests: openvswitch: Set value to nla flags.
octeontx2-pf: Fix linking objects into multiple modules
octeontx2-pf: Add error handling to VLAN unoffload handling
virtio_net: fixing XDP for fully checksummed packets handling
virtio_net: checksum offloading handling fix
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Lots of small HD-audio quirks and fixes (mostly Realtek codec and
Cirrus stuff).
Also a small MIDI 2.0 fix and a fix for missing module description
are included"
* tag 'sound-6.10-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda: cs35l56: Select SERIAL_MULTI_INSTANTIATE
ALSA: hda/realtek: Add more codec ID to no shutup pins list
sound/oss/dmasound: add missing MODULE_DESCRIPTION() macro
ALSA: hda/realtek: Add quirk for Lenovo Yoga Pro 7 14ARP8
ALSA: hda/realtek: Enable headset mic on IdeaPad 330-17IKB 81DM
ALSA: hda: tas2781: Component should be unbound before deconstruction
ALSA: hda: cs35l41: Component should be unbound before deconstruction
ALSA: hda: cs35l56: Component should be unbound before deconstruction
ALSA/hda: intel-dsp-config: Document AVS as dsp_driver option
ALSA: hda/realtek: Support Lenovo Thinkbook 13x Gen 4
ALSA: hda/realtek: Support Lenovo Thinkbook 16P Gen 5
ALSA: hda: cs35l41: Support Lenovo Thinkbook 13x Gen 4
ALSA: hda: cs35l41: Support Lenovo Thinkbook 16P Gen 5
ALSA: hda/realtek: Remove Framework Laptop 16 from quirks
ALSA: hda/realtek: Limit mic boost on N14AP7
ALSA: hda/realtek: fix mute/micmute LEDs don't work for ProBook 445/465 G11.
ALSA: seq: ump: Fix missing System Reset message handling
ALSA: hda: cs35l41: Possible null pointer dereference in cs35l41_hda_unbind()
ALSA: hda: cs35l56: Fix lifecycle of codec pointer
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd
Pull mfd fix from Lee Jones:
- Fix AXP717 PMIC probe and by extension its consumers
* tag 'mfd-fixes-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd:
mfd: axp20x: AXP717: Fix missing IRQ status registers range
|
|
This functions retrieves values by passing a pointer. As the function
that retrieves them can fail before touching the pointers, the variables
must be initialized.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+5186630949e3c55f0799@syzkaller.appspotmail.com
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20240619132816.11526-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
One may use tools/testing/selftests/drivers/net/virtio_net/config
for example for vng build command like this one:
$ vng -v -b -f tools/testing/selftests/drivers/net/virtio_net/config
In that case, the needed kernel config options are not turned on.
Add the missed kernel config options.
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240617072614.75fe79e7@kernel.org/
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netdev/1a63f209-b1d4-4809-bc30-295a5cafa296@kernel.org/
Fixes: ccfaed04db5e ("selftests: virtio_net: add initial tests")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240619061748.1869404-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Michael Chan says:
====================
bnxt_en: Bug fixes for net
The first firmware interface update is needed by the second patch to
limit the number of TSO segments on the 5760X chips. The third patch
fixes the TX error path for PTP packets.
====================
Link: https://lore.kernel.org/r/20240618215313.29631-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current code only restores PTP tx_avail count when we get DMA
mapping errors. Fix it so that the PTP tx_avail count will be
restored for both DMA mapping errors and skb_pad() errors.
Otherwise PTP TX timestamp will not be available after a PTP
packet hits the skb_pad() error.
Fixes: 83bb623c968e ("bnxt_en: Transmit and retrieve packet timestamps")
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618215313.29631-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Firmware will now advertise a non-zero TSO max segments if the
device has a limit. 0 means no limit. The latest 5760X chip
(early revs) has a limit of 2047 that cannot be exceeded. If
exceeded, the chip will send out just a small number of segments.
Call netif_set_tso_max_segs() if the device has a limit.
Fixes: 2012a6abc876 ("bnxt_en: Add 5760X (P7) PCI IDs")
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240618215313.29631-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The relevant change is the max_tso_segs value returned by firmware
in the HWRM_FUNC_QCAPS response. This value will be used in the next
patch to cap the TSO segments.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://lore.kernel.org/r/20240618215313.29631-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To facilitate running PHY parametric tests, add support for the SIOCSMIIREG
ioctl. This allows a userspace application to write to the PHY registers
to enable the test modes.
Signed-off-by: Jackie Jone <jackie.jone@alliedtelesis.co.nz>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240618213330.982046-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Jiawen Wu says:
====================
add flow director for txgbe
Add flow director support for Wangxun 10Gb NICs.
v2 -> v3: https://lore.kernel.org/all/20240605020852.24144-1-jiawenwu@trustnetic.com/
- Wrap the code at 80 chars where possible. (Jakub Kicinski)
- Add function description address on kernel-doc. (Jakub Kicinski)
- Correct return code. (Simon Horman)
- Remove redundant size check. (Hariprasad Kelam)
v1 -> v2: https://lore.kernel.org/all/20240529093821.27108-1-jiawenwu@trustnetic.com/
- Fix build warnings reported by kernel test robot.
====================
Link: https://lore.kernel.org/r/20240618101609.3580-1-jiawenwu@trustnetic.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add flow director filter match and miss statistics to ethtool -S.
And change the number of queues when using flow director for ehtool -l.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Support the addition and deletion of Flow Director filters.
Supported fields: src-ip, dst-ip, src-port, dst-port
Supported flow-types: tcp4, udp4, sctp4, ipv4
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add flow director ATR filter. ATR mode is enabled by default to filter
TCP packets.
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Assign the configured channel value to the EXTTS event in the timestamp
interrupt handler. Without assigning the correct channel, applications
like ts2phc will refuse to accept the event, resulting in errors such
as:
...
ts2phc[656.834]: config item end1.ts2phc.pin_index is 0
ts2phc[656.834]: config item end1.ts2phc.channel is 3
ts2phc[656.834]: config item end1.ts2phc.extts_polarity is 2
ts2phc[656.834]: config item end1.ts2phc.extts_correction is 0
...
ts2phc[656.862]: extts on unexpected channel
ts2phc[658.141]: extts on unexpected channel
ts2phc[659.140]: extts on unexpected channel
Fixes: f4da56529da60 ("net: stmmac: Add support for external trigger timestamping")
Cc: stable@vger.kernel.org
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Link: https://lore.kernel.org/r/20240618073821.619751-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 fixes the suspicious RCU usage warning that resulted from the
recent fix for the race between namespace cleanup and gc in
ipset left out checking the pernet exit phase when calling
rcu_dereference_protected(), from Jozsef Kadlecsik.
Patch #2 fixes incorrect input and output netdevice in SRv6 prerouting
hooks, from Jianguo Wu.
Patch #3 moves nf_hooks_lwtunnel sysctl toggle to the netfilter core.
The connection tracking system is loaded on-demand, this
ensures availability of this knob regardless.
Patch #4-#5 adds selftests for SRv6 netfilter hooks also from Jianguo Wu.
netfilter pull request 24-06-19
* tag 'nf-24-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
selftests: add selftest for the SRv6 End.DX6 behavior with netfilter
selftests: add selftest for the SRv6 End.DX4 behavior with netfilter
netfilter: move the sysctl nf_hooks_lwtunnel into the netfilter core
seg6: fix parameter passing when calling NF_HOOK() in End.DX4 and End.DX6 behaviors
netfilter: ipset: Fix suspicious rcu_dereference_protected()
====================
Link: https://lore.kernel.org/r/20240619170537.2846-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
It is possible to trigger a use-after-free by:
* attaching an fentry probe to __sock_release() and the probe calling the
bpf_get_socket_cookie() helper
* running traceroute -I 1.1.1.1 on a freshly booted VM
A KASAN enabled kernel will log something like below (decoded and stripped):
==================================================================
BUG: KASAN: slab-use-after-free in __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29)
Read of size 8 at addr ffff888007110dd8 by task traceroute/299
CPU: 2 PID: 299 Comm: traceroute Tainted: G E 6.10.0-rc2+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1))
print_report (mm/kasan/report.c:378 mm/kasan/report.c:488)
? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29)
kasan_report (mm/kasan/report.c:603)
? __sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29)
kasan_check_range (mm/kasan/generic.c:183 mm/kasan/generic.c:189)
__sock_gen_cookie (./arch/x86/include/asm/atomic64_64.h:15 ./include/linux/atomic/atomic-arch-fallback.h:2583 ./include/linux/atomic/atomic-instrumented.h:1611 net/core/sock_diag.c:29)
bpf_get_socket_ptr_cookie (./arch/x86/include/asm/preempt.h:94 ./include/linux/sock_diag.h:42 net/core/filter.c:5094 net/core/filter.c:5092)
bpf_prog_875642cf11f1d139___sock_release+0x6e/0x8e
bpf_trampoline_6442506592+0x47/0xaf
__sock_release (net/socket.c:652)
__sock_create (net/socket.c:1601)
...
Allocated by task 299 on cpu 2 at 78.328492s:
kasan_save_stack (mm/kasan/common.c:48)
kasan_save_track (mm/kasan/common.c:68)
__kasan_slab_alloc (mm/kasan/common.c:312 mm/kasan/common.c:338)
kmem_cache_alloc_noprof (mm/slub.c:3941 mm/slub.c:4000 mm/slub.c:4007)
sk_prot_alloc (net/core/sock.c:2075)
sk_alloc (net/core/sock.c:2134)
inet_create (net/ipv4/af_inet.c:327 net/ipv4/af_inet.c:252)
__sock_create (net/socket.c:1572)
__sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706)
__x64_sys_socket (net/socket.c:1718)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Freed by task 299 on cpu 2 at 78.328502s:
kasan_save_stack (mm/kasan/common.c:48)
kasan_save_track (mm/kasan/common.c:68)
kasan_save_free_info (mm/kasan/generic.c:582)
poison_slab_object (mm/kasan/common.c:242)
__kasan_slab_free (mm/kasan/common.c:256)
kmem_cache_free (mm/slub.c:4437 mm/slub.c:4511)
__sk_destruct (net/core/sock.c:2117 net/core/sock.c:2208)
inet_create (net/ipv4/af_inet.c:397 net/ipv4/af_inet.c:252)
__sock_create (net/socket.c:1572)
__sys_socket (net/socket.c:1660 net/socket.c:1644 net/socket.c:1706)
__x64_sys_socket (net/socket.c:1718)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Fix this by clearing the struct socket reference in sk_common_release() to cover
all protocol families create functions, which may already attached the
reference to the sk object with sock_init_data().
Fixes: c5dbb89fc2ac ("bpf: Expose bpf_get_socket_cookie to tracing programs")
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/netdev/20240613194047.36478-1-kuniyu@amazon.com/T/
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: D. Wythe <alibuda@linux.alibaba.com>
Link: https://lore.kernel.org/r/20240617210205.67311-1-ignat@cloudflare.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The ACPI IDs used in the CS35L56 HDA drivers are all handled by the
serial multi-instantiate driver which starts multiple Linux device
instances from a single ACPI Device() node.
As serial multi-instantiate is not an optional part of the system add it
as a dependency in Kconfig so that it is not overlooked.
Signed-off-by: Simon Trimmer <simont@opensource.cirrus.com>
Link: https://lore.kernel.org/20240619161602.117452-1-simont@opensource.cirrus.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
Shannon Nelson says:
====================
ionic: rework fix for doorbell miss
A latency test in a scaled out setting (many VMs with many queues)
has uncovered an issue with our missed doorbell fix from
commit b69585bfcece ("ionic: missed doorbell workaround")
As a refresher, the Elba ASIC has an issue where once in a blue
moon it might miss/drop a queue doorbell notification from
the driver. This can result in Tx timeouts and potential Rx
buffer misses.
The basic problem with the original solution is that
we're delaying things with a timer for every single queue,
periodically using mod_timer() to reset to reset the alarm, and
mod_timer() becomes a more and more expensive thing as there
are more and more VFs and queues each with their own timer.
A ping-pong latency test tends to exacerbate the effect such
that every napi is doing a mod_timer() in every cycle.
An alternative has been worked out to replace this using
periodic workqueue items outside the napi cycle to request a
napi_schedule driven by a single delayed-workqueue per device
rather than a timer for every queue. Also, now that newer
firmware is actually reporting its ASIC type, we can restrict
this to the appropriate chip.
The testing scenario used 128 VFs in UP state, 16 queues per
VF, and latency tests were done using TCP_RR with adaptive
interrupt coalescing enabled, running on 1 VF. We would see
99th percentile latencies of up to 900us range, with some max
fliers as much as 4ms.
With these fixes the 99th percentile latencies are typically well
under 50us with the occasional max under 500us.
v1:
https://lore.kernel.org/netdev/20240610230706.34883-1-shannon.nelson@amd.com/
====================
Link: https://lore.kernel.org/r/20240619003257.6138-1-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the doorbell workaround isn't required for a certain
asic_type then there is no need to run the associated
code. Since newer FW versions are finally reporting their
asic_type we can use a flag to determine whether or not to
do the workaround.
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-9-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We only support (u16)-1 size for rx_copybreak, so we can reduce the
field size and move a couple other fields around to save a little
space in the ionic_lif struct.
Before:
/* size: 17440, cachelines: 273, members: 56 */
/* sum members: 17403, holes: 9, sum holes: 37 */
/* last cacheline: 32 bytes */
After:
/* size: 17424, cachelines: 273, members: 56 */
/* sum members: 17401, holes: 7, sum holes: 23 */
/* last cacheline: 16 bytes */
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-8-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Check the deadline against the last time run and only
schedule a new napi if we haven't been run recently.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-7-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a work item for each queue that will be run on the queue's
preferred cpu and will schedule another napi. This napi is
run in case the device missed a doorbell and didn't process
a packet. This is a problem for the Elba asic that happens
very rarely.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-6-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the first queued work for checking on the missed doorbell.
This is a delayed work item that reschedules itself every cycle
starting at probe.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-5-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Instead of using the system's default workqueue, add a private
workqueue for the device to use for its little jobs. This is
to better support the new work items we will be adding in the
next patches for PF and VF specific jobs, without inundating
the system workqueue in a couple of customer cases where our
devices get scaled out to 100-200 VFs.
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-4-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently the driver either sets the initial interrupt affinity for its
adminq and tx/rx queues on probe or resets it on various
down/up/reconfigure flows. If any user and/or user process
(i.e. irqbalance) changes IRQ affinity for any of the driver's interrupts
that will be reset to driver defaults whenever any down/up/reconfigure
operation happens. This is incorrect and is fixed by making 2 changes:
1. Allocate an array of cpumasks that's only allocated on probe and
destroyed on remove.
2. Update the cpumask(s) for interrupts that are in use by registering
for affinity notifiers.
Signed-off-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-3-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the timer-per-queue mechanics from the missed doorbell
check in preparation for the new missed doorbell fix.
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240619003257.6138-2-shannon.nelson@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Petr Machata says:
====================
mlxsw: Use page pool for Rx buffers allocation
Amit Cohen writes:
After using NAPI to process events from hardware, the next step is to
use page pool for Rx buffers allocation, which is also enhances
performance.
To simplify this change, first use page pool to allocate one continuous
buffer for each packet, later memory consumption can be improved by using
fragmented buffers.
This set significantly enhances mlxsw driver performance, CPU can handle
about 370% of the packets per second it previously handled.
The next planned improvement is using XDP to optimize telemetry.
Patch set overview:
Patches #1-#2 are small preparations for page pool usage
Patch #3 initializes page pool, but do not use it
Patch #4 converts the driver to use page pool for buffers allocations
Patch #5 is an optimization for buffer access
Patch #6 cleans up an unused structure
Patch #7 uses napi_consume_skb() as part of Tx completion
====================
Link: https://lore.kernel.org/r/cover.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, as part of Tx completion, the driver calls dev_kfree_skb_any()
to free the SKB. For this flow, the correct function is napi_consume_skb().
This function and dev_consume_skb_any() were added to be used for consumed
SKBs, which were not dropped, so the skb:kfree_skb tracepoint is not
triggered, and we can get better diagnostics about dropped packets.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/a9f9f3dc884c0d1be4bd4c9d72030c88c7ac004f.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The previous patch used page pool to allocate buffers for RDQ. With this
change, 'elem_info->u.rdq.skb' is not used anymore, as we do not allocate
SKB before getting the packet, we hold page pointer and build the SKB
around it once packet is received.
Remove the union and store SKB pointer for SDQ only.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/23a531008936dc9a1a298643fb1e4f9a7b8e6eb3.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before accessing data buffer, call net_prefetch() to load it into the
cache. This change improves driver performance, CPU can handle about
7.1% more packets per second.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1fa07c510890866a6f201163ab7e78890ba28b3b.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As part of driver init, all Rx queues are filled with buffers for
hardware usage. Later, when a packet is received, a new buffer should be
allocated to be used by hardware instead of the received buffer.
Packet's processing time includes allocation time, which can be improved
using page pool.
Using page pool, DMA mapping is done only for first allocation of buffers.
As subsequent buffers allocation avoid DMA mapping, it results in
performance improvement. The purpose of page pool is to allocate pages fast
from cache without locking. This lockless guarantee naturally comes from
running under a NAPI.
Use page pool to allocate the data buffer only, so hardware will use it to
fill the packet. At completion time, attach the data buffer (now filled
with packet payload) to new SKB which is allocated around the received
buffer. SKB building at completion time prevents cache miss for each
packet, as now the SKB is allocated right before packets will be handled by
networking stack.
Page pool for each Rx queue enhances Rx side performance by reclaiming
buffers back to each queue specific pool. This change significantly
improves driver performance, CPU can handle about 345% of the packets per
second it previously handled.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/1cf788a8f43c70aae6d526018ef77becb27ad6d3.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Next patch will use page pool to allocate buffers for RDQ. Initialize
page pool for each CQ, which is mapped 1:1 to RDQ. Page pool for each Rx
queue enhances Rx side performance by reclaiming buffers back to each queue
specific pool.
When only one NAPI instance is the consumer of pages from page pool, it is
recommended to pass it as part of 'page_pool_params', then page pool APIs
will be done without special locks. mlxsw driver holds NAPI instance per
CQ, so add page pool per CQ and use the existing NAPI instance.
For now, pages are not allocated from the pool, next patch will use it.
Some notes regarding 'page_pool_params':
* Use PP_FLAG_DMA_MAP to allow page pool handles DMA mapping, for now
do not use sync flag, as only the device writes to this memory and we
read it only when it finishes writing there. This will probably be
changed when we will support XDP.
* Define 'order' according to maximum MTU and take into account software
overhead. Some round up are done, which means that we allocate more pages
than we really need. This can be improved later by using fragmented
buffers.
* Use pool_size = MLXSW_PCI_WQE_COUNT. This will be the size of 'ptr_ring',
and should be the maximum amount of packets that page pool will allocate
memory for. In our case, this is the queue size, defined as
MLXSW_PCI_WQE_COUNT.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/02e5856ae7c572d4293ce6bb92c286ee6cfec800.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Next patches will add support for page pool in mlxsw driver. Page pool will
be used to allocate buffers for RDQ and will use NAPI instance of the
appropriate CQ (RDQ is mapped 1:1 to CQ).
To allow pool initialization as part of CQ init, when NAPI is initialized,
page_pool structure will be as part of CQ structure. Later, the allocations
for RDQ will be done from the pool in the appropriate CQ. To allow access
to the appropriate pool, set CQ pointer as part of RDQ initialization.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/d60918ca1e142a554af1df9c1152cdac83854a3b.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
mlxsw_pci_cq_napi_setup() includes both NAPI initialization and
enablement, similar to teardown function. Next patches will add support
for page pool in mlxsw driver, then we use NAPI instance for page pool.
Page pool initialization should be done before NAPI enablement, same for
page pool destruction which should be done after NAPI disablement.
As preparation, split NAPI setup/teardown into two steps, then page pool
setup will be done between the phases.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/8dbf37e859f07247498fca17109b8858ff2b0498.1718709196.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This change just removes extra (i.e. not needed) white space in
prp_drop_frame() function.
No functional changes.
Signed-off-by: Lukasz Majewski <lukma@denx.de>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240618125817.1111070-1-lukma@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It seems I introduced it together with TCP_AO_CMDF_AO_REQUIRED, on
version 5 [1] of TCP-AO patches. Quite frustrative that having all these
selftests that I've written, running kmemtest & kcov was always in todo.
[1]: https://lore.kernel.org/netdev/20230215183335.800122-5-dima@arista.com/
Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240617072451.1403e1d2@kernel.org/
Fixes: 0aadc73995d0 ("net/tcp: Prevent TCP-MD5 with TCP-AO being set")
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240619-tcp-ao-required-leak-v1-1-6408f3c94247@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for Byte Queue Limits (BQL).
Tested on qemu emulated virtio_net device with 1, 2 and 4 queues.
Tested with fq_codel and pfifo_fast. Super netperf with 50 threads is
running in background. Netperf TCP_RR results:
NOBQL FQC 1q: 159.56 159.33 158.50 154.31 agv: 157.925
NOBQL FQC 2q: 184.64 184.96 174.73 174.15 agv: 179.62
NOBQL FQC 4q: 994.46 441.96 416.50 499.56 agv: 588.12
NOBQL PFF 1q: 148.68 148.92 145.95 149.48 agv: 148.2575
NOBQL PFF 2q: 171.86 171.20 170.42 169.42 agv: 170.725
NOBQL PFF 4q: 1505.23 1137.23 2488.70 3507.99 agv: 2159.7875
BQL FQC 1q: 1332.80 1297.97 1351.41 1147.57 agv: 1282.4375
BQL FQC 2q: 768.30 817.72 864.43 974.40 agv: 856.2125
BQL FQC 4q: 945.66 942.68 878.51 822.82 agv: 897.4175
BQL PFF 1q: 149.69 151.49 149.40 147.47 agv: 149.5125
BQL PFF 2q: 2059.32 798.74 1844.12 381.80 agv: 1270.995
BQL PFF 4q: 1871.98 4420.02 4916.59 13268.16 agv: 6119.1875
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/r/20240618144456.1688998-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With ARCH=m68k, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/smsc/smc9194.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618-md-m68k-drivers-net-ethernet-smsc-v1-1-ad3d7200421e@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With ARCH=m68k, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/cirrus/mac89x0.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618-md-m68k-drivers-net-ethernet-cirrus-v1-1-07f5bd0b64cb@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With ARCH=m68k, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/a2065.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/ariadne.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/atarilance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/hplance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/7990.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/mvme147.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/ethernet/amd/sun3lance.o
Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().
This includes drivers/net/ethernet/amd/lance.c which, although it did
not produce a warning with the m68k allmodconfig configuration, may
cause this warning with other configurations.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618-md-m68k-drivers-net-ethernet-amd-v1-1-50ee7a9ad50e@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With ARCH=m68k, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/com20020-isa.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240618-md-m68k-drivers-net-arcnet-v1-1-90e42bc58102@quicinc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adding/updating VSI list rule, as well as allocating/freeing VSI list
resource are called several times with type ICE_SW_LKUP_LAST, which fails
because ice_update_vsi_list_rule() and ice_aq_alloc_free_vsi_list()
consider it invalid. Allow calling these functions with ICE_SW_LKUP_LAST.
This fixes at least one issue in switchdev mode, where the same rule with
different action cannot be added, e.g.:
tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \
dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF1_PR
tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \
dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF2_PR
Fixes: 0f94570d0cae ("ice: allow adding advanced rules")
Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20240618210206.981885-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit under Fixes optimized the number of recv() calls
needed during RTM_GETROUTE dumps, but we got multiple
reports of applications hanging on recv() calls.
Applications expect that a route dump will be terminated
with a recv() reading an individual NLM_DONE message.
Coalescing NLM_DONE is perfectly legal in netlink,
but even tho reporters fixed the code in respective
projects, chances are it will take time for those
applications to get updated. So revert to old behavior
(for now)?
This is an IPv6 version of commit 460b0d33cf10 ("inet: bring
NLM_DONE out to a separate recv() again").
Reported-by: Maciej Żenczykowski <zenczykowski@gmail.com>
Link: https://lore.kernel.org/all/CANP3RGc1RG71oPEBXNx_WZFP9AyphJefdO4paczN92n__ds4ow@mail.gmail.com
Reported-by: Stefano Brivio <sbrivio@redhat.com>
Link: https://lore.kernel.org/all/20240315124808.033ff58d@elisabeth
Reported-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/all/02b50aae-f0e9-47a4-8365-a977a85975d3@ovn.org
Fixes: 5fc68320c1fb ("ipv6: remove RTNL protection from inet6_dump_fib()")
Tested-by: Ilya Maximets <i.maximets@ovn.org>
Link: https://lore.kernel.org/r/20240618193914.561782-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|