Age | Commit message (Collapse) | Author | Files | Lines |
|
Packet length check needs to be located after size and align_count
calculation to prevent kernel panic in skb_pull() in case
rx_cmd_a & RX_CMD_A_RED evaluates to true.
Fixes: d8b228318935 ("net: usb: smsc75xx: Limit packet length to skb->len")
Signed-off-by: Szymon Heidrich <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add PTP capability to the Ethernet MAC.
Signed-off-by: Durai Manickam KR <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add PTP capability to the Gigabit Ethernet MAC.
Signed-off-by: Durai Manickam KR <[email protected]>
Reviewed-by: Claudiu Beznea <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
SH_ETH doesn't need mdiobus suspend/resume, that's why it sets
'mac_managed_pm'. However, setting it needs to be moved from init to
probe, so mdiobus PM functions will really never be called (e.g. when
the interface is not up yet during suspend/resume).
Fixes: 6a1dbfefdae4 ("net: sh_eth: Fix PHY state warning splat during system resume")
Suggested-by: Heiner Kallweit <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Reviewed-by: Sergey Shtylyov <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
RAVB doesn't need mdiobus suspend/resume, that's why it sets
'mac_managed_pm'. However, setting it needs to be moved from init to
probe, so mdiobus PM functions will really never be called (e.g. when
the interface is not up yet during suspend/resume).
Fixes: 4924c0cdce75 ("net: ravb: Fix PHY state warning splat during system resume")
Suggested-by: Heiner Kallweit <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Reviewed-by: Sergey Shtylyov <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
LED core provides a helper to parse default state from firmware node.
Use it instead of custom implementation.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
build_skb_from_xdp_buff() may return NULL, in this case
we need to free the frags of xdp shinfo.
Fixes: fab89bafa95b ("virtio-net: support multi-buffer xdp")
Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Reviewed-by: Yunsheng Lin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Because headroom is not passed to page_to_skb(), this causes the shinfo
exceeds the range. Then the frags of shinfo are changed by other process.
[ 157.724634] stack segment: 0000 [#1] PREEMPT SMP NOPTI
[ 157.725358] CPU: 3 PID: 679 Comm: xdp_pass_user_f Tainted: G E 6.2.0+ #150
[ 157.726401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/4
[ 157.727820] RIP: 0010:skb_release_data+0x11b/0x180
[ 157.728449] Code: 44 24 02 48 83 c3 01 39 d8 7e be 48 89 d8 48 c1 e0 04 41 80 7d 7e 00 49 8b 6c 04 30 79 0c 48 89 ef e8 89 b
[ 157.730751] RSP: 0018:ffffc90000178b48 EFLAGS: 00010202
[ 157.731383] RAX: 0000000000000010 RBX: 0000000000000001 RCX: 0000000000000000
[ 157.732270] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff888100dd0b00
[ 157.733117] RBP: 5d5d76010f6e2408 R08: ffff888100dd0b2c R09: 0000000000000000
[ 157.734013] R10: ffffffff82effd30 R11: 000000000000a14e R12: ffff88810981ffc0
[ 157.734904] R13: ffff888100dd0b00 R14: 0000000000000002 R15: 0000000000002310
[ 157.735793] FS: 00007f06121d9740(0000) GS:ffff88842fcc0000(0000) knlGS:0000000000000000
[ 157.736794] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 157.737522] CR2: 00007ffd9a56c084 CR3: 0000000104bda001 CR4: 0000000000770ee0
[ 157.738420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 157.739283] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 157.740146] PKRU: 55555554
[ 157.740502] Call Trace:
[ 157.740843] <IRQ>
[ 157.741117] kfree_skb_reason+0x50/0x120
[ 157.741613] __udp4_lib_rcv+0x52b/0x5e0
[ 157.742132] ip_protocol_deliver_rcu+0xaf/0x190
[ 157.742715] ip_local_deliver_finish+0x77/0xa0
[ 157.743280] ip_sublist_rcv_finish+0x80/0x90
[ 157.743834] ip_list_rcv_finish.constprop.0+0x16f/0x190
[ 157.744493] ip_list_rcv+0x126/0x140
[ 157.744952] __netif_receive_skb_list_core+0x29b/0x2c0
[ 157.745602] __netif_receive_skb_list+0xed/0x160
[ 157.746190] ? udp4_gro_receive+0x275/0x350
[ 157.746732] netif_receive_skb_list_internal+0xf2/0x1b0
[ 157.747398] napi_gro_receive+0xd1/0x210
[ 157.747911] virtnet_receive+0x75/0x1c0
[ 157.748422] virtnet_poll+0x48/0x1b0
[ 157.748878] __napi_poll+0x29/0x1b0
[ 157.749330] net_rx_action+0x27a/0x340
[ 157.749812] __do_softirq+0xf3/0x2fb
[ 157.750298] do_softirq+0xa2/0xd0
[ 157.750745] </IRQ>
[ 157.751563] <TASK>
[ 157.752329] __local_bh_enable_ip+0x6d/0x80
[ 157.753178] virtnet_xdp_set+0x482/0x860
[ 157.754159] ? __pfx_virtnet_xdp+0x10/0x10
[ 157.755129] dev_xdp_install+0xa4/0xe0
[ 157.756033] dev_xdp_attach+0x20b/0x5e0
[ 157.756933] do_setlink+0x82e/0xc90
[ 157.757777] ? __nla_validate_parse+0x12b/0x1e0
[ 157.758744] rtnl_setlink+0xd8/0x170
[ 157.759549] ? mod_objcg_state+0xcb/0x320
[ 157.760328] ? security_capable+0x37/0x60
[ 157.761209] ? security_capable+0x37/0x60
[ 157.762072] rtnetlink_rcv_msg+0x145/0x3d0
[ 157.762929] ? ___slab_alloc+0x327/0x610
[ 157.763754] ? __alloc_skb+0x141/0x170
[ 157.764533] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 157.765422] netlink_rcv_skb+0x58/0x110
[ 157.766229] netlink_unicast+0x21f/0x330
[ 157.766951] netlink_sendmsg+0x240/0x4a0
[ 157.767654] sock_sendmsg+0x93/0xa0
[ 157.768434] ? sockfd_lookup_light+0x12/0x70
[ 157.769245] __sys_sendto+0xfe/0x170
[ 157.770079] ? handle_mm_fault+0xe9/0x2d0
[ 157.770859] ? preempt_count_add+0x51/0xa0
[ 157.771645] ? up_read+0x3c/0x80
[ 157.772340] ? do_user_addr_fault+0x1e9/0x710
[ 157.773166] ? kvm_read_and_reset_apf_flags+0x49/0x60
[ 157.774087] __x64_sys_sendto+0x29/0x30
[ 157.774856] do_syscall_64+0x3c/0x90
[ 157.775518] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 157.776382] RIP: 0033:0x7f06122def70
Fixes: 18117a842ab0 ("virtio-net: remove xdp related info from page_to_skb()")
Signed-off-by: Xuan Zhuo <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties.
Convert reading boolean properties to of_property_read_bool().
Reviewed-by: Simon Horman <[email protected]>
Acked-by: Marc Kleine-Budde <[email protected]> # for net/can
Acked-by: Kalle Valo <[email protected]>
Acked-by: Nicolas Ferre <[email protected]>
Acked-by: Francois Romieu <[email protected]>
Reviewed-by: Wei Fang <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are 3 classes of switch families that the driver is aware of, as
far as mv88e6xxx_change_mtu() is concerned:
- MTU configuration is available per port. Here, the
chip->info->ops->port_set_jumbo_size() method will be present.
- MTU configuration is global to the switch. Here, the
chip->info->ops->set_max_frame_size() method will be present.
- We don't know how to change the MTU. Here, none of the above methods
will be present.
Switch families MV88E6165, MV88E6191, MV88E6220, MV88E6250 and MV88E6290
fall in category 3.
The blamed commit has adjusted the MTU for all 3 categories by EDSA_HLEN
(8 bytes), resulting in a new maximum MTU of 1492 being reported by the
driver for these switches.
I don't have the hardware to test, but I do have a MV88E6390 switch on
which I can simulate this by commenting out its .port_set_jumbo_size
definition from mv88e6390_ops. The result is this set of messages at
probe time:
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 1
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 2
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 3
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 4
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 5
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 6
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 7
mv88e6085 d0032004.mdio-mii:10: nonfatal error -34 setting MTU to 1500 on port 8
It is highly implausible that there exist Ethernet switches which don't
support the standard MTU of 1500 octets, and this is what the DSA
framework says as well - the error comes from dsa_slave_create() ->
dsa_slave_change_mtu(slave_dev, ETH_DATA_LEN).
But the error messages are alarming, and it would be good to suppress
them.
As a consequence of this unlikeliness, we reimplement mv88e6xxx_get_max_mtu()
and mv88e6xxx_change_mtu() on switches from the 3rd category as follows:
the maximum supported MTU is 1500, and any request to set the MTU to a
value larger than that fails in dev_validate_mtu().
Fixes: b9c587fed61c ("dsa: mv88e6xxx: Include tagger overhead when setting MTU for DSA and CPU ports")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
ice_qp_dis() intends to stop a given queue pair that is a target of xsk
pool attach/detach. One of the steps is to disable interrupts on these
queues. It currently is broken in a way that txq irq is turned off
*after* HW flush which in turn takes no effect.
ice_qp_dis():
-> ice_qvec_dis_irq()
--> disable rxq irq
--> flush hw
-> ice_vsi_stop_tx_ring()
-->disable txq irq
Below splat can be triggered by following steps:
- start xdpsock WITHOUT loading xdp prog
- run xdp_rxq_info with XDP_TX action on this interface
- start traffic
- terminate xdpsock
[ 256.312485] BUG: kernel NULL pointer dereference, address: 0000000000000018
[ 256.319560] #PF: supervisor read access in kernel mode
[ 256.324775] #PF: error_code(0x0000) - not-present page
[ 256.329994] PGD 0 P4D 0
[ 256.332574] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 256.337006] CPU: 3 PID: 32 Comm: ksoftirqd/3 Tainted: G OE 6.2.0-rc5+ #51
[ 256.345218] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[ 256.355807] RIP: 0010:ice_clean_rx_irq_zc+0x9c/0x7d0 [ice]
[ 256.361423] Code: b7 8f 8a 00 00 00 66 39 ca 0f 84 f1 04 00 00 49 8b 47 40 4c 8b 24 d0 41 0f b7 45 04 66 25 ff 3f 66 89 04 24 0f 84 85 02 00 00 <49> 8b 44 24 18 0f b7 14 24 48 05 00 01 00 00 49 89 04 24 49 89 44
[ 256.380463] RSP: 0018:ffffc900088bfd20 EFLAGS: 00010206
[ 256.385765] RAX: 000000000000003c RBX: 0000000000000035 RCX: 000000000000067f
[ 256.393012] RDX: 0000000000000775 RSI: 0000000000000000 RDI: ffff8881deb3ac80
[ 256.400256] RBP: 000000000000003c R08: ffff889847982710 R09: 0000000000010000
[ 256.407500] R10: ffffffff82c060c0 R11: 0000000000000004 R12: 0000000000000000
[ 256.414746] R13: ffff88811165eea0 R14: ffffc9000d255000 R15: ffff888119b37600
[ 256.421990] FS: 0000000000000000(0000) GS:ffff8897e0cc0000(0000) knlGS:0000000000000000
[ 256.430207] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 256.436036] CR2: 0000000000000018 CR3: 0000000005c0a006 CR4: 00000000007706e0
[ 256.443283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 256.450527] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 256.457770] PKRU: 55555554
[ 256.460529] Call Trace:
[ 256.463015] <TASK>
[ 256.465157] ? ice_xmit_zc+0x6e/0x150 [ice]
[ 256.469437] ice_napi_poll+0x46d/0x680 [ice]
[ 256.473815] ? _raw_spin_unlock_irqrestore+0x1b/0x40
[ 256.478863] __napi_poll+0x29/0x160
[ 256.482409] net_rx_action+0x136/0x260
[ 256.486222] __do_softirq+0xe8/0x2e5
[ 256.489853] ? smpboot_thread_fn+0x2c/0x270
[ 256.494108] run_ksoftirqd+0x2a/0x50
[ 256.497747] smpboot_thread_fn+0x1c1/0x270
[ 256.501907] ? __pfx_smpboot_thread_fn+0x10/0x10
[ 256.506594] kthread+0xea/0x120
[ 256.509785] ? __pfx_kthread+0x10/0x10
[ 256.513597] ret_from_fork+0x29/0x50
[ 256.517238] </TASK>
In fact, irqs were not disabled and napi managed to be scheduled and run
while xsk_pool pointer was still valid, but SW ring of xdp_buff pointers
was already freed.
To fix this, call ice_qvec_dis_irq() after ice_vsi_stop_tx_ring(). Also
while at it, remove redundant ice_clean_rx_ring() call - this is handled
in ice_qp_clean_rings().
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <[email protected]>
Reviewed-by: Larysa Zaremba <[email protected]>
Tested-by: Chandan Kumar Rout <[email protected]> (A Contingent Worker at Intel)
Acked-by: John Fastabend <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
GPY2xx devices need 3 seconds to fully switch out of loopback mode
before it can safely re-enter loopback mode. Implement timeout mechanism
to guarantee 3 seconds waited before re-enter loopback mode.
Signed-off-by: Xu Liang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
mac_len is of type unsigned, which can never be less than zero.
mac_len = ieee802154_hdr_peek_addrs(skb, &header);
if (mac_len < 0)
return mac_len;
Change this to type int as ieee802154_hdr_peek_addrs() can return negative
integers, this is found by static analysis with smatch.
Fixes: 6c993779ea1d ("ca8210: fix mac_len negative array access")
Signed-off-by: Harshit Mogalapalli <[email protected]>
Acked-by: Alexander Aring <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
|
|
The check introduced in the commit a5fd39464a40 ("igc: Lift TAPRIO schedule
restriction") can detect a false positive error in some corner case.
For instance,
tc qdisc replace ... taprio num_tc 4
...
sched-entry S 0x01 100000 # slot#1
sched-entry S 0x03 100000 # slot#2
sched-entry S 0x04 100000 # slot#3
sched-entry S 0x08 200000 # slot#4
flags 0x02 # hardware offload
Here the queue#0 (the first queue) is on at the slot#1 and #2,
and off at the slot#3 and #4. Under the current logic, when the slot#4
is examined, validate_schedule() returns *false* since the enablement
count for the queue#0 is two and it is already off at the previous slot
(i.e. #3). But this definition is truely correct.
Let's fix the logic to enforce a strict validation for consecutively-opened
slots.
Fixes: a5fd39464a40 ("igc: Lift TAPRIO schedule restriction")
Signed-off-by: AKASHI Takahiro <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
Acked-by: Vinicius Costa Gomes <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
vf reset nack actually represents the reset operation itself is
performed but no address is assigned. Therefore, e1000_reset_hw_vf
should fill the "perm_addr" with the zero address and return success on
such an occasion. This prevents its callers in netdev.c from saying PF
still resetting, and instead allows them to correctly report that no
address is assigned.
Fixes: 6ddbc4cf1f4d ("igb: Indicate failure on vf reset for empty mac address")
Signed-off-by: Akihiko Odaki <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Tested-by: Marek Szlosek <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In igbvf_request_msix(), irqs have not been freed on the err path,
we need to free it. Fix it.
Fixes: d4e0fe01a38a ("igbvf: add new driver to support 82576 virtual functions")
Signed-off-by: Gaosheng Cui <[email protected]>
Reviewed-by: Maciej Fijalkowski <[email protected]>
Tested-by: Marek Szlosek <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Enabling SR-IOV causes the virtual functions to make requests to the
PF via the mailbox. Notably, E1000_VF_RESET request will happen during
the initialization of the VF. However, unless the reinit is done, the
VMMB interrupt, which delivers mailbox interrupt from VF to PF will be
kept masked and such requests will be silently ignored.
Enable SR-IOV at the very end of the procedure to configure the device
for SR-IOV so that the PF is configured properly for SR-IOV when a VF is
activated.
Fixes: fa44f2f185f7 ("igb: Enable SR-IOV configuration via PCI sysfs interface")
Signed-off-by: Akihiko Odaki <[email protected]>
Tested-by: Marek Szlosek <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The commit 6faee3d4ee8b ("igb: Add lock to avoid data race") adds
rtnl_lock to eliminate a false data race shown below
(FREE from device detaching) | (USE from netdev core)
igb_remove | igb_ndo_get_vf_config
igb_disable_sriov | vf >= adapter->vfs_allocated_count?
kfree(adapter->vf_data) |
adapter->vfs_allocated_count = 0 |
| memcpy(... adapter->vf_data[vf]
The above race will never happen and the extra rtnl_lock causes deadlock
below
[ 141.420169] <TASK>
[ 141.420672] __schedule+0x2dd/0x840
[ 141.421427] schedule+0x50/0xc0
[ 141.422041] schedule_preempt_disabled+0x11/0x20
[ 141.422678] __mutex_lock.isra.13+0x431/0x6b0
[ 141.423324] unregister_netdev+0xe/0x20
[ 141.423578] igbvf_remove+0x45/0xe0 [igbvf]
[ 141.423791] pci_device_remove+0x36/0xb0
[ 141.423990] device_release_driver_internal+0xc1/0x160
[ 141.424270] pci_stop_bus_device+0x6d/0x90
[ 141.424507] pci_stop_and_remove_bus_device+0xe/0x20
[ 141.424789] pci_iov_remove_virtfn+0xba/0x120
[ 141.425452] sriov_disable+0x2f/0xf0
[ 141.425679] igb_disable_sriov+0x4e/0x100 [igb]
[ 141.426353] igb_remove+0xa0/0x130 [igb]
[ 141.426599] pci_device_remove+0x36/0xb0
[ 141.426796] device_release_driver_internal+0xc1/0x160
[ 141.427060] driver_detach+0x44/0x90
[ 141.427253] bus_remove_driver+0x55/0xe0
[ 141.427477] pci_unregister_driver+0x2a/0xa0
[ 141.428296] __x64_sys_delete_module+0x141/0x2b0
[ 141.429126] ? mntput_no_expire+0x4a/0x240
[ 141.429363] ? syscall_trace_enter.isra.19+0x126/0x1a0
[ 141.429653] do_syscall_64+0x5b/0x80
[ 141.429847] ? exit_to_user_mode_prepare+0x14d/0x1c0
[ 141.430109] ? syscall_exit_to_user_mode+0x12/0x30
[ 141.430849] ? do_syscall_64+0x67/0x80
[ 141.431083] ? syscall_exit_to_user_mode_prepare+0x183/0x1b0
[ 141.431770] ? syscall_exit_to_user_mode+0x12/0x30
[ 141.432482] ? do_syscall_64+0x67/0x80
[ 141.432714] ? exc_page_fault+0x64/0x140
[ 141.432911] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Since the igb_disable_sriov() will call pci_disable_sriov() before
releasing any resources, the netdev core will synchronize the cleanup to
avoid any races. This patch removes the useless rtnl_(un)lock to guarantee
correctness.
CC: [email protected]
Fixes: 6faee3d4ee8b ("igb: Add lock to avoid data race")
Reported-by: Corinna Vinschen <[email protected]>
Link: https://lore.kernel.org/intel-wired-lan/[email protected]/
Signed-off-by: Lin Ma <[email protected]>
Tested-by: Corinna Vinschen <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
When an interface with the maximum number of VLAN filters is brought up,
a spurious error is logged:
[257.483082] 8021q: adding VLAN 0 to HW filter on device enp0s3
[257.483094] iavf 0000:00:03.0 enp0s3: Max allowed VLAN filters 8. Remove existing VLANs or disable filtering via Ethtool if supported.
The VF driver complains that it cannot add the VLAN 0 filter.
On the other hand, the PF driver always adds VLAN 0 filter on VF
initialization. The VF does not need to ask the PF for that filter at
all.
Fix the error by not tracking VLAN 0 filters altogether. With that, the
check added by commit 0e710a3ffd0c ("iavf: Fix VF driver counting VLAN 0
filters") in iavf_virtchnl.c is useless and might be confusing if left as
it suggests that we track VLAN 0.
Fixes: 0e710a3ffd0c ("iavf: Fix VF driver counting VLAN 0 filters")
Signed-off-by: Ahmed Zaki <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Currently, IAVF's decode_rx_desc_ptype() correctly reports payload type
of L4 for IPv4 UDP packets and IPv{4,6} TCP, but only L3 for IPv6 UDP.
Originally, i40e, ice and iavf were affected.
Commit 73df8c9e3e3d ("i40e: Correct UDP packet header for non_tunnel-ipv6")
fixed that in i40e, then
commit 638a0c8c8861 ("ice: fix incorrect payload indicator on PTYPE")
fixed that for ice.
IPv6 UDP is L4 obviously. Fix it and make iavf report correct L4 hash
type for such packets, so that the stack won't calculate it on CPU when
needs it.
Fixes: 206812b5fccb ("i40e/i40evf: i40e implementation for skb_set_hash")
Reviewed-by: Larysa Zaremba <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Condition, which checks whether the netdev has hashing enabled is
inverted. Basically, the tagged commit effectively disabled passing flow
hash from descriptor to skb, unless user *disables* it via Ethtool.
Commit a876c3ba59a6 ("i40e/i40evf: properly report Rx packet hash")
fixed this problem, but only for i40e.
Invert the condition now in iavf and unblock passing hash to skbs again.
Fixes: 857942fd1aa1 ("i40e: Fix Rx hash reported to the stack by our driver")
Reviewed-by: Larysa Zaremba <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Clang errors:
drivers/net/wireless/intel/iwlwifi/iwl-devtrace.c:15:32: error: unknown warning group '-Wsuggest-attribute=format', ignored [-Werror,-Wunknown-warning-option]
#pragma GCC diagnostic ignored "-Wsuggest-attribute=format"
^
1 error generated.
The warning being disabled by this pragma is GCC specific. Guard its use
with CONFIG_CC_IS_GCC so that it is not used with clang to clear up the
error.
Fixes: 4eca8cbf7ba8 ("wifi: iwlwifi: suppress printf warnings in tracing")
Link: https://github.com/ClangBuiltLinux/linux/issues/1818
Signed-off-by: Nathan Chancellor <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
There is a spelling mistake in a pr_warn_ratelimited message. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If goto_chain action present in the post ct flow rule, merge flow rules
in this ct-zone, create a new pre_ct entry as the pre ct flow rule of
next ct-zone, but do not offload merged flow rules to firmware. Repeat
the process in the next ct-zone until no goto_chain action present in
the post ct flow rule in a certain ct-zone, merged all the flow rules.
Offload to firmware finally.
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The fixed number of offload flow rule is only supported scenario of one
ct zone, in the scenario of multiple ct zones, dynamic number and more
number of offload flow rules are required. In order to support scenario
of multiple ct zones, parameter num_rules is added for to offload flow
rules
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The chain_index has different means in pre ct entry and post ct entry.
In pre ct entry, it means chain index, but in post ct entry, it means
goto chain index, it is confused.
chain_index and goto_chain_index may be present in one flow rule, It
cannot be distinguished by one field chain_index, both chain_index
and goto_chain_index are required in the follow-up patch to support
multiple ct zones
Another field goto_chain_index is added to record the goto chain index.
If no goto action in post ct entry, goto_chain_index is 0.
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
'ct_clear' action only or no ct action is supported for 'post_ct_flow'.
But in scenario of multiple ct zones, one non 'ct_clear' ct action or
more ct actions, including 'ct_clear action', may be present in one flow
rule. If ct state match key is 'ct_established', the flow rule is still
expected to be classified as 'post_ct_flow'. Check ct status first in
function "is_post_ct_flow" to achieve this.
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In the scenario of multiple ct zones, ct state key match and ct action
is present in one flow rule, the flow rule is classified to post_ct_flow
in design.
There is no ct state key match for pre ct flow, the judging condition
is added to function "is_pre_ct_flow".
Chain_index is another field for judging which flows are pre ct flow
If chain_index not 0, the flow is not pre ct flow.
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
CT action is a special case different from other actions, CT clear action
is not required when get ct action, but this case is not considered.
If CT clear action in the flow rule, skip the CT clear action when get ct
action, return the first ct action that is not a CT clear action
Signed-off-by: Wentao Jia <[email protected]>
Acked-by: Simon Horman <[email protected]>
Signed-off-by: Louis Peens <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Support offloading of TC rules that mirror/redirect egress traffic to a
MACVLAN device, which is attached to bond device which master mlx5 devices.
Signed-off-by: Maor Dickman <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Support offloading of TC rules that filter ingress traffic from a MACVLAN
device, which is attached to bond device.
Signed-off-by: Maor Dickman <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In preparation for next patch which will add new check
if device block can be setup, extract all existing checks
to function to make it more readable and maintainable.
Signed-off-by: Maor Dickman <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Print the number of hairpin queues and size as part of the hairpin table
dump.
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
We refer to a TC NIC rule that involves forwarding as "hairpin".
Hairpin queues are mlx5 hardware specific implementation for hardware
forwarding of such packets.
Per the discussion in [1], move the hairpin queues control (number and
size) from debugfs to devlink.
Expose two devlink params:
- hairpin_num_queues: control the number of hairpin queues
- hairpin_queue_size: control the size (in packets) of the hairpin queues
[1] https://lore.kernel.org/all/[email protected]/
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Downstream patches require devlink params to access the PTYS register,
move the needed functions from mlx5e to the core layer.
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently RQ health diagnostics doesn't inform the user whether an RQ
is an XSK RQ or not.
Address this, by adding XSK state flag to RQ SW state enum in core/en.h.
XSK will be '1' if current RQ is an XSK RQ, and it will be '0' if it's
not.
In this example below, it can be seen that XSK field value is '1' since
xdpsock program have been attached to channel 0 before issuing the
devlink query command:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
Output:
=======================================================================
Common config:
RQ:
type: 2 stride size: 4096 size: 16 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4236 HW state: 1 WQE counter: 15 posted WQEs: 15 cc: 15
SW State:
enabled: 1 recovering: 0 am: 1 no_csum_complete: 1 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0 xsk: 1
CQ:
cqn: 1085 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 5 size: 2048
ICOSQ:
sqn: 4229 HW state: 1 cc: 158 pc: 158 WQE size: 2048
CQ:
cqn: 1080 cc: 1 size: 2048
Signed-off-by: Adham Faris <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add SQ SW state textual representation to devlink health diagnostics
for tx reporter.
SQ SW state can be retrieved by issuing the devlink command below:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter tx
Output
=======================================================================
Common Config:
SQ:
stride size: 64 size: 1024 ts_format: FRC
CQ:
stride size: 64 size: 1024
SQs:
channel ix: 0 tc: 0 txq ix: 0 sqn: 4170 HW state: 1 stopped: false cc: 0 pc: 0
SW State:
enabled: 1 mpwqe: 1 recovering: 0 ipsec: 0 am: 1 vlan_need_l2_inline: 1 pending_xsk_tx: 0 pending_tls_rx_resync: 0 xdp_multibuf: 0
CQ:
cqn: 1031 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
Signed-off-by: Adham Faris <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
One of the parameters that is retrieved/printed as a response to
devlink health diagnostics for rx reporter is the RQ SW state.
It's printed as a bitmap decimal number. Printing it as bitmap is
problematic and non informative.
In addition User can't count on SW state without accessing the kernel
sources (mlx5e rq state enum in en.h).
This patch prints RQ SW state in a textual representation, as a key:
value pairs, where disabled rq states will appear as '0' and enabled
ones will appear as '1'.
See below the generated output for rx health diagnostics devlink
command:
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
Before:
=======================================================================
Common config:
RQ:
type: 2 stride size: 2048 size: 8 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4172 HW state: 1 SW state: 37 WQE counter: 7 posted WQEs: 7 cc: 7
CQ:
cqn: 1033 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
ICOSQ:
sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128
CQ:
cqn: 1030 cc: 1 size: 128
channel ix: 1 ...
.
.
After:
=======================================================================
Common config:
RQ:
type: 2 stride size: 2048 size: 8 ts_format: FRC
CQ:
stride size: 64 size: 1024
RQs:
channel ix: 0 rqn: 4172 HW state: 1 WQE counter: 7 posted WQEs: 7 cc: 7
SW State:
enabled: 1 recovering: 0 am: 1 no_csum_complete: 0 csum_full: 0 mini_cqe_hw_stridx: 1 shampo: 0 mini_cqe_enhanced: 0
CQ:
cqn: 1033 HW status: 0 ci: 0 size: 1024
EQ:
eqn: 7 irqn: 32 vecidx: 0 ci: 2 size: 2048
ICOSQ:
sqn: 4169 HW state: 1 cc: 74 pc: 74 WQE size: 128
CQ:
cqn: 1030 cc: 1 size: 128
channel: ix: 1 ...
.
.
Signed-off-by: Adham Faris <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Dynamic interrupt moderation RQ and SQ feature represented by
MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enums respectively, is not
consistent with the feature naming in the driver, and with the formal
feature and library names.
Hence, change MLX5E_RQ_STATE_AM and MLX5E_SQ_STATE_AM enum type names in
core/en.h to MLX5E_RQ_STATE_DIM and MLX5E_SQ_STATE_DIM respectively.
Signed-off-by: Adham Faris <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Previous check was comparing against the fifo mask. The mask is size of the
fifo (power of two) minus one, so a less than or equal comparator should be
used for checking if the fifo has room for the SKB.
Signed-off-by: Rahul Rameshbabu <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Implement thermal zone support for mlx5 based HW. The NIC
uses temperature sensor provided by ASIC to report current temperature
to thermal core.
Signed-off-by: Sandipan Patra <[email protected]>
Reviewed-by: Gal Pressman <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add comment to mlx5_devlink_params_register() functions so it is clear
that only driver init params should be registered here.
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If driver teardown is called while PCI is turned off, there is a race
between health recovery and teardown. If health recovery already started
it will wait 60 sec trying to see if PCI gets back and it can recover,
but actually there is no need to wait anymore once teardown was called.
Use the MLX5_BREAK_FW_WAIT flag which is set on driver teardown to break
waiting for PCI up.
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When shutdown or remove callbacks are called the driver sets the flag
MLX5_BREAK_FW_WAIT, to stop waiting for FW as teardown was called. There
is no need to clear the bit as once shutdown or remove were called as
there is no way back, the driver is going down. Furthermore, if not
cleared the flag can be used also in other loops where we may wait while
teardown was already called.
Use test_bit() instead of test_and_clear_bit() as there is no need to
clear the flag.
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Since the blamed commit, phy_ethtool_get_wol() and phy_ethtool_set_wol()
acquire phydev->lock, but the mscc phy driver implementations,
vsc85xx_wol_get() and vsc85xx_wol_set(), acquire the same lock as well,
resulting in a deadlock.
$ ip link set swp3 down
============================================
WARNING: possible recursive locking detected
mscc_felix 0000:00:00.5 swp3: Link is Down
--------------------------------------------
ip/375 is trying to acquire lock:
ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: vsc85xx_wol_get+0x2c/0xf4
but task is already holding lock:
ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: phy_ethtool_get_wol+0x3c/0x6c
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&dev->lock);
lock(&dev->lock);
*** DEADLOCK ***
May be due to missing lock nesting notation
2 locks held by ip/375:
#0: ffffd43b2a955788 (rtnl_mutex){+.+.}-{4:4}, at: rtnetlink_rcv_msg+0x144/0x58c
#1: ffff3d7e82e987a8 (&dev->lock){+.+.}-{4:4}, at: phy_ethtool_get_wol+0x3c/0x6c
Call trace:
__mutex_lock+0x98/0x454
mutex_lock_nested+0x2c/0x38
vsc85xx_wol_get+0x2c/0xf4
phy_ethtool_get_wol+0x50/0x6c
phy_suspend+0x84/0xcc
phy_state_machine+0x1b8/0x27c
phy_stop+0x70/0x154
phylink_stop+0x34/0xc0
dsa_port_disable_rt+0x2c/0xa4
dsa_slave_close+0x38/0xec
__dev_close_many+0xc8/0x16c
__dev_change_flags+0xdc/0x218
dev_change_flags+0x24/0x6c
do_setlink+0x234/0xea4
__rtnl_newlink+0x46c/0x878
rtnl_newlink+0x50/0x7c
rtnetlink_rcv_msg+0x16c/0x58c
Removing the mutex_lock(&phydev->lock) calls from the driver restores
the functionality.
Fixes: 2f987d486610 ("net: phy: Add locks to ethtool functions")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
'temp' was used before commit c0c99d0cd107 ("net: phy: micrel: remove
the use of .ack_interrupt()") refactored the code. Now, we can simplify
it a little.
Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 899a3cbbf77a ("net: phy: remove states PHY_STARTING and
PHY_PENDING") missed to update a comment in phy_probe. Remove
superfluous "Description:" prefix while we are here.
Signed-off-by: Wolfram Sang <[email protected]>
Reviewed-by: Heiner Kallweit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
ice: refactor mailbox overflow detection
Jake Keller says:
The primary motivation of this series is to cleanup and refactor the mailbox
overflow detection logic such that it will work with Scalable IOV. In
addition a few other minor cleanups are done while I was working on the
code in the area.
First, the mailbox overflow functions in ice_vf_mbx.c are refactored to
store the data per-VF as an embedded structure in struct ice_vf, rather than
stored separately as a fixed-size array which only works with Single Root
IOV. This reduces the overall memory footprint when only a handful of VFs
are used.
The overflow detection functions are also cleaned up to reduce the need for
multiple separate calls to determine when to report a VF as potentially
malicious.
Finally, the ice_is_malicious_vf function is cleaned up and moved into
ice_virtchnl.c since it is not Single Root IOV specific, and thus does not
belong in ice_sriov.c
I could probably have done this in fewer patches, but I split pieces out to
hopefully aid in reviewing the overall sequence of changes. This does cause
some additional thrash as it results in intermediate versions of the
refactor, but I think its worth it for making each step easier to
understand.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: call ice_is_malicious_vf() from ice_vc_process_vf_msg()
ice: move ice_is_malicious_vf() to ice_virtchnl.c
ice: print message if ice_mbx_vf_state_handler returns an error
ice: pass mbxdata to ice_is_malicious_vf()
ice: remove unnecessary &array[0] and just use array
ice: always report VF overflowing mailbox even without PF VSI
ice: declare ice_vc_process_vf_msg in ice_virtchnl.h
ice: initialize mailbox snapshot earlier in PF init
ice: merge ice_mbx_report_malvf with ice_mbx_vf_state_handler
ice: remove ice_mbx_deinit_snapshot
ice: move VF overflow message count into struct ice_mbx_vf_info
ice: track malicious VFs in new ice_mbx_vf_info structure
ice: convert ice_mbx_clear_malvf to void and use WARN
ice: re-order ice_mbx_reset_snapshot function
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 718a18a0c8a6 ("veth: Rework veth_xdp_rcv_skb in order
to accept non-linear skb") introduced a bug where it tried to
use pskb_expand_head() if the headroom was less than
XDP_PACKET_HEADROOM. This however uses kmalloc to expand the head,
which will later allow consume_skb() to free the skb while is it still
in use by AF_XDP.
Previously if the headroom was less than XDP_PACKET_HEADROOM we
continued on to allocate a new skb from pages so this restores that
behavior.
BUG: KASAN: use-after-free in __xsk_rcv+0x18d/0x2c0
Read of size 78 at addr ffff888976250154 by task napi/iconduit-g/148640
CPU: 5 PID: 148640 Comm: napi/iconduit-g Kdump: loaded Tainted: G O 6.1.4-cloudflare-kasan-2023.1.2 #1
Hardware name: Quanta Computer Inc. QuantaPlex T41S-2U/S2S-MB, BIOS S2S_3B10.03 06/21/2018
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x48
print_report+0x170/0x473
? __xsk_rcv+0x18d/0x2c0
kasan_report+0xad/0x130
? __xsk_rcv+0x18d/0x2c0
kasan_check_range+0x149/0x1a0
memcpy+0x20/0x60
__xsk_rcv+0x18d/0x2c0
__xsk_map_redirect+0x1f3/0x490
? veth_xdp_rcv_skb+0x89c/0x1ba0 [veth]
xdp_do_redirect+0x5ca/0xd60
veth_xdp_rcv_skb+0x935/0x1ba0 [veth]
? __netif_receive_skb_list_core+0x671/0x920
? veth_xdp+0x670/0x670 [veth]
veth_xdp_rcv+0x304/0xa20 [veth]
? do_xdp_generic+0x150/0x150
? veth_xdp_rcv_one+0xde0/0xde0 [veth]
? _raw_spin_lock_bh+0xe0/0xe0
? newidle_balance+0x887/0xe30
? __perf_event_task_sched_in+0xdb/0x800
veth_poll+0x139/0x571 [veth]
? veth_xdp_rcv+0xa20/0xa20 [veth]
? _raw_spin_unlock+0x39/0x70
? finish_task_switch.isra.0+0x17e/0x7d0
? __switch_to+0x5cf/0x1070
? __schedule+0x95b/0x2640
? io_schedule_timeout+0x160/0x160
__napi_poll+0xa1/0x440
napi_threaded_poll+0x3d1/0x460
? __napi_poll+0x440/0x440
? __kthread_parkme+0xc6/0x1f0
? __napi_poll+0x440/0x440
kthread+0x2a2/0x340
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x22/0x30
</TASK>
Freed by task 148640:
kasan_save_stack+0x23/0x50
kasan_set_track+0x21/0x30
kasan_save_free_info+0x2a/0x40
____kasan_slab_free+0x169/0x1d0
slab_free_freelist_hook+0xd2/0x190
__kmem_cache_free+0x1a1/0x2f0
skb_release_data+0x449/0x600
consume_skb+0x9f/0x1c0
veth_xdp_rcv_skb+0x89c/0x1ba0 [veth]
veth_xdp_rcv+0x304/0xa20 [veth]
veth_poll+0x139/0x571 [veth]
__napi_poll+0xa1/0x440
napi_threaded_poll+0x3d1/0x460
kthread+0x2a2/0x340
ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff888976250000
which belongs to the cache kmalloc-2k of size 2048
The buggy address is located 340 bytes inside of
2048-byte region [ffff888976250000, ffff888976250800)
The buggy address belongs to the physical page:
page:00000000ae18262a refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x976250
head:00000000ae18262a order:3 compound_mapcount:0 compound_pincount:0
flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
raw: 002ffff800010200 0000000000000000 dead000000000122 ffff88810004cf00
raw: 0000000000000000 0000000080080008 00000002ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888976250000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888976250080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff888976250100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888976250180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888976250200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: 718a18a0c8a6 ("veth: Rework veth_xdp_rcv_skb in order to accept non-linear skb")
Signed-off-by: Shawn Bohrer <[email protected]>
Acked-by: Lorenzo Bianconi <[email protected]>
Acked-by: Toshiaki Makita <[email protected]>
Acked-by: Toke Høiland-Jørgensen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The maximum VMID for assign_mem is 63. Use a u64 to represent this
bitmap instead of architecture-dependent "unsigned int" which varies in
size on 32-bit and 64-bit platforms.
Acked-by: Kalle Valo <[email protected]> (ath10k)
Tested-by: Gokul krishna Krishnakumar <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|