aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-04-21pds_core: add devcmd device interfacesShannon Nelson2-0/+193
The devcmd interface is the basic connection to the device through the PCI BAR for low level identification and command services. This does the early device initialization and finds the identity data, and adds devcmd routines to be used by later driver bits. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21pds_core: initial framework for pds_core PF driverShannon Nelson2-0/+585
This is the initial PCI driver framework for the new pds_core device driver and its family of devices. This does the very basics of registering for the new PF PCI device 1dd8:100c, setting up debugfs entries, and registering with devlink. Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Allow setting per-{Port, VLAN} neighbor suppression stateIdo Schimmel1-0/+1
Add a new bridge port attribute that allows user space to enable per-{Port, VLAN} neighbor suppression. Example: # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' false # bridge link set dev swp1 neigh_vlan_suppress on # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' true # bridge link set dev swp1 neigh_vlan_suppress off # bridge -d -j -p link show dev swp1 | jq '.[]["neigh_vlan_suppress"]' false Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: vlan: Allow setting VLAN neighbor suppression stateIdo Schimmel1-0/+1
Add a new VLAN attribute that allows user space to set the neighbor suppression state of the port VLAN. Example: # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' false # bridge vlan set vid 10 dev swp1 neigh_suppress on # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' true # bridge vlan set vid 10 dev swp1 neigh_suppress off # bridge -d -j -p vlan show dev swp1 vid 10 | jq '.[]["vlans"][]["neigh_suppress"]' false # bridge vlan set vid 10 dev br0 neigh_suppress on Error: bridge: Can't set neigh_suppress for non-port vlans. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21bridge: Add internal flags for per-{Port, VLAN} neighbor suppressionIdo Schimmel1-0/+1
Add two internal flags that will be used to enable / disable per-{Port, VLAN} neighbor suppression: 1. 'BR_NEIGH_VLAN_SUPPRESS': A per-port flag used to indicate that per-{Port, VLAN} neighbor suppression is enabled on the bridge port. When set, 'BR_NEIGH_SUPPRESS' has no effect. 2. 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED': A per-VLAN flag used to indicate that neighbor suppression is enabled on the given VLAN. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array payloadXin Long1-1/+1
This patch deletes the flexible-array payload[] from the structure sctp_datahdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/socket.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:230:29: warning: nested flexible array This member is not even used anywhere. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array hmacXin Long1-1/+1
This patch deletes the flexible-array hmac[] from the structure sctp_authhdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/auth.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:735:29: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array peer_initXin Long1-1/+1
This patch deletes the flexible-array peer_init[] from the structure sctp_cookie to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/sm_make_chunk.c: note: in included file (through include/net/sctp/sctp.h): ./include/net/sctp/structs.h:1588:28: warning: nested flexible array ./include/net/sctp/structs.h:343:28: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array variableXin Long1-2/+2
This patch deletes the flexible-array variable[] from the structure sctp_sackhdr and sctp_errhdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/sm_statefuns.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:451:28: warning: nested flexible array ./include/linux/sctp.h:393:29: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array skipXin Long2-4/+4
This patch deletes the flexible-array skip[] from the structure sctp_ifwdtsn/fwdtsn_hdr to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/stream_interleave.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:611:32: warning: nested flexible array ./include/linux/sctp.h:628:33: warning: nested flexible array Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-21sctp: delete the nested flexible array paramsXin Long2-7/+7
This patch deletes the flexible-array params[] from the structure sctp_inithdr, sctp_addiphdr and sctp_reconf_chunk to avoid some sparse warnings: # make C=2 CF="-Wflexible-array-nested" M=./net/sctp/ net/sctp/input.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h): ./include/linux/sctp.h:278:29: warning: nested flexible array ./include/linux/sctp.h:675:30: warning: nested flexible array This warning is reported if a structure having a flexible array member is included by other structures. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-20mac80211: use the new drop reasons infrastructureJohannes Berg1-0/+12
It can be really hard to analyse or debug why packets are going missing in mac80211, so add the needed infrastructure to use use the new per-subsystem drop reasons. We actually use two drop reason subsystems here because of the different handling of frames that are dropped but still go to monitor for old versions of hostapd, and those that are just completely unusable (e.g. crypto failed.) Annotate a few reasons here just to illustrate this, we'll need to go through and annotate more of them later. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: extend drop reasons for multiple subsystemsJohannes Berg2-4/+41
Extend drop reasons to make them usable by subsystems other than core by reserving the high 16 bits for a new subsystem ID, of which 0 of course is used for the existing reasons immediately. To still be able to have string reasons, restructure that code a bit to make the loopup under RCU, the only user of this (right now) is drop_monitor. Link: https://lore.kernel.org/netdev/[email protected] Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: move dropreason.h to dropreason-core.hJohannes Berg4-6/+7
This will, after the next patch, hold only the core drop reasons and minimal infrastructure. Fix a small kernel-doc issue while at it, to avoid the move triggering a checker. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20ipv6: add icmpv6_error_anycast_as_unicast for ICMPv6Mahesh Bandewar1-0/+1
ICMPv6 error packets are not sent to the anycast destinations and this prevents things like traceroute from working. So create a setting similar to ECHO when dealing with Anycast sources (icmpv6_echo_ignore_anycast). Signed-off-by: Mahesh Bandewar <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Maciej Żenczykowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20net: skbuff: update and rename __kfree_skb_defer()Jakub Kicinski1-1/+1
__kfree_skb_defer() uses the old naming where "defer" meant slab bulk free/alloc APIs. In the meantime we also made __kfree_skb_defer() feed the per-NAPI skb cache, which implies bulk APIs. So take away the 'defer' and add 'napi'. While at it add a drop reason. This only matters on the tx_action path, if the skb has a frag_list. But getting rid of a SKB_DROP_REASON_NOT_SPECIFIED seems like a net benefit so why not. Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20flow_dissector: Address kdoc warningsSimon Horman1-18/+20
Address a number of warnings flagged by ./scripts/kernel-doc -none include/net/flow_dissector.h include/net/flow_dissector.h:23: warning: Function parameter or member 'addr_type' not described in 'flow_dissector_key_control' include/net/flow_dissector.h:23: warning: Function parameter or member 'flags' not described in 'flow_dissector_key_control' include/net/flow_dissector.h:46: warning: Function parameter or member 'padding' not described in 'flow_dissector_key_basic' include/net/flow_dissector.h:145: warning: Function parameter or member 'tipckey' not described in 'flow_dissector_key_addrs' include/net/flow_dissector.h:157: warning: cannot understand function prototype: 'struct flow_dissector_key_arp ' include/net/flow_dissector.h:171: warning: cannot understand function prototype: 'struct flow_dissector_key_ports ' include/net/flow_dissector.h:203: warning: cannot understand function prototype: 'struct flow_dissector_key_icmp ' Also improve indentation on adjacent lines to those changed to address the above. No functional changes intended. Signed-off-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20page_pool: unlink from napi during destroyJakub Kicinski1-0/+5
Jesper points out that we must prevent recycling into cache after page_pool_destroy() is called, because page_pool_destroy() is not synchronized with recycling (some pages may still be outstanding when destroy() gets called). I assumed this will not happen because NAPI can't be scheduled if its page pool is being destroyed. But I missed the fact that NAPI may get reused. For instance when user changes ring configuration driver may allocate a new page pool, stop NAPI, swap, start NAPI, and then destroy the old pool. The NAPI is running so old page pool will think it can recycle to the cache, but the consumer at that point is the destroy() path, not NAPI. To avoid extra synchronization let the drivers do "unlinking" during the "swap" stage while NAPI is indeed disabled. Fixes: 8c48eea3adf3 ("page_pool: allow caching from safely localized NAPI") Reported-by: Jesper Dangaard Brouer <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Acked-by: Jesper Dangaard Brouer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski4-25/+28
Adjacent changes: net/mptcp/protocol.h 63740448a32e ("mptcp: fix accept vs worker race") 2a6a870e44dd ("mptcp: stops worker on unaccepted sockets at listener close") ddb1a072f858 ("mptcp: move first subflow allocation at mpc access time") Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-20Merge tag 'net-6.3-rc8' of ↵Linus Torvalds3-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter and bpf. There are a few fixes for new code bugs, including the Mellanox one noted in the last networking pull. No known regressions outstanding. Current release - regressions: - sched: clear actions pointer in miss cookie init fail - mptcp: fix accept vs worker race - bpf: fix bpf_arch_text_poke() with new_addr == NULL on s390 - eth: bnxt_en: fix a possible NULL pointer dereference in unload path - eth: veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features flag Current release - new code bugs: - eth: revert "net/mlx5: Enable management PF initialization" Previous releases - regressions: - netfilter: fix recent physdev match breakage - bpf: fix incorrect verifier pruning due to missing register precision taints - eth: virtio_net: fix overflow inside xdp_linearize_page() - eth: cxgb4: fix use after free bugs caused by circular dependency problem - eth: mlxsw: pci: fix possible crash during initialization Previous releases - always broken: - sched: sch_qfq: prevent slab-out-of-bounds in qfq_activate_agg - netfilter: validate catch-all set elements - bridge: don't notify FDB entries with "master dynamic" - eth: bonding: fix memory leak when changing bond type to ethernet - eth: i40e: fix accessing vsi->active_filters without holding lock Misc: - Mat is back as MPTCP co-maintainer" * tag 'net-6.3-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (33 commits) net: bridge: switchdev: don't notify FDB entries with "master dynamic" Revert "net/mlx5: Enable management PF initialization" MAINTAINERS: Resume MPTCP co-maintainer role mailmap: add entries for Mat Martineau e1000e: Disable TSO on i219-LM card to increase speed bnxt_en: fix free-runnig PHC mode net: dsa: microchip: ksz8795: Correctly handle huge frame configuration bpf: Fix incorrect verifier pruning due to missing register precision taints hamradio: drop ISA_DMA_API dependency mlxsw: pci: Fix possible crash during initialization mptcp: fix accept vs worker race mptcp: stops worker on unaccepted sockets at listener close net: rpl: fix rpl header size calculation net: vmxnet3: Fix NULL pointer dereference in vmxnet3_rq_rx_complete() bonding: Fix memory leak when changing bond type to Ethernet veth: take into account peer device for NETDEV_XDP_ACT_NDO_XMIT xdp_features flag mlxfw: fix null-ptr-deref in mlxfw_mfa2_tlv_next() bnxt_en: Fix a possible NULL pointer dereference in unload path bnxt_en: Do not initialize PTP on older P3/P4 chips netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements ...
2023-04-19Revert "net/mlx5: Enable management PF initialization"Jakub Kicinski1-5/+0
This reverts commit fe998a3c77b9f989a30a2a01fb00d3729a6d53a4. Paul reports that it causes a regression with IB on CX4 and FW 12.18.1000. In addition I think that the concept of "management PF" is not fully accepted and requires a discussion. Fixes: fe998a3c77b9 ("net/mlx5: Enable management PF initialization") Reported-by: Paul Moore <[email protected]> Link: https://lore.kernel.org/all/CAHC9VhQ7A4+msL38WpbOMYjAqLp0EtOjeLh4Dc6SQtD6OUvCQg@mail.gmail.com/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-19net/handshake: Add a kernel API for requesting a TLSv1.3 handshakeChuck Lever2-0/+45
To enable kernel consumers of TLS to request a TLS handshake, add support to net/handshake/ to request a handshake upcall. This patch also acts as a template for adding handshake upcall support for other kernel transport layer security providers. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-19net/handshake: Create a NETLINK service for handling handshake requestsChuck Lever2-0/+230
When a kernel consumer needs a transport layer security session, it first needs a handshake to negotiate and establish a session. This negotiation can be done in user space via one of the several existing library implementations, or it can be done in the kernel. No in-kernel handshake implementations yet exist. In their absence, we add a netlink service that can: a. Notify a user space daemon that a handshake is needed. b. Once notified, the daemon calls the kernel back via this netlink service to get the handshake parameters, including an open socket on which to establish the session. c. Once the handshake is complete, the daemon reports the session status and other information via a second netlink operation. This operation marks that it is safe for the kernel to use the open socket and the security session established there. The notification service uses a multicast group. Each handshake mechanism (eg, tlshd) adopts its own group number so that the handshake services are completely independent of one another. The kernel can then tell via netlink_has_listeners() whether a handshake service is active and prepared to handle a handshake request. A new netlink operation, ACCEPT, acts like accept(2) in that it instantiates a file descriptor in the user space daemon's fd table. If this operation is successful, the reply carries the fd number, which can be treated as an open and ready file descriptor. While user space is performing the handshake, the kernel keeps its muddy paws off the open socket. A second new netlink operation, DONE, indicates that the user space daemon is finished with the socket and it is safe for the kernel to use again. The operation also indicates whether a session was established successfully. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-19Merge tag 'mm-hotfixes-stable-2023-04-19-16-36' of ↵Linus Torvalds1-18/+21
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "22 hotfixes. 19 are cc:stable and the remainder address issues which were introduced during this merge cycle, or aren't considered suitable for -stable backporting. 19 are for MM and the remainder are for other subsystems" * tag 'mm-hotfixes-stable-2023-04-19-16-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) nilfs2: initialize unused bytes in segment summary blocks mm: page_alloc: skip regions with hugetlbfs pages when allocating 1G pages mm/mmap: regression fix for unmapped_area{_topdown} maple_tree: fix mas_empty_area() search maple_tree: make maple state reusable after mas_empty_area_rev() mm: kmsan: handle alloc failures in kmsan_ioremap_page_range() mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush() tools/Makefile: do missed s/vm/mm/ mm: fix memory leak on mm_init error handling mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock kernel/sys.c: fix and improve control flow in __sys_setres[ug]id() Revert "userfaultfd: don't fail on unrecognized features" writeback, cgroup: fix null-ptr-deref write in bdi_split_work_to_wbs maple_tree: fix a potential memory leak, OOB access, or other unpredictable bug tools/mm/page_owner_sort.c: fix TGID output when cull=tg is used mailmap: update jtoppins' entry to reference correct email mm/mempolicy: fix use-after-free of VMA iterator mm/huge_memory.c: warn with pr_warn_ratelimited instead of VM_WARN_ON_ONCE_FOLIO mm/mprotect: fix do_mprotect_pkey() return on error mm/khugepaged: check again on anon uffd-wp during isolation ...
2023-04-19net: skbuff: hide nf_trace and ipvs_propertyJakub Kicinski1-0/+4
Accesses to nf_trace and ipvs_property are already wrapped by ifdefs where necessary. Don't allocate the bits for those fields at all if possible. Acked-by: Florian Westphal <[email protected]> Acked-by: Simon Horman <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: skbuff: push nf_trace down the bitfieldJakub Kicinski1-3/+3
nf_trace is a debug feature, AFAIU, and yet it sits oddly high in the sk_buff bitfield. Move it down, pushing up dst_pending_confirm and inner_protocol_type. Next change will make nf_trace optional (under Kconfig) and all optional fields should be placed after 2b fields to avoid 2b fields straddling bytes. dst_pending_confirm is L3, so it makes sense next to ignore_df. inner_protocol_type goes up just to keep the balance. Acked-by: Florian Westphal <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: skbuff: move alloc_cpu into a potential holeJakub Kicinski1-1/+2
alloc_cpu is currently between 4 byte fields, so it's almost guaranteed to create a 2B hole. It has a knock on effect of creating a 4B hole after @end (and @end and @tail being in different cachelines). None of this matters hugely, but for kernel configs which don't enable all the features there may well be a 2B hole after the bitfield. Move alloc_cpu there. Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: skbuff: hide csum_not_inet when CONFIG_IP_SCTP not setJakub Kicinski1-0/+14
SCTP is not universally deployed, allow hiding its bit from the skb. Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: skbuff: hide wifi_acked when CONFIG_WIRELESS not setJakub Kicinski2-1/+12
Datacenter kernel builds will very likely not include WIRELESS, so let them shave 2 bits off the skb by hiding the wifi fields. Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: phy: phy_device: Call into the PHY driver to set LED blinkingAndrew Lunn1-0/+12
Linux LEDs can be requested to perform hardware accelerated blinking. Pass this to the PHY driver, if it implements the op. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Christian Marangi <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: phy: phy_device: Call into the PHY driver to set LED brightnessAndrew Lunn1-0/+13
Linux LEDs can be software controlled via the brightness file in /sys. LED drivers need to implement a brightness_set function which the core will call. Implement an intermediary in phy_device, which will call into the phy driver if it implements the necessary function. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Christian Marangi <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19net: phy: Add a binding for PHY LEDsAndrew Lunn1-0/+16
Define common binding parsing for all PHY drivers with LEDs using phylib. Parse the DT as part of the phy_probe and add LEDs to the linux LED class infrastructure. For the moment, provide a dummy brightness function, which will later be replaced with a call into the PHY driver. This allows testing since the LED core might otherwise reject an LED whose brightness cannot be set. Add a dependency on LED_CLASS. It either needs to be built in, or not enabled, since a modular build can result in linker errors. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Christian Marangi <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-19leds: Provide stubs for when CLASS_LED & NEW_LEDS are disabledAndrew Lunn1-0/+18
Provide stubs for devm_led_classdev_register_ext() and led_init_default_state_get() so that LED drivers embedded within other drivers such as PHYs and Ethernet switches still build when LEDS_CLASS or NEW_LEDS are disabled. This also helps with Kconfig dependencies, which are somewhat hairy for phylib and mdio and only get worse when adding a dependency on LED_CLASS. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: Christian Marangi <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-18bonding: add software tx timestamping supportHangbin Liu1-0/+1
Currently, bonding only obtain the timestamp (ts) information of the active slave, which is available only for modes 1, 5, and 6. For other modes, bonding only has software rx timestamping support. However, some users who use modes such as LACP also want tx timestamp support. To address this issue, let's check the ts information of each slave. If all slaves support tx timestamping, we can enable tx timestamping support for the bond. Add a note that the get_ts_info may be called with RCU, or rtnl or reference on the device in ethtool.h> Suggested-by: Miroslav Lichvar <[email protected]> Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Jay Vosburgh <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski2-2/+7
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Unbreak br_netfilter physdev match support, from Florian Westphal. 2) Use GFP_KERNEL_ACCOUNT for stateful/policy objects, from Chen Aotian. 3) Use IS_ENABLED() in nf_reset_trace(), from Florian Westphal. 4) Fix validation of catch-all set element. 5) Tighten requirements for catch-all set elements. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: tighten netlink attribute requirements for catch-all elements netfilter: nf_tables: validate catch-all set elements netfilter: nf_tables: fix ifdef to also consider nf_tables=m netfilter: nf_tables: Modify nla_memdup's flag to GFP_KERNEL_ACCOUNT netfilter: br_netfilter: fix recent physdev match breakage ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-18mm: kmsan: handle alloc failures in kmsan_ioremap_page_range()Alexander Potapenko1-9/+10
Similarly to kmsan_vmap_pages_range_noflush(), kmsan_ioremap_page_range() must also properly handle allocation/mapping failures. In the case of such, it must clean up the already created metadata mappings and return an error code, so that the error can be propagated to ioremap_page_range(). Without doing so, KMSAN may silently fail to bring the metadata for the page range into a consistent state, which will result in user-visible crashes when trying to access them. Link: https://lkml.kernel.org/r/[email protected] Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Alexander Potapenko <[email protected]> Reported-by: Dipanjan Das <[email protected]> Link: https://lore.kernel.org/linux-mm/CANX2M5ZRrRA64k0hOif02TjmY9kbbO2aCBPyq79es34RXZ=cAw@mail.gmail.com/ Reviewed-by: Marco Elver <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-18mm: kmsan: handle alloc failures in kmsan_vmap_pages_range_noflush()Alexander Potapenko1-9/+11
As reported by Dipanjan Das, when KMSAN is used together with kernel fault injection (or, generally, even without the latter), calls to kcalloc() or __vmap_pages_range_noflush() may fail, leaving the metadata mappings for the virtual mapping in an inconsistent state. When these metadata mappings are accessed later, the kernel crashes. To address the problem, we return a non-zero error code from kmsan_vmap_pages_range_noflush() in the case of any allocation/mapping failure inside it, and make vmap_pages_range_noflush() return an error if KMSAN fails to allocate the metadata. This patch also removes KMSAN_WARN_ON() from vmap_pages_range_noflush(), as these allocation failures are not fatal anymore. Link: https://lkml.kernel.org/r/[email protected] Fixes: b073d7f8aee4 ("mm: kmsan: maintain KMSAN metadata for page operations") Signed-off-by: Alexander Potapenko <[email protected]> Reported-by: Dipanjan Das <[email protected]> Link: https://lore.kernel.org/linux-mm/CANX2M5ZRrRA64k0hOif02TjmY9kbbO2aCBPyq79es34RXZ=cAw@mail.gmail.com/ Reviewed-by: Marco Elver <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Uladzislau Rezki (Sony) <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-04-18net: add macro netif_subqueue_completed_wakeHeiner Kallweit1-0/+10
Add netif_subqueue_completed_wake, complementing the subqueue versions netif_subqueue_try_stop and netif_subqueue_maybe_stop. Reviewed-by: Jacob Keller <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-04-18netfilter: nf_tables: validate catch-all set elementsPablo Neira Ayuso1-0/+4
catch-all set element might jump/goto to chain that uses expressions that require validation. Fixes: aaa31047a6d2 ("netfilter: nftables: add catch-all set element support") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-04-17net: mscc: ocelot: add support for preemptible traffic classesVladimir Oltean1-0/+3
In order to not transmit (preemptible) frames which will be received by the link partner as corrupted (because it doesn't support FP), the hardware requires the driver to program the QSYS_PREEMPTION_CFG_P_QUEUES register only after the MAC Merge layer becomes active (verification succeeds, or was disabled). There are some cases when FP is known (through experimentation) to be broken. Give priority to FP over cut-through switching, and disable FP for known broken link modes. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: add support for mqprio offloadVladimir Oltean1-0/+4
This doesn't apply anything to hardware and in general doesn't do anything that the software variant doesn't do, except for checking that there isn't more than 1 TXQ per TC (TXQs for a DSA switch are a dubious concept anyway). The reason we add this is to be able to parse one more field added to struct tc_mqprio_qopt_offload, namely preemptible_tcs. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Ferenc Fejes <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: optimize ocelot_mm_irq()Vladimir Oltean1-0/+1
The MAC Merge IRQ of all ports is shared with the PTP TX timestamp IRQ of all ports, which means that currently, when a PTP TX timestamp is generated, felix_irq_handler() also polls for the MAC Merge layer status of all ports, looking for changes. This makes the kernel do more work, and under certain circumstances may make ptp4l require a tx_timestamp_timeout argument higher than before. Changes to the MAC Merge layer status are only to be expected under certain conditions - its TX direction needs to be enabled - so we can check early if that is the case, and omit register access otherwise. Make ocelot_mm_update_port_status() skip register access if mm->tx_enabled is unset, and also call it once more, outside IRQ context, from ocelot_port_set_mm(), when mm->tx_enabled transitions from true to false, because an IRQ is also expected in that case. Also, a port may have its MAC Merge layer enabled but it may not have generated the interrupt. In that case, there's no point in writing to DEV_MM_STATUS to acknowledge that IRQ. We can reduce the number of register writes per port with MM enabled by keeping an "ack" variable which writes the "write-one-to-clear" bits. Those are 3 in number: PRMPT_ACTIVE_STICKY, UNEXP_RX_PFRM_STICKY and UNEXP_TX_PFRM_STICKY. The other fields in DEV_MM_STATUS are read-only and it doesn't matter what is written to them, so writing zero is just fine. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: remove struct ocelot_mm_state :: lockVladimir Oltean1-1/+0
Unfortunately, the workarounds for the hardware bugs make it pointless to keep fine-grained locking for the MAC Merge state of each port. Our vsc9959_cut_through_fwd() implementation requires ocelot->fwd_domain_lock to be held, in order to serialize with changes to the bridging domains and to port speed changes (which affect which ports can be cut-through). Simultaneously, the traffic classes which can be cut-through cannot be preemptible at the same time, and this will depend on the MAC Merge layer state (which changes from threaded interrupt context). Since vsc9959_cut_through_fwd() would have to hold the mm->lock of all ports for a correct and race-free implementation with respect to ocelot_mm_irq(), in practice it means that any time a port's mm->lock is held, it would potentially block holders of ocelot->fwd_domain_lock. In the interest of simple locking rules, make all MAC Merge layer state changes (and preemptible traffic class changes) be serialized by the ocelot->fwd_domain_lock. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net: mscc: ocelot: export a single ocelot_mm_irq()Vladimir Oltean1-1/+1
When the switch emits an IRQ, we don't know what caused it, and we iterate through all ports to check the MAC Merge status. Move that iteration inside the ocelot lib; we will change the locking in a future change and it would be good to encapsulate that lock completely within the ocelot lib. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17net/mlx5e: Add IPsec packet offload tunnel bitsLeon Romanovsky1-2/+6
Extend packet reformat types and flow table capabilities with IPsec packet offload tunnel bits. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Sridhar Samudrala <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-17netfilter: nf_tables: fix ifdef to also consider nf_tables=mFlorian Westphal1-2/+2
nftables can be built as a module, so fix the preprocessor conditional accordingly. Fixes: 478b360a47b7 ("netfilter: nf_tables: fix nf_trace always-on with XT_TRACE=n") Reported-by: Florian Fainelli <[email protected]> Reported-by: Jakub Kicinski <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2023-04-17sctp: delete the obsolete code for the host name address paramXin Long1-1/+0
In the latest RFC9260, the Host Name Address param has been deprecated. For INIT chunk: Note 3: An INIT chunk MUST NOT contain the Host Name Address parameter. The receiver of an INIT chunk containing a Host Name Address parameter MUST send an ABORT chunk and MAY include an "Unresolvable Address" error cause. For Supported Address Types: The value indicating the Host Name Address parameter MUST NOT be used when sending this parameter and MUST be ignored when receiving this parameter. Currently Linux SCTP doesn't really support Host Name Address param, but only saves some flag and print debug info, which actually won't even be triggered due to the verification in sctp_verify_param(). This patch is to delete those dead code. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-14page_pool: allow caching from safely localized NAPIJakub Kicinski3-8/+18
Recent patches to mlx5 mentioned a regression when moving from driver local page pool to only using the generic page pool code. Page pool has two recycling paths (1) direct one, which runs in safe NAPI context (basically consumer context, so producing can be lockless); and (2) via a ptr_ring, which takes a spin lock because the freeing can happen from any CPU; producer and consumer may run concurrently. Since the page pool code was added, Eric introduced a revised version of deferred skb freeing. TCP skbs are now usually returned to the CPU which allocated them, and freed in softirq context. This places the freeing (producing of pages back to the pool) enticingly close to the allocation (consumer). If we can prove that we're freeing in the same softirq context in which the consumer NAPI will run - lockless use of the cache is perfectly fine, no need for the lock. Let drivers link the page pool to a NAPI instance. If the NAPI instance is scheduled on the same CPU on which we're freeing - place the pages in the direct cache. With that and patched bnxt (XDP enabled to engage the page pool, sigh, bnxt really needs page pool work :() I see a 2.6% perf boost with a TCP stream test (app on a different physical core than softirq). The CPU use of relevant functions decreases as expected: page_pool_refill_alloc_cache 1.17% -> 0% _raw_spin_lock 2.41% -> 0.98% Only consider lockless path to be safe when NAPI is scheduled - in practice this should cover majority if not all of steady state workloads. It's usually the NAPI kicking in that causes the skb flush. The main case we'll miss out on is when application runs on the same CPU as NAPI. In that case we don't use the deferred skb free path. Reviewed-by: Tariq Toukan <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Tested-by: Dragos Tatulea <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-04-14net: mana: Add support for jumbo frameHaiyang Zhang2-0/+18
During probe, get the hardware-allowed max MTU by querying the device configuration. Users can select MTU up to the device limit. When XDP is in use, limit MTU settings so the buffer size is within one page. And, when MTU is set to a too large value, XDP is not allowed to run. Also, to prevent changing MTU fails, and leaves the NIC in a bad state, pre-allocate all buffers before starting the change. So in low memory condition, it will return error, without affecting the NIC. Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-04-14net: mana: Enable RX path to handle various MTU sizesHaiyang Zhang1-0/+7
Update RX data path to allocate and use RX queue DMA buffers with proper size based on potentially various MTU sizes. Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: Jesse Brandeburg <[email protected]> Signed-off-by: David S. Miller <[email protected]>