diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 15:47:48 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 15:47:48 -0800 |
commit | 7e68dd7d07a28faa2e6574dd6b9dbd90cdeaae91 (patch) | |
tree | ae0427c5a3b905f24b3a44b510a9bcf35d9b67a3 /net/core/rtnetlink.c | |
parent | 1ca06f1c1acecbe02124f14a37cce347b8c1a90c (diff) | |
parent | 7c4a6309e27f411743817fe74a832ec2d2798a4b (diff) |
Merge tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core:
- Allow live renaming when an interface is up
- Add retpoline wrappers for tc, improving considerably the
performances of complex queue discipline configurations
- Add inet drop monitor support
- A few GRO performance improvements
- Add infrastructure for atomic dev stats, addressing long standing
data races
- De-duplicate common code between OVS and conntrack offloading
infrastructure
- A bunch of UBSAN_BOUNDS/FORTIFY_SOURCE improvements
- Netfilter: introduce packet parser for tunneled packets
- Replace IPVS timer-based estimators with kthreads to scale up the
workload with the number of available CPUs
- Add the helper support for connection-tracking OVS offload
BPF:
- Support for user defined BPF objects: the use case is to allocate
own objects, build own object hierarchies and use the building
blocks to build own data structures flexibly, for example, linked
lists in BPF
- Make cgroup local storage available to non-cgroup attached BPF
programs
- Avoid unnecessary deadlock detection and failures wrt BPF task
storage helpers
- A relevant bunch of BPF verifier fixes and improvements
- Veristat tool improvements to support custom filtering, sorting,
and replay of results
- Add LLVM disassembler as default library for dumping JITed code
- Lots of new BPF documentation for various BPF maps
- Add bpf_rcu_read_{,un}lock() support for sleepable programs
- Add RCU grace period chaining to BPF to wait for the completion of
access from both sleepable and non-sleepable BPF programs
- Add support storing struct task_struct objects as kptrs in maps
- Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
values
- Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions
Protocols:
- TCP: implement Protective Load Balancing across switch links
- TCP: allow dynamically disabling TCP-MD5 static key, reverting back
to fast[er]-path
- UDP: Introduce optional per-netns hash lookup table
- IPv6: simplify and cleanup sockets disposal
- Netlink: support different type policies for each generic netlink
operation
- MPTCP: add MSG_FASTOPEN and FastOpen listener side support
- MPTCP: add netlink notification support for listener sockets events
- SCTP: add VRF support, allowing sctp sockets binding to VRF devices
- Add bridging MAC Authentication Bypass (MAB) support
- Extensions for Ethernet VPN bridging implementation to better
support multicast scenarios
- More work for Wi-Fi 7 support, comprising conversion of all the
existing drivers to internal TX queue usage
- IPSec: introduce a new offload type (packet offload) allowing
complete header processing and crypto offloading
- IPSec: extended ack support for more descriptive XFRM error
reporting
- RXRPC: increase SACK table size and move processing into a
per-local endpoint kernel thread, reducing considerably the
required locking
- IEEE 802154: synchronous send frame and extended filtering support,
initial support for scanning available 15.4 networks
- Tun: bump the link speed from 10Mbps to 10Gbps
- Tun/VirtioNet: implement UDP segmentation offload support
Driver API:
- PHY/SFP: improve power level switching between standard level 1 and
the higher power levels
- New API for netdev <-> devlink_port linkage
- PTP: convert existing drivers to new frequency adjustment
implementation
- DSA: add support for rx offloading
- Autoload DSA tagging driver when dynamically changing protocol
- Add new PCP and APPTRUST attributes to Data Center Bridging
- Add configuration support for 800Gbps link speed
- Add devlink port function attribute to enable/disable RoCE and
migratable
- Extend devlink-rate to support strict prioriry and weighted fair
queuing
- Add devlink support to directly reading from region memory
- New device tree helper to fetch MAC address from nvmem
- New big TCP helper to simplify temporary header stripping
New hardware / drivers:
- Ethernet:
- Marvel Octeon CNF95N and CN10KB Ethernet Switches
- Marvel Prestera AC5X Ethernet Switch
- WangXun 10 Gigabit NIC
- Motorcomm yt8521 Gigabit Ethernet
- Microchip ksz9563 Gigabit Ethernet Switch
- Microsoft Azure Network Adapter
- Linux Automation 10Base-T1L adapter
- PHY:
- Aquantia AQR112 and AQR412
- Motorcomm YT8531S
- PTP:
- Orolia ART-CARD
- WiFi:
- MediaTek Wi-Fi 7 (802.11be) devices
- RealTek rtw8821cu, rtw8822bu, rtw8822cu and rtw8723du USB
devices
- Bluetooth:
- Broadcom BCM4377/4378/4387 Bluetooth chipsets
- Realtek RTL8852BE and RTL8723DS
- Cypress.CYW4373A0 WiFi + Bluetooth combo device
Drivers:
- CAN:
- gs_usb: bus error reporting support
- kvaser_usb: listen only and bus error reporting support
- Ethernet NICs:
- Intel (100G):
- extend action skbedit to RX queue mapping
- implement devlink-rate support
- support direct read from memory
- nVidia/Mellanox (mlx5):
- SW steering improvements, increasing rules update rate
- Support for enhanced events compression
- extend H/W offload packet manipulation capabilities
- implement IPSec packet offload mode
- nVidia/Mellanox (mlx4):
- better big TCP support
- Netronome Ethernet NICs (nfp):
- IPsec offload support
- add support for multicast filter
- Broadcom:
- RSS and PTP support improvements
- AMD/SolarFlare:
- netlink extened ack improvements
- add basic flower matches to offload, and related stats
- Virtual NICs:
- ibmvnic: introduce affinity hint support
- small / embedded:
- FreeScale fec: add initial XDP support
- Marvel mv643xx_eth: support MII/GMII/RGMII modes for Kirkwood
- TI am65-cpsw: add suspend/resume support
- Mediatek MT7986: add RX wireless wthernet dispatch support
- Realtek 8169: enable GRO software interrupt coalescing per
default
- Ethernet high-speed switches:
- Microchip (sparx5):
- add support for Sparx5 TC/flower H/W offload via VCAP
- Mellanox mlxsw:
- add 802.1X and MAC Authentication Bypass offload support
- add ip6gre support
- Embedded Ethernet switches:
- Mediatek (mtk_eth_soc):
- improve PCS implementation, add DSA untag support
- enable flow offload support
- Renesas:
- add rswitch R-Car Gen4 gPTP support
- Microchip (lan966x):
- add full XDP support
- add TC H/W offload via VCAP
- enable PTP on bridge interfaces
- Microchip (ksz8):
- add MTU support for KSZ8 series
- Qualcomm 802.11ax WiFi (ath11k):
- support configuring channel dwell time during scan
- MediaTek WiFi (mt76):
- enable Wireless Ethernet Dispatch (WED) offload support
- add ack signal support
- enable coredump support
- remain_on_channel support
- Intel WiFi (iwlwifi):
- enable Wi-Fi 7 Extremely High Throughput (EHT) PHY capabilities
- 320 MHz channels support
- RealTek WiFi (rtw89):
- new dynamic header firmware format support
- wake-over-WLAN support"
* tag 'net-next-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2002 commits)
ipvs: fix type warning in do_div() on 32 bit
net: lan966x: Remove a useless test in lan966x_ptp_add_trap()
net: ipa: add IPA v4.7 support
dt-bindings: net: qcom,ipa: Add SM6350 compatible
bnxt: Use generic HBH removal helper in tx path
IPv6/GRO: generic helper to remove temporary HBH/jumbo header in driver
selftests: forwarding: Add bridge MDB test
selftests: forwarding: Rename bridge_mdb test
bridge: mcast: Support replacement of MDB port group entries
bridge: mcast: Allow user space to specify MDB entry routing protocol
bridge: mcast: Allow user space to add (*, G) with a source list and filter mode
bridge: mcast: Add support for (*, G) with a source list and filter mode
bridge: mcast: Avoid arming group timer when (S, G) corresponds to a source
bridge: mcast: Add a flag for user installed source entries
bridge: mcast: Expose __br_multicast_del_group_src()
bridge: mcast: Expose br_multicast_new_group_src()
bridge: mcast: Add a centralized error path
bridge: mcast: Place netlink policy before validation functions
bridge: mcast: Split (*, G) and (S, G) addition into different functions
bridge: mcast: Do not derive entry type from its filter mode
...
Diffstat (limited to 'net/core/rtnetlink.c')
-rw-r--r-- | net/core/rtnetlink.c | 90 |
1 files changed, 70 insertions, 20 deletions
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 74864dc46a7e..64289bc98887 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -53,6 +53,7 @@ #include <net/fib_rules.h> #include <net/rtnetlink.h> #include <net/net_namespace.h> +#include <net/devlink.h> #include "dev.h" @@ -760,7 +761,7 @@ int rtnl_unicast(struct sk_buff *skb, struct net *net, u32 pid) EXPORT_SYMBOL(rtnl_unicast); void rtnl_notify(struct sk_buff *skb, struct net *net, u32 pid, u32 group, - struct nlmsghdr *nlh, gfp_t flags) + const struct nlmsghdr *nlh, gfp_t flags) { struct sock *rtnl = net->rtnl; @@ -1038,6 +1039,16 @@ static size_t rtnl_proto_down_size(const struct net_device *dev) return size; } +static size_t rtnl_devlink_port_size(const struct net_device *dev) +{ + size_t size = nla_total_size(0); /* nest IFLA_DEVLINK_PORT */ + + if (dev->devlink_port) + size += devlink_nl_port_handle_size(dev->devlink_port); + + return size; +} + static noinline size_t if_nlmsg_size(const struct net_device *dev, u32 ext_filter_mask) { @@ -1091,6 +1102,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev, + nla_total_size(4) /* IFLA_MAX_MTU */ + rtnl_prop_list_size(dev) + nla_total_size(MAX_ADDR_LEN) /* IFLA_PERM_ADDRESS */ + + rtnl_devlink_port_size(dev) + 0; } @@ -1728,6 +1740,30 @@ nla_put_failure: return -EMSGSIZE; } +static int rtnl_fill_devlink_port(struct sk_buff *skb, + const struct net_device *dev) +{ + struct nlattr *devlink_port_nest; + int ret; + + devlink_port_nest = nla_nest_start(skb, IFLA_DEVLINK_PORT); + if (!devlink_port_nest) + return -EMSGSIZE; + + if (dev->devlink_port) { + ret = devlink_nl_port_handle_fill(skb, dev->devlink_port); + if (ret < 0) + goto nest_cancel; + } + + nla_nest_end(skb, devlink_port_nest); + return 0; + +nest_cancel: + nla_nest_cancel(skb, devlink_port_nest); + return ret; +} + static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, struct net *src_net, int type, u32 pid, u32 seq, u32 change, @@ -1865,6 +1901,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, dev->dev.parent->bus->name)) goto nla_put_failure; + if (rtnl_fill_devlink_port(skb, dev)) + goto nla_put_failure; + nlmsg_end(skb, nlh); return 0; @@ -3110,7 +3149,7 @@ static int rtnl_group_dellink(const struct net *net, int group) return 0; } -int rtnl_delete_link(struct net_device *dev) +int rtnl_delete_link(struct net_device *dev, u32 portid, const struct nlmsghdr *nlh) { const struct rtnl_link_ops *ops; LIST_HEAD(list_kill); @@ -3120,7 +3159,7 @@ int rtnl_delete_link(struct net_device *dev) return -EOPNOTSUPP; ops->dellink(dev, &list_kill); - unregister_netdevice_many(&list_kill); + unregister_netdevice_many_notify(&list_kill, portid, nlh); return 0; } @@ -3130,6 +3169,7 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, struct netlink_ext_ack *extack) { struct net *net = sock_net(skb->sk); + u32 portid = NETLINK_CB(skb).portid; struct net *tgt_net = net; struct net_device *dev = NULL; struct ifinfomsg *ifm; @@ -3171,7 +3211,7 @@ static int rtnl_dellink(struct sk_buff *skb, struct nlmsghdr *nlh, goto out; } - err = rtnl_delete_link(dev); + err = rtnl_delete_link(dev, portid, nlh); out: if (netnsid >= 0) @@ -3180,7 +3220,8 @@ out: return err; } -int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm) +int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm, + u32 portid, const struct nlmsghdr *nlh) { unsigned int old_flags; int err; @@ -3194,10 +3235,10 @@ int rtnl_configure_link(struct net_device *dev, const struct ifinfomsg *ifm) } if (dev->rtnl_link_state == RTNL_LINK_INITIALIZED) { - __dev_notify_flags(dev, old_flags, (old_flags ^ dev->flags)); + __dev_notify_flags(dev, old_flags, (old_flags ^ dev->flags), portid, nlh); } else { dev->rtnl_link_state = RTNL_LINK_INITIALIZED; - __dev_notify_flags(dev, old_flags, ~0U); + __dev_notify_flags(dev, old_flags, ~0U, portid, nlh); } return 0; } @@ -3311,11 +3352,13 @@ static int rtnl_group_changelink(const struct sk_buff *skb, static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, const struct rtnl_link_ops *ops, + const struct nlmsghdr *nlh, struct nlattr **tb, struct nlattr **data, struct netlink_ext_ack *extack) { unsigned char name_assign_type = NET_NAME_USER; struct net *net = sock_net(skb->sk); + u32 portid = NETLINK_CB(skb).portid; struct net *dest_net, *link_net; struct net_device *dev; char ifname[IFNAMSIZ]; @@ -3369,7 +3412,7 @@ static int rtnl_newlink_create(struct sk_buff *skb, struct ifinfomsg *ifm, goto out; } - err = rtnl_configure_link(dev, ifm); + err = rtnl_configure_link(dev, ifm, portid, nlh); if (err < 0) goto out_unregister; if (link_net) { @@ -3578,7 +3621,7 @@ replay: return -EOPNOTSUPP; } - return rtnl_newlink_create(skb, ifm, ops, tb, data, extack); + return rtnl_newlink_create(skb, ifm, ops, nlh, tb, data, extack); } static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -3896,7 +3939,7 @@ static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb) struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev, unsigned int change, u32 event, gfp_t flags, int *new_nsid, - int new_ifindex) + int new_ifindex, u32 portid, u32 seq) { struct net *net = dev_net(dev); struct sk_buff *skb; @@ -3907,7 +3950,7 @@ struct sk_buff *rtmsg_ifinfo_build_skb(int type, struct net_device *dev, goto errout; err = rtnl_fill_ifinfo(skb, dev, dev_net(dev), - type, 0, 0, change, 0, 0, event, + type, portid, seq, change, 0, 0, event, new_nsid, new_ifindex, -1, flags); if (err < 0) { /* -EMSGSIZE implies BUG in if_nlmsg_size() */ @@ -3922,16 +3965,18 @@ errout: return NULL; } -void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev, gfp_t flags) +void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev, gfp_t flags, + u32 portid, const struct nlmsghdr *nlh) { struct net *net = dev_net(dev); - rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags); + rtnl_notify(skb, net, portid, RTNLGRP_LINK, nlh, flags); } static void rtmsg_ifinfo_event(int type, struct net_device *dev, unsigned int change, u32 event, - gfp_t flags, int *new_nsid, int new_ifindex) + gfp_t flags, int *new_nsid, int new_ifindex, + u32 portid, const struct nlmsghdr *nlh) { struct sk_buff *skb; @@ -3939,23 +3984,23 @@ static void rtmsg_ifinfo_event(int type, struct net_device *dev, return; skb = rtmsg_ifinfo_build_skb(type, dev, change, event, flags, new_nsid, - new_ifindex); + new_ifindex, portid, nlmsg_seq(nlh)); if (skb) - rtmsg_ifinfo_send(skb, dev, flags); + rtmsg_ifinfo_send(skb, dev, flags, portid, nlh); } void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, - gfp_t flags) + gfp_t flags, u32 portid, const struct nlmsghdr *nlh) { rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, - NULL, 0); + NULL, 0, portid, nlh); } void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, gfp_t flags, int *new_nsid, int new_ifindex) { rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, - new_nsid, new_ifindex); + new_nsid, new_ifindex, 0, NULL); } static int nlmsg_populate_fdb_fill(struct sk_buff *skb, @@ -4045,6 +4090,11 @@ int ndo_dflt_fdb_add(struct ndmsg *ndm, return err; } + if (tb[NDA_FLAGS_EXT]) { + netdev_info(dev, "invalid flags given to default FDB implementation\n"); + return err; + } + if (vid) { netdev_info(dev, "vlans aren't supported yet for dev_uc|mc_add()\n"); return err; @@ -6140,7 +6190,7 @@ static int rtnetlink_event(struct notifier_block *this, unsigned long event, voi case NETDEV_CHANGELOWERSTATE: case NETDEV_CHANGE_TX_QUEUE_LEN: rtmsg_ifinfo_event(RTM_NEWLINK, dev, 0, rtnl_get_event(event), - GFP_KERNEL, NULL, 0); + GFP_KERNEL, NULL, 0, 0, NULL); break; default: break; |