diff options
author | Alexei Starovoitov <[email protected]> | 2023-12-13 16:16:42 -0800 |
---|---|---|
committer | Alexei Starovoitov <[email protected]> | 2023-12-13 16:16:42 -0800 |
commit | ec14325c7339bf1d40fc29bb8a0d2121cfe649aa (patch) | |
tree | 69dbb3887b907d62888bd4cad1781053d55ddd3c | |
parent | 733763285acfe8dffd6e39ad2ed3d1222b32a901 (diff) | |
parent | 4c6612f6100c2d85212865dbd1a5d8a7e391d3cb (diff) |
Merge branch 'xdp-metadata-via-kfuncs-for-ice-vlan-hint'
Larysa Zaremba says:
====================
XDP metadata via kfuncs for ice + VLAN hint
This series introduces XDP hints via kfuncs [0] to the ice driver.
Series brings the following existing hints to the ice driver:
- HW timestamp
- RX hash with type
Series also introduces VLAN tag with protocol XDP hint, it now be accessed by
XDP and userspace (AF_XDP) programs. They can also be checked with xdp_metadata
test and xdp_hw_metadata program.
Impact of these patches on ice performance:
ZC:
* Full hints implementation decreases pps in ZC mode by less than 3%
(64B, rxdrop)
skb (packets with invalid IP, dropped by stack):
* Overall, patchset improves peak performance in skb mode by about 0.5%
[0] https://patchwork.kernel.org/project/netdevbpf/cover/[email protected]/
v7:
https://lore.kernel.org/bpf/[email protected]/
v6:
https://lore.kernel.org/bpf/[email protected]/
Intermediate RFC v2:
https://lore.kernel.org/bpf/[email protected]/
Intermediate RFC v1:
https://lore.kernel.org/bpf/[email protected]/
v5:
https://lore.kernel.org/bpf/[email protected]/
v4:
https://lore.kernel.org/bpf/[email protected]/
v3:
https://lore.kernel.org/bpf/[email protected]/
v2:
https://lore.kernel.org/bpf/[email protected]/
v1:
https://lore.kernel.org/all/[email protected]/
Changes since v7:
* shorten timestamp assignment in ice
* change first argument of ice_fill_rx_descs back to xsk_buff_pool
* fix kernel-doc for ice_run_xdp_zc
* add missing XSK_CHECK_PRIV_TYPE() in ice
* resolved selftests merge conflicts with TX hints
* AF_INET patch adds new packet generation, not replaces AF_XDP one
* fix destination port in xdp_metadata
Changes since v6:
* add ability to fill cb of all xdp_buffs in xsk_buff_pool
* place just pointer to packet context in ice_xdp_buff
* add const qualifiers in veth implementation
* generate uapi for VLAN hint
Changes since v5:
* drop checksum hint from the patchset entirely
* Alex's patch that lifts the data_meta size limitation is no longer
required in this patchset, so will be sent separately
* new patch: hide some ice hints code behind a static key
* fix several bugs in ZC mode (ice)
* change argument order in VLAN hint kfunc (tci, proto -> proto, tci)
* cosmetic changes
* analyze performance impact
Changes since v4:
* Drop the concept of partial checksum from the hint design
* Drop the concept of checksum level from the hint design
Changes since v3:
* use XDP_CHECKSUM_VALID_LVL0 + csum_level instead of csum_level + 1
* fix spelling mistakes
* read XDP timestamp unconditionally
* add TO_STR() macro
Changes since v2:
* redesign checksum hint, so now it gives full status
* rename vlan_tag -> vlan_tci, where applicable
* use open_netns() and close_netns() in xdp_metadata
* improve VLAN hint documentation
* replace CFI with DEI
* use VLAN_VID_MASK in xdp_metadata
* make vlan_get_tag() return -ENODATA
* remove unused rx_ptype in ice_xsk.c
* fix ice timestamp code division between patches
Changes since v1:
* directly return RX hash, RX timestamp and RX checksum status
in skb-common functions
* use intermediate enum value for checksum status in ice
* get rid of ring structure dependency in ice kfunc implementation
* make variables const, when possible, in ice implementation
* use -ENODATA instead of -EOPNOTSUPP for driver implementation
* instead of having 2 separate functions for c-tag and s-tag,
use 1 function that outputs both VLAN tag and protocol ID
* improve documentation for introduced hints
* update xdp_metadata selftest to test new hints
* implement new hints in veth, so they can be tested in xdp_metadata
* parse VLAN tag in xdp_hw_metadata
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
31 files changed, 850 insertions, 309 deletions
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index eef6358ec587..aeec090e1387 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -54,6 +54,10 @@ definitions: name: hash doc: Device is capable of exposing receive packet hash via bpf_xdp_metadata_rx_hash(). + - + name: vlan-tag + doc: + Device is capable of exposing receive packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). - type: flags name: xsk-flags diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst index e3e9420fd817..a6e0ece18be5 100644 --- a/Documentation/networking/xdp-rx-metadata.rst +++ b/Documentation/networking/xdp-rx-metadata.rst @@ -20,7 +20,13 @@ Currently, the following kfuncs are supported. In the future, as more metadata is supported, this set will grow: .. kernel-doc:: net/core/xdp.c - :identifiers: bpf_xdp_metadata_rx_timestamp bpf_xdp_metadata_rx_hash + :identifiers: bpf_xdp_metadata_rx_timestamp + +.. kernel-doc:: net/core/xdp.c + :identifiers: bpf_xdp_metadata_rx_hash + +.. kernel-doc:: net/core/xdp.c + :identifiers: bpf_xdp_metadata_rx_vlan_tag An XDP program can use these kfuncs to read the metadata into stack variables for its own consumption. Or, to pass the metadata on to other diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index cd7dcd0fa7f2..9cf4ed3d2885 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -996,4 +996,6 @@ static inline void ice_clear_rdma_cap(struct ice_pf *pf) set_bit(ICE_FLAG_UNPLUG_AUX_DEV, pf->flags); clear_bit(ICE_FLAG_RDMA_ENA, pf->flags); } + +extern const struct xdp_metadata_ops ice_xdp_md_ops; #endif /* _ICE_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index 7fa43827a3f0..a040f02a342e 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -519,6 +519,19 @@ static int ice_setup_rx_ctx(struct ice_rx_ring *ring) return 0; } +static void ice_xsk_pool_fill_cb(struct ice_rx_ring *ring) +{ + void *ctx_ptr = &ring->pkt_ctx; + struct xsk_cb_desc desc = {}; + + XSK_CHECK_PRIV_TYPE(struct ice_xdp_buff); + desc.src = &ctx_ptr; + desc.off = offsetof(struct ice_xdp_buff, pkt_ctx) - + sizeof(struct xdp_buff); + desc.bytes = sizeof(ctx_ptr); + xsk_pool_fill_cb(ring->xsk_pool, &desc); +} + /** * ice_vsi_cfg_rxq - Configure an Rx queue * @ring: the ring being configured @@ -553,6 +566,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring) if (err) return err; xsk_pool_set_rxq_info(ring->xsk_pool, &ring->xdp_rxq); + ice_xsk_pool_fill_cb(ring); dev_info(dev, "Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring %d\n", ring->q_index); @@ -575,6 +589,7 @@ int ice_vsi_cfg_rxq(struct ice_rx_ring *ring) xdp_init_buff(&ring->xdp, ice_rx_pg_size(ring) / 2, &ring->xdp_rxq); ring->xdp.data = NULL; + ring->xdp_ext.pkt_ctx = &ring->pkt_ctx; err = ice_setup_rx_ctx(ring); if (err) { dev_err(dev, "ice_setup_rx_ctx failed for RxQ %d, err %d\n", diff --git a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h index 89f986a75cc8..d384ddfcb83e 100644 --- a/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h +++ b/drivers/net/ethernet/intel/ice/ice_lan_tx_rx.h @@ -673,6 +673,212 @@ struct ice_tlan_ctx { * Use the enum ice_rx_l2_ptype to decode the packet type * ENDIF */ +#define ICE_PTYPES \ + /* L2 Packet types */ \ + ICE_PTT_UNUSED_ENTRY(0), \ + ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2), \ + ICE_PTT_UNUSED_ENTRY(2), \ + ICE_PTT_UNUSED_ENTRY(3), \ + ICE_PTT_UNUSED_ENTRY(4), \ + ICE_PTT_UNUSED_ENTRY(5), \ + ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT_UNUSED_ENTRY(8), \ + ICE_PTT_UNUSED_ENTRY(9), \ + ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), \ + ICE_PTT_UNUSED_ENTRY(12), \ + ICE_PTT_UNUSED_ENTRY(13), \ + ICE_PTT_UNUSED_ENTRY(14), \ + ICE_PTT_UNUSED_ENTRY(15), \ + ICE_PTT_UNUSED_ENTRY(16), \ + ICE_PTT_UNUSED_ENTRY(17), \ + ICE_PTT_UNUSED_ENTRY(18), \ + ICE_PTT_UNUSED_ENTRY(19), \ + ICE_PTT_UNUSED_ENTRY(20), \ + ICE_PTT_UNUSED_ENTRY(21), \ + \ + /* Non Tunneled IPv4 */ \ + ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(25), \ + ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP, PAY4), \ + ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4), \ + ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> IPv4 */ \ + ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(32), \ + ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> IPv6 */ \ + ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(39), \ + ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT */ \ + ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 --> GRE/NAT --> IPv4 */ \ + ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(47), \ + ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> IPv6 */ \ + ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(54), \ + ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> MAC */ \ + ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 --> GRE/NAT --> MAC --> IPv4 */ \ + ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(62), \ + ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT -> MAC --> IPv6 */ \ + ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(69), \ + ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv4 --> GRE/NAT --> MAC/VLAN */ \ + ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), \ + \ + /* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */ \ + ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(77), \ + ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */ \ + ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(84), \ + ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), \ + \ + /* Non Tunneled IPv6 */ \ + ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3), \ + ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(91), \ + ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP, PAY4), \ + ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4), \ + ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> IPv4 */ \ + ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(98), \ + ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> IPv6 */ \ + ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(105), \ + ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT */ \ + ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> IPv4 */ \ + ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(113), \ + ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> IPv6 */ \ + ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(120), \ + ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC */ \ + ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> MAC -> IPv4 */ \ + ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(128), \ + ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC -> IPv6 */ \ + ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(135), \ + ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN */ \ + ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */ \ + ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), \ + ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), \ + ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(143), \ + ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), \ + ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), \ + ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), \ + \ + /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */ \ + ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), \ + ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), \ + ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), \ + ICE_PTT_UNUSED_ENTRY(150), \ + ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), \ + ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), \ + ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), + +#define ICE_NUM_DEFINED_PTYPES 154 /* macro to make the table lines short, use explicit indexing with [PTYPE] */ #define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\ @@ -695,212 +901,10 @@ struct ice_tlan_ctx { /* Lookup table mapping in the 10-bit HW PTYPE to the bit field for decoding */ static const struct ice_rx_ptype_decoded ice_ptype_lkup[BIT(10)] = { - /* L2 Packet types */ - ICE_PTT_UNUSED_ENTRY(0), - ICE_PTT(1, L2, NONE, NOF, NONE, NONE, NOF, NONE, PAY2), - ICE_PTT_UNUSED_ENTRY(2), - ICE_PTT_UNUSED_ENTRY(3), - ICE_PTT_UNUSED_ENTRY(4), - ICE_PTT_UNUSED_ENTRY(5), - ICE_PTT(6, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT(7, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT_UNUSED_ENTRY(8), - ICE_PTT_UNUSED_ENTRY(9), - ICE_PTT(10, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT(11, L2, NONE, NOF, NONE, NONE, NOF, NONE, NONE), - ICE_PTT_UNUSED_ENTRY(12), - ICE_PTT_UNUSED_ENTRY(13), - ICE_PTT_UNUSED_ENTRY(14), - ICE_PTT_UNUSED_ENTRY(15), - ICE_PTT_UNUSED_ENTRY(16), - ICE_PTT_UNUSED_ENTRY(17), - ICE_PTT_UNUSED_ENTRY(18), - ICE_PTT_UNUSED_ENTRY(19), - ICE_PTT_UNUSED_ENTRY(20), - ICE_PTT_UNUSED_ENTRY(21), - - /* Non Tunneled IPv4 */ - ICE_PTT(22, IP, IPV4, FRG, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(23, IP, IPV4, NOF, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(24, IP, IPV4, NOF, NONE, NONE, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(25), - ICE_PTT(26, IP, IPV4, NOF, NONE, NONE, NOF, TCP, PAY4), - ICE_PTT(27, IP, IPV4, NOF, NONE, NONE, NOF, SCTP, PAY4), - ICE_PTT(28, IP, IPV4, NOF, NONE, NONE, NOF, ICMP, PAY4), - - /* IPv4 --> IPv4 */ - ICE_PTT(29, IP, IPV4, NOF, IP_IP, IPV4, FRG, NONE, PAY3), - ICE_PTT(30, IP, IPV4, NOF, IP_IP, IPV4, NOF, NONE, PAY3), - ICE_PTT(31, IP, IPV4, NOF, IP_IP, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(32), - ICE_PTT(33, IP, IPV4, NOF, IP_IP, IPV4, NOF, TCP, PAY4), - ICE_PTT(34, IP, IPV4, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), - ICE_PTT(35, IP, IPV4, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> IPv6 */ - ICE_PTT(36, IP, IPV4, NOF, IP_IP, IPV6, FRG, NONE, PAY3), - ICE_PTT(37, IP, IPV4, NOF, IP_IP, IPV6, NOF, NONE, PAY3), - ICE_PTT(38, IP, IPV4, NOF, IP_IP, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(39), - ICE_PTT(40, IP, IPV4, NOF, IP_IP, IPV6, NOF, TCP, PAY4), - ICE_PTT(41, IP, IPV4, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), - ICE_PTT(42, IP, IPV4, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT */ - ICE_PTT(43, IP, IPV4, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), - - /* IPv4 --> GRE/NAT --> IPv4 */ - ICE_PTT(44, IP, IPV4, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), - ICE_PTT(45, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), - ICE_PTT(46, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(47), - ICE_PTT(48, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), - ICE_PTT(49, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), - ICE_PTT(50, IP, IPV4, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> IPv6 */ - ICE_PTT(51, IP, IPV4, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), - ICE_PTT(52, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), - ICE_PTT(53, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(54), - ICE_PTT(55, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), - ICE_PTT(56, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), - ICE_PTT(57, IP, IPV4, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> MAC */ - ICE_PTT(58, IP, IPV4, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), - - /* IPv4 --> GRE/NAT --> MAC --> IPv4 */ - ICE_PTT(59, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), - ICE_PTT(60, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), - ICE_PTT(61, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(62), - ICE_PTT(63, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), - ICE_PTT(64, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), - ICE_PTT(65, IP, IPV4, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT -> MAC --> IPv6 */ - ICE_PTT(66, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), - ICE_PTT(67, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), - ICE_PTT(68, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(69), - ICE_PTT(70, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), - ICE_PTT(71, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), - ICE_PTT(72, IP, IPV4, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), - - /* IPv4 --> GRE/NAT --> MAC/VLAN */ - ICE_PTT(73, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), - - /* IPv4 ---> GRE/NAT -> MAC/VLAN --> IPv4 */ - ICE_PTT(74, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), - ICE_PTT(75, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), - ICE_PTT(76, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(77), - ICE_PTT(78, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), - ICE_PTT(79, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), - ICE_PTT(80, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), - - /* IPv4 -> GRE/NAT -> MAC/VLAN --> IPv6 */ - ICE_PTT(81, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), - ICE_PTT(82, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), - ICE_PTT(83, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(84), - ICE_PTT(85, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), - ICE_PTT(86, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), - ICE_PTT(87, IP, IPV4, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), - - /* Non Tunneled IPv6 */ - ICE_PTT(88, IP, IPV6, FRG, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(89, IP, IPV6, NOF, NONE, NONE, NOF, NONE, PAY3), - ICE_PTT(90, IP, IPV6, NOF, NONE, NONE, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(91), - ICE_PTT(92, IP, IPV6, NOF, NONE, NONE, NOF, TCP, PAY4), - ICE_PTT(93, IP, IPV6, NOF, NONE, NONE, NOF, SCTP, PAY4), - ICE_PTT(94, IP, IPV6, NOF, NONE, NONE, NOF, ICMP, PAY4), - - /* IPv6 --> IPv4 */ - ICE_PTT(95, IP, IPV6, NOF, IP_IP, IPV4, FRG, NONE, PAY3), - ICE_PTT(96, IP, IPV6, NOF, IP_IP, IPV4, NOF, NONE, PAY3), - ICE_PTT(97, IP, IPV6, NOF, IP_IP, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(98), - ICE_PTT(99, IP, IPV6, NOF, IP_IP, IPV4, NOF, TCP, PAY4), - ICE_PTT(100, IP, IPV6, NOF, IP_IP, IPV4, NOF, SCTP, PAY4), - ICE_PTT(101, IP, IPV6, NOF, IP_IP, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> IPv6 */ - ICE_PTT(102, IP, IPV6, NOF, IP_IP, IPV6, FRG, NONE, PAY3), - ICE_PTT(103, IP, IPV6, NOF, IP_IP, IPV6, NOF, NONE, PAY3), - ICE_PTT(104, IP, IPV6, NOF, IP_IP, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(105), - ICE_PTT(106, IP, IPV6, NOF, IP_IP, IPV6, NOF, TCP, PAY4), - ICE_PTT(107, IP, IPV6, NOF, IP_IP, IPV6, NOF, SCTP, PAY4), - ICE_PTT(108, IP, IPV6, NOF, IP_IP, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT */ - ICE_PTT(109, IP, IPV6, NOF, IP_GRENAT, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> IPv4 */ - ICE_PTT(110, IP, IPV6, NOF, IP_GRENAT, IPV4, FRG, NONE, PAY3), - ICE_PTT(111, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, NONE, PAY3), - ICE_PTT(112, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(113), - ICE_PTT(114, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, TCP, PAY4), - ICE_PTT(115, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, SCTP, PAY4), - ICE_PTT(116, IP, IPV6, NOF, IP_GRENAT, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> IPv6 */ - ICE_PTT(117, IP, IPV6, NOF, IP_GRENAT, IPV6, FRG, NONE, PAY3), - ICE_PTT(118, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, NONE, PAY3), - ICE_PTT(119, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(120), - ICE_PTT(121, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, TCP, PAY4), - ICE_PTT(122, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, SCTP, PAY4), - ICE_PTT(123, IP, IPV6, NOF, IP_GRENAT, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC */ - ICE_PTT(124, IP, IPV6, NOF, IP_GRENAT_MAC, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> MAC -> IPv4 */ - ICE_PTT(125, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, FRG, NONE, PAY3), - ICE_PTT(126, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, NONE, PAY3), - ICE_PTT(127, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(128), - ICE_PTT(129, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, TCP, PAY4), - ICE_PTT(130, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, SCTP, PAY4), - ICE_PTT(131, IP, IPV6, NOF, IP_GRENAT_MAC, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC -> IPv6 */ - ICE_PTT(132, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, FRG, NONE, PAY3), - ICE_PTT(133, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, NONE, PAY3), - ICE_PTT(134, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(135), - ICE_PTT(136, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, TCP, PAY4), - ICE_PTT(137, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, SCTP, PAY4), - ICE_PTT(138, IP, IPV6, NOF, IP_GRENAT_MAC, IPV6, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC/VLAN */ - ICE_PTT(139, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, NONE, NOF, NONE, PAY3), - - /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv4 */ - ICE_PTT(140, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, FRG, NONE, PAY3), - ICE_PTT(141, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, NONE, PAY3), - ICE_PTT(142, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(143), - ICE_PTT(144, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, TCP, PAY4), - ICE_PTT(145, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, SCTP, PAY4), - ICE_PTT(146, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV4, NOF, ICMP, PAY4), - - /* IPv6 --> GRE/NAT -> MAC/VLAN --> IPv6 */ - ICE_PTT(147, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, FRG, NONE, PAY3), - ICE_PTT(148, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, NONE, PAY3), - ICE_PTT(149, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, UDP, PAY4), - ICE_PTT_UNUSED_ENTRY(150), - ICE_PTT(151, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, TCP, PAY4), - ICE_PTT(152, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, SCTP, PAY4), - ICE_PTT(153, IP, IPV6, NOF, IP_GRENAT_MAC_VLAN, IPV6, NOF, ICMP, PAY4), + ICE_PTYPES /* unused entries */ - [154 ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 } + [ICE_NUM_DEFINED_PTYPES ... 1023] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 } }; static inline struct ice_rx_ptype_decoded ice_decode_rx_desc_ptype(u16 ptype) diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 43ba3e55b8c1..86f704850aa6 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -3397,6 +3397,7 @@ static void ice_set_ops(struct ice_vsi *vsi) netdev->netdev_ops = &ice_netdev_ops; netdev->udp_tunnel_nic_info = &pf->hw.udp_tunnel_nic; + netdev->xdp_metadata_ops = &ice_xdp_md_ops; ice_set_ethtool_ops(netdev); if (vsi->type != ICE_VSI_PF) @@ -6043,6 +6044,23 @@ ice_fix_features(struct net_device *netdev, netdev_features_t features) } /** + * ice_set_rx_rings_vlan_proto - update rings with new stripped VLAN proto + * @vsi: PF's VSI + * @vlan_ethertype: VLAN ethertype (802.1Q or 802.1ad) in network byte order + * + * Store current stripped VLAN proto in ring packet context, + * so it can be accessed more efficiently by packet processing code. + */ +static void +ice_set_rx_rings_vlan_proto(struct ice_vsi *vsi, __be16 vlan_ethertype) +{ + u16 i; + + ice_for_each_alloc_rxq(vsi, i) + vsi->rx_rings[i]->pkt_ctx.vlan_proto = vlan_ethertype; +} + +/** * ice_set_vlan_offload_features - set VLAN offload features for the PF VSI * @vsi: PF's VSI * @features: features used to determine VLAN offload settings @@ -6084,6 +6102,9 @@ ice_set_vlan_offload_features(struct ice_vsi *vsi, netdev_features_t features) if (strip_err || insert_err) return -EIO; + ice_set_rx_rings_vlan_proto(vsi, enable_stripping ? + htons(vlan_ethertype) : 0); + return 0; } diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index 71f405f8a6fe..a4d3a9ee409a 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -2127,30 +2127,26 @@ int ice_ptp_set_ts_config(struct ice_pf *pf, struct ifreq *ifr) } /** - * ice_ptp_rx_hwtstamp - Check for an Rx timestamp - * @rx_ring: Ring to get the VSI info + * ice_ptp_get_rx_hwts - Get packet Rx timestamp in ns * @rx_desc: Receive descriptor - * @skb: Particular skb to send timestamp with + * @pkt_ctx: Packet context to get the cached time * * The driver receives a notification in the receive descriptor with timestamp. - * The timestamp is in ns, so we must convert the result first. */ -void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + const struct ice_pkt_ctx *pkt_ctx) { - struct skb_shared_hwtstamps *hwtstamps; u64 ts_ns, cached_time; u32 ts_high; if (!(rx_desc->wb.time_stamp_low & ICE_PTP_TS_VALID)) - return; + return 0; - cached_time = READ_ONCE(rx_ring->cached_phctime); + cached_time = READ_ONCE(pkt_ctx->cached_phctime); /* Do not report a timestamp if we don't have a cached PHC time */ if (!cached_time) - return; + return 0; /* Use ice_ptp_extend_32b_ts directly, using the ring-specific cached * PHC value, rather than accessing the PF. This also allows us to @@ -2161,9 +2157,7 @@ ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, ts_high = le32_to_cpu(rx_desc->wb.flex_ts.ts_high); ts_ns = ice_ptp_extend_32b_ts(cached_time, ts_high); - hwtstamps = skb_hwtstamps(skb); - memset(hwtstamps, 0, sizeof(*hwtstamps)); - hwtstamps->hwtstamp = ns_to_ktime(ts_ns); + return ts_ns; } /** diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.h b/drivers/net/ethernet/intel/ice/ice_ptp.h index 06a330867fc9..5c6450e4f2f2 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp.h @@ -298,9 +298,8 @@ void ice_ptp_extts_event(struct ice_pf *pf); s8 ice_ptp_request_ts(struct ice_ptp_tx *tx, struct sk_buff *skb); enum ice_tx_tstamp_work ice_ptp_process_ts(struct ice_pf *pf); -void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb); +u64 ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + const struct ice_pkt_ctx *pkt_ctx); void ice_ptp_reset(struct ice_pf *pf); void ice_ptp_prepare_for_reset(struct ice_pf *pf); void ice_ptp_init(struct ice_pf *pf); @@ -329,9 +328,14 @@ static inline bool ice_ptp_process_ts(struct ice_pf *pf) { return true; } -static inline void -ice_ptp_rx_hwtstamp(struct ice_rx_ring *rx_ring, - union ice_32b_rx_flex_desc *rx_desc, struct sk_buff *skb) { } + +static inline u64 +ice_ptp_get_rx_hwts(const union ice_32b_rx_flex_desc *rx_desc, + const struct ice_pkt_ctx *pkt_ctx) +{ + return 0; +} + static inline void ice_ptp_reset(struct ice_pf *pf) { } static inline void ice_ptp_prepare_for_reset(struct ice_pf *pf) { } static inline void ice_ptp_init(struct ice_pf *pf) { } diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index 9e97ea863068..59617f055e35 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -557,13 +557,14 @@ ice_rx_frame_truesize(struct ice_rx_ring *rx_ring, const unsigned int size) * @xdp_prog: XDP program to run * @xdp_ring: ring to be used for XDP_TX action * @rx_buf: Rx buffer to store the XDP action + * @eop_desc: Last descriptor in packet to read metadata from * * Returns any of ICE_XDP_{PASS, CONSUMED, TX, REDIR} */ static void ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, struct ice_tx_ring *xdp_ring, - struct ice_rx_buf *rx_buf) + struct ice_rx_buf *rx_buf, union ice_32b_rx_flex_desc *eop_desc) { unsigned int ret = ICE_XDP_PASS; u32 act; @@ -571,6 +572,8 @@ ice_run_xdp(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp, if (!xdp_prog) goto exit; + ice_xdp_meta_set_desc(xdp, eop_desc); + act = bpf_prog_run_xdp(xdp_prog, xdp); switch (act) { case XDP_PASS: @@ -1180,8 +1183,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) struct sk_buff *skb; unsigned int size; u16 stat_err_bits; - u16 vlan_tag = 0; - u16 rx_ptype; + u16 vlan_tci; /* get the Rx desc from Rx ring based on 'next_to_clean' */ rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -1241,7 +1243,7 @@ int ice_clean_rx_irq(struct ice_rx_ring *rx_ring, int budget) if (ice_is_non_eop(rx_ring, rx_desc)) continue; - ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf); + ice_run_xdp(rx_ring, xdp, xdp_prog, xdp_ring, rx_buf, rx_desc); if (rx_buf->act == ICE_XDP_PASS) goto construct_skb; total_rx_bytes += xdp_get_buff_len(xdp); @@ -1276,7 +1278,7 @@ construct_skb: continue; } - vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc); + vlan_tci = ice_get_vlan_tci(rx_desc); /* pad the skb if needed, to make a valid ethernet frame */ if (eth_skb_pad(skb)) @@ -1286,14 +1288,11 @@ construct_skb: total_rx_bytes += skb->len; /* populate checksum, VLAN, and protocol */ - rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & - ICE_RX_FLEX_DESC_PTYPE_M; - - ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype); + ice_process_skb_fields(rx_ring, rx_desc, skb); ice_trace(clean_rx_irq_indicate, rx_ring, rx_desc, skb); /* send completed skb up the stack */ - ice_receive_skb(rx_ring, skb, vlan_tag); + ice_receive_skb(rx_ring, skb, vlan_tci); /* update budget accounting */ total_rx_pkts++; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.h b/drivers/net/ethernet/intel/ice/ice_txrx.h index daf7b9dbb143..b3379ff73674 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx.h @@ -257,6 +257,20 @@ enum ice_rx_dtype { ICE_RX_DTYPE_SPLIT_ALWAYS = 2, }; +struct ice_pkt_ctx { + u64 cached_phctime; + __be16 vlan_proto; +}; + +struct ice_xdp_buff { + struct xdp_buff xdp_buff; + const union ice_32b_rx_flex_desc *eop_desc; + const struct ice_pkt_ctx *pkt_ctx; +}; + +/* Required for compatibility with xdp_buffs from xsk_pool */ +static_assert(offsetof(struct ice_xdp_buff, xdp_buff) == 0); + /* indices into GLINT_ITR registers */ #define ICE_RX_ITR ICE_IDX_ITR0 #define ICE_TX_ITR ICE_IDX_ITR1 @@ -298,7 +312,6 @@ enum ice_dynamic_itr { /* descriptor ring, associated with a VSI */ struct ice_rx_ring { /* CL1 - 1st cacheline starts here */ - struct ice_rx_ring *next; /* pointer to next ring in q_vector */ void *desc; /* Descriptor ring memory */ struct device *dev; /* Used for DMA mapping */ struct net_device *netdev; /* netdev ring maps to */ @@ -310,13 +323,24 @@ struct ice_rx_ring { u16 count; /* Number of descriptors */ u16 reg_idx; /* HW register index of the ring */ u16 next_to_alloc; - /* CL2 - 2nd cacheline starts here */ + union { struct ice_rx_buf *rx_buf; struct xdp_buff **xdp_buf; }; - struct xdp_buff xdp; + /* CL2 - 2nd cacheline starts here */ + union { + struct ice_xdp_buff xdp_ext; + struct xdp_buff xdp; + }; /* CL3 - 3rd cacheline starts here */ + union { + struct ice_pkt_ctx pkt_ctx; + struct { + u64 cached_phctime; + __be16 vlan_proto; + }; + }; struct bpf_prog *xdp_prog; u16 rx_offset; @@ -332,9 +356,9 @@ struct ice_rx_ring { /* CL4 - 4th cacheline starts here */ struct ice_channel *ch; struct ice_tx_ring *xdp_ring; + struct ice_rx_ring *next; /* pointer to next ring in q_vector */ struct xsk_buff_pool *xsk_pool; dma_addr_t dma; /* physical address of ring */ - u64 cached_phctime; u16 rx_buf_len; u8 dcb_tc; /* Traffic class of ring */ u8 ptp_rx; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c index 7e06373e14d9..839e5da24ad5 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c @@ -63,28 +63,42 @@ static enum pkt_hash_types ice_ptype_to_htype(u16 ptype) } /** - * ice_rx_hash - set the hash value in the skb + * ice_get_rx_hash - get RX hash value from descriptor + * @rx_desc: specific descriptor + * + * Returns hash, if present, 0 otherwise. + */ +static u32 ice_get_rx_hash(const union ice_32b_rx_flex_desc *rx_desc) +{ + const struct ice_32b_rx_flex_desc_nic *nic_mdid; + + if (unlikely(rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC)) + return 0; + + nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc; + return le32_to_cpu(nic_mdid->rss_hash); +} + +/** + * ice_rx_hash_to_skb - set the hash value in the skb * @rx_ring: descriptor ring * @rx_desc: specific descriptor * @skb: pointer to current skb * @rx_ptype: the ptype value from the descriptor */ static void -ice_rx_hash(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 rx_ptype) +ice_rx_hash_to_skb(const struct ice_rx_ring *rx_ring, + const union ice_32b_rx_flex_desc *rx_desc, + struct sk_buff *skb, u16 rx_ptype) { - struct ice_32b_rx_flex_desc_nic *nic_mdid; u32 hash; if (!(rx_ring->netdev->features & NETIF_F_RXHASH)) return; - if (rx_desc->wb.rxdid != ICE_RXDID_FLEX_NIC) - return; - - nic_mdid = (struct ice_32b_rx_flex_desc_nic *)rx_desc; - hash = le32_to_cpu(nic_mdid->rss_hash); - skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype)); + hash = ice_get_rx_hash(rx_desc); + if (likely(hash)) + skb_set_hash(skb, hash, ice_ptype_to_htype(rx_ptype)); } /** @@ -171,11 +185,38 @@ checksum_fail: } /** + * ice_ptp_rx_hwts_to_skb - Put RX timestamp into skb + * @rx_ring: Ring to get the VSI info + * @rx_desc: Receive descriptor + * @skb: Particular skb to send timestamp with + * + * The timestamp is in ns, so we must convert the result first. + */ +static void +ice_ptp_rx_hwts_to_skb(struct ice_rx_ring *rx_ring, + const union ice_32b_rx_flex_desc *rx_desc, + struct sk_buff *skb) +{ + u64 ts_ns = ice_ptp_get_rx_hwts(rx_desc, &rx_ring->pkt_ctx); + + skb_hwtstamps(skb)->hwtstamp = ns_to_ktime(ts_ns); +} + +/** + * ice_get_ptype - Read HW packet type from the descriptor + * @rx_desc: RX descriptor + */ +static u16 ice_get_ptype(const union ice_32b_rx_flex_desc *rx_desc) +{ + return le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & + ICE_RX_FLEX_DESC_PTYPE_M; +} + +/** * ice_process_skb_fields - Populate skb header fields from Rx descriptor * @rx_ring: Rx descriptor ring packet is being transacted on * @rx_desc: pointer to the EOP Rx descriptor * @skb: pointer to current skb being populated - * @ptype: the packet type decoded by hardware * * This function checks the ring, descriptor, and packet information in * order to populate the hash, checksum, VLAN, protocol, and @@ -184,9 +225,11 @@ checksum_fail: void ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 ptype) + struct sk_buff *skb) { - ice_rx_hash(rx_ring, rx_desc, skb, ptype); + u16 ptype = ice_get_ptype(rx_desc); + + ice_rx_hash_to_skb(rx_ring, rx_desc, skb, ptype); /* modifies the skb - consumes the enet header */ skb->protocol = eth_type_trans(skb, rx_ring->netdev); @@ -194,28 +237,24 @@ ice_process_skb_fields(struct ice_rx_ring *rx_ring, ice_rx_csum(rx_ring, skb, rx_desc, ptype); if (rx_ring->ptp_rx) - ice_ptp_rx_hwtstamp(rx_ring, rx_desc, skb); + ice_ptp_rx_hwts_to_skb(rx_ring, rx_desc, skb); } /** * ice_receive_skb - Send a completed packet up the stack * @rx_ring: Rx ring in play * @skb: packet to send up - * @vlan_tag: VLAN tag for packet + * @vlan_tci: VLAN TCI for packet * * This function sends the completed packet (via. skb) up the stack using * gro receive functions (with/without VLAN tag) */ void -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag) +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci) { - netdev_features_t features = rx_ring->netdev->features; - bool non_zero_vlan = !!(vlan_tag & VLAN_VID_MASK); - - if ((features & NETIF_F_HW_VLAN_CTAG_RX) && non_zero_vlan) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), vlan_tag); - else if ((features & NETIF_F_HW_VLAN_STAG_RX) && non_zero_vlan) - __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021AD), vlan_tag); + if ((vlan_tci & VLAN_VID_MASK) && rx_ring->vlan_proto) + __vlan_hwaccel_put_tag(skb, rx_ring->vlan_proto, + vlan_tci); napi_gro_receive(&rx_ring->q_vector->napi, skb); } @@ -464,3 +503,125 @@ void ice_finalize_xdp_rx(struct ice_tx_ring *xdp_ring, unsigned int xdp_res, spin_unlock(&xdp_ring->tx_lock); } } + +/** + * ice_xdp_rx_hw_ts - HW timestamp XDP hint handler + * @ctx: XDP buff pointer + * @ts_ns: destination address + * + * Copy HW timestamp (if available) to the destination address. + */ +static int ice_xdp_rx_hw_ts(const struct xdp_md *ctx, u64 *ts_ns) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *ts_ns = ice_ptp_get_rx_hwts(xdp_ext->eop_desc, + xdp_ext->pkt_ctx); + if (!*ts_ns) + return -ENODATA; + + return 0; +} + +/* Define a ptype index -> XDP hash type lookup table. + * It uses the same ptype definitions as ice_decode_rx_desc_ptype[], + * avoiding possible copy-paste errors. + */ +#undef ICE_PTT +#undef ICE_PTT_UNUSED_ENTRY + +#define ICE_PTT(PTYPE, OUTER_IP, OUTER_IP_VER, OUTER_FRAG, T, TE, TEF, I, PL)\ + [PTYPE] = XDP_RSS_L3_##OUTER_IP_VER | XDP_RSS_L4_##I | XDP_RSS_TYPE_##PL + +#define ICE_PTT_UNUSED_ENTRY(PTYPE) [PTYPE] = 0 + +/* A few supplementary definitions for when XDP hash types do not coincide + * with what can be generated from ptype definitions + * by means of preprocessor concatenation. + */ +#define XDP_RSS_L3_NONE XDP_RSS_TYPE_NONE +#define XDP_RSS_L4_NONE XDP_RSS_TYPE_NONE +#define XDP_RSS_TYPE_PAY2 XDP_RSS_TYPE_L2 +#define XDP_RSS_TYPE_PAY3 XDP_RSS_TYPE_NONE +#define XDP_RSS_TYPE_PAY4 XDP_RSS_L4 + +static const enum xdp_rss_hash_type +ice_ptype_to_xdp_hash[ICE_NUM_DEFINED_PTYPES] = { + ICE_PTYPES +}; + +#undef XDP_RSS_L3_NONE +#undef XDP_RSS_L4_NONE +#undef XDP_RSS_TYPE_PAY2 +#undef XDP_RSS_TYPE_PAY3 +#undef XDP_RSS_TYPE_PAY4 + +#undef ICE_PTT +#undef ICE_PTT_UNUSED_ENTRY + +/** + * ice_xdp_rx_hash_type - Get XDP-specific hash type from the RX descriptor + * @eop_desc: End of Packet descriptor + */ +static enum xdp_rss_hash_type +ice_xdp_rx_hash_type(const union ice_32b_rx_flex_desc *eop_desc) +{ + u16 ptype = ice_get_ptype(eop_desc); + + if (unlikely(ptype >= ICE_NUM_DEFINED_PTYPES)) + return 0; + + return ice_ptype_to_xdp_hash[ptype]; +} + +/** + * ice_xdp_rx_hash - RX hash XDP hint handler + * @ctx: XDP buff pointer + * @hash: hash destination address + * @rss_type: XDP hash type destination address + * + * Copy RX hash (if available) and its type to the destination address. + */ +static int ice_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, + enum xdp_rss_hash_type *rss_type) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *hash = ice_get_rx_hash(xdp_ext->eop_desc); + *rss_type = ice_xdp_rx_hash_type(xdp_ext->eop_desc); + if (!likely(*hash)) + return -ENODATA; + + return 0; +} + +/** + * ice_xdp_rx_vlan_tag - VLAN tag XDP hint handler + * @ctx: XDP buff pointer + * @vlan_proto: destination address for VLAN protocol + * @vlan_tci: destination address for VLAN TCI + * + * Copy VLAN tag (if was stripped) and corresponding protocol + * to the destination address. + */ +static int ice_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct ice_xdp_buff *xdp_ext = (void *)ctx; + + *vlan_proto = xdp_ext->pkt_ctx->vlan_proto; + if (!*vlan_proto) + return -ENODATA; + + *vlan_tci = ice_get_vlan_tci(xdp_ext->eop_desc); + if (!*vlan_tci) + return -ENODATA; + + return 0; +} + +const struct xdp_metadata_ops ice_xdp_md_ops = { + .xmo_rx_timestamp = ice_xdp_rx_hw_ts, + .xmo_rx_hash = ice_xdp_rx_hash, + .xmo_rx_vlan_tag = ice_xdp_rx_vlan_tag, +}; diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h index 115969ecdf7b..762047508619 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.h +++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.h @@ -84,7 +84,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag) } /** - * ice_get_vlan_tag_from_rx_desc - get VLAN from Rx flex descriptor + * ice_get_vlan_tci - get VLAN TCI from Rx flex descriptor * @rx_desc: Rx 32b flex descriptor with RXDID=2 * * The OS and current PF implementation only support stripping a single VLAN tag @@ -92,7 +92,7 @@ ice_build_ctob(u64 td_cmd, u64 td_offset, unsigned int size, u64 td_tag) * one is found return the tag, else return 0 to mean no VLAN tag was found. */ static inline u16 -ice_get_vlan_tag_from_rx_desc(union ice_32b_rx_flex_desc *rx_desc) +ice_get_vlan_tci(const union ice_32b_rx_flex_desc *rx_desc) { u16 stat_err_bits; @@ -148,7 +148,17 @@ void ice_release_rx_desc(struct ice_rx_ring *rx_ring, u16 val); void ice_process_skb_fields(struct ice_rx_ring *rx_ring, union ice_32b_rx_flex_desc *rx_desc, - struct sk_buff *skb, u16 ptype); + struct sk_buff *skb); void -ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tag); +ice_receive_skb(struct ice_rx_ring *rx_ring, struct sk_buff *skb, u16 vlan_tci); + +static inline void +ice_xdp_meta_set_desc(struct xdp_buff *xdp, + union ice_32b_rx_flex_desc *eop_desc) +{ + struct ice_xdp_buff *xdp_ext = container_of(xdp, struct ice_xdp_buff, + xdp_buff); + + xdp_ext->eop_desc = eop_desc; +} #endif /* !_ICE_TXRX_LIB_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c index 99954508184f..5d1ae8e4058a 100644 --- a/drivers/net/ethernet/intel/ice/ice_xsk.c +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c @@ -458,6 +458,11 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp, rx_desc->read.pkt_addr = cpu_to_le64(dma); rx_desc->wb.status_error0 = 0; + /* Put private info that changes on a per-packet basis + * into xdp_buff_xsk->cb. + */ + ice_xdp_meta_set_desc(*xdp, rx_desc); + rx_desc++; xdp++; } @@ -863,8 +868,7 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget) struct xdp_buff *xdp; struct sk_buff *skb; u16 stat_err_bits; - u16 vlan_tag = 0; - u16 rx_ptype; + u16 vlan_tci; rx_desc = ICE_RX_DESC(rx_ring, ntc); @@ -942,13 +946,10 @@ construct_skb: total_rx_bytes += skb->len; total_rx_packets++; - vlan_tag = ice_get_vlan_tag_from_rx_desc(rx_desc); - - rx_ptype = le16_to_cpu(rx_desc->wb.ptype_flex_flags0) & - ICE_RX_FLEX_DESC_PTYPE_M; + vlan_tci = ice_get_vlan_tci(rx_desc); - ice_process_skb_fields(rx_ring, rx_desc, skb, rx_ptype); - ice_receive_skb(rx_ring, skb, vlan_tag); + ice_process_skb_fields(rx_ring, rx_desc, skb); + ice_receive_skb(rx_ring, skb, vlan_tci); } rx_ring->next_to_clean = ntc; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c index e2e7d82cfca4..9e695ed122ee 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c @@ -256,9 +256,24 @@ static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, return 0; } +static int mlx5e_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct mlx5e_xdp_buff *_ctx = (void *)ctx; + const struct mlx5_cqe64 *cqe = _ctx->cqe; + + if (!cqe_has_vlan(cqe)) + return -ENODATA; + + *vlan_proto = htons(ETH_P_8021Q); + *vlan_tci = be16_to_cpu(cqe->vlan_info); + return 0; +} + const struct xdp_metadata_ops mlx5e_xdp_metadata_ops = { .xmo_rx_timestamp = mlx5e_xdp_rx_timestamp, .xmo_rx_hash = mlx5e_xdp_rx_hash, + .xmo_rx_vlan_tag = mlx5e_xdp_rx_vlan_tag, }; struct mlx5e_xsk_tx_complete { diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 57efb3454c57..1efdbe4b92f5 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1722,6 +1722,24 @@ static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash, return 0; } +static int veth_xdp_rx_vlan_tag(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci) +{ + const struct veth_xdp_buff *_ctx = (void *)ctx; + const struct sk_buff *skb = _ctx->skb; + int err; + + if (!skb) + return -ENODATA; + + err = __vlan_hwaccel_get_tag(skb, vlan_tci); + if (err) + return err; + + *vlan_proto = skb->vlan_proto; + return err; +} + static const struct net_device_ops veth_netdev_ops = { .ndo_init = veth_dev_init, .ndo_open = veth_open, @@ -1746,6 +1764,7 @@ static const struct net_device_ops veth_netdev_ops = { static const struct xdp_metadata_ops veth_xdp_metadata_ops = { .xmo_rx_timestamp = veth_xdp_rx_timestamp, .xmo_rx_hash = veth_xdp_rx_hash, + .xmo_rx_vlan_tag = veth_xdp_rx_vlan_tag, }; #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HW_CSUM | \ diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 3028af87716e..c1645c86eed9 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -540,7 +540,7 @@ static inline int __vlan_get_tag(const struct sk_buff *skb, u16 *vlan_tci) struct vlan_ethhdr *veth = skb_vlan_eth_hdr(skb); if (!eth_type_vlan(veth->h_vlan_proto)) - return -EINVAL; + return -ENODATA; *vlan_tci = ntohs(veth->h_vlan_TCI); return 0; @@ -561,7 +561,7 @@ static inline int __vlan_hwaccel_get_tag(const struct sk_buff *skb, return 0; } else { *vlan_tci = 0; - return -EINVAL; + return -ENODATA; } } diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h index 820bca965fb6..01275c6e8468 100644 --- a/include/linux/mlx5/device.h +++ b/include/linux/mlx5/device.h @@ -918,7 +918,7 @@ static inline u8 get_cqe_tls_offload(struct mlx5_cqe64 *cqe) return (cqe->tls_outer_l3_tunneled >> 3) & 0x3; } -static inline bool cqe_has_vlan(struct mlx5_cqe64 *cqe) +static inline bool cqe_has_vlan(const struct mlx5_cqe64 *cqe) { return cqe->l4_l3_hdr_type & 0x1; } diff --git a/include/net/xdp.h b/include/net/xdp.h index 5d3673afc037..8cd04a74dba5 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -404,6 +404,10 @@ void xdp_attachment_setup(struct xdp_attachment_info *info, NETDEV_XDP_RX_METADATA_HASH, \ bpf_xdp_metadata_rx_hash, \ xmo_rx_hash) \ + XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_VLAN_TAG, \ + NETDEV_XDP_RX_METADATA_VLAN_TAG, \ + bpf_xdp_metadata_rx_vlan_tag, \ + xmo_rx_vlan_tag) \ enum xdp_rx_metadata { #define XDP_METADATA_KFUNC(name, _, __, ___) name, @@ -432,6 +436,7 @@ enum xdp_rss_hash_type { XDP_RSS_L4_UDP = BIT(5), XDP_RSS_L4_SCTP = BIT(6), XDP_RSS_L4_IPSEC = BIT(7), /* L4 based hash include IPSEC SPI */ + XDP_RSS_L4_ICMP = BIT(8), /* Second part: RSS hash type combinations used for driver HW mapping */ XDP_RSS_TYPE_NONE = 0, @@ -447,11 +452,13 @@ enum xdp_rss_hash_type { XDP_RSS_TYPE_L4_IPV4_UDP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_UDP, XDP_RSS_TYPE_L4_IPV4_SCTP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_SCTP, XDP_RSS_TYPE_L4_IPV4_IPSEC = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC, + XDP_RSS_TYPE_L4_IPV4_ICMP = XDP_RSS_L3_IPV4 | XDP_RSS_L4 | XDP_RSS_L4_ICMP, XDP_RSS_TYPE_L4_IPV6_TCP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_TCP, XDP_RSS_TYPE_L4_IPV6_UDP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_UDP, XDP_RSS_TYPE_L4_IPV6_SCTP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_SCTP, XDP_RSS_TYPE_L4_IPV6_IPSEC = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_IPSEC, + XDP_RSS_TYPE_L4_IPV6_ICMP = XDP_RSS_L3_IPV6 | XDP_RSS_L4 | XDP_RSS_L4_ICMP, XDP_RSS_TYPE_L4_IPV6_TCP_EX = XDP_RSS_TYPE_L4_IPV6_TCP | XDP_RSS_L3_DYNHDR, XDP_RSS_TYPE_L4_IPV6_UDP_EX = XDP_RSS_TYPE_L4_IPV6_UDP | XDP_RSS_L3_DYNHDR, @@ -462,6 +469,8 @@ struct xdp_metadata_ops { int (*xmo_rx_timestamp)(const struct xdp_md *ctx, u64 *timestamp); int (*xmo_rx_hash)(const struct xdp_md *ctx, u32 *hash, enum xdp_rss_hash_type *rss_type); + int (*xmo_rx_vlan_tag)(const struct xdp_md *ctx, __be16 *vlan_proto, + u16 *vlan_tci); }; #ifdef CONFIG_NET diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 81e02de3f453..b62bb8525a5f 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -14,6 +14,12 @@ #ifdef CONFIG_XDP_SOCKETS +struct xsk_cb_desc { + void *src; + u8 off; + u8 bytes; +}; + void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries); bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max); @@ -47,6 +53,12 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, xp_set_rxq_info(pool, rxq); } +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, + struct xsk_cb_desc *desc) +{ + xp_fill_cb(pool, desc); +} + static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) { #ifdef CONFIG_NET_RX_BUSY_POLL @@ -274,6 +286,11 @@ static inline void xsk_pool_set_rxq_info(struct xsk_buff_pool *pool, { } +static inline void xsk_pool_fill_cb(struct xsk_buff_pool *pool, + struct xsk_cb_desc *desc) +{ +} + static inline unsigned int xsk_pool_get_napi_id(struct xsk_buff_pool *pool) { return 0; diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index 8d48d37ab7c0..99dd7376df6a 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -12,6 +12,7 @@ struct xsk_buff_pool; struct xdp_rxq_info; +struct xsk_cb_desc; struct xsk_queue; struct xdp_desc; struct xdp_umem; @@ -135,6 +136,7 @@ static inline void xp_init_xskb_dma(struct xdp_buff_xsk *xskb, struct xsk_buff_p /* AF_XDP ZC drivers, via xdp_sock_buff.h */ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq); +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc); int xp_dma_map(struct xsk_buff_pool *pool, struct device *dev, unsigned long attrs, struct page **pages, u32 nr_pages); void xp_dma_unmap(struct xsk_buff_pool *pool, unsigned long attrs); diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 6244c0164976..966638b08ccf 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -44,10 +44,13 @@ enum netdev_xdp_act { * timestamp via bpf_xdp_metadata_rx_timestamp(). * @NETDEV_XDP_RX_METADATA_HASH: Device is capable of exposing receive packet * hash via bpf_xdp_metadata_rx_hash(). + * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive + * packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). */ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_TIMESTAMP = 1, NETDEV_XDP_RX_METADATA_HASH = 2, + NETDEV_XDP_RX_METADATA_VLAN_TAG = 4, }; /** diff --git a/net/core/xdp.c b/net/core/xdp.c index b6f1d6dab3f2..4869c1c2d8f3 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -736,6 +736,39 @@ __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash, return -EOPNOTSUPP; } +/** + * bpf_xdp_metadata_rx_vlan_tag - Get XDP packet outermost VLAN tag + * @ctx: XDP context pointer. + * @vlan_proto: Destination pointer for VLAN Tag protocol identifier (TPID). + * @vlan_tci: Destination pointer for VLAN TCI (VID + DEI + PCP) + * + * In case of success, ``vlan_proto`` contains *Tag protocol identifier (TPID)*, + * usually ``ETH_P_8021Q`` or ``ETH_P_8021AD``, but some networks can use + * custom TPIDs. ``vlan_proto`` is stored in **network byte order (BE)** + * and should be used as follows: + * ``if (vlan_proto == bpf_htons(ETH_P_8021Q)) do_something();`` + * + * ``vlan_tci`` contains the remaining 16 bits of a VLAN tag. + * Driver is expected to provide those in **host byte order (usually LE)**, + * so the bpf program should not perform byte conversion. + * According to 802.1Q standard, *VLAN TCI (Tag control information)* + * is a bit field that contains: + * *VLAN identifier (VID)* that can be read with ``vlan_tci & 0xfff``, + * *Drop eligible indicator (DEI)* - 1 bit, + * *Priority code point (PCP)* - 3 bits. + * For detailed meaning of DEI and PCP, please refer to other sources. + * + * Return: + * * Returns 0 on success or ``-errno`` on error. + * * ``-EOPNOTSUPP`` : device driver doesn't implement kfunc + * * ``-ENODATA`` : VLAN tag was not stripped or is not available + */ +__bpf_kfunc int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, u16 *vlan_tci) +{ + return -EOPNOTSUPP; +} + __bpf_kfunc_end_defs(); BTF_SET8_START(xdp_metadata_kfunc_ids) diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 4f6f538a5462..28711cc44ced 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -125,6 +125,18 @@ void xp_set_rxq_info(struct xsk_buff_pool *pool, struct xdp_rxq_info *rxq) } EXPORT_SYMBOL(xp_set_rxq_info); +void xp_fill_cb(struct xsk_buff_pool *pool, struct xsk_cb_desc *desc) +{ + u32 i; + + for (i = 0; i < pool->heads_cnt; i++) { + struct xdp_buff_xsk *xskb = &pool->heads[i]; + + memcpy(xskb->cb + desc->off, desc->src, desc->bytes); + } +} +EXPORT_SYMBOL(xp_fill_cb); + static void xp_disable_drv_zc(struct xsk_buff_pool *pool) { struct netdev_bpf bpf; diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 6244c0164976..966638b08ccf 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -44,10 +44,13 @@ enum netdev_xdp_act { * timestamp via bpf_xdp_metadata_rx_timestamp(). * @NETDEV_XDP_RX_METADATA_HASH: Device is capable of exposing receive packet * hash via bpf_xdp_metadata_rx_hash(). + * @NETDEV_XDP_RX_METADATA_VLAN_TAG: Device is capable of exposing receive + * packet VLAN tag via bpf_xdp_metadata_rx_vlan_tag(). */ enum netdev_xdp_rx_metadata { NETDEV_XDP_RX_METADATA_TIMESTAMP = 1, NETDEV_XDP_RX_METADATA_HASH = 2, + NETDEV_XDP_RX_METADATA_VLAN_TAG = 4, }; /** diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index 3b9dee94d4ce..e3fe748086bd 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -53,6 +53,7 @@ const char *netdev_xdp_act_str(enum netdev_xdp_act value) static const char * const netdev_xdp_rx_metadata_strmap[] = { [0] = "timestamp", [1] = "hash", + [2] = "vlan-tag", }; const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 33cdf88efa6b..05edcf32f528 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -20,7 +20,7 @@ #define UDP_PAYLOAD_BYTES 4 -#define AF_XDP_SOURCE_PORT 1234 +#define UDP_SOURCE_PORT 1234 #define AF_XDP_CONSUMER_PORT 8080 #define UMEM_NUM 16 @@ -33,6 +33,18 @@ #define RX_ADDR "10.0.0.2" #define PREFIX_LEN "8" #define FAMILY AF_INET +#define TX_NETNS_NAME "xdp_metadata_tx" +#define RX_NETNS_NAME "xdp_metadata_rx" +#define TX_MAC "00:00:00:00:00:01" +#define RX_MAC "00:00:00:00:00:02" + +#define VLAN_ID 59 +#define VLAN_PROTO "802.1Q" +#define VLAN_PID htons(ETH_P_8021Q) +#define TX_NAME_VLAN TX_NAME "." TO_STR(VLAN_ID) + +#define XDP_RSS_TYPE_L4 BIT(3) +#define VLAN_VID_MASK 0xfff struct xsk { void *umem_area; @@ -181,7 +193,7 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) ASSERT_EQ(inet_pton(FAMILY, RX_ADDR, &iph->daddr), 1, "inet_pton(RX_ADDR)"); ip_csum(iph); - udph->source = htons(AF_XDP_SOURCE_PORT); + udph->source = htons(UDP_SOURCE_PORT); udph->dest = htons(dst_port); udph->len = htons(sizeof(*udph) + UDP_PAYLOAD_BYTES); udph->check = ~csum_tcpudp_magic(iph->saddr, iph->daddr, @@ -204,6 +216,30 @@ static int generate_packet(struct xsk *xsk, __u16 dst_port) return 0; } +static int generate_packet_inet(void) +{ + char udp_payload[UDP_PAYLOAD_BYTES]; + struct sockaddr_in rx_addr; + int sock_fd, err = 0; + + /* Build a packet */ + memset(udp_payload, 0xAA, UDP_PAYLOAD_BYTES); + rx_addr.sin_addr.s_addr = inet_addr(RX_ADDR); + rx_addr.sin_family = AF_INET; + rx_addr.sin_port = htons(AF_XDP_CONSUMER_PORT); + + sock_fd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP); + if (!ASSERT_GE(sock_fd, 0, "socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)")) + return sock_fd; + + err = sendto(sock_fd, udp_payload, UDP_PAYLOAD_BYTES, MSG_DONTWAIT, + (void *)&rx_addr, sizeof(rx_addr)); + ASSERT_GE(err, 0, "sendto"); + + close(sock_fd); + return err; +} + static void complete_tx(struct xsk *xsk) { struct xsk_tx_metadata *meta; @@ -236,7 +272,7 @@ static void refill_rx(struct xsk *xsk, __u64 addr) } } -static int verify_xsk_metadata(struct xsk *xsk) +static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp) { const struct xdp_desc *rx_desc; struct pollfd fds = {}; @@ -290,17 +326,42 @@ static int verify_xsk_metadata(struct xsk *xsk) if (!ASSERT_NEQ(meta->rx_hash, 0, "rx_hash")) return -1; + if (!sent_from_af_xdp) { + if (!ASSERT_NEQ(meta->rx_hash_type & XDP_RSS_TYPE_L4, 0, "rx_hash_type")) + return -1; + + if (!ASSERT_EQ(meta->rx_vlan_tci & VLAN_VID_MASK, VLAN_ID, "rx_vlan_tci")) + return -1; + + if (!ASSERT_EQ(meta->rx_vlan_proto, VLAN_PID, "rx_vlan_proto")) + return -1; + goto done; + } + ASSERT_EQ(meta->rx_hash_type, 0, "rx_hash_type"); /* checksum offload */ ASSERT_EQ(udph->check, htons(0x721c), "csum"); +done: xsk_ring_cons__release(&xsk->rx, 1); refill_rx(xsk, comp_addr); return 0; } +static void switch_ns_to_rx(struct nstoken **tok) +{ + close_netns(*tok); + *tok = open_netns(RX_NETNS_NAME); +} + +static void switch_ns_to_tx(struct nstoken **tok) +{ + close_netns(*tok); + *tok = open_netns(TX_NETNS_NAME); +} + void test_xdp_metadata(void) { struct xdp_metadata2 *bpf_obj2 = NULL; @@ -318,27 +379,35 @@ void test_xdp_metadata(void) int sock_fd; int ret; - /* Setup new networking namespace, with a veth pair. */ + /* Setup new networking namespaces, with a veth pair. */ + SYS(out, "ip netns add " TX_NETNS_NAME); + SYS(out, "ip netns add " RX_NETNS_NAME); - SYS(out, "ip netns add xdp_metadata"); - tok = open_netns("xdp_metadata"); + tok = open_netns(TX_NETNS_NAME); SYS(out, "ip link add numtxqueues 1 numrxqueues 1 " TX_NAME " type veth peer " RX_NAME " numtxqueues 1 numrxqueues 1"); - SYS(out, "ip link set dev " TX_NAME " address 00:00:00:00:00:01"); - SYS(out, "ip link set dev " RX_NAME " address 00:00:00:00:00:02"); + SYS(out, "ip link set " RX_NAME " netns " RX_NETNS_NAME); + + SYS(out, "ip link set dev " TX_NAME " address " TX_MAC); SYS(out, "ip link set dev " TX_NAME " up"); + + SYS(out, "ip link add link " TX_NAME " " TX_NAME_VLAN + " type vlan proto " VLAN_PROTO " id " TO_STR(VLAN_ID)); + SYS(out, "ip link set dev " TX_NAME_VLAN " up"); + SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME_VLAN); + + /* Avoid ARP calls */ + SYS(out, "ip -4 neigh add " RX_ADDR " lladdr " RX_MAC " dev " TX_NAME_VLAN); + + switch_ns_to_rx(&tok); + + SYS(out, "ip link set dev " RX_NAME " address " RX_MAC); SYS(out, "ip link set dev " RX_NAME " up"); - SYS(out, "ip addr add " TX_ADDR "/" PREFIX_LEN " dev " TX_NAME); SYS(out, "ip addr add " RX_ADDR "/" PREFIX_LEN " dev " RX_NAME); rx_ifindex = if_nametoindex(RX_NAME); - tx_ifindex = if_nametoindex(TX_NAME); - - /* Setup separate AF_XDP for TX and RX interfaces. */ - ret = open_xsk(tx_ifindex, &tx_xsk); - if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) - goto out; + /* Setup separate AF_XDP for RX interface. */ ret = open_xsk(rx_ifindex, &rx_xsk); if (!ASSERT_OK(ret, "open_xsk(RX_NAME)")) @@ -379,18 +448,38 @@ void test_xdp_metadata(void) if (!ASSERT_GE(ret, 0, "bpf_map_update_elem")) goto out; - /* Send packet destined to RX AF_XDP socket. */ + switch_ns_to_tx(&tok); + + /* Setup separate AF_XDP for TX interface nad send packet to the RX socket. */ + tx_ifindex = if_nametoindex(TX_NAME); + ret = open_xsk(tx_ifindex, &tx_xsk); + if (!ASSERT_OK(ret, "open_xsk(TX_NAME)")) + goto out; + if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, "generate AF_XDP_CONSUMER_PORT")) goto out; - /* Verify AF_XDP RX packet has proper metadata. */ - if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk), 0, + switch_ns_to_rx(&tok); + + /* Verify packet sent from AF_XDP has proper metadata. */ + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk, true), 0, "verify_xsk_metadata")) goto out; + switch_ns_to_tx(&tok); complete_tx(&tx_xsk); + /* Now check metadata of packet, generated with network stack */ + if (!ASSERT_GE(generate_packet_inet(), 0, "generate UDP packet")) + goto out; + + switch_ns_to_rx(&tok); + + if (!ASSERT_GE(verify_xsk_metadata(&rx_xsk, false), 0, + "verify_xsk_metadata")) + goto out; + /* Make sure freplace correctly picks up original bound device * and doesn't crash. */ @@ -408,11 +497,15 @@ void test_xdp_metadata(void) if (!ASSERT_OK(xdp_metadata2__attach(bpf_obj2), "attach freplace")) goto out; + switch_ns_to_tx(&tok); + /* Send packet to trigger . */ if (!ASSERT_GE(generate_packet(&tx_xsk, AF_XDP_CONSUMER_PORT), 0, "generate freplace packet")) goto out; + switch_ns_to_rx(&tok); + while (!retries--) { if (bpf_obj2->bss->called) break; @@ -427,5 +520,6 @@ out: xdp_metadata__destroy(bpf_obj); if (tok) close_netns(tok); - SYS_NOFAIL("ip netns del xdp_metadata"); + SYS_NOFAIL("ip netns del " RX_NETNS_NAME); + SYS_NOFAIL("ip netns del " TX_NETNS_NAME); } diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index f6d1cc9ad892..330ece2eabdb 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -20,21 +20,32 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, __u64 *timestamp) __ksym; extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, enum xdp_rss_hash_type *rss_type) __ksym; +extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, + __u16 *vlan_tci) __ksym; SEC("xdp.frags") int rx(struct xdp_md *ctx) { void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; - struct ethhdr *eth = NULL; struct udphdr *udp = NULL; struct iphdr *iph = NULL; struct xdp_meta *meta; + struct ethhdr *eth; int err; data = (void *)(long)ctx->data; data_end = (void *)(long)ctx->data_end; eth = data; + + if (eth + 1 < data_end && (eth->h_proto == bpf_htons(ETH_P_8021AD) || + eth->h_proto == bpf_htons(ETH_P_8021Q))) + eth = (void *)eth + sizeof(struct vlan_hdr); + + if (eth + 1 < data_end && eth->h_proto == bpf_htons(ETH_P_8021Q)) + eth = (void *)eth + sizeof(struct vlan_hdr); + if (eth + 1 < data_end) { if (eth->h_proto == bpf_htons(ETH_P_IP)) { iph = (void *)(eth + 1); @@ -76,15 +87,28 @@ int rx(struct xdp_md *ctx) return XDP_PASS; } + meta->hint_valid = 0; + + meta->xdp_timestamp = bpf_ktime_get_tai_ns(); err = bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp); - if (!err) - meta->xdp_timestamp = bpf_ktime_get_tai_ns(); + if (err) + meta->rx_timestamp_err = err; else - meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */ + meta->hint_valid |= XDP_META_FIELD_TS; - err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type); - if (err < 0) - meta->rx_hash_err = err; /* Used by AF_XDP as no hash signal */ + err = bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, + &meta->rx_hash_type); + if (err) + meta->rx_hash_err = err; + else + meta->hint_valid |= XDP_META_FIELD_RSS; + + err = bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto, + &meta->rx_vlan_tci); + if (err) + meta->rx_vlan_tag_err = err; + else + meta->hint_valid |= XDP_META_FIELD_VLAN_TAG; __sync_add_and_fetch(&pkts_redir, 1); return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c index 5d6c1245c310..31ca229bb3c0 100644 --- a/tools/testing/selftests/bpf/progs/xdp_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c @@ -23,6 +23,9 @@ extern int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, __u64 *timestamp) __ksym; extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, enum xdp_rss_hash_type *rss_type) __ksym; +extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, + __be16 *vlan_proto, + __u16 *vlan_tci) __ksym; SEC("xdp") int rx(struct xdp_md *ctx) @@ -86,6 +89,8 @@ int rx(struct xdp_md *ctx) meta->rx_timestamp = 1; bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash, &meta->rx_hash_type); + bpf_xdp_metadata_rx_vlan_tag(ctx, &meta->rx_vlan_proto, + &meta->rx_vlan_tci); return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); } diff --git a/tools/testing/selftests/bpf/testing_helpers.h b/tools/testing/selftests/bpf/testing_helpers.h index 5b7a55136741..35284faff4f2 100644 --- a/tools/testing/selftests/bpf/testing_helpers.h +++ b/tools/testing/selftests/bpf/testing_helpers.h @@ -9,6 +9,9 @@ #include <bpf/libbpf.h> #include <time.h> +#define __TO_STR(x) #x +#define TO_STR(x) __TO_STR(x) + int parse_num_list(const char *s, bool **set, int *set_len); __u32 link_info_prog_id(const struct bpf_link *link, struct bpf_link_info *info); int bpf_prog_test_load(const char *file, enum bpf_prog_type type, diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index c69c08933fdd..878d68db0325 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -21,6 +21,9 @@ #include "xsk.h" #include <error.h> +#include <linux/kernel.h> +#include <linux/bits.h> +#include <linux/bitfield.h> #include <linux/errqueue.h> #include <linux/if_link.h> #include <linux/net_tstamp.h> @@ -182,19 +185,31 @@ static void print_tstamp_delta(const char *name, const char *refname, (double)delta / 1000); } +#define VLAN_PRIO_MASK GENMASK(15, 13) /* Priority Code Point */ +#define VLAN_DEI_MASK GENMASK(12, 12) /* Drop Eligible Indicator */ +#define VLAN_VID_MASK GENMASK(11, 0) /* VLAN Identifier */ +static void print_vlan_tci(__u16 tag) +{ + __u16 vlan_id = FIELD_GET(VLAN_VID_MASK, tag); + __u8 pcp = FIELD_GET(VLAN_PRIO_MASK, tag); + bool dei = FIELD_GET(VLAN_DEI_MASK, tag); + + printf("PCP=%u, DEI=%d, VID=0x%X\n", pcp, dei, vlan_id); +} + static void verify_xdp_metadata(void *data, clockid_t clock_id) { struct xdp_meta *meta; meta = data - sizeof(*meta); - if (meta->rx_hash_err < 0) - printf("No rx_hash err=%d\n", meta->rx_hash_err); - else + if (meta->hint_valid & XDP_META_FIELD_RSS) printf("rx_hash: 0x%X with RSS type:0x%X\n", meta->rx_hash, meta->rx_hash_type); + else + printf("No rx_hash, err=%d\n", meta->rx_hash_err); - if (meta->rx_timestamp) { + if (meta->hint_valid & XDP_META_FIELD_TS) { __u64 ref_tstamp = gettime(clock_id); /* store received timestamps to calculate a delta at tx */ @@ -206,7 +221,16 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) print_tstamp_delta("XDP RX-time", "User RX-time", meta->xdp_timestamp, ref_tstamp); } else { - printf("No rx_timestamp\n"); + printf("No rx_timestamp, err=%d\n", meta->rx_timestamp_err); + } + + if (meta->hint_valid & XDP_META_FIELD_VLAN_TAG) { + printf("rx_vlan_proto: 0x%X\n", ntohs(meta->rx_vlan_proto)); + printf("rx_vlan_tci: "); + print_vlan_tci(meta->rx_vlan_tci); + } else { + printf("No rx_vlan_tci or rx_vlan_proto, err=%d\n", + meta->rx_vlan_tag_err); } } diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h index 938a729bd307..87318ad1117a 100644 --- a/tools/testing/selftests/bpf/xdp_metadata.h +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -9,12 +9,44 @@ #define ETH_P_IPV6 0x86DD #endif +#ifndef ETH_P_8021Q +#define ETH_P_8021Q 0x8100 +#endif + +#ifndef ETH_P_8021AD +#define ETH_P_8021AD 0x88A8 +#endif + +#ifndef BIT +#define BIT(nr) (1 << (nr)) +#endif + +/* Non-existent checksum status */ +#define XDP_CHECKSUM_MAGIC BIT(2) + +enum xdp_meta_field { + XDP_META_FIELD_TS = BIT(0), + XDP_META_FIELD_RSS = BIT(1), + XDP_META_FIELD_VLAN_TAG = BIT(2), +}; + struct xdp_meta { - __u64 rx_timestamp; + union { + __u64 rx_timestamp; + __s32 rx_timestamp_err; + }; __u64 xdp_timestamp; __u32 rx_hash; union { __u32 rx_hash_type; __s32 rx_hash_err; }; + union { + struct { + __be16 rx_vlan_proto; + __u16 rx_vlan_tci; + }; + __s32 rx_vlan_tag_err; + }; + enum xdp_meta_field hint_valid; }; |