aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/google
AgeCommit message (Collapse)AuthorFilesLines
2024-08-19gve: Remove unused declaration gve_rx_alloc_rings()Yue Haibing1-1/+0
Commit f13697cc7a19 ("gve: Switch to config-aware queue allocation") convert this function to gve_rx_alloc_rings_gqi(). Signed-off-by: Yue Haibing <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-13gve: Add RSS adminq commands and ethtool supportJeroen de Borst4-1/+235
Introduce adminq commands to configure and retrieve RSS settings from the device. Implement corresponding ethtool ops for user-level management. Signed-off-by: Jeroen de Borst <[email protected]> Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-13gve: Add RSS device optionZiwei Xiao3-3/+51
Add a device option to inform the driver about the hash key size and hash table size used by the device. This information will be stored and made available for RSS ethtool operations. Signed-off-by: Ziwei Xiao <[email protected]> Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-08-02gve: Fix use of netif_carrier_ok()Praveen Kaligineedi2-7/+7
GVE driver wrongly relies on netif_carrier_ok() to check the interface administrative state when resources are being allocated/deallocated for queue(s). netif_carrier_ok() needs to be replaced with netif_running() for all such cases. Administrative state is the result of "ip link set dev <dev> up/down". It reflects whether the administrator wants to use the device for traffic and the corresponding resources have been allocated. Fixes: 5f08cd3d6423 ("gve: Alloc before freeing when adjusting queues") Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-25gve: Fix an edge case for TSO skb validity checkBailey Forrest1-1/+21
The NIC requires each TSO segment to not span more than 10 descriptors. NIC further requires each descriptor to not exceed 16KB - 1 (GVE_TX_MAX_BUF_SIZE_DQO). The descriptors for an skb are generated by gve_tx_add_skb_no_copy_dqo() for DQO RDA queue format. gve_tx_add_skb_no_copy_dqo() loops through each skb frag and generates a descriptor for the entire frag if the frag size is not greater than GVE_TX_MAX_BUF_SIZE_DQO. If the frag size is greater than GVE_TX_MAX_BUF_SIZE_DQO, it is split into descriptor(s) of size GVE_TX_MAX_BUF_SIZE_DQO and a descriptor is generated for the remainder (frag size % GVE_TX_MAX_BUF_SIZE_DQO). gve_can_send_tso() checks if the descriptors thus generated for an skb would meet the requirement that each TSO-segment not span more than 10 descriptors. However, the current code misses an edge case when a TSO segment spans multiple descriptors within a large frag. This change fixes the edge case. gve_can_send_tso() relies on the assumption that max gso size (9728) is less than GVE_TX_MAX_BUF_SIZE_DQO and therefore within an skb fragment a TSO segment can never span more than 2 descriptors. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Praveen Kaligineedi <[email protected]> Signed-off-by: Bailey Forrest <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Cc: [email protected] Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-07-17gve: Fix XDP TX completion handling when counters overflowJoshua Washington1-2/+3
In gve_clean_xdp_done, the driver processes the TX completions based on a 32-bit NIC counter and a 32-bit completion counter stored in the tx queue. Fix the for loop so that the counter wraparound is handled correctly. Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") Signed-off-by: Joshua Washington <[email protected]> Signed-off-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-25gve: Add flow steering ethtool supportJeroen de Borst5-9/+398
Implement the ethtool commands that can be used to configure and query flow-steering rules. A large part of this change consists of translating the ethtool representation of 'ntuples' to our internal gve_flow_rule and vice-versa in the new created gve_flow_rule.c Considering the possible large amount of flow rules, the driver doesn't store all the rules locally. When the user runs 'ethtool -n <nic>' to check the registered rules, the driver will send adminq command to query a limited amount of rules/rule ids(that filled in a 4096 bytes dma memory) at a time as a cache for the ethtool queries. The adminq query commands will be repeated for several times until the ethtool has queried all the needed rules. Signed-off-by: Jeroen de Borst <[email protected]> Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-25gve: Add flow steering adminq commandsJeroen de Borst5-4/+314
Add new adminq commands for the driver to configure and query flow rules that are stored in the device. Flow steering rules are assigned with a location that determines the relative order of the rules. Flow rules can run up to an order of millions. In such cases, storing a full copy of the rules in the driver to prepare for the ethtool query is infeasible while querying them from the device is better. That needs to be optimized too so that we don't send a lot of adminq commands. The solution here is to store a limited number of rules/rule ids in the driver in a cache. Use dma_pool to allocate 4k bytes which lets device write at most 46 flow rules(4096/88) or 1024 rule ids(4096/4) at a time. For configuring flow rules, there are 3 sub-commands: - ADD which adds a rule at the location supplied - DEL which deletes the rule at the location supplied - RESET which clears all currently active rules in the device For querying flow rules, there are also 3 sub-commands: - QUERY_RULES corresponds to ETHTOOL_GRXCLSRULE. It fills the rules in the allocated cache after querying the device - QUERY_RULES_IDS corresponds to ETHTOOL_GRXCLSRLALL. It fills the rule_ids in the allocated cache after querying the device - QUERY_RULES_STATS corresponds to ETHTOOL_GRXCLSRLCNT. It queries the device's current flow rule number and the supported max flow rule limit Signed-off-by: Jeroen de Borst <[email protected]> Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-25gve: Add flow steering device optionJeroen de Borst3-2/+51
Add a new device option to signal to the driver that the device supports flow steering. This device option also carries the maximum number of flow steering rules that the device can store. Signed-off-by: Jeroen de Borst <[email protected]> Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-25gve: Add adminq extended commandJeroen de Borst2-0/+43
The adminq command is limited to 64 bytes per entry and it's 56 bytes for the command itself at maximum. To support larger commands, we need to dma_alloc a separate memory to put the command in that memory and send the dma memory address instead of the actual command. Introduce an extended adminq command to wrap the real command with the inner opcode and the allocated dma memory address specified. Once the device receives it, it can get the real command from the given dma memory address. As designed with the device, all the extended commands will use inner opcode larger than 0xFF. Signed-off-by: Jeroen de Borst <[email protected]> Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-25gve: Add adminq mutex lockZiwei Xiao2-10/+13
We were depending on the rtnl_lock to make sure there is only one adminq command running at a time. But some commands may take too long to hold the rtnl_lock, such as the upcoming flow steering operations. For such situations, it can temporarily drop the rtnl_lock, and replace it for these operations with a new adminq lock, which can ensure the adminq command execution to be thread-safe. Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-13gve: Clear napi->skb before dev_kfree_skb_any()Ziwei Xiao1-3/+5
gve_rx_free_skb incorrectly leaves napi->skb referencing an skb after it is freed with dev_kfree_skb_any(). This can result in a subsequent call to napi_get_frags returning a dangling pointer. Fix this by clearing napi->skb before the skb is freed. Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Cc: [email protected] Reported-by: Shailend Chand <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-06-11gve: ignore nonrelevant GSO type bits when processing TSO headersJoshua Washington1-15/+5
TSO currently fails when the skb's gso_type field has more than one bit set. TSO packets can be passed from userspace using PF_PACKET, TUNTAP and a few others, using virtio_net_hdr (e.g., PACKET_VNET_HDR). This includes virtualization, such as QEMU, a real use-case. The gso_type and gso_size fields as passed from userspace in virtio_net_hdr are not trusted blindly by the kernel. It adds gso_type |= SKB_GSO_DODGY to force the packet to enter the software GSO stack for verification. This issue might similarly come up when the CWR bit is set in the TCP header for congestion control, causing the SKB_GSO_TCP_ECN gso_type bit to be set. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Joshua Washington <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Suggested-by: Eric Dumazet <[email protected]> Acked-by: Andrei Vagin <[email protected]> v2 - Remove unnecessary comments, remove line break between fixes tag and signoffs. v3 - Add back unrelated empty line removal. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10gve: Use ethtool_sprintf/puts() to fill stats stringsSimon Horman1-25/+17
Make use of standard helpers to simplify filling in stats strings. The first two ethtool_puts() changes address the following fortification warnings flagged by W=1 builds with clang-18. (The last ethtool_puts change does not because the warning relates to writing beyond the first element of an array, and gve_gstrings_priv_flags only has one element.) .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 562 | __read_overflow2_field(q_size_field, size); | ^ .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] Likewise, the same changes resolve the same problems flagged by Smatch. .../gve_ethtool.c:100 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_main_stats' too small (32 vs 576) .../gve_ethtool.c:120 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_adminq_stats' too small (32 vs 512) Compile tested only. Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Signed-off-by: Simon Horman <[email protected]> Acked-by: Justin Stitt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10gve: Avoid unnecessary use of comma operatorSimon Horman1-2/+2
Although it does not seem to have any untoward side-effects, the use of ';' to separate to assignments seems more appropriate than ','. Flagged by clang-18 -Wcomma No functional change intended. Compile tested only. Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Signed-off-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-06gve: Implement queue apiShailend Chand5-24/+189
The new netdev queue api is implemented for gve. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-05gve: Alloc and free QPLs with the ringsShailend Chand7-331/+171
Every tx and rx ring has its own queue-page-list (QPL) that serves as the bounce buffer. Previously we were allocating QPLs for all queues before the queues themselves were allocated and later associating a QPL with a queue. This is avoidable complexity: it is much more natural for each queue to allocate and free its own QPL. Moreover, the advent of new queue-manipulating ndo hooks make it hard to keep things as is: we would need to transfer a QPL from an old queue to a new queue, and that is unpleasant. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Account for stopped queues when reading NIC statsShailend Chand1-6/+35
We now account for the fact that the NIC might send us stats for a subset of queues. Without this change, gve_get_ethtool_stats might make an invalid access on the priv->stats_report->stats array. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Reset Rx ring state in the ring-stop funcsShailend Chand2-30/+120
This does not fix any existing bug. In anticipation of the ndo queue api hooks that alloc/free/start/stop a single Rx queue, the already existing per-queue stop functions are being made more robust. Specifically for this use case: rx_queue_n.stop() + rx_queue_n.start() Note that this is not the use case being used in devmem tcp (the first place these new ndo hooks would be used). There the usecase is: new_queue.alloc() + old_queue.stop() + new_queue.start() + old_queue.free() Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Avoid rescheduling napi if on wrong cpuShailend Chand2-2/+32
In order to make possible the implementation of per-queue ndo hooks, gve_turnup was changed in a previous patch to account for queues already having some unprocessed descriptors: it does a one-off napi_schdule to handle them. If conditions of consistent high traffic persist in the immediate aftermath of this, the poll routine for a queue can be "stuck" on the cpu on which the ndo hooks ran, instead of the cpu its irq has affinity with. This situation is exacerbated by the fact that the ndo hooks for all the queues are invoked on the same cpu, potentially causing all the napi poll routines to be residing on the same cpu. A self correcting mechanism in the poll method itself solves this problem. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Make gve_turnup work for nonempty queuesShailend Chand1-0/+14
gVNIC has a requirement that all queues have to be quiesced before any queue is operated on (created or destroyed). To enable the implementation of future ndo hooks that work on a single queue, we need to evolve gve_turnup to account for queues already having some unprocessed descriptors in the ring. Say rxq 4 is being stopped and started via the queue api. Due to gve's requirement of quiescence, queues 0 through 3 are not processing their rings while queue 4 is being toggled. Once they are made live, these queues need to be poked to cause them to check their rings for descriptors that were written during their brief period of quiescence. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Make gve_turn(up|down) ignore stopped queuesShailend Chand1-0/+10
Currently the queues are either all live or all dead, toggling from one state to the other via the ndo open and stop hooks. The future addition of single-queue ndo hooks changes this, and thus gve_turnup and gve_turndown should evolve to account for a state where some queues are live and some aren't. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Add adminq funcs to add/remove a single Rx queueShailend Chand2-27/+54
This allows for implementing future ndo hooks that act on a single queue. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-05gve: Make the GQ RX free queue funcs idempotentShailend Chand1-10/+19
Although this is not fixing any existing double free bug, making these functions idempotent allows for a simpler implementation of future ndo hooks that act on a single queue. Tested-by: Mina Almasry <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: Shailend Chand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-18gve: Remove qpl_cfg struct since qpl_ids map with queues respectivelyZiwei Xiao7-113/+20
The qpl_cfg struct was used to make sure that no two different queues are using QPL with the same qpl_id. We can remove that qpl_cfg struct since now the qpl_ids map with the queues respectively as follows: For tx queues: qpl_id = tx_qid For rx queues: qpl_id = max_tx_queues + rx_qid And when XDP is used, it will need the user to reduce the tx queues to be at most half of the max_tx_queues. Then it will use the same number of tx queues starting from the end of existing tx queues for XDP. So the XDP queues will not exceed the max_tx_queues range and will not overlap with the rx queues, where the qpl_ids will not have overlapping too. Considering of that, we remove the qpl_cfg struct to get the qpl_id directly based on the queue id. Unless we are erroneously allocating a rx/tx queue that has already been allocated, we would never allocate the qpl with the same qpl_id twice. In that case, it should fail much earlier than the QPL assignment. Suggested-by: Praveen Kaligineedi <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Shailend Chand <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-04-15gve: Correctly report software timestamping capabilitiesJohn Fraker1-1/+2
gve has supported software timestamp generation since its inception, but has not advertised that support via ethtool. This patch correctly advertises that support. Signed-off-by: John Fraker <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03gve: add support to change ring size via ethtoolHarshitha Ramamurthy3-14/+95
Allow the user to change ring size via ethtool if supported by the device. The driver relies on the ring size ranges queried from device to validate ring sizes requested by the user. Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03gve: add support to read ring size ranges from the deviceHarshitha Ramamurthy3-24/+102
Add support to read ring size change capability and the min and max descriptor counts from the device and store it in the driver. Also accommodate a special case where the device does not provide minimum ring size depending on the version of the device. In that case, rely on default values for the minimums. Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03gve: set page count for RX QPL for GQI and DQO queue formatsHarshitha Ramamurthy5-22/+20
Fulfill the requirement that for GQI, the number of pages per RX QPL is equal to the ring size. Set this value to be equal to ring size. Because of this change, the rx_data_slot_cnt and rx_pages_per_qpl fields stored in the priv structure are not needed, so remove their usage. And for DQO, the number of pages per RX QPL is more than ring size to account for out-of-order completions. So set it to two times of rx ring size. Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03gve: make the completion and buffer ring size equal for DQOHarshitha Ramamurthy5-43/+13
For the DQO queue format, the gve driver stores two ring sizes for both TX and RX - one for completion queue ring and one for data buffer ring. This is supposed to enable asymmetric sizes for these two rings but that is not supported. Make both fields reference the same single variable. This change renders reading supported TX completion ring size and RX buffer ring size for DQO from the device useless, so change those fields to reserved and remove related code. Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-04-03gve: simplify setting decriptor count defaultsHarshitha Ramamurthy1-29/+15
Combine the gve_set_desc_cnt and gve_set_desc_cnt_dqo into one function which sets the counts after checking the queue format. Both the functions in the previous code and the new combined function never return an error so make the new function void and remove the goto on error. Also rename the new function to gve_set_default_desc_cnt to be clearer about its intention. Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: Harshitha Ramamurthy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-03-28gve: Add counter adminq_get_ptype_map_cnt to stats reportJohn Fraker1-1/+2
This counter counts the number of times get_ptype_map is executed on the admin queue, and was previously missing from the stats report. Signed-off-by: John Fraker <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-03-05net: introduce page_frag_cache_drain()Yunsheng Lin1-9/+2
When draining a page_frag_cache, most user are doing the similar steps, so introduce an API to avoid code duplication. Signed-off-by: Yunsheng Lin <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-03-04gve: Add header split ethtool statsJeroen de Borst3-11/+38
To record the stats of header split packets, three stats are added in the driver's ethtool stats. - rx_hsplit_pkt is the split packets count with header split - rx_hsplit_bytes is the received header bytes count with header split - rx_hsplit_unsplit_pkt is the unsplit packet count due to header buffer overflow or zero header length when header split is enabled Currently, it's entering the stats_update critical section more than once per packet. We have plans to avoid that in the future change to let all the stats_update happen in one place at the end of `gve_rx_poll_dqo`. Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Signed-off-by: Jeroen de Borst <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-03-04gve: Add header split data pathJeroen de Borst7-8/+169
Add header buffers and ethtool support to enable header split via the tcp-data-split flag in ethtool's ringparam config. A coherent dma memory is allocated for the header buffers. There is one header buffer per ring entry by calculating the offset to the header-buffers starting address. The header buffer is always copied directly into the skb and payload is always added as frags. When there is a header buffer overflow or the header length is 0, the driver places the whole unsplit packet in frags. When toggling header split, the driver will call gve_adjust_config to set its queues appropriately. If header split is enabled by the user and the max packet buffer size is no less than 4KB, driver will set the packet buffer size as 4KB to support TCP_ZEROCOPY_RECEIVE. Otherwise the driver will use the default 2KB as the packet buffer size. `ethtool -G <dev> tcp-data-split on/off` is the command to toggle header split. `ethtool -g <dev>` will show the status of header split with the field of `tcp-data-split`. Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Signed-off-by: Jeroen de Borst <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-03-04gve: Add header split device optionJeroen de Borst5-16/+70
To enable header split via ethtool, we first need to query the device to get the max rx buffer size and header buffer size. Add a device option to get these values and store them in the driver. If the header buffer size received from the device is non-zero, it means header split is supported in the device. Currently the max rx buffer size will only be used when header split is enabled which will set the data_buffer_size_dqo to be the max rx buffer size. Also change the data_buffer_size_dqo from int to u16 since we are modifying it and making it to be consistent with max_rx_buffer_size. Co-developed-by: Ziwei Xiao <[email protected]> Signed-off-by: Ziwei Xiao <[email protected]> Signed-off-by: Jeroen de Borst <[email protected]> Reviewed-by: Praveen Kaligineedi <[email protected]> Reviewed-by: Harshitha Ramamurthy <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-02-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-4/+4
Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-25gve: Modify rx_buf_alloc_fail counter centrally and closer to failureAnkit Garg1-8/+11
Previously, each caller of gve_rx_alloc_buffer had to increase counter and as a result one caller was not tracking those failure. Increasing counters at a common location now so callers don't have to duplicate code or miss counter management. Signed-off-by: Ankit Garg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-25gve: Fix skb truesize underestimationPraveen Kaligineedi1-4/+4
For a skb frag with a newly allocated copy page, the true size is incorrectly set to packet buffer size. It should be set to PAGE_SIZE instead. Fixes: 82fd151d38d9 ("gve: Reduce alloc and copy costs in the GQ rx path") Signed-off-by: Praveen Kaligineedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Alloc before freeing when changing featuresShailend Chand1-20/+21
Previously, existing queues were being freed before the resources for the new queues were being allocated. This would take down the interface if someone were to attempt to change feature flags under a resource crunch. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Alloc before freeing when adjusting queuesShailend Chand1-22/+67
Previously, existing queues were being freed before the resources for the new queues were being allocated. This would take down the interface if someone were to attempt to change queue counts under a resource crunch. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Refactor gve_open and gve_closeShailend Chand1-40/+119
gve_open is rewritten to be composed of two funcs: gve_queues_mem_alloc and gve_queues_start. The former only allocates queue resources without doing anything to install the queues, which is taken up by the latter. Similarly gve_close is split into gve_queues_stop and gve_queues_mem_free. Separating the acts of queue resource allocation and making the queue become live help with subsequent changes that aim to not take down the datapath when applying new configurations. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Switch to config-aware queue allocationShailend Chand9-425/+745
The new config-aware functions will help achieve the goal of being able to allocate resources for new queues while there already are active queues serving traffic. These new functions work off of arbitrary queue allocation configs rather than just the currently active config in priv, and they return the newly allocated resources instead of writing them into priv. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Refactor napi add and remove functionsShailend Chand5-17/+26
This change makes the napi poll functions non-static and moves the gve_(add|remove)_napi functions to gve_utils.c, to make possible future "start queue" hooks in the datapath files. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-01-23gve: Define config structs for queue allocationShailend Chand1-0/+49
Queue allocation functions currently can only allocate into priv and free memory in priv. These new structs would be passed into the queue functions in a subsequent change to make them capable of returning newly allocated resources and not just writing them into priv. They also make it possible to allocate resources for queues with a different config than that of the currently active queues. Signed-off-by: Shailend Chand <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Reviewed-by: Jeroen de Borst <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-11-29gve: Remove dependency on 4k page size.John Fraker5-10/+11
Prior to this change, gve crashes when attempting to run in kernels with page sizes other than 4k. This change removes unnecessary references to PAGE_SIZE and replaces them with more meaningful constants. Signed-off-by: Jordan Kimbrough <[email protected]> Signed-off-by: John Fraker <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-11-29gve: Add page size register to the register_page_list command.John Fraker2-1/+3
This register is required on platforms with page sizes greater than 4k. This is because the tx side of the driver vmaps the entire queue page list of pages into a single flat address space, then uses the entire space. Without communicating the guest page size to the backend, the backend will only access the first 4k of each page in the queue page list. Signed-off-by: Jordan Kimbrough <[email protected]> Signed-off-by: John Fraker <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-11-29gve: Remove obsolete checks that rely on page size.John Fraker2-18/+1
These checks are safe to remove as they are no longer enforced by the backend. Retaining them would require updating them to work differently with page sizes larger than 4k. Signed-off-by: Jordan Kimbrough <[email protected]> Signed-off-by: John Fraker <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-11-29gve: Deprecate adminq_pfn for pci revision 0x1.John Fraker2-13/+44
adminq_pfn assumes a page size of 4k, causing this mechanism to break in kernels compiled with different page sizes. A new PCI device revision was needed for the device to be able to communicate with the driver how to set up the admin queue prior to having access to the admin queue. Signed-off-by: Jordan Kimbrough <[email protected]> Signed-off-by: John Fraker <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-11-29gve: Perform adminq allocations through a dma_pool.John Fraker2-10/+22
This allows the adminq to be smaller than a page, paving the way for non 4k page support. This is to support platforms where PAGE_SIZE is not 4k, such as some ARM platforms. Signed-off-by: Jordan Kimbrough <[email protected]> Signed-off-by: John Fraker <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>