aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/xen-netfront.c
AgeCommit message (Collapse)AuthorFilesLines
2022-02-25xen/netfront: destroy queues before real_num_tx_queues is zeroedMarek Marczykowski-Górecki1-16/+23
xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5 ("net-sysfs: update the queue counts in the unregistration path"), unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two facts together means, that xennet_destroy_queues() called from xennet_remove() cannot do its job, because it's called after unregister_netdev(). This results in kfree-ing queues that are still linked in napi, which ultimately crashes: BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226 RIP: 0010:free_netdev+0xa3/0x1a0 Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00 RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050 R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680 FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0 Call Trace: <TASK> xennet_remove+0x13d/0x300 [xen_netfront] xenbus_dev_remove+0x6d/0xf0 __device_release_driver+0x17a/0x240 device_release_driver+0x24/0x30 bus_remove_device+0xd8/0x140 device_del+0x18b/0x410 ? _raw_spin_unlock+0x16/0x30 ? klist_iter_exit+0x14/0x20 ? xenbus_dev_request_and_reply+0x80/0x80 device_unregister+0x13/0x60 xenbus_dev_changed+0x18e/0x1f0 xenwatch_thread+0xc0/0x1a0 ? do_wait_intr_irq+0xa0/0xa0 kthread+0x16b/0x190 ? set_kthread_struct+0x40/0x40 ret_from_fork+0x22/0x30 </TASK> Fix this by calling xennet_destroy_queues() from xennet_uninit(), when real_num_tx_queues is still available. This ensures that queues are destroyed when real_num_tx_queues is set to 0, regardless of how unregister_netdev() was called. Originally reported at https://github.com/QubesOS/qubes-issues/issues/7257 Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path") Cc: [email protected] Signed-off-by: Marek Marczykowski-Górecki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-1/+1
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-12-30 The following pull-request contains BPF updates for your *net-next* tree. We've added 72 non-merge commits during the last 20 day(s) which contain a total of 223 files changed, 3510 insertions(+), 1591 deletions(-). The main changes are: 1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii. 2) Beautify and de-verbose verifier logs, from Christy. 3) Composable verifier types, from Hao. 4) bpf_strncmp helper, from Hou. 5) bpf.h header dependency cleanup, from Jakub. 6) get_func_[arg|ret|arg_cnt] helpers, from Jiri. 7) Sleepable local storage, from KP. 8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-12-16xen/netfront: harden netfront against event channel stormsJuergen Gross1-31/+94
The Xen netfront driver is still vulnerable for an attack via excessive number of events sent by the backend. Fix that by using lateeoi event channels. For being able to detect the case of no rx responses being added while the carrier is down a new lock is needed in order to update and test rsp_cons and the number of seen unconsumed responses atomically. This is part of XSA-391 Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> --- V2: - don't eoi irq in case of interface set broken (Jan Beulich) - handle carrier off + no new responses added (Jan Beulich) V3: - add rx_ prefix to rsp_unconsumed (Jan Beulich) - correct xennet_set_rx_rsp_cons() spelling (Jan Beulich)
2021-12-13bpf: Let bpf_warn_invalid_xdp_action() report more infoPaolo Abeni1-1/+1
In non trivial scenarios, the action id alone is not sufficient to identify the program causing the warning. Before the previous patch, the generated stack-trace pointed out at least the involved device driver. Let's additionally include the program name and id, and the relevant device name. If the user needs additional infos, he can fetch them via a kernel probe, leveraging the arguments added here. Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Toke Høiland-Jørgensen <[email protected]> Link: https://lore.kernel.org/bpf/ddb96bb975cbfddb1546cf5da60e77d5100b533c.1638189075.git.pabeni@redhat.com
2021-10-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-0/+8
include/net/sock.h 7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable") 4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout") drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c 0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table") e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.") Adjacent code addition in both cases, keep both. Signed-off-by: Jakub Kicinski <[email protected]>
2021-10-25xen/netfront: stop tx queues during live migrationDongli Zhang1-0/+8
The tx queues are not stopped during the live migration. As a result, the ndo_start_xmit() may access netfront_info->queues which is freed by talk_to_netback()->xennet_destroy_queues(). This patch is to netif_device_detach() at the beginning of xen-netfront resuming, and netif_device_attach() at the end of resuming. CPU A CPU B talk_to_netback() -> if (info->queues) xennet_destroy_queues(info); to free netfront_info->queues xennet_start_xmit() to access netfront_info->queues -> err = xennet_create_queues(info, &num_queues); The idea is borrowed from virtio-net. Cc: Joe Jin <[email protected]> Signed-off-by: Dongli Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-10-22net: xen: use eth_hw_addr_set()Jakub Kicinski1-1/+3
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-25xen/netfront: don't trust the backend response data blindlyJuergen Gross1-5/+84
Today netfront will trust the backend to send only sane response data. In order to avoid privilege escalations or crashes in case of malicious backends verify the data to be within expected limits. Especially make sure that the response always references an outstanding request. Note that only the tx queue needs special id handling, as for the rx queue the id is equal to the index in the ring page. Introduce a new indicator for the device whether it is broken and let the device stop working when it is set. Set this indicator in case the backend sets any weird data. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-25xen/netfront: disentangle tx_skb_freelistJuergen Gross1-36/+25
The tx_skb_freelist elements are in a single linked list with the request id used as link reference. The per element link field is in a union with the skb pointer of an in use request. Move the link reference out of the union in order to enable a later reuse of it for requests which need a populated skb pointer. Rename add_id_to_freelist() and get_id_from_freelist() to add_id_to_list() and get_id_from_list() in order to prepare using those for other lists as well. Define ~0 as value to indicate the end of a list and place that value into the link for a request not being on the list. When freeing a skb zero the skb pointer in the request. Use a NULL value of the skb pointer instead of skb_entry_is_link() for deciding whether a request has a skb linked to it. Remove skb_entry_set_link() and open code it instead as it is really trivial now. Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-25xen/netfront: don't read data from request on the ring pageJuergen Gross1-44/+42
In order to avoid a malicious backend being able to influence the local processing of a request build the request locally first and then copy it to the ring page. Any reading from the request influencing the processing in the frontend needs to be done on the local instance. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-08-25xen/netfront: read response from backend only onceJuergen Gross1-19/+19
In order to avoid problems in case the backend is modifying a response on the ring page while the frontend has already seen it, just read the response into a local buffer in one go and then operate on that buffer only. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Jan Beulich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-18bpf, devmap: Move drop error path to devmap for XDP_REDIRECTLorenzo Bianconi1-9/+9
We want to change the current ndo_xdp_xmit drop semantics because it will allow us to implement better queue overflow handling. This is working towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error path handling from each XDP ethernet driver to devmap code. According to the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx loop whenever the hw reports a tx error and it will just return to devmap caller the number of successfully transmitted frames. It will be devmap responsibility to free dropped frames. Move each XDP ndo_xdp_xmit capable driver to the new APIs: - veth - virtio-net - mvneta - mvpp2 - socionext - amazon ena - bnxt - freescale (dpaa2, dpaa) - xen-frontend - qede - ice - igb - ixgbe - i40e - mlx5 - ti (cpsw, cpsw-new) - tun - sfc Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Ioana Ciornei <[email protected]> Reviewed-by: Ilias Apalodimas <[email protected]> Reviewed-by: Camelia Groza <[email protected]> Acked-by: Edward Cree <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Shay Agroskin <[email protected]> Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org
2021-02-04drivers: net: xen-netfront: Simplify the calculation of variablesJiapeng Chong1-1/+1
Fix the following coccicheck warnings: ./drivers/net/xen-netfront.c:1816:52-54: WARNING !A || A && B is equivalent to !A || B. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Link: https://lore.kernel.org/r/1612261069-13315-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-16net: ethernet: ibm: ibmvnic: Fix some kernel-doc misdemeanoursLee Jones1-3/+3
Fixes the following W=1 kernel build warning(s): from drivers/net/ethernet/ibm/ibmvnic.c:35: inlined from ‘handle_vpd_rsp’ at drivers/net/ethernet/ibm/ibmvnic.c:4124:3: drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'skb' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_len' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1362: warning: Function parameter or member 'hdr_data' not described in 'build_hdr_data' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_field' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_data' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'hdr_len' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1423: warning: Function parameter or member 'scrq_arr' not described in 'create_hdr_descs' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'txbuff' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'num_entries' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1474: warning: Function parameter or member 'hdr_field' not described in 'build_hdr_descs_arr' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'adapter' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'rwi' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1832: warning: Function parameter or member 'reset_state' not described in 'do_change_param_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'adapter' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'rwi' not described in 'do_reset' drivers/net/ethernet/ibm/ibmvnic.c:1911: warning: Function parameter or member 'reset_state' not described in 'do_reset' Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-01-08net, xdp: Introduce xdp_prepare_buff utility routineLorenzo Bianconi1-4/+2
Introduce xdp_prepare_buff utility routine to initialize per-descriptor xdp_buff fields (e.g. xdp_buff pointers). Rely on xdp_prepare_buff() in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Shay Agroskin <[email protected]> Acked-by: Martin Habets <[email protected]> Acked-by: Camelia Groza <[email protected]> Acked-by: Marcin Wojtas <[email protected]> Link: https://lore.kernel.org/bpf/45f46f12295972a97da8ca01990b3e71501e9d89.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <[email protected]>
2021-01-08net, xdp: Introduce xdp_init_buff utility routineLorenzo Bianconi1-2/+2
Introduce xdp_init_buff utility routine to initialize xdp_buff fields const over NAPI iterations (e.g. frame_sz or rxq pointer). Rely on xdp_init_buff in all XDP capable drivers. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Shay Agroskin <[email protected]> Acked-by: Martin Habets <[email protected]> Acked-by: Camelia Groza <[email protected]> Acked-by: Marcin Wojtas <[email protected]> Link: https://lore.kernel.org/bpf/7f8329b6da1434dc2b05a77f2e800b29628a8913.1608670965.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <[email protected]>
2020-12-01xsk: Propagate napi_id to XDP socket Rx pathBjörn Töpel1-1/+1
Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Ilias Apalodimas <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Tariq Toukan <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-02drivers: net: xen-netfront: Fixed W=1 set but unused warningsAndrew Lunn1-2/+1
drivers/net/xen-netfront.c:2416:16: warning: variable ‘target’ set but not used [-Wunused-but-set-variable] 2416 | unsigned long target; Remove target and just discard the return value from simple_strtoul(). This patch does give a checkpatch warning, but the warning was there before anyway, as this file has lots of checkpatch warnings. Signed-off-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva1-1/+1
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <[email protected]>
2020-07-25bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commandsAndrii Nakryiko1-21/+0
Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-22/+42
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <[email protected]>
2020-07-24xen-netfront: fix potential deadlock in xennet_remove()Andrea Righi1-22/+42
There's a potential race in xennet_remove(); this is what the driver is doing upon unregistering a network device: 1. state = read bus state 2. if state is not "Closed": 3. request to set state to "Closing" 4. wait for state to be set to "Closing" 5. request to set state to "Closed" 6. wait for state to be set to "Closed" If the state changes to "Closed" immediately after step 1 we are stuck forever in step 4, because the state will never go back from "Closed" to "Closing". Make sure to check also for state == "Closed" in step 4 to prevent the deadlock. Also add a 5 sec timeout any time we wait for the bus state to change, to avoid getting stuck forever in wait_event(). Signed-off-by: Andrea Righi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-03net/xen-netfront: add kernel TX timestampsDaniel Drown1-0/+4
This adds kernel TX timestamps to the xen-netfront driver. Tested with chrony on an AWS EC2 instance. Signed-off-by: Daniel Drown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-02xen-netfront: remove redundant assignment to variable 'act'Colin Ian King1-1/+1
The variable act is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-01xen networking: add basic XDP support for xen-netfrontDenis Kirjanov1-10/+326
The patch adds a basic XDP processing to xen-netfront driver. We ran an XDP program for an RX response received from netback driver. Also we request xen-netback to adjust data offset for bpf_xdp_adjust_head() header space for custom headers. synchronization between frontend and backend parts is done by using xenbus state switching: Reconfiguring -> Reconfigured- > Connected UDP packets drop rate using xdp program is around 310 kpps using ./pktgen_sample04_many_flows.sh and 160 kpps without the patch. Signed-off-by: Denis Kirjanov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-10-01xen-netfront: do not use ~0U as error return value for xennet_fill_frags()Dongli Zhang1-8/+9
xennet_fill_frags() uses ~0U as return value when the sk_buff is not able to cache extra fragments. This is incorrect because the return type of xennet_fill_frags() is RING_IDX and 0xffffffff is an expected value for ring buffer index. In the situation when the rsp_cons is approaching 0xffffffff, the return value of xennet_fill_frags() may become 0xffffffff which xennet_poll() (the caller) would regard as error. As a result, queue->rx.rsp_cons is set incorrectly because it is updated only when there is error. If there is no error, xennet_poll() would be responsible to update queue->rx.rsp_cons. Finally, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. The symptom is similar to the one fixed in commit 00b368502d18 ("xen-netfront: do not assume sk_buff_head list is empty in error handling"). This patch changes the return type of xennet_fill_frags() to indicate whether it is successful or failed. The queue->rx.rsp_cons will be always updated inside this function. Fixes: ad4f15dc2c70 ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-17Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-1/+1
Pull in bug fixes from 'net' tree for the merge window. Signed-off-by: David S. Miller <[email protected]>
2019-09-16xen-netfront: do not assume sk_buff_head list is empty in error handlingDongli Zhang1-1/+1
When skb_shinfo(skb) is not able to cache extra fragment (that is, skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS), xennet_fill_frags() assumes the sk_buff_head list is already empty. As a result, cons is increased only by 1 and returns to error handling path in xennet_poll(). However, if the sk_buff_head list is not empty, queue->rx.rsp_cons may be set incorrectly. That is, queue->rx.rsp_cons would point to the rx ring buffer entries whose queue->rx_skbs[i] and queue->grant_rx_ref[i] are already cleared to NULL. This leads to NULL pointer access in the next iteration to process rx ring buffer entries. Below is how xennet_poll() does error handling. All remaining entries in tmpq are accounted to queue->rx.rsp_cons without assuming how many outstanding skbs are remained in the list. 985 static int xennet_poll(struct napi_struct *napi, int budget) ... ... 1032 if (unlikely(xennet_set_skb_gso(skb, gso))) { 1033 __skb_queue_head(&tmpq, skb); 1034 queue->rx.rsp_cons += skb_queue_len(&tmpq); 1035 goto err; 1036 } It is better to always have the error handling in the same way. Fixes: ad4f15dc2c70 ("xen/netfront: don't bug in case of too many frags") Signed-off-by: Dongli Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-30net: Use skb_frag_off accessorsJonathan Lemon1-4/+4
Use accessor functions for skb fragment's page_offset instead of direct references, in preparation for bvec conversion. Signed-off-by: Jonathan Lemon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-16xen-netfront: mark expected switch fall-throughGustavo A. R. Silva1-1/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/xen-netfront.c: In function ‘netback_changed’: drivers/net/xen-netfront.c:2038:6: warning: this statement may fall through [-Wimplicit-fallthrough=] if (dev->state == XenbusStateClosed) ^ drivers/net/xen-netfront.c:2041:2: note: here case XenbusStateClosing: ^~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-03-20net: remove 'fallback' argument from dev->ndo_select_queue()Paolo Abeni1-2/+1
After the previous patch, all the callers of ndo_select_queue() provide as a 'fallback' argument netdev_pick_tx. The only exceptions are nested calls to ndo_select_queue(), which pass down the 'fallback' available in the current scope - still netdev_pick_tx. We can drop such argument and replace fallback() invocation with netdev_pick_tx(). This avoids an indirect call per xmit packet in some scenarios (TCP syn, UDP unconnected, XDP generic, pktgen) with device drivers implementing such ndo. It also clean the code a bit. Tested with ixgbe and CONFIG_FCOE=m With pktgen using queue xmit: threads vanilla patched (kpps) (kpps) 1 2334 2428 2 4166 4278 4 7895 8100 v1 -> v2: - rebased after helper's name change Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <[email protected]>
2018-12-18xen/netfront: tolerate frags with no dataJuergen Gross1-1/+1
At least old Xen net backends seem to send frags with no real data sometimes. In case such a fragment happens to occur with the frag limit already reached the frontend will BUG currently even if this situation is easily recoverable. Modify the BUG_ON() condition accordingly. Tested-by: Dietmar Hahn <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-09xen/netfront: remove unnecessary wmbJacob Wen1-2/+0
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY is already able to make sure backend sees requests before req_prod is updated. Signed-off-by: Jacob Wen <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Reviewed-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-09-13xen/netfront: don't bug in case of too many fragsJuergen Gross1-1/+7
Commit 57f230ab04d291 ("xen/netfront: raise max number of slots in xennet_get_responses()") raised the max number of allowed slots by one. This seems to be problematic in some configurations with netback using a larger MAX_SKB_FRAGS value (e.g. old Linux kernel with MAX_SKB_FRAGS defined as 18 instead of nowadays 17). Instead of BUG_ON() in this case just fall back to retransmission. Fixes: 57f230ab04d291 ("xen/netfront: raise max number of slots in xennet_get_responses()") Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-09-07xen/netfront: fix waiting for xenbus state changeJuergen Gross1-14/+10
Commit 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually") added a new wait queue to wait on for a state change when the module is loaded manually. Unfortunately there is no wakeup anywhere to stop that waiting. Instead of introducing a new wait queue rename the existing module_unload_q to module_wq and use it for both purposes (loading and unloading). As any state change of the backend might be intended to stop waiting do the wake_up_all() in any case when netback_changed() is called. Fixes: 822fb18a82aba ("xen-netfront: wait xenbus state change when load module manually") Cc: <[email protected]> #4.18 Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-14xen-netfront: fix warn message as irq device name has '/'Xiao Liang1-2/+4
There is a call trace generated after commit 2d408c0d4574b01b9ed45e02516888bf925e11a9( xen-netfront: fix queue name setting). There is no 'device/vif/xx-q0-tx' file found under /proc/irq/xx/. This patch only picks up device type and id as its name. With the patch, now /proc/interrupts looks like below and the warning message gone: 70: 21 0 0 0 xen-dyn -event vif0-q0-tx 71: 15 0 0 0 xen-dyn -event vif0-q0-rx 72: 14 0 0 0 xen-dyn -event vif0-q1-tx 73: 33 0 0 0 xen-dyn -event vif0-q1-rx 74: 12 0 0 0 xen-dyn -event vif0-q2-tx 75: 24 0 0 0 xen-dyn -event vif0-q2-rx 76: 19 0 0 0 xen-dyn -event vif0-q3-tx 77: 21 0 0 0 xen-dyn -event vif0-q3-rx Below is call trace information without this patch: name 'device/vif/0-q0-tx' WARNING: CPU: 2 PID: 37 at fs/proc/generic.c:174 __xlate_proc_name+0x85/0xa0 RIP: 0010:__xlate_proc_name+0x85/0xa0 RSP: 0018:ffffb85c40473c18 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006 RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff984c7f516930 RBP: ffffb85c40473cb8 R08: 000000000000002c R09: 0000000000000229 R10: 0000000000000000 R11: 0000000000000001 R12: ffffb85c40473c98 R13: ffffb85c40473cb8 R14: ffffb85c40473c50 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff984c7f500000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f69b6899038 CR3: 000000001c20a006 CR4: 00000000001606e0 Call Trace: __proc_create+0x45/0x230 ? snprintf+0x49/0x60 proc_mkdir_data+0x35/0x90 register_handler_proc+0xef/0x110 ? proc_register+0xfc/0x110 ? proc_create_data+0x70/0xb0 __setup_irq+0x39b/0x660 ? request_threaded_irq+0xad/0x160 request_threaded_irq+0xf5/0x160 ? xennet_tx_buf_gc+0x1d0/0x1d0 [xen_netfront] bind_evtchn_to_irqhandler+0x3d/0x70 ? xenbus_alloc_evtchn+0x41/0xa0 netback_changed+0xa46/0xcda [xen_netfront] ? find_watch+0x40/0x40 xenwatch_thread+0xc5/0x160 ? finish_wait+0x80/0x80 kthread+0x112/0x130 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x35/0x40 Code: 81 5c 00 48 85 c0 75 cc 5b 49 89 2e 31 c0 5d 4d 89 3c 24 41 5c 41 5d 41 5e 41 5f c3 4c 89 ee 48 c7 c7 40 4f 0e b4 e8 65 ea d8 ff <0f> 0b b8 fe ff ff ff 5b 5d 41 5c 41 5d 41 5e 41 5f c3 66 0f 1f ---[ end trace 650e5561b0caab3a ]--- Signed-off-by: Xiao Liang <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-11Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+4
2018-08-11xen/netfront: don't cache skb_shinfo()Juergen Gross1-4/+4
skb_shinfo() can change when calling __pskb_pull_tail(): Don't cache its return value. Cc: [email protected] Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-08-02Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+6
The BTF conflicts were simple overlapping changes. The virtio_net conflict was an overlap of a fix of statistics counter, happening alongisde a move over to a bonafide statistics structure rather than counting value on the stack. Signed-off-by: David S. Miller <[email protected]>
2018-07-30xen-netfront: wait xenbus state change when load module manuallyXiao Liang1-0/+6
When loading module manually, after call xenbus_switch_state to initializes the state of the netfront device, the driver state did not change so fast that may lead no dev created in latest kernel. This patch adds wait to make sure xenbus knows the driver is not in closed/unknown state. Current state: [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Cannot get device settings: No such device Cannot get wake-on-lan settings: No such device Cannot get message level: No such device Cannot get link status: No such device No data available With the patch installed. [vm]# ethtool eth0 Settings for eth0: Link detected: yes [vm]# modprobe -r xen_netfront [vm]# modprobe xen_netfront [vm]# ethtool eth0 Settings for eth0: Link detected: yes Signed-off-by: Xiao Liang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-22xen-netfront: fix queue name settingVitaly Kuznetsov1-1/+1
Commit f599c64fdf7d ("xen-netfront: Fix race between device setup and open") changed the initialization order: xennet_create_queues() now happens before we do register_netdev() so using netdev->name in xennet_init_queue() is incorrect, we end up with the following in /proc/interrupts: 60: 139 0 xen-dyn -event eth%d-q0-tx 61: 265 0 xen-dyn -event eth%d-q0-rx 62: 234 0 xen-dyn -event eth%d-q1-tx 63: 1 0 xen-dyn -event eth%d-q1-rx and this looks ugly. Actually, using early netdev name (even when it's already set) is also not ideal: nowadays we tend to rename eth devices and queue name may end up not corresponding to the netdev name. Use nodename from xenbus device for queue naming: this can't change in VM's lifetime. Now /proc/interrupts looks like 62: 202 0 xen-dyn -event device/vif/0-q0-tx 63: 317 0 xen-dyn -event device/vif/0-q0-rx 64: 262 0 xen-dyn -event device/vif/0-q1-tx 65: 17 0 xen-dyn -event device/vif/0-q1-rx Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Signed-off-by: Vitaly Kuznetsov <[email protected]> Reviewed-by: Ross Lagerwall <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-09net: allow ndo_select_queue to pass netdevAlexander Duyck1-1/+2
This patch makes it so that instead of passing a void pointer as the accel_priv we instead pass a net_device pointer as sb_dev. Making this change allows us to pass the subordinate device through to the fallback function eventually so that we can keep the actual code in the ndo_select_queue call as focused on possible on the exception cases. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-06-22xen-netfront: Update features after registering netdevRoss Lagerwall1-4/+4
Update the features after calling register_netdev() otherwise the device features are not set up correctly and it not possible to change the MTU of the device. After this change, the features reported by ethtool match the device's features before the commit which introduced the issue and it is possible to change the device's MTU. Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Reported-by: Liam Shepherd <[email protected]> Signed-off-by: Ross Lagerwall <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-22xen-netfront: Fix mismatched rtnl_unlockRoss Lagerwall1-1/+2
Fixes: f599c64fdf7d ("xen-netfront: Fix race between device setup and open") Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Ross Lagerwall <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-12xen/netfront: raise max number of slots in xennet_get_responses()Juergen Gross1-2/+2
The max number of slots used in xennet_get_responses() is set to MAX_SKB_FRAGS + (rx->status <= RX_COPY_THRESHOLD). In old kernel-xen MAX_SKB_FRAGS was 18, while nowadays it is 17. This difference is resulting in frequent messages "too many slots" and a reduced network throughput for some workloads (factor 10 below that of a kernel-xen based guest). Replacing MAX_SKB_FRAGS by XEN_NETIF_NR_SLOTS_MIN for calculation of the max number of slots to use solves that problem (tests showed no more messages "too many slots" and throughput was as high as with the kernel-xen based guest system). Replace MAX_SKB_FRAGS-2 by XEN_NETIF_NR_SLOTS_MIN-1 in netfront_tx_slot_available() for making it clearer what is really being tested without actually modifying the tested value. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-14xen-netfront: fix xennet_start_xmit()'s return typeLuc Van Oostenryck1-1/+1
The method ndo_start_xmit() is defined as returning an 'netdev_tx_t', which is a typedef for an enum type, but the implementation in this driver returns an 'int'. Fix this by returning 'netdev_tx_t' in this driver too. Signed-off-by: Luc Van Oostenryck <[email protected]> Reviewed-by: Wei Liu <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-03-26drivers/net: Use octal not symbolic permissionsJoe Perches1-3/+3
Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <[email protected]> Reviewed-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-28xen-netfront: Fix hang on device removalJason Andryuk1-1/+6
A toolstack may delete the vif frontend and backend xenstore entries while xen-netfront is in the removal code path. In that case, the checks for xenbus_read_driver_state would return XenbusStateUnknown, and xennet_remove would hang indefinitely. This hang prevents system shutdown. xennet_remove must be able to handle XenbusStateUnknown, and netback_changed must also wake up the wake_queue for that state as well. Fixes: 5b5971df3bc2 ("xen-netfront: remove warning when unloading module") Signed-off-by: Jason Andryuk <[email protected]> Cc: Eduardo Otubo <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-02-06xen-netfront: Fix race between device setup and openRoss Lagerwall1-22/+24
When a netfront device is set up it registers a netdev fairly early on, before it has set up the queues and is actually usable. A userspace tool like NetworkManager will immediately try to open it and access its state as soon as it appears. The bug can be reproduced by hotplugging VIFs until the VM runs out of grant refs. It registers the netdev but fails to set up any queues (since there are no more grant refs). In the meantime, NetworkManager opens the device and the kernel crashes trying to access the queues (of which there are none). Fix this in two ways: * For initial setup, register the netdev much later, after the queues are setup. This avoids the race entirely. * During a suspend/resume cycle, the frontend reconnects to the backend and the queues are recreated. It is possible (though highly unlikely) to race with something opening the device and accessing the queues after they have been destroyed but before they have been recreated. Extend the region covered by the rtnl semaphore to protect against this race. There is a possibility that we fail to recreate the queues so check for this in the open function. Signed-off-by: Ross Lagerwall <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>