aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2019-08-21cfg80211: util: add 6GHz channel to freq conversion and vice versaArend van Spriel1-1/+9
Extend the functions ieee80211_channel_to_frequency() and ieee80211_frequency_to_channel() to support 6GHz band according specification in 802.11ax D4.1 27.3.22.2. Reviewed-by: Pieter-Paul Giesberts <[email protected]> Reviewed-by: Leon Zegers <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21cfg80211: add 6GHz UNII band definitionsArend van Spriel1-2/+19
For the new 6GHz there are new UNII band definitions as listed in the FCC notice [1]. [1] https://docs.fcc.gov/public/attachments/FCC-18-147A1_Rcd.pdf Reviewed-by: Pieter-Paul Giesberts <[email protected]> Reviewed-by: Leon Zegers <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21nl80211: add 6GHz band definition to enum nl80211_bandArend van Spriel1-0/+1
In the 802.11ax specification a new band is introduced, which is also proposed by FCC for unlicensed use. This band is referred to as 6GHz spanning frequency range from 5925 to 7125 MHz. Reviewed-by: Pieter-Paul Giesberts <[email protected]> Reviewed-by: Leon Zegers <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21Revert "cfg80211: fix processing world regdomain when non modular"Hodaszi, Robert1-1/+1
This reverts commit 96cce12ff6e0 ("cfg80211: fix processing world regdomain when non modular"). Re-triggering a reg_process_hint with the last request on all events, can make the regulatory domain fail in case of multiple WiFi modules. On slower boards (espacially with mdev), enumeration of the WiFi modules can end up in an intersected regulatory domain, and user cannot set it with 'iw reg set' anymore. This is happening, because: - 1st module enumerates, queues up a regulatory request - request gets processed by __reg_process_hint_driver(): - checks if previous was set by CORE -> yes - checks if regulator domain changed -> yes, from '00' to e.g. 'US' -> sends request to the 'crda' - 2nd module enumerates, queues up a regulator request (which triggers the reg_todo() work) - reg_todo() -> reg_process_pending_hints() sees, that the last request is not processed yet, so it tries to process it again. __reg_process_hint driver() will run again, and: - checks if the last request's initiator was the core -> no, it was the driver (1st WiFi module) - checks, if the previous initiator was the driver -> yes - checks if the regulator domain changed -> yes, it was '00' (set by core, and crda call did not return yet), and should be changed to 'US' ------> __reg_process_hint_driver calls an intersect Besides, the reg_process_hint call with the last request is meaningless since the crda call has a timeout work. If that timeout expires, the first module's request will lost. Cc: [email protected] Fixes: 96cce12ff6e0 ("cfg80211: fix processing world regdomain when non modular") Signed-off-by: Robert Hodaszi <[email protected]> Link: https://lore.kernel.org/r/20190614131600.GA13897@a1-hr Signed-off-by: Johannes Berg <[email protected]>
2019-08-21mac80211: add missing length field increment when generating Radiotap headerJohn Crispin1-0/+1
The code generating the Tx Radiotap header when using tx_status_ext was missing a field increment after setting the VHT bandwidth. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21mac80211: 80Mhz was not reported properly when using tx_status_extJohn Crispin1-1/+1
When reporting 80MHz, we need to set 4 and not 2 inside the corresponding field inside the Tx Radiotap header. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21mac80211: fix bad guard when reporting legacy ratesJohn Crispin1-7/+7
When reporting legacy rates inside the TX Radiotap header we need to split the check between "uses tx_statua_ext" and "is legacy rate". Not doing so would make the code drop into the !tx_status_ext path. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21mac80211: fix TX legacy rate reporting when tx_status_ext is usedJohn Crispin1-3/+9
The RX Radiotap header length was not calculated properly when reporting legacy rates using tx_status_ext. Fixes: 3d07ffcaf320 ("mac80211: add struct ieee80211_tx_status support to ieee80211_add_tx_radiotap_header") Signed-off-by: John Crispin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21cfg80211: Fix Extended Key ID key install checksAlexander Wetzel1-9/+14
Fix two shortcomings in the Extended Key ID API: 1) Allow the userspace to install pairwise keys using keyid 1 without NL80211_KEY_NO_TX set. This allows the userspace to install and activate pairwise keys with keyid 1 in the same way as for keyid 0, simplifying the API usage for e.g. FILS and FT key installs. 2) IEEE 802.11 - 2016 restricts Extended Key ID usage to CCMP/GCMP ciphers in IEEE 802.11 - 2016 "9.4.2.25.4 RSN capabilities". Enforce that when installing a key. Cc: [email protected] # 5.2 Fixes: 6cdd3979a2bd ("nl80211/cfg80211: Extended Key ID support") Signed-off-by: Alexander Wetzel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-21mac80211: fix possible sta leakJohannes Berg1-4/+5
If TDLS station addition is rejected, the sta memory is leaked. Avoid this by moving the check before the allocation. Cc: [email protected] Fixes: 7ed5285396c2 ("mac80211: don't initiate TDLS connection if station is not associated to AP") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2019-08-20xprtrdma: Ensure creating an MR does not trigger FS writebackChuck Lever2-4/+5
Probably would be good to also pass GFP flags to ib_alloc_mr. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Move rpcrdma_mr_get out of frwr_mapChuck Lever4-43/+35
Refactor: Retrieve an MR and handle error recovery entirely in rpc_rdma.c, as this is not a device-specific function. Note that since commit 89f90fe1ad8b ("SUNRPC: Allow calls to xprt_transmit() to drain the entire transmit queue"), the xprt_transmit function handles the cond_resched. The transport no longer has to do this itself. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Combine rpcrdma_mr_put and rpcrdma_mr_unmap_and_putChuck Lever3-28/+11
Clean up. There is only one remaining rpcrdma_mr_put call site, and it can be directly replaced with unmap_and_put because mr->mr_dir is set to DMA_NONE just before the call. Now all the call sites do a DMA unmap, and we can just rename mr_unmap_and_put to mr_put, which nicely matches mr_get. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Simplify rpcrdma_mr_popChuck Lever4-21/+11
Clean up: rpcrdma_mr_pop call sites check if the list is empty first. Let's replace the list_empty with less costly logic. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20net: fix __ip_mc_inc_group usageLi RongQing1-2/+2
in ip_mc_inc_group, memory allocation flag, not mcast mode, is expected by __ip_mc_inc_group similar issue in __ip_mc_join_group, both mcase mode and gfp_t are needed here, so use ____ip_mc_inc_group(...) Fixes: 9fb20801dab4 ("net: Fix ip_mc_{dec,inc}_group allocation context") Signed-off-by: Li RongQing <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: Zhang Yu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-20net/ncsi: Ensure 32-bit boundary for data cksumTerry S. Duncan2-4/+7
The NCSI spec indicates that if the data does not end on a 32 bit boundary, one to three padding bytes equal to 0x00 shall be present to align the checksum field to a 32-bit boundary. Signed-off-by: Terry S. Duncan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-20net: dsa: enable and disable all portsVivien Didelot1-0/+11
Call the .port_enable and .port_disable functions for all ports, not only the user ports, so that drivers may optimize the power consumption of all ports after a successful setup. Unused ports are now disabled on setup. CPU and DSA ports are now enabled on setup and disabled on teardown. User ports were already enabled at slave creation and disabled at slave destruction. Signed-off-by: Vivien Didelot <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-20net: dsa: use a single switch statement for port setupVivien Didelot1-48/+39
It is currently difficult to read the different steps involved in the setup and teardown of ports in the DSA code. Keep it simple with a single switch statement for each port type: UNUSED, CPU, DSA, or USER. Also no need to call devlink_port_unregister from within dsa_port_setup as this step is inconditionally handled by dsa_port_teardown on error. Signed-off-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-20net/smc: make sure EPOLLOUT is raisedJason Baron1-4/+2
Currently, we are only explicitly setting SOCK_NOSPACE on a write timeout for non-blocking sockets. Epoll() edge-trigger mode relies on SOCK_NOSPACE being set when -EAGAIN is returned to ensure that EPOLLOUT is raised. Expand the setting of SOCK_NOSPACE to non-blocking sockets as well that can use SO_SNDTIMEO to adjust their write timeout. This mirrors the behavior that Eric Dumazet introduced for tcp sockets. Signed-off-by: Jason Baron <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Ursula Braun <[email protected]> Cc: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-20xprtrdma: Toggle XPRT_CONGESTED in xprtrdma's slot methodsChuck Lever1-1/+3
Commit 48be539dd44a ("xprtrdma: Introduce ->alloc_slot call-out for xprtrdma") added a separate alloc_slot and free_slot to the RPC/RDMA transport. Later, commit 75891f502f5f ("SUNRPC: Support for congestion control when queuing is enabled") modified the generic alloc/free_slot methods, but neglected the methods in xprtrdma. Found via code review. Fixes: 75891f502f5f ("SUNRPC: Support for congestion control ... ") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Rename rpcrdma_buffer::rb_allChuck Lever2-19/+9
Clean up: There are other "all" list heads. For code clarity distinguish this one as for use only for MRs by renaming it. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Rename CQE field in Receive trace pointsChuck Lever1-1/+1
Make the field name the same for all trace points that handle pointers to struct rpcrdma_rep. That makes it easy to grep for matching rep points in trace output. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Boost maximum transport header sizeChuck Lever2-14/+18
Although I haven't seen any performance results that justify it, I've received several complaints that NFS/RDMA no longer supports a maximum rsize and wsize of 1MB. These days it is somewhat smaller. To simplify the logic that determines whether a chunk list is necessary, the implementation uses a fixed maximum size of the transport header. Currently that maximum size is 256 bytes, one quarter of the default inline threshold size for RPC/RDMA v1. Since commit a78868497c2e ("xprtrdma: Reduce max_frwr_depth"), the size of chunks is also smaller to take advantage of inline page lists in device internal MR data structures. The combination of these two design choices has reduced the maximum NFS rsize and wsize that can be used for most RNIC/HCAs. Increasing the maximum transport header size and the maximum number of RDMA segments it can contain increases the negotiated maximum rsize/wsize on common RNIC/HCAs. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Fix calculation of ri_max_segs againChuck Lever1-2/+2
Commit 302d3deb206 ("xprtrdma: Prevent inline overflow") added this calculation back in 2016, but got it wrong. I tested only the lower bound, which is why there is a max_t there. The upper bound should be rounded up too. Now, when using DIV_ROUND_UP, that takes care of the lower bound as well. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Update obsolete commentChuck Lever1-3/+0
Comment was made obsolete by commit 8cec3dba76a4 ("xprtrdma: rpcrdma_regbuf alignment"). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xprtrdma: Refresh the documenting comment in frwr_ops.cChuck Lever1-48/+18
Things have changed since this comment was written. In particular, the reworking of connection closing, on-demand creation of MRs, and the removal of fr_state all mean that deferring MR recovery to frwr_map is no longer needed. The description is obsolete. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xdp: unpin xdp umem pages in error pathIvan Khoronzhuk1-1/+3
Fix mem leak caused by missed unpin routine for umem pages. Fixes: 8aef7340ae9695 ("xsk: introduce xdp_umem_page") Signed-off-by: Ivan Khoronzhuk <[email protected]> Acked-by: Jonathan Lemon <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-08-20SUNRPC: Inline xdr_commit_encodeChuck Lever1-1/+1
Micro-optimization: For xdr_commit_encode call sites in net/sunrpc/xdr.c, eliminate the extra calling sequence. On my client, this change saves about a microsecond for every 30 calls to xdr_reserve_space(). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20SUNRPC: Remove rpc_wake_up_queued_task_on_wq()Chuck Lever1-23/+4
Clean up: commit c544577daddb ("SUNRPC: Clean up transport write space handling") appears to have removed the last caller of rpc_wake_up_queued_task_on_wq(). Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-08-20xfrm: policy: avoid warning splat when merging nodesFlorian Westphal1-2/+4
syzbot reported a splat: xfrm_policy_inexact_list_reinsert+0x625/0x6e0 net/xfrm/xfrm_policy.c:877 CPU: 1 PID: 6756 Comm: syz-executor.1 Not tainted 5.3.0-rc2+ #57 Call Trace: xfrm_policy_inexact_node_reinsert net/xfrm/xfrm_policy.c:922 [inline] xfrm_policy_inexact_node_merge net/xfrm/xfrm_policy.c:958 [inline] xfrm_policy_inexact_insert_node+0x537/0xb50 net/xfrm/xfrm_policy.c:1023 xfrm_policy_inexact_alloc_chain+0x62b/0xbd0 net/xfrm/xfrm_policy.c:1139 xfrm_policy_inexact_insert+0xe8/0x1540 net/xfrm/xfrm_policy.c:1182 xfrm_policy_insert+0xdf/0xce0 net/xfrm/xfrm_policy.c:1574 xfrm_add_policy+0x4cf/0x9b0 net/xfrm/xfrm_user.c:1670 xfrm_user_rcv_msg+0x46b/0x720 net/xfrm/xfrm_user.c:2676 netlink_rcv_skb+0x1f0/0x460 net/netlink/af_netlink.c:2477 xfrm_netlink_rcv+0x74/0x90 net/xfrm/xfrm_user.c:2684 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline] netlink_unicast+0x809/0x9a0 net/netlink/af_netlink.c:1328 netlink_sendmsg+0xa70/0xd30 net/netlink/af_netlink.c:1917 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg net/socket.c:657 [inline] There is no reproducer, however, the warning can be reproduced by adding rules with ever smaller prefixes. The sanity check ("does the policy match the node") uses the prefix value of the node before its updated to the smaller value. To fix this, update the prefix earlier. The bug has no impact on tree correctness, this is only to prevent a false warning. Reported-by: [email protected] Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2019-08-19sctp: remove net sctp.x_enable working as a global switchXin Long1-16/+12
The netns sctp feature flags shouldn't work as a global switch, which is mostly like a firewall/netfilter's job. Also, it will break asoc as it discard or accept chunks incorrectly when net sctp.x_enable is changed after the asoc is created. Since each type of chunk's processing function will check the corresp asoc's feature flag, this 'global switch' should be removed, and net sctp.x_enable will only work as the default feature flags for the future sctp sockets/endpoints. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: add SCTP_AUTH_SUPPORTED sockoptXin Long1-0/+86
SCTP_AUTH_SUPPORTED sockopt is used to set enpoint's auth flag. With this feature, each endpoint will have its own flag for its future asoc's auth_capable, instead of netns auth flag. Note that when both ep's auth_enable is enabled, endpoint auth related data should be initialized. If asconf_enable is also set, SCTP_CID_ASCONF/SCTP_CID_ASCONF_ACK should be added into auth_chunk_list. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: add sctp_auth_init and sctp_auth_freeXin Long2-56/+74
This patch is to factor out sctp_auth_init and sctp_auth_free functions, and sctp_auth_init will also be used in the next patch for SCTP_AUTH_SUPPORTED sockopt. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: use ep and asoc auth_enable properlyXin Long2-33/+44
sctp has per endpoint auth flag and per asoc auth flag, and the asoc one should be checked when coming to asoc and the endpoint one should be checked when coming to endpoint. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: add SCTP_ASCONF_SUPPORTED sockoptXin Long1-0/+82
SCTP_ASCONF_SUPPORTED sockopt is used to set enpoint's asconf flag. With this feature, each endpoint will have its own flag for its future asoc's asconf_capable, instead of netns asconf flag. Note that when both ep's asconf_enable and auth_enable are enabled, SCTP_CID_ASCONF and SCTP_CID_ASCONF_ACK should be added into auth_chunk_list. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: check asoc peer.asconf_capable before processing asconfXin Long1-2/+4
asconf chunks should be dropped when the asoc doesn't support asconf feature. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: not set peer.asconf_capable in sctp_association_initXin Long1-9/+0
asoc->peer.asconf_capable is to be set during handshake, and its value should be initialized to 0. net->sctp.addip_noauth will be checked in sctp_process_init when processing INIT_ACK on client and COOKIE_ECHO on server. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19sctp: add asconf_enable in struct sctp_endpointXin Long3-20/+18
This patch is to make addip/asconf flag per endpoint, and its value is initialized by the per netns flag, net->sctp.addip_enable. It also replaces the checks of net->sctp.addip_enable with ep->asconf_enable in some places. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19net: remove empty inet_exit_netLi RongQing1-5/+0
Pointer members of an object with static storage duration, if not explicitly initialized, will be initialized to a NULL pointer. The net namespace API checks if this pointer is not NULL before using it, it are safe to remove the function. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller3-15/+35
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Remove IP MASQUERADING record in MAINTAINERS file, from Denis Efremov. 2) Counter arguments are swapped in ebtables, from Todd Seidelmann. 3) Missing netlink attribute validation in flow_offload extension. 4) Incorrect alignment in xt_nfacct that breaks 32-bits userspace / 64-bits kernels, from Juliana Rodrigueiro. 5) Missing include guard in nf_conntrack_h323_types.h, from Masahiro Yamada. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-08-19tcp: make sure EPOLLOUT wont be missedEric Dumazet1-7/+9
As Jason Baron explained in commit 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure"), it is crucial we properly set SOCK_NOSPACE when needed. However, Jason patch had a bug, because the 'nonblocking' status as far as sk_stream_wait_memory() is concerned is governed by MSG_DONTWAIT flag passed at sendmsg() time : long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT); So it is very possible that tcp sendmsg() calls sk_stream_wait_memory(), and that sk_stream_wait_memory() returns -EAGAIN with SOCK_NOSPACE cleared, if sk->sk_sndtimeo has been set to a small (but not zero) value. This patch removes the 'noblock' variable since we must always set SOCK_NOSPACE if -EAGAIN is returned. It also renames the do_nonblock label since we might reach this code path even if we were in blocking mode. Fixes: 790ba4566c1a ("tcp: set SOCK_NOSPACE under memory pressure") Signed-off-by: Eric Dumazet <[email protected]> Cc: Jason Baron <[email protected]> Reported-by: Vladimir Rutsky <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Neal Cardwell <[email protected]> Acked-by: Jason Baron <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19net: flow_offload: convert block_ing_cb_list to regular list typeVlad Buslov1-6/+7
RCU list block_ing_cb_list is protected by rcu read lock in flow_block_ing_cmd() and with flow_indr_block_ing_cb_lock mutex in all functions that use it. However, flow_block_ing_cmd() needs to call blocking functions while iterating block_ing_cb_list which leads to following suspicious RCU usage warning: [ 401.510948] ============================= [ 401.510952] WARNING: suspicious RCU usage [ 401.510993] 5.3.0-rc3+ #589 Not tainted [ 401.510996] ----------------------------- [ 401.511001] include/linux/rcupdate.h:265 Illegal context switch in RCU read-side critical section! [ 401.511004] other info that might help us debug this: [ 401.511008] rcu_scheduler_active = 2, debug_locks = 1 [ 401.511012] 7 locks held by test-ecmp-add-v/7576: [ 401.511015] #0: 00000000081d71a5 (sb_writers#4){.+.+}, at: vfs_write+0x166/0x1d0 [ 401.511037] #1: 000000002bd338c3 (&of->mutex){+.+.}, at: kernfs_fop_write+0xef/0x1b0 [ 401.511051] #2: 00000000c921c634 (kn->count#317){.+.+}, at: kernfs_fop_write+0xf7/0x1b0 [ 401.511062] #3: 00000000a19cdd56 (&dev->mutex){....}, at: sriov_numvfs_store+0x6b/0x130 [ 401.511079] #4: 000000005425fa52 (pernet_ops_rwsem){++++}, at: unregister_netdevice_notifier+0x30/0x140 [ 401.511092] #5: 00000000c5822793 (rtnl_mutex){+.+.}, at: unregister_netdevice_notifier+0x35/0x140 [ 401.511101] #6: 00000000c2f3507e (rcu_read_lock){....}, at: flow_block_ing_cmd+0x5/0x130 [ 401.511115] stack backtrace: [ 401.511121] CPU: 21 PID: 7576 Comm: test-ecmp-add-v Not tainted 5.3.0-rc3+ #589 [ 401.511124] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017 [ 401.511127] Call Trace: [ 401.511138] dump_stack+0x85/0xc0 [ 401.511146] ___might_sleep+0x100/0x180 [ 401.511154] __mutex_lock+0x5b/0x960 [ 401.511162] ? find_held_lock+0x2b/0x80 [ 401.511173] ? __tcf_get_next_chain+0x1d/0xb0 [ 401.511179] ? mark_held_locks+0x49/0x70 [ 401.511194] ? __tcf_get_next_chain+0x1d/0xb0 [ 401.511198] __tcf_get_next_chain+0x1d/0xb0 [ 401.511251] ? uplink_rep_async_event+0x70/0x70 [mlx5_core] [ 401.511261] tcf_block_playback_offloads+0x39/0x160 [ 401.511276] tcf_block_setup+0x1b0/0x240 [ 401.511312] ? mlx5e_rep_indr_setup_tc_cb+0xca/0x290 [mlx5_core] [ 401.511347] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core] [ 401.511359] tc_indr_block_get_and_ing_cmd+0x11b/0x1e0 [ 401.511404] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core] [ 401.511414] flow_block_ing_cmd+0x7e/0x130 [ 401.511453] ? mlx5e_rep_indr_tc_block_unbind+0x50/0x50 [mlx5_core] [ 401.511462] __flow_indr_block_cb_unregister+0x7f/0xf0 [ 401.511502] mlx5e_nic_rep_netdevice_event+0x75/0xb0 [mlx5_core] [ 401.511513] unregister_netdevice_notifier+0xe9/0x140 [ 401.511554] mlx5e_cleanup_rep_tx+0x6f/0xe0 [mlx5_core] [ 401.511597] mlx5e_detach_netdev+0x4b/0x60 [mlx5_core] [ 401.511637] mlx5e_vport_rep_unload+0x71/0xc0 [mlx5_core] [ 401.511679] esw_offloads_disable+0x5b/0x90 [mlx5_core] [ 401.511724] mlx5_eswitch_disable.cold+0xdf/0x176 [mlx5_core] [ 401.511759] mlx5_device_disable_sriov+0xab/0xb0 [mlx5_core] [ 401.511794] mlx5_core_sriov_configure+0xaf/0xd0 [mlx5_core] [ 401.511805] sriov_numvfs_store+0xf8/0x130 [ 401.511817] kernfs_fop_write+0x122/0x1b0 [ 401.511826] vfs_write+0xdb/0x1d0 [ 401.511835] ksys_write+0x65/0xe0 [ 401.511847] do_syscall_64+0x5c/0xb0 [ 401.511857] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 401.511862] RIP: 0033:0x7fad892d30f8 [ 401.511868] Code: 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 25 96 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 60 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89 [ 401.511871] RSP: 002b:00007ffca2a9fad8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 401.511875] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fad892d30f8 [ 401.511878] RDX: 0000000000000002 RSI: 000055afeb072a90 RDI: 0000000000000001 [ 401.511881] RBP: 000055afeb072a90 R08: 00000000ffffffff R09: 000000000000000a [ 401.511884] R10: 000055afeb058710 R11: 0000000000000246 R12: 0000000000000002 [ 401.511887] R13: 00007fad893a8780 R14: 0000000000000002 R15: 00007fad893a3740 To fix the described incorrect RCU usage, convert block_ing_cb_list from RCU list to regular list and protect it with flow_indr_block_ing_cb_lock mutex in flow_block_ing_cmd(). Fixes: 1150ab0f1b33 ("flow_offload: support get multi-subsystem block") Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-08-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller39-196/+429
Merge conflict of mlx5 resolved using instructions in merge commit 9566e650bf7fdf58384bb06df634f7531ca3a97e. Signed-off-by: David S. Miller <[email protected]>
2019-08-19sunrpc: add a new cache_detail operation for when a cache is flushedJeff Layton1-0/+3
When the exports table is changed, exportfs will usually write a new time to the "flush" file in the nfsd.export cache procfile. This tells the kernel to flush any entries that are older than that value. This gives us a mechanism to tell whether an unexport might have occurred. Add a new ->flush cache_detail operation that is called after flushing the cache whenever someone writes to a "flush" file. Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2019-08-19svcrdma: Use llist for managing cache of recv_ctxtsChuck Lever2-16/+11
Use a wait-free mechanism for managing the svc_rdma_recv_ctxts free list. Subsequently, sc_recv_lock can be eliminated. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2019-08-19svcrdma: Remove svc_rdma_wqChuck Lever2-8/+2
Clean up: the system workqueue will work just as well. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2019-08-19netfilter: xt_nfacct: Fix alignment mismatch in xt_nfacct_match_infoJuliana Rodrigueiro1-11/+25
When running a 64-bit kernel with a 32-bit iptables binary, the size of the xt_nfacct_match_info struct diverges. kernel: sizeof(struct xt_nfacct_match_info) : 40 iptables: sizeof(struct xt_nfacct_match_info)) : 36 Trying to append nfacct related rules results in an unhelpful message. Although it is suggested to look for more information in dmesg, nothing can be found there. # iptables -A <chain> -m nfacct --nfacct-name <acct-object> iptables: Invalid argument. Run `dmesg' for more information. This patch fixes the memory misalignment by enforcing 8-byte alignment within the struct's first revision. This solution is often used in many other uapi netfilter headers. Signed-off-by: Juliana Rodrigueiro <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-08-19netfilter: nft_flow_offload: missing netlink attribute policyPablo Neira Ayuso1-0/+6
The netlink attribute policy for NFTA_FLOW_TABLE_NAME is missing. Fixes: a3c90f7a2323 ("netfilter: nf_tables: flow offload expression") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-08-19netfilter: ebtables: Fix argument order to ADD_COUNTERTodd Seidelmann1-4/+4
The ordering of arguments to the x_tables ADD_COUNTER macro appears to be wrong in ebtables (cf. ip_tables.c, ip6_tables.c, and arp_tables.c). This causes data corruption in the ebtables userspace tools because they get incorrect packet & byte counts from the kernel. Fixes: d72133e628803 ("netfilter: ebtables: use ADD_COUNTER macro") Signed-off-by: Todd Seidelmann <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-08-18netfilter: nf_tables: map basechain priority to hardware priorityPablo Neira Ayuso2-3/+18
This patch adds initial support for offloading basechains using the priority range from 1 to 65535. This is restricting the netfilter priority range to 16-bit integer since this is what most drivers assume so far from tc. It should be possible to extend this range of supported priorities later on once drivers are updated to support for 32-bit integer priorities. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>