aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-04-30nfp: replace -ENOTSUPP with -EOPNOTSUPPJakub Kicinski4-21/+21
As Or points out in commit 423b3aecf290 ("net/mlx4: Change ENOTSUPP to EOPNOTSUPP"), ENOTSUPP is NFS specific error. Replace it with EOPNOTSUPP. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30virtio-net: use netif_tx_napi_add for tx napiWillem de Bruijn1-2/+2
Avoid hashing the tx napi struct into napi_hash[], which is used for busy polling receive queues. Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30net: Initialise init_net.count to 1David Howells1-1/+2
Initialise init_net.count to 1 for its pointer from init_nsproxy lest someone tries to do a get_net() and a put_net() in a process in which current->ns_proxy->net_ns points to the initial network namespace. Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30geneve: fix incorrect setting of UDP checksum flagGirish Moodalbail1-1/+1
Creating a geneve link with 'udpcsum' set results in a creation of link for which UDP checksum will NOT be computed on outbound packets, as can be seen below. 11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0 geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum Similarly, creating a link with 'noudpcsum' set results in a creation of link for which UDP checksum will be computed on outbound packets. Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.") Signed-off-by: Girish Moodalbail <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Acked-by: Lance Richardson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30Merge branch 'vxlan-disabled-ipv6'David S. Miller1-5/+7
Jiri Benc says: ==================== vxlan: do not error out on disabled IPv6 This patchset fixes a bug with metadata based tunnels when booted with ipv6.disable=1. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-30vxlan: do not output confusing error messageJiri Benc1-2/+0
The message "Cannot bind port X, err=Y" creates only confusion. In metadata based mode, failure of IPv6 socket creation is okay if IPv6 is disabled and no error message should be printed. But when IPv6 tunnel was requested, such failure is fatal. The vxlan_socket_create does not know when the error is harmless and when it's not. Instead of passing such information down to vxlan_socket_create, remove the message completely. It's not useful. We propagate the error code up to the user space and the port number comes from the user space. There's nothing in the message that the process creating vxlan interface does not know. Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30vxlan: correctly handle ipv6.disable module parameterJiri Benc1-3/+7
When IPv6 is compiled but disabled at runtime, __vxlan_sock_add returns -EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole operation of bringing up the tunnel. Ignore failure of IPv6 socket creation for metadata based tunnels caused by IPv6 not being available. Fixes: b1be00a6c39f ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device") Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30bnx2x: Get rid of useless temporary variableAndy Shevchenko1-9/+5
Replace pattern int status; ... status = func(...); return status; by return func(...); No functional change intented. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30bnx2x: Reuse bnx2x_null_format_ver()Andy Shevchenko1-11/+9
Reuse bnx2x_null_format_ver() in functions where it's appropriated instead of open coded variant. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30bnx2x: Replace custom scnprintf()Andy Shevchenko1-70/+9
Use scnprintf() when printing version instead of custom open coded variants. Signed-off-by: Andy Shevchenko <[email protected]> Acked-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30Merge tag 'linux-can-next-for-4.12-20170427' of ↵David S. Miller1-8/+13
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2017-04-25 this is a pull request of 1 patch for net-next/master. This patch by Oliver Hartkopp fixes the build of the broad cast manager with CONFIG_PROC_FS disabled. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-30bpf: Fix inaccurate helper function descriptionChenbo Feng1-2/+1
The description inside uapi/linux/bpf.h about bpf_get_socket_uid helper function is no longer valid. It returns overflowuid rather than 0 when failed. Signed-off-by: Chenbo Feng <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30tcp: fix access to sk->sk_state in tcp_poll()Davide Caratti1-1/+1
avoid direct access to sk->sk_state when tcp_poll() is called on a socket using active TCP fastopen with deferred connect. Use local variable 'state', which stores the result of sk_state_load(), like it was done in commit 00fd38d938db ("tcp: ensure proper barriers in lockless contexts"). Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Wei Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30bpf: restore skb->sk before pskb_trim() callEric Dumazet1-1/+1
While testing a fix [1] in ___pskb_trim(), addressing the WARN_ON_ONCE() in skb_try_coalesce() reported by Andrey, I found that we had an skb with skb->sk set but no skb->destructor. This invalidated heuristic found in commit 158f323b9868 ("net: adjust skb->truesize in pskb_expand_head()") and in cited patch. Considering the BUG_ON(skb->sk) we have in skb_orphan(), we should restrain the temporary setting to a minimal section. [1] https://patchwork.ozlabs.org/patch/755570/ net: adjust skb->truesize in ___pskb_trim() Fixes: 8f917bba0042 ("bpf: pass sk to helper functions") Signed-off-by: Eric Dumazet <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Andrey Konovalov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30net: macb: fix phy interrupt parsingAlexandre Belloni1-8/+10
Since 83a77e9ec415, the phydev irq is explicitly set to PHY_POLL when there is no pdata. It doesn't work on DT enabled platforms because the phydev irq is already set by libphy before. Fixes: 83a77e9ec415 ("net: macb: Added PCI wrapper for Platform Driver.") Signed-off-by: Alexandre Belloni <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-30Merge branch '40GbE' of ↵David S. Miller12-334/+372
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-04-30 This series contains updates to i40e and i40evf only. Jake provides majority of the changes in this series, starting with the renaming of a flag to avoid confusion. Then renamed a variable to a more meaningful name to clarify what is actually being done and to reduce confusion. Amortizes the wait time when initializing or disabling lots of VFs by using i40e_reset_all_vfs() and i40e_vsi_stop_rings_no_wait(). Cleaned up a unnecessary delay since pci_disable_sriov() already has its own delay, so need to add a additional delay when removing VFs. Avoid using the same name flags for both vsi->state and pf->state, to make code review easier and assist future work to use the correct state field when checking bits. Use DECLARE_BITMAP() to ensure that we always allocate enough space for flags. Replace hw_disabled_flags with the new _AUTO_DISABLED flags, which are more readable because we are not setting an *_ENABLED flag to disable the feature. Alex corrects a oversight where we were not reprogramming the ports after a reset, which was causing us to lose all of the receive tunnel offloads. Arnd Bergmann moves the declaration of a local variable to avoid a warning seen on architectures with larger pages about an unused variable. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-30Merge branch '1GbE' of ↵David S. Miller6-91/+144
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 1GbE Intel Wired LAN Driver Updates 2017-04-30 This series contains updates to e1000e only. Jarod Wilson fixes an issue where the workaround for 82574 & 82583 is needed for i218 as well, so set the appropriate flags. Sasha adds support for the upcoming new i219 devices for the client platform (CannonLake), which includes the support for 38.4MHz frequency to support PTP on CannonLake. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-30Bluetooth: Add selftest for ECDH key generationMarcel Holtmann2-3/+38
Since the ECDH key generation takes a different path, it needs to be tested as well. For this generate the public debug key from the private debug key and compare both. This also moves the seeding of the private key into the SMP calling code to allow for easier re-use of the ECDH key generation helper. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2017-04-30Bluetooth: zero kpp input for key generationMarcel Holtmann1-0/+1
When generating new ECDH keys with kpp, the shared secret input needs to be set to NULL. Fix this by including kpp_request_set_input call. Fixes: 58771c1c ("Bluetooth: convert smp and selftest to crypto kpp API") Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2017-04-30net/mlx5: E-Switch, Avoid redundant memory allocationEli Cohen2-19/+10
struct esw_mc_addr is a small struct that can be part of struct mlx5_eswitch. Define it as a field and not as a pointer and save the kzalloc call and then error flow handling. Signed-off-by: Eli Cohen <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Disable HW LRO when PCI is slower than link on striding RQEran Ben Elisha1-7/+14
We will activate the HW LRO only on servers with PCI BW > MAX LINK BW, or when PCI BW > 16Gbps. On other cases we do not want LRO by default as LRO sessions might get timeout and add redundant software overhead. Tested: ethtool -k <ifs-name> | grep large-receive-offload On systems with and without the limitations. Signed-off-by: Eran Ben Elisha <[email protected]> Cc: [email protected] Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Use u8 as ownership type in mlx5e_get_cqe()Tariq Toukan1-2/+2
CQE ownership indication is as small as a single bit. Use u8 to speedup the comparison. Signed-off-by: Tariq Toukan <[email protected]> Cc: [email protected] Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Use prefetchw when a write is to followTariq Toukan1-1/+1
"prefetchw()" prefetches the cacheline for write. Use it for skb->data, as soon we'll be copying the packet header there. Performance: Single-stream packet-rate tested with pktgen. Packets are dropped in tc level to zoom into driver data-path. Larger gain is expected for smaller packets, as less time is spent on handling SKB fragments, making the path shorter and the improvement more significant. --------------------------------------------- packet size | before | after | gain | 64B | 4,113,306 | 4,778,720 | 16% | 1024B | 3,633,819 | 3,950,593 | 8.7% | Signed-off-by: Tariq Toukan <[email protected]> Cc: [email protected] Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Optimize poll ICOSQ completion queueTariq Toukan1-29/+33
UMR operations are more frequent and important. Check them first, and add a compiler branch predictor hint. According to current design, ICOSQ CQ can contain at most one pending CQE per napi. Poll function is optimized accordingly. Performance: Single-stream packet-rate tested with pktgen. Packets are dropped in tc level to zoom into driver data-path. Larger gain is expected for larger packet sizes, as BW is higher and UMR posts are more frequent. --------------------------------------------- packet size | before | after | gain | 64B | 4,092,370 | 4,113,306 | 0.5% | 1024B | 3,421,435 | 3,633,819 | 6.2% | Signed-off-by: Tariq Toukan <[email protected]> Cc: [email protected] Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Act on delay probe time updatesHadar Hen Zion1-0/+39
The user can change delay_first_probe_time parameter through sysctl. Listen to NETEVENT_DELAY_PROBE_TIME_UPDATE notifications and update the intervals for updating the neighbours 'used' value periodic task and for flow HW counters query periodic task. Both of the intervals will be update only in case the new delay prob time value is lower the current interval. Since the driver saves only one min interval value and not per device, the users will be able to set lower interval value for updating neighbour 'used' value periodic task but they won't be able to schedule a higher interval for this periodic task. The used interval for scheduling neighbour 'used' value periodic task is the minimal delay prob time parameter ever seen by the driver. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Update neighbour 'used' state using HW flow rules countersHadar Hen Zion7-2/+152
When IP tunnel encapsulation rules are offloaded, the kernel can't see the traffic of the offloaded flow. The neighbour for the IP tunnel destination of the offloaded flow can mistakenly become STALE and deleted by the kernel since its 'used' value wasn't changed. To make sure that a neighbour which is used by the HW won't become STALE, we proactively update the neighbour 'used' value every DELAY_PROBE_TIME period, when packets were matched and counted by the HW for one of the tunnel encap flows related to this neighbour. The periodic task that updates the used neighbours is scheduled when a tunnel encap rule is successfully offloaded into HW and keeps re-scheduling itself as long as the representor's neighbours list isn't empty. Add, remove, lookup and status change operations done over the representor's neighbours list or the neighbour hash entry encaps list are all serialized by RTNL lock. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Add support to neighbour update flowHadar Hen Zion5-43/+434
In order to offload TC encap rules, the driver does a lookup for the IP tunnel neighbour according to the output device and the destination IP given by the user. To keep tracking after the validity state of such neighbours, we keep the neighbours information (pair of device pointer and destination IP) in a hash table maintained at the relevant egress representor and register to get NETEVENT_NEIGH_UPDATE events. When getting neighbour update netevent, we search for a match among the cached neighbours entries used for encapsulation. In case the neighbour isn't valid, we can't offload the flow into the HW. We cache the flow (requested matching and actions) in the driver and offload the rule later, when the neighbour is resolved and becomes valid. When a flow is only cached in the driver and not offloaded into HW yet, we use EAGAIN return value to mark it internally, the TC ndo still returns success. Listen to kernel neighbour update netevents to trace relevant neighbours validity state: 1. If a neighbour becomes valid, offload the related rules to HW. 2. If the neighbour becomes invalid, remove the related rules from HW. 3. If the neighbour mac address was changed, update the encap header. Remove all the offloaded rules using the old encap header from the HW and insert new rules to HW with updated encap header. Access to the neighbors hash table is protected by RTNL lock of its caller or by the table's spinlock. Details of the locking/synchronization among the different actions applied on the neighbour table: Add/remove operations - protected by RTNL lock of its caller (all TC commands are protected by RTNL lock). Add and remove operations are initiated only when the user inserts/removes a TC rule into/from the driver. Lookup/remove operations - since the lookup operation is done from netevent notifier block, RTNL lock can't be used (atomic context). Use the table's spin lock to protect lookups from TC user removal operation. bh is used since netevent can be called from a softirq context. Lookup/add operations - The hash table access functions are taking care of the protection between lookup and add operations. When adding/removing encap headers and rules to/from the HW, RTNL lock is used. It can happen when: 1. The user inserts/removes a TC rule into/from the driver (TC commands are protected by RTNL lock of it's caller). 2. The driver gets neighbour notification event, which reports about neighbour validity status change. Before adding/removing encap headers and rules to/from the HW, RTNL lock is taken. A neighbour hash table entry should be freed when its encap list is empty. Since The neighbour update netevent notification schedules a neighbour update work that uses the neighbour hash entry, it can't be freed unconditionally when the encap list becomes empty during TC delete rule flow. Use reference count to protect from freeing neighbour hash table entry while it's still in use. When the user asks to unregister a netdvice used by one of the neigbours, neighbour removal notification is received. Then we take a reference on the neighbour and don't free it until the relevant encap entries (and flows) are marked as invalid (not offloaded) and removed from HW. As long as the encap entry is still valid (checked under RTNL lock) we can safely access the neighbour device saved on mlx5e_neigh struct. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Add neighbour hash table to the representorsHadar Hen Zion2-8/+129
Add hash table to the representors which is to be used by the next patch to save neighbours information in the driver. In order to offload IP tunnel encapsulation rules, the driver must find the tunnel dst neighbour according to the output device and the destination address given by the user. The next patch will cache the neighbors information in the driver to allow support in neigh update flow for tunnel encap rules. The neighbour entries are also saved in a list so we easily iterate over them when querying statistics in order to provide 'used' feedback to the kernel neighbour NUD core. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Read neigh parameters with proper lockingHadar Hen Zion1-6/+14
The nud_state and hardware address fields are protected by the neighbour lock, we should acquire it before accessing those parameters. Use this lock to avoid inconsistency between the neighbour validity state and it's hardware address. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Use flag to properly monitor a flow rule offloading stateHadar Hen Zion1-1/+7
Instead of relaying on the 'flow->rule' pointer value which can be valid or invalid (in case the FW returns an error while trying to offload the rule), monitor the rule state using a flag. In downstream patch which adds support to IP tunneling neigh update flow, a TC rule could be cached in the driver and not offloaded into the HW. In this case, the flow handle pointer stays NULL. Check the offloaded flag to properly deal with rules which are currently not offloaded when querying rule statistics. This patch doesn't add any new functionality. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Remove output device parameter from create encap header helpers ↵Hadar Hen Zion1-15/+14
definition Passing output device parameter to the helper functions that deal with creation of encapsulation headers is redundant. Output device parameter can be defined inside those helpers, no need to pass it. Refactor the code by removing the parameter from the function signature. This patch doesn't change any functionality. Signed-off-by: Hadar Hen Zion <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Move the encap entry structure from the eswitch headerOr Gerlitz3-19/+18
The encap entry structure isn't manipulated by the eswitch code, hence it can/needs to be removed from the eswitch header. Do that, and change it to have mlx5e_ prefix. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5: Remove encap entry pointer from the eswitch flow attributesOr Gerlitz3-15/+18
Encap wise, the tc eswitch flow attribute struct needs to have only the encap ID which is programmed later to the HW and none of the higher level encap params, fix that. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2017-04-30net/mlx5e: Extendable vport representor netdev private dataSaeed Mahameed6-130/+224
Make representor netdev private data extendable by adding new struct "mlx5e_rep_priv" and use it as the rep netdev private data struct instead of directly pointing to mlx5_eswitch_rep. Added new en_rep.h header file to contain all representor related definitions and prototypes, and moved all representor specific logic into en_rep.c. Needed for downstream patches to extend representor functionality to support neighbour update. Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Or Gerlitz <[email protected]>
2017-04-30e1000e: Add Support for 38.4MHZ frequencySasha Neftin2-29/+48
Add support for 38.4MHz frequency is required for PTP on CannonLake. SYSTIM frequency adjustment attributes for TIMINCA are get/set dependent on the hardware clock frequency for a different types of adapters. 38.4MHz frequency supported by CannonLake and active once time synchronisation mechanism was enabled Changed abbreviation from Hz to HZ to be compliant checkpatch code style Signed-off-by: Sasha Neftin <[email protected]> Reviewed-by: Raanan Avargil <[email protected]> Reviewed-by: Dima Ruinskiy <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30e1000e: Add Support for CannonLakeSasha Neftin4-62/+63
The propagation of CannonLake mac type to driver functionality Signed-off-by: Sasha Neftin <[email protected]> Reviewed-by: Raanan Avargil <[email protected]> Reviewed-by: Dima Ruinskiy <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30e1000e: Initial Support for CannonLakeSasha Neftin4-1/+33
i219 (6) and i219 (7) are the next LOM generations that will be available on the nextIntel Client platform (CannonLake) This patch provides the initial support for these devices Signed-off-by: Sasha Neftin <[email protected]> Reviewed-by: Raanan Avargil <[email protected]> Reviewed-by: Dima Ruinskiy <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30e1000e: fix PTP on e1000_pch_lpt variantsJarod Wilson1-1/+2
I've got reports that the Intel I-218V NIC in Intel NUC5i5RYH systems used as a PTP slave experiences random ~10 hour clock jumps, which are resolved if the same workaround for the 82574 and 82583 is employed, so set the appropriate flag2 in e1000_pch_lpt_info too. Reported-by: Rupesh Patel <[email protected]> Signed-off-by: Jarod Wilson <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40evf: hide unused variableArnd Bergmann1-1/+2
On architectures with larger pages, we get a warning about an unused variable: drivers/net/ethernet/intel/i40evf/i40evf_main.c: In function 'i40evf_configure_rx': drivers/net/ethernet/intel/i40evf/i40evf_main.c:690:21: error: unused variable 'netdev' [-Werror=unused-variable] This moves the declaration into the #ifdef to avoid the warning. Fixes: dab86afdbbd1 ("i40e/i40evf: Change the way we limit the maximum frame size for Rx") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40evf: allocate queues before we setup the interrupts and q_vectorsJacob Keller1-9/+9
This matches the ordering of how we free stuff during reset and remove. It also makes logical sense because we set the interrupts based on the number of queues. Currently this doesn't really matter in practice. However a future patch moves the assignment of num_active_queues into i40evf_alloc_queues, which is required by i40evf_set_interrupt_capability. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40evf: remove I40E_FLAG_FDIR_ATR_ENABLEDJacob Keller1-1/+0
The flag used by the common code and PF code is I40E_FLAG_FD_ATR_ENABLED, not *FDIR*. It turns out none of the txrx code actually shared with the VF driver actually checks the ATR flag. This is made even more obvious by the typo in the VF header file. Let's just remove the flag from the VF driver since it's not needed. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: remove hw_disabled_flags in favor of using separate flag bitsJacob Keller4-52/+38
The hw_disabled_flags field was added as a way of signifying that a feature was automatically or temporarily disabled. However, we actually only use this for FDir features. Replace its use with new _AUTO_DISABLED flags instead. This is more readable, because you aren't setting an *_ENABLED flag to *disable* the feature. Additionally, clean up a few areas where we used these bits. First, we don't really need to set the auto-disable flag for ATR if we're fully disabling the feature via ethtool. Second, we should always clear the auto-disable bits in case they somehow got set when the feature was disabled. However, avoid displaying a message that we've re-enabled the feature. Third, we shouldn't be re-enabling ATR in the SB ntuple add flow, because it might have been disabled due to space constraints. Instead, we should just wait for the fdir_check_and_reenable to be called by the watchdog. Overall, this change allows us to simplify some code by removing an extra field we didn't need, and the result should make it more clear as to what we're actually doing with these flags. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40evf: remove needless min_t() on num_online_cpus()*2Jacob Keller1-6/+6
We already set pairs to the value of adapter->num_active_queues. This value is limited by vsi_res->num_queue_pairs and num_online_cpus(). This means that pairs by definition is already smaller than num_online_cpus()*2, so we don't even need to bother with this check. Lets just remove it and update the comment. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: use DECLARE_BITMAP for state fieldsJacob Keller11-181/+189
Instead of assuming our flags fit within an unsigned long, use DECLARE_BITMAP which will ensure that we always allocate enough space. Additionally, use __I40E_STATE_SIZE__ markers as the last element of the enumeration so that the size of the BITMAP is compile-time assigned rather than programmer-time assigned. This ensures that potential future flag additions do not actually overrun the array. This is especially important as 32bit systems would only have 32bit longs instead of 64bit longs as we generally have assumed in the prior code. This change also removes a dereference of the state fields throughout the code, so it does have a bit of code churn. The conversions were automated using sed replacements with an alternation s/&(vsi->back|vsi|pf)->state/\1->state/ s/&adapter->vsi.state/adapter->vsi.state/ For debugfs, we modify the printing so that we can display chunks of the state value on new lines. This ensures that we can print the entire set of state values. Additionally, we now print them as 08lx to ensure that they display nicely. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: separate PF and VSI state flagsJacob Keller8-55/+64
Avoid using the same named flags for both vsi->state and pf->state. This makes code review easier, as it is more likely that future authors will use the correct state field when checking bits. Previous commits already found issues with at least one check, and possibly others may be incorrect. This reduces confusion as it is more clear what each flag represents, and which flags are valid for which state field. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: remove unnecessary msleep() delay in i40e_free_vfsJacob Keller3-4/+2
The delay was added because of a desire to ensure that the VF driver can finish up removing. However, pci_disable_sriov already has its own ssleep() call that will sleep for an entire second, so there is no reason to add extra delay on top of this by using msleep here. In practice, an msleep() won't have a huge impact on timing but there is no real value in keeping it, so lets just simplify the code and remove it. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: amortize wait time when disabling lots of VFsJacob Keller1-2/+14
Just as we do in i40e_reset_all_vfs, save some time when freeing VFs by amortizing the wait time for stopping queues. We can use i40e_vsi_stop_rings_no_wait() to begin the process of stopping all the VF rings at once. Then, once we've started the process on each VF we can begin waiting for the VFs to stop. This helps reduce the total wait time by a large factor. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: Reprogram port offloads after resetAlexander Duyck1-0/+20
This patch corrects a major oversight in that we were not reprogramming the ports after a reset. As a result we completely lost all of the Rx tunnel offloads on receive including Rx checksum, RSS on inner headers, and ATR. The fix for this is pretty standard as all we needed to do is reset the filter bits to pending for all active filters and schedule the sync event. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: rename index to port to avoid confusionJacob Keller2-6/+6
The .index field of i40e_udp_port_config represents the udp port number. Rename this variable to port so that it is more obvious. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-30i40e: make use of i40e_reset_all_vfs when initializing new VFsJacob Keller2-3/+8
When allocating a large number of VFs, the driver previously used i40e_reset_vf in a sequence. Just as when performing a normal reset, this accumulates a large amount of delay for handling all of the VFs in sequence. This delay is mainly due to a hardware requirement to wait after initiating a reset on the VF. We recently added a new function, i40e_reset_all_vfs() which can be used to amortize the delay time, by first triggering all VF resets, then waiting once, and finally cleaning up and allocating the VFs. This is almost as good as truly running the resets in parallel. In order to avoid sending a spurious reset message to a client interface, we have a check to see whether we've assigned pf->num_alloc_vfs yet. This was originally intended as a way to distinguish the "initialization" case from the regular reset case. Unfortunately, this means that we can't directly use i40e_reset_all_vfs yet. Lets avoid this check of pf->num_alloc_vfs by replacing it with a proper VSI state bit which we can use instead. This makes the intention much clearer and allows us to re-use the i40e_reset_all_vfs function directly. Change-ID: I694279b37eb6b5a91b6670182d0c15d10244fd6e Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>