aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2017-10-06i40e/i40evf: organize and re-number feature flagsJacob Keller3-68/+68
Now that we've reduced the number of flags, organize similar flags together and re-number them accordingly. Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. One alternative approach considered, but not implemented here, was to use an enumeration for the flag variables, and create a macro I40E_FLAG() which used string concatenation to generate BIT_ULL values. This has the advantage of making the actual bit values compile-time dynamic so that we do not need to worry about matching the order to the bit value. However, this does produce a high level of code churn, and makes it more difficult to read a dumped flags value when debugging. Change-ID: I8653fff69453cd547d6fe98d29dfa9d8710387d1 Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: ignore skb->xmit_more when deciding to set RS bitJacob Keller1-30/+4
Since commit 6a7fded776a7 ("i40e: Fix RS bit update in Tx path and disable force WB workaround") we've tried to "optimize" setting the RS bit based around skb->xmit_more. This same logic was refactored in commit 1dc8b538795f ("i40e: Reorder logic for coalescing RS bits"), but ultimately was not functionally changed. Using skb->xmit_more in this way is incorrect, because in certain circumstances we may see a large number of skbs in sequence with xmit_more set. This leads to a performance loss as the hardware does not writeback anything for those packets, which delays the time it takes for us to respond to the stack transmit requests. This significantly impacts UDP performance, especially when layered with multiple devices, such as bonding, VLANs, and vnet setups. This was not noticed until now because it is difficult to create a setup which reproduces the issue. It was discovered in a UDP_STREAM test in a VM, connected using a vnet device to a bridge, which is connected to a bonded pair of X710 ports in active-backup mode with a VLAN. These layered devices seem to compound the number of skbs transmitted at once by the qdisc. Additionally, the problem can be masked by reducing the ITR value. Since the original commit does not provide strong justification for this RS bit "optimization", revert to the previous behavior of setting the RS bit every 4th packet. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40evf: enable support for VF VLAN tag stripping controlJacob Keller1-8/+14
A recent commit 809481484e5d ("i40e/i40evf: support for VF VLAN tag stripping control") added support for VFs to negotiate the control of VLAN tag stripping. This should have allowed VFs to disable the feature. Unfortunately, the flag was set only in netdev->feature flags and not in netdev->hw_features. This ultimately causes the stack to assume that it cannot change the flag, so it was unchangeable and marked as [fixed] in the ethtool -k output. Fix this by setting the feature in hw_features first, just as we do for the PF code. This enables ethtool -K to disable the feature correctly, and fully enables user control of the VLAN tag stripping feature. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: do not enter PHY debug mode while setting LEDs behaviourMariusz Stachura1-2/+5
Previous implementation of LED set/get functions required to enter PHY debug mode, in order to prevent access to it from FW and SW at the same time. Reset of all ports was a unwanted side effect. Signed-off-by: Mariusz Stachura <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: implement split PCI error reset handlerAlan Brady1-0/+24
This patch implements the PCI error handler reset_prepare and reset_done. This allows us to handle function level reset. Without this patch we are unable to perform and recover from an FLR correctly and this will cause VFs to be unable to recover from an FLR on the PF. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: Properly maintain flow director filters listFilip Sadowski1-0/+1
When there is no space for more flow director filters and user requested to add a new one it is rejected by firmware and automatically removed from the filter list maintained by driver. This behaviour is correct. Afterwards existing filter can be removed making free slot for the new one. This however causes the newly added filter to be accepted by firmware but removed from driver filter list resulting in not showing after issuing 'ethtool -n <dev_name>'. This happened due to not clearing the variable pf->fd_inv which stores filter number to be removed from the list when firmware refused to add the requested filter. It caused the filter with this specific ID to be constantly removed once it was added to the list although it has been accepted by firmware and effectively applied to the NIC. It was fixed by clearing pf->fd_inv variable after removal of the filter from the list when it was rejected by firmware. Signed-off-by: Filip Sadowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: Display error message if module does not meet thermal requirementsFilip Sadowski4-5/+22
This patch causes error message to be displayed when NIC detects insertion of module that does not meet thermal requirements. Signed-off-by: Filip Sadowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: fix merge errorAlice Michael1-5/+0
This patch removes some code that was accidentally added to the wrong function with a merge error. Fixes: c53934c6d1b1 ("i40e: fix: do not sleep in netdev_ops") Signed-off-by: Alice Michael <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e/i40evf: use DECLARE_BITMAP for stateJesse Brandeburg4-8/+10
When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: fix incorrect register definitionMitch Williams1-1/+1
This register was defined incorrectly. Fix the increment value to 8, and replace the iterator with _i to make the definition consistent with other statistics registers. Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: redfine I40E_PHY_TYPE_MAXMitch Williams2-2/+2
Since I40E_PHY_TYPE_MAX is used as an iterator, usually combined with some sort of bit-shifting, it should only include actual PHY types and not error cases. Move it up in the enum declaration so that loops only iterate across valid PHY types. Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: re-enable PTP L4 capabilities for XL710 if FW >6.0Alan Brady1-0/+5
Starting with XL710 FW 5.3 PTP L4 was disabled for XL710 due to a bug. The bug has since been resolved in XL710 FW >6.0 and PTP L4 can now be re-enabled on those devices with updated firmware. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e/i40evf: spread CPU affinity hints across online CPUs onlyJacob Keller2-8/+17
Currently, when setting up the IRQ for a q_vector, we set an affinity hint based on the v_idx of that q_vector. Meaning a loop iterates on v_idx, which is an incremental value, and the cpumask is created based on this value. This is a problem in systems with multiple logical CPUs per core (like in simultaneous multithreading (SMT) scenarios). If we disable some logical CPUs, by turning SMT off for example, we will end up with a sparse cpu_online_mask, i.e., only the first CPU in a core is online, and incremental filling in q_vector cpumask might lead to multiple offline CPUs being assigned to q_vectors. Example: if we have a system with 8 cores each one containing 8 logical CPUs (SMT == 8 in this case), we have 64 CPUs in total. But if SMT is disabled, only the 1st CPU in each core remains online, so the cpu_online_mask in this case would have only 8 bits set, in a sparse way. In general case, when SMT is off the cpu_online_mask has only C bits set: 0, 1*N, 2*N, ..., C*(N-1) where C == # of cores; N == # of logical CPUs per core. In our example, only bits 0, 8, 16, 24, 32, 40, 48, 56 would be set. Instead, we should only assign hints for CPUs which are online. Even better, the kernel already provides a function, cpumask_local_spread() which takes an index and returns a CPU, spreading the interrupts across local NUMA nodes first, and then remote ones if necessary. Since we generally have a 1:1 mapping between vectors and CPUs, there is no real advantage to spreading vectors to local CPUs first. In order to avoid mismatch of the default XPS hints, we'll pass -1 so that it spreads across all CPUs without regard to the node locality. Note that we don't need to change the q_vector->affinity_mask as this is initialized to cpu_possible_mask, until an actual affinity is set and then notified back to us. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: add private flag to control source pruningMitch Williams3-2/+31
By default, our devices do source pruning, that is, they drop receive packets that have the source MAC matching one of the receive filters. Unfortunately, this breaks ARP monitoring in channel bonding, as the bonding driver expects devices to receive ARPs containing their own source address. Add an ethtool private flag to control this feature. Also, remove the netif_running() check when we process our private flags. It's OK to reset when the device is closed and in most cases we need the reset the apply these changes. Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-06i40e: fix a typo in i40e_pf documentationRami Rosen1-1/+1
This patch fixes a typo in i40e_pf object documentation; num_req_vfs refers to the number of VFs requested for the PF. Signed-off-by: Rami Rosen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: fix mis-ordered parameters in declaration for .ndo_set_vf_bwJacob Keller2-6/+7
We've had support for setting both a minimum and maximum bandwidth via .ndo_set_vf_bw since commit 883a9ccbae56 ("fm10k: Add support for SR-IOV to driver", 2014-09-20). Likely because we do not support minimum rates, the declaration mis-ordered the "unused" parameter, which causes warnings when analyzed with cppcheck. Fix this warning by properly declaring the min_rate and max_rate variables in the declaration and definition (rather than using "unused"). Also rename "rate" to max_rate so as to clarify that we only support setting the maximum rate. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: prefer %s and __func__ for diagnostic printsJacob Keller1-2/+16
Don't hard code the function names in the diagnostic output when these reset related routines fail. Instead, use %s and __func__ so that future refactors don't need to change the print outs. Additionally, while we are here, add missing function header comments for the new reset_prepare and reset_done function handlers. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: Fix misuse of net_ratelimit()Joe Perches1-1/+2
Correct the backward logic using !net_ratelimit() Miscellanea: o Add a blank line before the error return label Signed-off-by: Joe Perches <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: bump version numberJacob Keller1-1/+1
Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: use the MAC/VLAN queue for VF<->PF MAC/VLAN requestsJacob Keller3-4/+133
Now that we have a working MAC/VLAN queue for handling MAC/VLAN messages from the netdev, replace the default handler for the VF<->PF messages. This new handler is very similar to the default code, but uses the MAC/VLAN queue instead of sending the message directly. Unfortunately we can't easily re-use the default code, so we'll just replace the entire function. This ensures that a VF requesting a large number of VLANs or MAC addresses does not start a reset cycle, as explained in the commit which introduced the message queue. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Ngai-mint Kwan <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: introduce a message queue for MAC/VLAN messagesJacob Keller3-42/+397
Under some circumstances, when dealing with a large number of MAC address or VLAN updates at once, the fm10k driver, particularly the VFs can overload the mailbox with too many messages at once. This results in a mailbox timeout, which causes the driver to initiate a reset. During the reset, we re-send all the same messages that originally caused the timeout. This results in a cycle of resets each triggering a future reset. To fix or avoid this, we introduce a workqueue item which monitors a queue of MAC and VLAN requests. These requests are queued to the end of the list, and we process as a FIFO periodically. Initially we only handle requests for the netdev, but we do handle unicast MAC addresses, multicast MAC addresses, and update VLAN requests. A future patch will add support to use this queue for handling MAC update requests from the VF<->PF mailbox. The MAC/VLAN work item will keep checking to make sure that each request does not overflow the mailbox and cause a timeout. If it might, then the work item will reschedule itself a short time later. This avoids any reset cycle, since we never send the message if the mailbox is not ready. As an alternative, we tried increasing the mailbox message FIFO, but this just delays the problem and results in needless memory waste on the system. Our new message queue is dynamically allocated so only uses as much memory as it needs. Additionally, it need not be contiguous like the Tx and Rx FIFOs. Note that this patch chose to only create a queue for MAC and VLAN messages, since these are the only messages sent in a large enough volume to cause the reset loop. Other messages are very unlikely to overflow the mailbox Tx FIFO so easily. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: use generic PM hooks instead of legacy PCIe power hooksJacob Keller1-45/+22
Replace the PCI specific legacy power management hooks with the new generic power management hooks which work properly for both suspend and hibernate. The new generic system is better and properly handles the lower level PCIe power management rather than forcing the driver to handle it. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: use spinlock to implement mailbox lockJacob Keller2-10/+8
Lets not re-invent the locking wheel. Remove our bitlock and use a proper spinlock instead. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-03fm10k: prepare_for_reset() when we lose PCIe LinkJacob Keller2-26/+79
If we lose PCIe link, such as when an unannounced PFLR event occurs, or when a device is surprise removed, we currently detach the device and close the netdev. This unfortunately leaves a lot of things still active, such as the msix_mbx_pf IRQ, and Tx/Rx resources. This can cause problems because the register reads will return potentially invalid values which may result in unknown driver behavior. Begin the process of resetting using fm10k_prepare_for_reset(), much in the same way as the suspend and resume cycle does. This will attempt to shutdown as much as possible, in order to prevent possible issues. A naive implementation for this has issues, because there are now multiple flows calling the reset logic and setting a reset bit. This would cause problems, because the "re-attach" routine might call fm10k_handle_reset() prior to the reset actually finishing. Instead, we'll add state bits to indicate which flow actually initiated the reset. For the general reset flow, we'll assume that if someone else is resetting that we do not need to handle it at all, so it does not need its own state bit. For the suspend case, we will simply issue a warning indicating that we are attempting to recover from this case when resuming. For the detached subtask, we'll simply refuse to re-attach until we've actually initiated a reset as part of that flow. Finally, we'll stop attempting to manage the mailbox subtask when we're detached, since there's nothing we can do if we don't have a PCIe address. Overall this produces a much cleaner shutdown and recovery cycle for a PCIe surprise remove event. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02Merge branch '40GbE' of ↵David S. Miller17-74/+837
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-02 This series contains updates to i40e and i40evf. Shannon Nelson fixes an issue where when a machine has more CPUs than queue pairs, the counting gets a "little funky" and turns off Flow Director. So to correct it, limit the number of LAN queues initially allocated to be sure there are some left for Flow Director and other features. Lihong cleans up dead code by removing a condition check which cannot ever be true. Christophe Jaillet fixes a potential NULL pointer dereference, which could happen if kzalloc() fails. Filip corrects the reporting of supported link modes, which was incorrect for some NICs. Added support for 'ethtool -m' command, which displays information about QSFP+ modules. Mariusz adds functions to read/write the LED registers to control the LEDS, instead of accessing the registers directly whenever the LEDs need to be controlled. Jake fixes a regression where we introduced a scheduling while atomic, so introduce a separate helper function which will manage its own need for the mac_filter_hash_lock. Also cleaned up the "PF" parameter in i40e_vc_disable_vf() since it is never used and is not needed. Fixed a rare case where it is possible that a reset does not occur when i40e_vc_disable_vf() is called, so modify i40e_reset_vf() to return a bool to indicate whether it reset or not so that i40e_vc_disable_vf() can wait until a reset actually occurs. Alan adds the ability for the VF to request more or less underlying allocated queues from the PF. Fixes the incorrect method for clearing the vf_states variable with a NULL assignment, when we should be using atomic bitops since we don't actually want to clear all the flags. Fixed a resource leak, where the PF driver fails to inform clients of a VF reset because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. Mitch converts i40evf_map_rings_to_vectors() to a void function since it cannot fail and allows us to clean up the checks for the function return value. Scott enables the driver(s) to pass traffic with VLAN tags using the 802.1ad Ethernet protocol. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-02i40e: Stop dropping 802.1ad tags - eth proto 0x88a8Scott Peterson7-3/+62
Enable i40e to pass traffic with VLAN tags using the 802.1ad ethernet protocol ID (0x88a8). This requires NIC firmware providing version 1.7 of the API. With older NIC firmware 802.1ad tagged packets will continue to be dropped. No VLAN offloads nor RSS are supported for 802.1ad VLANs. Signed-off-by: Scott Peterson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: fix client notify of VF resetAlan Brady1-2/+2
Currently there is a bug in which the PF driver fails to inform clients of a VF reset which then causes clients to leak resources. The bug exists because we were incorrectly checking the I40E_VF_STATE_PRE_ENABLE bit. When a VF is first init we go through a reset to initialize variables and allocate resources but we don't want to inform clients of this first reset since the client isn't fully enabled yet so we set a state bit signifying we're in a "pre-enabled" client state. During the first reset we should be clearing the bit, allowing all following resets to notify the client of the reset when the bit is not set. This patch fixes the issue by negating the 'test_and_clear_bit' check to accurately reflect the behavior we want. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: fix handling of vf_states variableAlan Brady2-2/+4
Currently we inappropriately clear the vf_states variable with a null assignment. This is problematic because we should be using atomic bitops on this variable and we don't actually want to clear all the flags. We should just clear the ones we know we want to clear. Additionally remove the I40E_VF_STATE_FCOEENA bit because it is no longer being used. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: make i40evf_map_rings_to_vectors voidMitch Williams1-7/+2
This function cannot fail, so why is it returning a value? And why are we checking it? Why shouldn't we just make it void? Why is this commit message made up of only questions? Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40evf: Enable VF to request an alternate queue allocationAlan Brady4-7/+173
Currently the VF gets a default number of allocated queues from HW on init and it could choose to enable or disable those allocated queues. This makes it such that the VF can request more or less underlying allocated queues from the PF. First the VF negotiates the number of queues it wants that can be supported by the PF and if successful asks for a reset. During reset the PF will reallocate the HW queues for the VF and will then remap the new queues. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: ensure reset occurs when disabling VFJacob Keller2-11/+35
It is possible although rare that we may not reset when i40e_vc_disable_vf() is called. This can lead to some weird circumstances with some values not being properly set. Modify i40e_reset_vf() to return a code indicating whether it reset or not. Now, i40e_vc_disable_vf() can wait until a reset actually occurs. If it fails to free up within a reasonable time frame we'll display a warning message. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: make use of i40e_vc_disable_vfJacob Keller1-2/+1
Replace i40e_vc_notify_vf_reset and i40e_reset_vf with a call to i40e_vc_disable_vf which does this exact thing. This matches similar code patterns throughout the driver. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: drop i40e_pf *pf from i40e_vc_disable_vf()Jacob Keller1-4/+3
It's never used, and the vf structure could get back to the PF if necessary. Lets just drop the extra unneeded parameter. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: don't hold spinlock while resetting VFJacob Keller1-4/+32
When we refactored handling of the PVID in commit 9af52f60b2d9 ("i40e: use (add|rm)_vlan_all_mac helper functions when changing PVID") we introduced a scheduling while atomic regression. This occurred because we now held the spinlock across a call to i40e_reset_vf(), which results in a usleep_range() call that triggers a scheduling while atomic bug. This was rare as it only occurred if the user configured a VLAN on a VF and also attempted to reconfigure the VF from the host system with a port VLAN. We do need to hold the lock while calling i40e_is_vsi_in_vlan(), but we should not be holding it while we reset the VF. We'll fix this by introducing a separate helper function i40e_vsi_has_vlans which checks whether we have a PVID and whether the VSI has configured VLANs. This helper function will manage its own need for the mac_filter_hash_lock. Then, we can move the acquiring of the spinlock until after we reset the VF, which ensures that we do not sleep while holding the lock. Using a separate function like this makes the code more clear and is easier to read than attempting to release and re-acquire the spinlock when we reset the VF. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: use admin queue for setting LEDs behaviorMariusz Stachura3-24/+99
Instead of accessing register directly, use newly added AQC in order to blink LEDs. Introduce and utilize a new flag to prevent excessive API version checking. Signed-off-by: Mariusz Stachura <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: Add support for 'ethtool -m'Filip Sadowski9-0/+371
This patch adds support for 'ethtool -m' command which displays information about (Q)SFP+ module plugged into NIC's cage. Signed-off-by: Filip Sadowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: Fix reporting of supported link modesFilip Sadowski3-5/+46
This patch fixes incorrect reporting of supported link modes on some NICs. Signed-off-by: Filip Sadowski <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: Fix a potential NULL pointer dereferenceChristophe JAILLET1-0/+3
If 'kzalloc()' fails, a NULL pointer will be dereferenced. Return an error code (-ENOMEM) instead. Signed-off-by: Christophe JAILLET <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: remove logically dead codeLihong Yang1-2/+0
This patch removes the !vf condition check that cannot be true in i40e_ndo_set_vf_trust function Detected by CoverityScan, CID 1397531 Logically dead code Signed-off-by: Lihong Yang <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02i40e: limit lan queue count in large CPU count machineShannon Nelson1-4/+7
When a machine has more CPUs than queue pairs, e.g. 512 cores, the counting gets a little funky and turns off Flow Director with the message: not enough queues for Flow Director. Flow Director feature is disabled This patch limits the number of lan queues initially allocated to be sure we have some left for FD and other features. Signed-off-by: Shannon Nelson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: prevent race condition of __FM10K_SERVICE_SCHEDJacob Keller1-7/+25
Although very unlikely, it is possible that cancel_work_sync() may stop the service_task before it actually started. In this case, the __FM10K_SERVICE_SCHED bit will never be cleared. This results in the service task being unable to reschedule in the future. Add a helper function which sets the service disable bit, waits for the service task to stop and clears the schedule bit, thus avoiding the race condition. We know the schedule bit is safe to clear because the cancel_work_sync() guarantees the service task is not running. Add a helper function also to restart the service task, for symmetry. This is not strictly needed but helps the mental model of how to stop and start the service task. This race could only happen in fm10k_suspend/fm10k_resume as this is the only place where the service task is actually restarted. Thus, suspend/resume testing would be ideal. However, note that the chance of this happening is very slim as the service event is scheduled for immediate execution, and you would have to trigger a suspend at almost the exact same time as the service task was scheduled. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: move fm10k_prepare_for_reset and fm10k_handle_resetJacob Keller1-29/+29
A future patch needs these functions defined earlier in the file. Move them closer to above where they will be called. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: avoid divide by zero in rare cases when device is resettingJacob Keller1-2/+6
It is possible that under rare circumstances the device is undergoing a reset, such as when a PFLR occurs, and the device may be transmitting simultaneously. In this case, we might attempt to divide by zero when finding the proper r_idx. Instead, lets read the num_tx_queues once, and make sure it's non-zero. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: don't loop while resetting VFs due to VFLR eventJacob Keller1-13/+11
We've always had a really weird looping construction for resetting VFs. We read the VFLRE register and reset the VF if the corresponding bit is set, which makes sense. However we loop continuously until we no longer have any bits left unset. At first this makes sense, as a sort of "keep trying until we succeed" concept. Unfortunately this causes a problem if we happen to surprise remove while this code is executing, because in this case we'll always read all 1s for the VFLRE register. This results in a hard lockup on the CPU because the loop will never terminate. Because our own reset function will clear the VFLR event register always, (except when we've lost PCIe link obviously) there is no real reason to loop. In practice, we'll loop over once and find that no VFs are pending anymore. Lets just check once. Since we're clear the notification when we reset there's no benefit to the loop. Additionally, there shouldn't be a race as future VLFRE events should trigger an interrupt. Additionally, we didn't warn or do anything in the looped case anyways. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: simplify reading PFVFLRE registerJacob Keller1-4/+2
We're doing a really convoluted bitshift and read for the PFVFLRE register. Just reading the PFVFLRE(1), shifting it by 32, then reading PFVFLRE(0) should be sufficient. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: avoid needless delay when loading driverJacob Keller1-4/+2
When we load the driver, we set the last_reset to be in the future, which delays the initial driver reset. Additionally, the service task isn't scheduled to run automatically until the timer runs out. This causes a needless delay of the first reset to begin talking to the switch manager. We can avoid this by simply not setting last_reset and immediately scheduling the service task while in probe. This allows the device to wake up faster, and avoids this delay. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: add missing fall through commentJacob Keller1-0/+1
Newer versions of GCC starting with 7 now additionally warn when a case statement may fall through without an explicit comment mentioning it. Add such a comment to silence the warning, as this is expected. Unfortunately the comment must come directly before the next case statement, so we put it outside the #ifdef. Otherwise, the compiler cannot properly detect it and thus the warning is displayed regardless. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: avoid possible truncation of q_vector->nameJacob Keller1-7/+7
New versions of GCC since version 7 began warning about possible truncation of calls to snprintf. We can fix this and avoid false positives. First, we should pass the full buffer size to snprintf, because it guarantees a NULL character as part of its passed length, so passing len-1 is simply wasting a byte of possible storage. Second, if we make the ri and ti variables unsigned, the compiler is able to correctly reason that the value never gets larger than 256, so it doesn't need to warn about the full space required to print a signed integer. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: fix typos on fall through commentsJacob Keller2-7/+7
Newer versions of GCC since version 7 now warn when a case statement may fall through without an explicit comment. "Fallthough" does not count as it is misspelled. Fix the typos for these comments to appease the new warnings. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-02fm10k: stop spurious link down messages when Tx FIFO is fullJacob Keller1-3/+3
In fm10k_get_host_state_generic, we check the mailbox tx_read() function to ensure that the mailbox is still open. This function also checks to make sure we have space to transmit another message. Unfortunately, if we just recently sent a bunch of messages (such as enabling hundreds of VLANs on a VF) this can result in a race where the watchdog task thinks the link went down just because we haven't had time to process all these messages yet. Instead, lets just check whether the mailbox is still open. This ensures that we don't race with the Tx FIFO, and we only link down once the mailbox is not open. This is safe, because if the FIFO fills up and we're unable to send a message for too long, we'll end up triggering the timeout detection which results in a reset. Additionally, since we still check to ensure the mailbox state is OPEN, we'll transition to link down whenever the mailbox closes as well. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Krishneil Singh <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>