aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)AuthorFilesLines
2017-10-13i40e/i40evf: don't trust VF to reset itselfAlan Brady2-6/+10
When using 'ethtool -L' on a VF to change number of requested queues from PF, we shouldn't trust the VF to reset itself after making the request. Doing it that way opens the door for a potentially malicious VF to do nasty things to the PF which should never be the case. This makes it such that after VF makes a successful request, PF will then reset the VF to institute required changes. Only if the request fails will PF send a message back to VF letting it know the request was unsuccessful. Testing-hints: There should be no real functional changes. This is simply hardening against a potentially malicious VF. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: fix link reportingAlan Brady1-1/+6
When querying the NVM for supported phy_types, on some firmware versions, we were failing to actually fill out the phy_types which means ethtool wouldn't report any link types. Testing-hints: Check 'ethtool <iface>' if you have the right (wrong?) firmware. Without this patch, no link modes will be reported. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: make const array patterns static, reduces object code sizeColin Ian King1-1/+3
Don't populate const array patterns on the stack, instead make it static. Makes the object code smaller by over 60 bytes: Before: text data bss dec hex filename 1953 496 0 2449 991 i40e_diag.o After: text data bss dec hex filename 1798 584 0 2382 94e i40e_diag.o (gcc 6.3.0, x86-64) Signed-off-by: Colin Ian King <[email protected]> Acked-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: Add support setting TC max bandwidth ratesAmritha Nambiar2-9/+93
This patch enables setting up maximum Tx rates for the traffic classes in i40e. The maximum rate is offloaded to the hardware through the mqprio framework by specifying the mode option as 'channel' and shaper option as 'bw_rlimit' and is configured for the VSI. Configuring minimum Tx rate limit is not supported in the device. The minimum usable value for Tx rate is 50Mbps. Example: # tc qdisc add dev eth0 root mqprio num_tc 2 map 0 0 0 0 1 1 1 1\ queues 4@0 4@4 hw 1 mode channel shaper bw_rlimit\ max_rate 4Gbit 5Gbit To dump the bandwidth rates: # tc qdisc show dev eth0 qdisc mqprio 804a: root tc 2 map 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 queues:(0:3) (4:7) mode:channel shaper:bw_rlimit max_rate:4Gbit 5Gbit Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: Refactor VF BW rate limitingAmritha Nambiar3-43/+71
This patch refactors the BW rate limiting for Tx traffic on the VF to be reused in the next patch for rate limiting Tx traffic for the VSIs on the PF as well. Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: Enable 'channel' mode in mqprio for TC configsAmritha Nambiar3-106/+362
The i40e driver is modified to enable the new mqprio hardware offload mode and factor the TCs and queue configuration by creating channel VSIs. In this mode, the priority to traffic class mapping and the user specified queue ranges are used to configure the traffic classes by setting the mode option to 'channel'. Example: map 0 0 0 0 1 2 2 3 queues 2@0 2@2 1@4 1@5\ hw 1 mode channel qdisc mqprio 8038: root tc 4 map 0 0 0 0 1 2 2 3 0 0 0 0 0 0 0 0 queues:(0:1) (2:3) (4:4) (5:5) mode:channel shaper:dcb The HW channels created are removed and all the queue configuration is set to default when the qdisc is detached from the root of the device. This patch also disables setting up channels via ethtool (ethtool -L) when the TCs are configured using mqprio scheduler. The patch also limits setting ethtool Rx flow hash indirection (ethtool -X eth0 equal N) to max queues configured via mqprio. The Rx flow hash indirection input through ethtool should be validated so that it is within in the queue range configured via tc/mqprio. The bound checking is achieved by reporting the current rss size to the kernel when queues are configured via mqprio. Example: map 0 0 0 1 0 2 3 0 queues 2@0 4@2 8@6 11@14\ hw 1 mode channel Cannot set RX flow hash configuration: Invalid argument Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: Add infrastructure for queue channel supportAmritha Nambiar3-9/+743
This patch sets up the infrastructure for offloading TCs and queue configurations to the hardware by creating HW channels(VSI). A new channel is created for each of the traffic class configuration offloaded via mqprio framework except for the first TC (TC0). TC0 for the main VSI is also reconfigured as per user provided queue parameters. Queue counts that are not power-of-2 are handled by reconfiguring RSS by reprogramming LUTs using the queue count value. This patch also handles configuring the TX rings for the channels, setting up the RX queue map for channel. Also, the channels so created are removed and all the queue configuration is set to default when the qdisc is detached from the root of the device. Signed-off-by: Amritha Nambiar <[email protected]> Signed-off-by: Kiran Patil <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-13i40e: Add macro for PF reset bitAmritha Nambiar4-10/+9
Introduce a macro for the bit setting the PF reset flag and update its usages. This makes it easier to use this flag in functions to be introduced in future without encountering checkpatch issues related to alignment and line over 80 characters. Signed-off-by: Amritha Nambiar <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10igb: check memory allocation failureChristophe JAILLET1-0/+2
Check memory allocation failures and return -ENOMEM in such cases, as already done for other memory allocations in this function. This avoids NULL pointers dereference. Signed-off-by: Christophe JAILLET <[email protected]> Tested-by: Aaron Brown <[email protected] Acked-by: PJ Waskiewicz <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Be drop monitor friendlyFlorian Fainelli1-7/+11
e1000e_put_txbuf() can be called from normal reclamation path as well as when a DMA mapping failure, so we need to differentiate these two cases when freeing SKBs to be drop monitor friendly. e1000e_tx_hwtstamp_work() and e1000_remove() are processing TX timestamped SKBs and those should not be accounted as drops either. Signed-off-by: Florian Fainelli <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: apply burst mode settings only on defaultWillem de Bruijn3-13/+15
Devices that support FLAG2_DMA_BURST have different default values for RDTR and RADV. Apply burst mode default settings only when no explicit value was passed at module load. The RDTR default is zero. If the module is loaded for low latency operation with RxIntDelay=0, do not override this value with a burst default of 32. Move the decision to apply burst values earlier, where explicitly initialized module variables can be distinguished from defaults. Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Alexander Duyck <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: fix buffer overrun while the I219 is processing DMA transactionsSasha Neftin1-3/+5
Intel® 100/200 Series Chipset platforms reduced the round-trip latency for the LAN Controller DMA accesses, causing in some high performance cases a buffer overrun while the I219 LAN Connected Device is processing the DMA transactions. I219LM and I219V devices can fall into unrecovered Tx hang under very stressfully UDP traffic and multiple reconnection of Ethernet cable. This Tx hang of the LAN Controller is only recovered if the system is rebooted. Slightly slow down DMA access by reducing the number of outstanding requests. This workaround could have an impact on TCP traffic performance on the platform. Disabling TSO eliminates performance loss for TCP traffic without a noticeable impact on CPU performance. Please, refer to I218/I219 specification update: https://www.intel.com/content/www/us/en/embedded/products/networking/ ethernet-connection-i218-family-documentation.html Signed-off-by: Sasha Neftin <[email protected]> Reviewed-by: Dima Ruinskiy <[email protected]> Reviewed-by: Raanan Avargil <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Avoid receiver overrun interrupt burstsBenjamin Poirier2-8/+26
When e1000e_poll() is not fast enough to keep up with incoming traffic, the adapter (when operating in msix mode) raises the Other interrupt to signal Receiver Overrun. This is a double problem because 1) at the moment e1000_msix_other() assumes that it is only called in case of Link Status Change and 2) if the condition persists, the interrupt is repeatedly raised again in quick succession. Ideally we would configure the Other interrupt to not be raised in case of receiver overrun but this doesn't seem possible on this adapter. Instead, we handle the first part of the problem by reverting to the practice of reading ICR in the other interrupt handler, like before commit 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt"). Thanks to commit 0a8047ac68e5 ("e1000e: Fix msi-x interrupt automask") which cleared IAME from CTRL_EXT, reading ICR doesn't interfere with RxQ0, TxQ0 interrupts anymore. We handle the second part of the problem by not re-enabling the Other interrupt right away when there is overrun. Instead, we wait until traffic subsides, napi polling mode is exited and interrupts are re-enabled. Reported-by: Lennart Sorensen <[email protected]> Fixes: 16ecba59bc33 ("e1000e: Do not read ICR in Other interrupt") Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Separate signaling for link check/link upBenjamin Poirier2-4/+9
Lennart reported the following race condition: \ e1000_watchdog_task \ e1000e_has_link \ hw->mac.ops.check_for_link() === e1000e_check_for_copper_link /* link is up */ mac->get_link_status = false; /* interrupt */ \ e1000_msix_other hw->mac.get_link_status = true; link_active = !hw->mac.get_link_status /* link_active is false, wrongly */ This problem arises because the single flag get_link_status is used to signal two different states: link status needs checking and link status is down. Avoid the problem by using the return value of .check_for_link to signal the link status to e1000e_has_link(). Reported-by: Lennart Sorensen <[email protected]> Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Fix return value testBenjamin Poirier1-1/+1
All the helpers return -E1000_ERR_PHY. Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Fix wrong comment related to link detectionBenjamin Poirier1-2/+2
Reading e1000e_check_for_copper_link() shows that get_link_status is set to false after link has been detected. Therefore, it stays TRUE until then. Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10e1000e: Fix error path in link detectionBenjamin Poirier1-3/+4
In case of error from e1e_rphy(), the loop will exit early and "success" will be set to true erroneously. Signed-off-by: Benjamin Poirier <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10i40e: Fix memory leak related filter programming statusAlexander Duyck1-27/+36
It looks like we weren't correctly placing the pages from buffers that had been used to return a filter programming status back on the ring. As a result they were being overwritten and tracking of the pages was lost. This change works to correct that by incorporating part of i40e_put_rx_buffer into the programming status handler code. As a result we should now be correctly placing the pages for those buffers on the re-allocation list instead of letting them stay in place. Fixes: 0e626ff7ccbf ("i40e: Fix support for flow director programming status") Reported-by: Anders K. Pedersen <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Anders K Pedersen <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-10i40e: Fix comment about locking for __i40e_read_nvm_word()Stefano Brivio1-1/+1
Caller needs to acquire the lock. Called functions will not. Fixes: 09f79fd49d94 ("i40e: avoid NVM acquire deadlock during NVM update") Signed-off-by: Stefano Brivio <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-50/+13
2017-10-09Merge branch '40GbE' of ↵David S. Miller12-81/+107
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 40GbE Intel Wired LAN Driver Updates 2017-10-09 This series contains updates to i40e and i40evf only. Jake fixes missed flag conversion from u64 to u32. Fixes a deafult ITR value issue where the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). So fix this by changing the default values to be calculated using the ITR_REG_TO_USEC() macro which indicates that we are converting from the register units into microseconds. Updates the drivers to bump the tail in increments of 8 and double the number of descriptors we will bundle into one tail bump when receiving. With the recent kernel support for enabling XPS and QoS at the same time, we no longer need to worry about the number of traffic classes when enabling XPS. Lihong converts the use of hash_for_each() to hash_for_each_safe() to safely remove a hash entry. Adds a check for the return value for find_first_bit() in the case that it returns the size passed to search. Alan fixes a bug in which filters are erroneously removed if they are removed and then added again. So make sure that when adding a filter, if we find it already existed in our list, make sure it is not marked to be removed. Jayaprakash adds the retrying of PHY reads when the I2C is busy for a maximum period of 500ms. Rami fixes code comment typo. Stefano Brivio simplifies the code by removing the use of a local return code variable and simply return the results of the read function. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-09i40e: Avoid some useless variables and initializers in NVM functionsStefano Brivio1-13/+7
Fixes: 09f79fd49d94 ("i40e: avoid NVM acquire deadlock during NVM update") Signed-off-by: Stefano Brivio <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: fix a typoRami Rosen1-1/+1
This patch fixes a typo in i40e_vsi_alloc_arrays() documentation. The first parameter name should be "vsi" instead of "type". Signed-off-by: Rami Rosen <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: use a local variable instead of calculating multiple timesLihong Yang1-13/+7
The computed result of I40E_MAX_VSI_QP * I40E_VIRTCHNL_SUPPORTED_QTYPES is used more than three times in function i40e_config_irq_link_list. Simply declare a local variable to store it to improve readability. Signed-off-by: Lihong Yang <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: Retry AQC GetPhyAbilities to overcome I2CRead hangsJayaprakash Shanmugam3-13/+35
- When the I2C is busy, the PHY reads are delayed. The firmware will return EGAIN in these cases with an expectation that the SW will trigger the reads again - This patch retries the operation for a maximum period of 500ms Signed-off-by: Jayaprakash Shanmugam <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: add check for return from find_first_bit callLihong Yang1-0/+4
The find_first_bit function will return the size passed to search if the first set bit is not found. This patch adds the check in case that happens as the return value would be used as the index in an array and that would have caused the out-of-bounds access. Detected by CoverityScan, CID 1295969 Out-of-bounds access Signed-off-by: Lihong Yang <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: allow XPS with QoS enabledJacob Keller1-11/+6
Recently, the kernel gained support for enabling XPS and QoS at the same time. Thus, we no longer need to worry about the number of traffic classes when enabling XPS. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e/i40evf: bundle more descriptors when allocating buffersJacob Keller2-2/+2
Double the number of descriptors we'll bundle into one tail bump when receiving. Empirical testing has shown that we reduce CPU utilization and don't appear to reduce throughput or packet rate. 32 seems to be the sweet spot, as it's half the default polling budget, so we'd essentially reduce from 4 tail writes when polling down to 2. Increasing this up to 64 appears to have negative impacts as it may become possible that we don't bump the tail each time we get polled, which could cause a long delay between returning descriptors to the hardware. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e/i40evf: bump tail only in multiples of 8Jacob Keller2-0/+18
Hardware only fetches descriptors on cachelines of 8, essentially ignoring the lower 3 bits of the tail register. Thus, it is pointless to bump tail by an unaligned access as the hardware will ignore some of the new descriptors we allocated. Thus, it's ideal if we can ensure tail writes are always aligned to 8. At first, it seems like we'd already do this, since we allocate descriptors in batches which are a multiple of 8. Since we'd always increment by a multiple of 8, it seems like the value should always be aligned. However, this ignores allocation failures. If we fail to allocate a buffer, our tail register will become unaligned. Once it has become unaligned it will essentially be stuck unaligned until a buffer allocation happens to fail at the exact amount necessary to re-align it. We can do better, by simply rounding down the number of buffers we're about to allocate (cleaned_count) such that "next_to_clean + cleaned_count" is rounded to the nearest multiple of 8. We do this by calculating how far off that value is and subtracting it from the cleaned_count. This essentially defers allocation of buffers if they're going to be ignored by hardware anyways, and re-aligns our next_to_use and tail values after a failure to allocate a descriptor. This calculation ensures that we always align the tail writes in a way the hardware expects and don't unnecessarily allocate buffers which won't be fetched immediately. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: reduce lrxqthresh from 2 to 1Jacob Keller2-2/+2
The lrxq thresh value tells hardware to immediately interrupt when there are fewer than N*64 packets left in the ring. Counter intuitively, empirical testing has shown that decreasing this value from 2 to 1, and thus changing from an immediate interrupt at fewer than 128 descriptors down to 64 descriptors causes a small increase in the maximum total packets per second we can receive. This increase occurs even when we're polling with interrupts masked, as the hardware must still handle interrupts internally even if we've disabled them in software. Also reduce the value for any VFs we allocate. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e/i40evf: always set the CLEARPBA flag when re-enabling interruptsJacob Keller5-18/+10
In the past we changed driver behavior to not clear the PBA when re-enabling interrupts. This change was motivated by the flawed belief that clearing the PBA would cause a lost interrupt if a receive interrupt occurred while interrupts were disabled. According to empirical testing this isn't the case. Additionally, the data sheet specifically says that we should set the CLEARPBA bit when re-enabling interrupts in a polling setup. This reverts commit 40d72a509862 ("i40e/i40evf: don't lose interrupts") Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e/i40evf: fix incorrect default ITR values on driver loadJacob Keller4-8/+12
The ITR register expects to be programmed in units of 2 microseconds. Because of this, all of the drivers I40E_ITR_* constants are in terms of this 2 microsecond register. Unfortunately, the rx_itr_default value is expected to be programmed in microseconds. Effectively the driver defaults to an ITR value of half the expected value (in terms of minimum microseconds between interrupts). Fix this by changing the default values to be calculated using ITR_REG_TO_USEC macro which indicates that we're converting from the register units into microseconds. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40evf: fix mac filter removal timing issueAlan Brady1-0/+2
Due to the asynchronous nature in which mac filters are added and deleted, there exists a bug in which filters are erroneously removed if removed then added again quickly. The events are as such: - filter marked for removal - same filter is re-added before watchdog that cleans up filters - we skip re-adding the filter because we have it already in the list - watchdog filter cleanup kicks off and filter is removed So when we were re-adding the same filter, it didn't actually get added because it already existed in the list, but was marked for removal and had yet to actually be removed. This patch fixes the issue by making sure that when adding a filter, if we find it already existing in our list, make sure it is not marked to be removed. Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: use the safe hash table iterator when deleting mac filtersLihong Yang1-1/+2
This patch replaces hash_for_each function with hash_for_each_safe when calling __i40e_del_filter. The hash_for_each_safe function is the right one to use when iterating over a hash table to safely remove a hash entry. Otherwise, incorrect values may be read from freed memory. Detected by CoverityScan, CID 1402048 Read from pointer after free Signed-off-by: Lihong Yang <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09i40e: fix flags declarationJacob Keller1-1/+1
Since we don't yet have more than 32 flags, we'll use a u32 for both the hw_features and flag field. Should we gain more flags in the future, we may need to convert to a u64 or separate flags out into two fields. This was overlooked in the previous commit 2781de2134c4 ("i40e/i40evf: organize and re-number feature flags"), where the feature flag was not converted form u64 to u32. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: fix crash when injecting AER after failed resetEmil Tantilov1-0/+3
In case where AER recovery fails the device is left in a down state. Consecutive AER error injection can lead to a double IRQ free. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: Update adaptive ITR algorithmAlexander Duyck3-55/+178
The following change is meant to update the adaptive ITR algorithm to better support the needs of the network. Specifically with this change what I have done is make it so that our ITR algorithm will try to prevent either starving a socket buffer for memory in the case of Tx, or overrunning an Rx socket buffer on receive. In addition a side effect of the calculations used is that we should function better with new features such as XDP which can handle small packets at high rates without needing to lock us into NAPI polling mode. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: fix the FWSM.PT check in ixgbe_mng_present()Emil Tantilov1-2/+2
Bits other than FWSM.PT can be set in IXGBE_SWFW_MODE_MASK making the previous check invalid. Change the check for MNG present to be only based on FWSM.PT bit. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: fix use of uninitialized paddingEmil Tantilov2-2/+4
This patch is resolving Coverity hits where padding in a structure could be used uninitialized. - Initialize fwd_cmd.pad/2 before ixgbe_calculate_checksum() - Initialize buffer.pad2/3 before ixgbe_hic_unlocked() Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: add counter for times Rx pages gets allocated, not recycledJesper Dangaard Brouer3-0/+7
The ixgbe driver have page recycle scheme based around the RX-ring queue, where a RX page is shared between two packets. Based on the refcnt, the driver can determine if the RX-page is currently only used by a single packet, if so it can then directly refill/recycle the RX-slot by with the opposite "side" of the page. While this is a clever trick, it is hard to determine when this recycling is successful and when it fails. Adding a counter, which is available via ethtool --statistics as 'alloc_rx_page'. Which counts the number of times the recycle fails and the real page allocator is invoked. When interpreting the stats, do remember that every alloc will serve two packets. The counter is collected per rx_ring, but is summed and ethtool exported as 'alloc_rx_page'. It would be relevant to know what rx_ring that cannot keep up, but that can be exported later if someone experience a need for this. Signed-off-by: Jesper Dangaard Brouer <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: split Tx/Rx ring clearing for ethtool loopback testEmil Tantilov1-19/+34
Commit: fed21bcee7a5 ("ixgbe: Don't bother clearing buffer memory for descriptor rings) exposed some issues with the logic in the current implementation of ixgbe_clean_test_rings() that are being addressed in this patch: - Split the clearing of the Tx and Rx rings in separate loops. Previously both Tx and Rx rings were cleared in a rx_desc->wb.upper.length based loop which could lead to issues if for w/e reason packets were received outside of the frames transmitted for the loopback test. - Add check for IXGBE_TXD_STAT_DD to avoid clearing the rings if the transmits have not comlpeted by the time we enter ixgbe_clean_test_rings() - Exit early on ixgbe_check_lbtest_frame() failure. This change fixes a crash during ethtool diagnostic (ethtool -t). Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: add error checks when initializing the PHYEmil Tantilov1-2/+6
Ignoring errors when attempting to identify the PHY can lead to a crash. Specifically in the case of FW controlled PHYs where the PHY read/write operations are set to NULL. Removed redundant comment. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: restore normal RSS after last macvlan offload is removedShannon Nelson1-0/+11
Just like when the last VF is removed, we need to restore normal operations after the last macvlan offload is removed, else we get stuck in single queue operations. To test: ethtool -l eth1 # note the number of queues in use, ~= cpus ethtool -K eth1 l2-fwd-offload on ip link add mv1 link eth1 type macvlan mode bridge ip link set dev mv1 up ip link del mv1 ethtool -l eth1 # are we back to the same # of queues, or stuck on 1? Signed-off-by: Shannon Nelson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: declare ixgbe_mac_operations structures as constBhumika Goyal1-2/+2
Declare ixgbe_mac_operations structures as const as they are only stored in the mac_ops field of ixgbe_info structure. This field is of type const and therefore ixgbe_mac_operations structure can be made const too. Signed-off-by: Bhumika Goyal <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: Clear SWFW_SYNC register during initEmil Tantilov1-7/+12
Added clearing of SW resource bits in the SW/FW synchronization register to ixgbe_init_swfw_sync_X540(). Updated ixgbe_acquire_swfw_sync_X540 SW Manageability host interface resource bit error case to match the error handling of the other SW resource bits. Which is to release the SW resource bits if SW times out while attempting to acquire the resource. This allows the driver to load in cases where the semaphore bits could be stuck after a reset or a crash. Signed-off-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: incorrect XDP ring accounting in ethtool tx_frame paramJohn Fastabend1-8/+8
Changing the TX ring parameters with an XDP program attached may cause the XDP queues to be cleared and the TX rings to be incorrectly configured. Fix by doing correct ring accounting in setup call. Fixes: 33fdc82f0883 ("ixgbe: add support for XDP_TX action") Signed-off-by: John Fastabend <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09net: ixgbe: Use new PCI_DEV_FLAGS_NO_RELAXED_ORDERING flagDing Tianhong2-41/+0
The ixgbe driver use the compile check to determine if it can send TLPs to Root Port with the Relaxed Ordering Attribute set, this is too inconvenient, now the new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to the kernel and we could check the bit4 in the PCIe Device Control register to determine whether we should use the Relaxed Ordering Attributes or not, so use this new way in the ixgbe driver. Signed-off-by: Ding Tianhong <[email protected]> Acked-by: Emil Tantilov <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09Revert commit 1a8b6d76dc5b ("net:add one common config...")Ding Tianhong1-1/+1
The new flag PCI_DEV_FLAGS_NO_RELAXED_ORDERING has been added to indicate that Relaxed Ordering Attributes (RO) should not be used for Transaction Layer Packets (TLP) targeted toward these affected Root Port, it will clear the bit4 in the PCIe Device Control register, so the PCIe device drivers could query PCIe configuration space to determine if it can send TLPs to Root Port with the Relaxed Ordering Attributes set. With this new flag we don't need the config ARCH_WANT_RELAX_ORDER to control the Relaxed Ordering Attributes for the ixgbe drivers just like the commit 1a8b6d76dc5b ("net:add one common config...") did, so revert this commit. Signed-off-by: Ding Tianhong <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: fix masking of bits read from IXGBE_VXLANCTRL registerSabrina Dubroca1-1/+1
In ixgbe_clear_udp_tunnel_port(), we read the IXGBE_VXLANCTRL register and then try to mask some bits out of the value, using the logical instead of bitwise and operator. Fixes: a21d0822ff69 ("ixgbe: add support for geneve Rx offload") Signed-off-by: Sabrina Dubroca <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-10-09ixgbe: Return error when getting PHY address if PHY access is not supportedMark D Rustad1-0/+4
In cases where PHY register access is not supported, don't mislead a caller into thinking that it is supported by returning a PHY address. Instead, return -EOPNOTSUPP when PHY access is not supported. Signed-off-by: Mark Rustad <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>