aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/i40e/i40e_txrx.h
AgeCommit message (Collapse)AuthorFilesLines
2017-10-06i40e/i40evf: use DECLARE_BITMAP for stateJesse Brandeburg1-1/+2
When using set_bit and friends, we should be using actual bitmaps, and fix all the locations where we might access it. Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-08-27i40e/i40evf: avoid dynamic ITR updates when polling or low packet rateJacob Keller1-0/+1
The dynamic ITR algorithm depends on a calculation of usecs which assumes that the interrupts have been firing constantly at the interrupt throttle rate. This is not guaranteed because we could have a low packet rate, or have been polling in software. We'll estimate whether this is the case by using jiffies to determine if we've been too long. If the time difference of jiffies is larger we are guaranteed to have an incorrect calculation. If the time difference of jiffies is smaller we might have been polling some but the difference shouldn't affect the calculation too much. This ensures that we don't get stuck in BULK latency during certain rare situations where we receive bursts of packets that force us into NAPI polling. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-08-27i40e/i40evf: remove ULTRA latency modeJacob Keller1-1/+0
Since commit c56625d59726 ("i40e/i40evf: change dynamic interrupt thresholds") a new higher latency ITR setting called I40E_ULTRA_LATENCY was added with a cryptic comment about how it was meant for adjusting Rx more aggressively when streaming small packets. This mode was attempting to calculate packets per second and then kick in when we have a huge number of small packets. Unfortunately, the ULTRA setting was kicking in for workloads it wasn't intended for including single-thread UDP_STREAM workloads. This wasn't caught for a variety of reasons. First, the ip_defrag routines were improved somewhat which makes the UDP_STREAM test still reasonable at 10GbE, even when dropped down to 8k interrupts a second. Additionally, some other obvious workloads appear to work fine, such as TCP_STREAM. The number 40k doesn't make sense for a number of reasons. First, we absolutely can do more than 40k packets per second. Second, we calculate the value inline in an integer, which sometimes can overflow resulting in using incorrect values. If we fix this overflow it makes it even more likely that we'll enter ULTRA mode which is the opposite of what we want. The ULTRA mode was added originally as a way to reduce CPU utilization during a small packet workload where we weren't keeping up anyways. It should never have been kicking in during these other workloads. Given the issues outlined above, let's remove the ULTRA latency mode. If necessary, a better solution to the CPU utilization issue for small packet workloads will be added in a future patch. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-08-25i40e: separate hw_features from runtime changing flagsJacob Keller1-1/+1
The number of flags found in pf->flags has grown quite large, and there are a lot of different types of flags. Most of the flags are simply hardware features which are enabled on some firmware or some MAC types. Other flags are dynamic run-time flags which enable or disable certain features of the driver. Separate these two types of flags into pf->hw_features and pf->flags. The hw_features list will contain a set of features which are enabled at init time. This will not contain toggles or otherwise dynamically changing features. These flags should not need atomic protections, as they will be set once during init and then be essentially read only. Everything else will remain in the flags variable. These flags may be modified at any time during run time. A future patch may wish to convert these flags into set_bit/clear_bit/test_bit or similar approach to ensure atomic correctness. The I40E_FLAG_MFP_ENABLED flag may be a good fit for hw_features but currently is used by ethtool in the private flags settings, and thus has been left as part of flags. Additionally, I40E_FLAG_DCB_CAPABLE may be a good fit for the hw_features but this patch has not tried to untangle it yet. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-08-25i40e/i40evf: adjust packet size to account for double VLANsMitch Williams1-0/+1
Now that the kernel supports double VLAN tags, we should at least play nice. Adjust the max packet size to account for two VLAN tags, not just one. Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-06-20i40e: add support for XDP_TX actionBjörn Töpel1-0/+11
This patch adds proper XDP_TX action support. For each Tx ring, an additional XDP Tx ring is allocated and setup. This version does the DMA mapping in the fast-path, which will penalize performance for IOMMU enabled systems. Further, debugfs support is not wired up for the XDP Tx rings. Signed-off-by: Björn Töpel <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-06-20i40e: add XDP support for pass and drop actionsBjörn Töpel1-0/+1
This commit adds basic XDP support for i40e derived NICs. All XDP actions will end up in XDP_DROP. Signed-off-by: Björn Töpel <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-08i40e/i40evf: Add support for padding start of framesAlexander Duyck1-1/+69
This patch adds padding to the start of frames to make room for headroom for us to eventually start using build_skb. Right now we guarantee at least NET_SKB_PAD + NET_IP_ALIGN, however we allocate more space if more is available. For example on x86 the headroom should be 192 bytes. On systems that have too large of a cache line size to support storing 1.5K padding and shared info we default to using 3K buffers and reserve everything that isn't used for skb_shared_info or the data buffer for headroom. Change-ID: I33c641c9a1ea10cf7cc484c2d20985368d2d709a Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-08i40e/i40evf: Add support for using order 1 pages with a 3K bufferAlexander Duyck1-0/+12
There are situations where adding padding to the front and back of an Rx buffer will require that we add additional padding. Specifically if NET_IP_ALIGN is non-zero, or the MTU size is larger than 7.5K we would need to use 2K buffers which leaves us with no room for the padding. To preemptively address these cases I am adding support for 3K buffers to the Rx path so that we can provide the additional padding needed in the event of NET_IP_ALIGN being non-zero or a cache line being greater than 64. Change-ID: I938bc1ba611285428df39a613cd66f98e60b55c7 Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-04-08i40e: Simplify i40e_detect_recover_hung_queue logicAlan Brady1-2/+1
This patch greatly reduces the unneeded complexity in the i40e_detect_recover_hung_queue code path. The previous implementation set a 'hung bit' which would then get cleared while polling. If the detection routine was called a second time with the bit already set, we would issue a software interrupt. This patch makes it such that if interrupts are disabled and we have pending TX descriptors, we trigger a software interrupt since in, the worst case, queues are already clean and we have an extra interrupt. Additionally this patch removes the workaround for lost interrupts as calling napi_reschedule in this context can cause software interrupts to fire on the wrong CPU. Change-ID: Iae108582a3ceb6229ed1d22e4ed6e69cf97aad8d Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-03-29i40e/i40evf: Change the way we limit the maximum frame size for RxAlexander Duyck1-3/+1
This patch changes the way we handle the maximum frame size for the Rx path. Previously we were rounding up to 2K for a 1500 MTU and then brining the max frame size down to MTU plus a fixed amount. With this patch applied what we now do is limit the maximum frame to 1.5K minus the value for NET_IP_ALIGN for standard MTU, and for any MTU greater than 1500 we allow up to the maximum frame size. This makes the behavior more consistent with the other drivers such as igb which had similar logic. In addition it reduces the test matrix for MTU since we only have two max frame sizes that are handled for Rx now. Change-ID: I23a9d3c857e7df04b0ef28c64df63e659c013f3f Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-03-27i40e: Drop FCoE code from core driver filesAlexander Duyck1-17/+0
Looking over the code for FCoE it looks like the Rx path has been broken at least since the last major Rx refactor almost a year ago. It seems like FCoE isn't supported for any of the Fortville/Fortpark hardware so there isn't much point in carrying the code around, especially if it is broken and untested. Change-ID: I892de8fa551cb129ce2361e738ff82ce55fa229e Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-03-27i40e/i40evf: Update code to better handle incrementing page countAlexander Duyck1-1/+6
Update the driver code so that we do bulk updates of the page reference count instead of just incrementing it by one reference at a time. The advantage to doing this is that we cut down on atomic operations and this in turn should give us a slight improvement in cycles per packet. In addition if we eventually move this over to using build_skb the gains will be more noticeable. I also found and fixed a store forwarding stall from where we were assigning "*new_buff = *old_buff". By breaking it up into individual copies we can avoid this and as a result the performance is slightly improved. Change-ID: I1d3880dece4133eca3c32423b04a5467321ccc52 Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-03-15i40e/i40evf: Add support for mapping pages with DMA attributesAlexander Duyck1-0/+3
This patch adds support for DMA_ATTR_SKIP_CPU_SYNC and DMA_ATTR_WEAK_ORDERING. By enabling both of these for the Rx path we are able to see performance improvements on architectures that implement either one due to the fact that page mapping and unmapping only has to sync what is actually being used instead of the entire buffer. In addition by enabling the weak ordering attribute enables a performance improvement for architectures that can associate a memory ordering with a DMA buffer such as Sparc. Change-ID: If176824e8231c5b24b8a5d55b339a6026738fc75 Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-02-11i40e/i40evf: Moves skb from i40e_rx_buffer to i40e_ringScott Peterson1-1/+8
This patch reduces the size of struct i40e_rx_buffer by one pointer, and makes the i40e driver a little more consistent with the igb driver in terms of packets that span buffers. We do this by moving the skb field from struct i40e_rx_buffer to struct i40e_ring. We pass the skb we already have (or NULL if we don't) to i40e_fetch_rx_buffer(), which skips the skb allocation if we already have one for this packet. Change-ID: I4ad48a531844494ba0c5d8e1a62209a057f661b0 Signed-off-by: Scott Peterson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-02-02i40e: refactor macro INTRL_USEC_TO_REGAlan Brady1-1/+14
This patch refactors the macro INTRL_USEC_TO_REG into a static inline function and fixes a couple subtle bugs caused by the macro. This patch fixes a bug which was caused by passing a bad register value to the firmware. If enabling interrupt rate limiting, a non-zero value for the rate limit must be used. Otherwise the firmware sets the interrupt rate limit to the maximum value. Due to the limited resolution of the register, attempting to set a value of 1, 2, or 3 would be rounded down to 0 and limiting was left enabled, causing unexpected behavior. This patch also fixes a possible bug in which using the macro itself can introduce unintended side-affects because the macro argument is used more than once in the macro definition (e.g. a variable post-increment argument would perform a double increment on the variable). Without this patch, attempting to set interrupt rate limits of 1, 2, or 3 results in unexpected behavior and future use of this macro could cause subtle bugs. Change-Id: I83ac842de0ca9c86761923d6e3a4d7b1b95f2b3f Signed-off-by: Alan Brady <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-12-06i40e: simplify txd use count calculationMitch Williams1-17/+28
The i40e_txd_use_count function was fast but confusing. In the comments, it even admits that it's ugly. So replace it with a new function that is (very) slightly faster and has extensive commenting to help the thicker among us (including the author, who will forget in a week) understand how it works. Change-ID: Ifb533f13786a0bf39cb29f77969a5be2c83d9a87 Signed-off-by: Mitch Williams <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-10-31i40e: Reorder logic for coalescing RS bitsAlexander Duyck1-1/+0
This patch reorders the logic at the end of i40e_tx_map to address the fact that the logic was rather convoluted and much larger than it needed to be. In order to try and coalesce the code paths I have updated some of the comments and repurposed some of the variables in order to reduce unnecessary overhead. This patch does the following: 1. Quit tracking skb->xmit_more with a flag, just max out packet_stride 2. Drop tail_bump and do_rs and instead just use desc_count and td_cmd 3. Pull comments from ixgbe that make need for wmb() more explicit. Change-ID: Ic7da85ec75043c634e87fef958109789bcc6317c Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-10-31i40e: replace PTP Rx timestamp hang logicJacob Keller1-2/+0
The current Rx timestamp hang logic is not very robust because it does not notice a register is hung until all four timestamps have been latched and we wait a full 5 seconds. Replace this logic with a newer Rx hang detection based on storing the jiffies when we first notice a receive timestamp event. We store each register's time separately, along with a flag indicating if it is currently latched. Upon first transitioning to latch, we will update the latch_events[i] jiffies value. This indicates the time we first noticed this event. The watchdog routine will simply check that the either the flag has been cleared, or we have passed at least one second. In this case, it is able to clear the Rx timestamp register under the assumption that it was for a dropped frame. The benefit if this strategy is that we should be able to detect and clear out stalled RXTIME_H registers before we exhaust the supply of 4, and avoid complete stall of Rx timestamp events. Change-ID: Id55458c0cd7a5dd0c951ff2b8ac0b2509364131f Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-09-24i40e/i40evf: Add txring_txq function to match fm10k and ixgbeAlexander Duyck1-0/+9
This patch adds a txring_txq function which allows us to convert a i40e_ring/i40evf_ring to a netdev_tx_queue structure. This way we can avoid having to make a multi-line function call for all the spots that need access to this. Change-ID: Ic063b71d8b92ea406d2c32e798c8e2b02809d65b Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-05-05i40e/i40evf: Remove unused hardware receive descriptor codeJesse Brandeburg1-14/+10
The hardware supports a 16 byte descriptor for receive, but the driver was never using it in production. There was no performance benefit to the real driver of 16 byte descriptors, so drop a whole lot of complexity while getting rid of the code. Also since the previous patch made us use no-split mode all the time, drop any support in the driver for any other value in dtype and assume it is always zero (aka no-split). Hooray for code removal! Change-ID: I2257e902e4dad84a07b94db6d2e6f4ce69b27bc0 Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-05-05i40e: Refactor receive routineJesse Brandeburg1-12/+25
This is part 1 of the Rx refactor series, just including changes to i40e. This refactor aligns the receive routine with the one in ixgbe which was highly optimized. This reduces the code we have to maintain and allows for (hopefully) more readable and maintainable RX hot path. In order to do this: - consolidate the receive path into a single function that doesn't use packet split but *does* use pages for Rx buffers. - remove the old _1buf routine - consolidate several routines into helper functions - remove ethtool control over packet split Change-ID: I5ca100721de65992aa0114f8b4bac844b84758e0 Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-05-05i40e/i40evf: Remove reference to ring->dtypeJesse Brandeburg1-1/+0
As part of the rx-refactor, the dtype variable in the i40e_ring struct is no longer used, so remove it. Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-05-05i40e: Drop packet split receive routineJesse Brandeburg1-7/+0
As part of preparation for the rx-refactor, remove the packet split receive routine and ancillary code. Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-04-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-3/+7
Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <[email protected]>
2016-04-13i40e/i40evf: Limit TSO to 7 descriptors for payload instead of 8 per packetAlexander Duyck1-3/+7
This patch addresses a bug introduced based on my interpretation of the XL710 datasheet. Specifically section 8.4.1 states that "A single transmit packet may span up to 8 buffers (up to 8 data descriptors per packet including both the header and payload buffers)." It then later goes on to say that each segment for a TSO obeys the previous rule, however it then refers to TSO header and the segment payload buffers. I believe the actual limit for fragments with TSO and a skbuff that has payload data in the header portion of the buffer is actually only 7 fragments as the skb->data portion counts as 2 buffers, one for the TSO header, and one for a segment payload buffer. Fixes: 2d37490b82af ("i40e/i40evf: Rewrite logic for 8 descriptor per packet check") Reported-by: Sowmini Varadhan <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Acked-by: Jesse Brandeburg <[email protected]> Tested-by: Sowmini Varadhan <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-04-06i40e/i40evf: Faster RX via avoiding FCoEJesse Brandeburg1-0/+10
As it turns out, calling into other files from hot path hurts performance a lot. In this case the majority of the time we call "check FCoE" and the packet is *not* FCoE, but this call was taking 5% of our total cycles spent on receive. Change-ID: I080552c26e7060bc7b78504dc2763f6f0b3d8c76 Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-04-05i40e/i40evf: Allow up to 12K bytes of data per Tx descriptor instead of 8KAlexander Duyck1-3/+32
From what I can tell the practical limitation on the size of the Tx data buffer is the fact that the Tx descriptor is limited to 14 bits. As such we cannot use 16K as is typically used on the other Intel drivers. However artificially limiting ourselves to 8K can be expensive as this means that we will consume up to 10 descriptors (1 context, 1 for header, and 9 for payload, non-8K aligned) in a single send. I propose that we can reduce this by increasing the maximum data for a 4K aligned block to 12K. We can reduce the descriptors used for a 32K aligned block by 1 by increasing the size like this. In addition we still have the 4K - 1 of space that is still unused. We can use this as a bit of extra padding when dealing with data that is not aligned to 4K. By aligning the descriptors after the first to 4K we can improve the efficiency of PCIe accesses as we can avoid using byte enables and can fetch full TLP transactions after the first fetch of the buffer. This helps to improve PCIe efficiency. Below is the results of testing before and after with this patch: Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB Before: 87380 16384 16384 10.00 33682.24 20.27 -1.00 0.592 -1.00 After: 87380 16384 16384 10.00 34204.08 20.54 -1.00 0.590 -1.00 So the net result of this patch is that we have a small gain in throughput due to a reduction in overhead for putting together the frame. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-19i40e: queue-specific settings for interrupt moderationKan Liang1-0/+8
For i40e driver, each vector has its own ITR register. However, there are no concept of queue-specific settings in the driver proper. Only global variable is used to store ITR values. That will cause problems especially when resetting the vector. The specific ITR values could be lost. This patch move rx_itr_setting and tx_itr_setting to i40e_ring to store specific ITR register for each queue. i40e_get_coalesce and i40e_set_coalesce are also modified accordingly to support queue-specific settings. To make it compatible with old ethtool, if user doesn't specify the queue number, i40e_get_coalesce will return queue 0's value. While i40e_set_coalesce will apply value to all queues. Signed-off-by: Kan Liang <[email protected]> Acked-by: Shannon Nelson <[email protected]> Acked-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-02-18i40e/i40evf: Rewrite logic for 8 descriptor per packet checkAlexander Duyck1-0/+19
This patch is meant to rewrite the logic for how we determine if we can transmit the frame or if it needs to be linearized. The previous code for this function was using a mix of division and modulus division as a part of computing if we need to take the slow path. Instead I have replaced this by simply working with a sliding window which will tell us if the frame would be capable of causing a single packet to span several descriptors. The logic for the scan is fairly simple. If any given group of 6 fragments is less than gso_size - 1 then it is possible for us to have one byte coming out of the first fragment, 6 fragments, and one or more bytes coming out of the last fragment. This gives us a total of 8 fragments which exceeds what we can allow so we send such frames to be linearized. Arguably the use of modulus might be more exact as the approach I propose may generate some false positives. However the likelihood of us taking much of a hit for those false positives is fairly low, and I would rather not add more overhead in the case where we are receiving a frame composed of 4K pages. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-18i40e/i40evf: Break up xmit_descriptor_count from maybe_stop_txAlexander Duyck1-2/+42
In an upcoming patch I would like to have access to the descriptor count used for the data portion of the frame. For this reason I am splitting up the descriptor count function from the function that stops the ring. Also in order to try and reduce unnecessary duplication of code I am moving the slow-path portions of the code out of being inline calls so that we can just jump to them and process them instead of having to build them into each function that calls them. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-18i40e/i40evf: Add exception handling for Tx checksumAlexander Duyck1-1/+0
Add exception handling to the Tx checksum path so that we can handle cases of TSO where the frame is bad, or Tx checksum where we didn't recognize a protocol Drop I40E_TX_FLAGS_CSUM as it is unused, move the CHECKSUM_PARTIAL check into the function itself so that we can decrease indent. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-18i40e/i40evf: Drop outer checksum offload that was not requestedAlexander Duyck1-1/+0
The i40e and i40evf drivers contained code for inserting an outer checksum on UDP tunnels. The issue however is that the upper levels of the stack never requested such an offload and it results in possible errors. In addition the same logic was being applied to the Rx side where it was attempting to validate the outer checksum, but the logic there was incorrect in that it was testing for the resultant sum to be equal to the header checksum instead of being equal to 0. Since this code is so massively flawed, and doing things that we didn't ask for it to do I am just dropping it, and will bring it back later to use as an offload for SKB_GSO_UDP_TUNNEL_CSUM which can make use of such a feature. As far as the Rx feature I am dropping it completely since it would need to be massively expanded and applied to IPv4 and IPv6 checksums for all parts, not just the one that supports Tx checksum offload for the outer. Signed-off-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-17i40e: Add a SW workaround for lost interruptsAnjali Singhai Jain1-1/+2
This patch adds a workaround for cases where we might have interrupts that got lost but WB happened. If that happens without this patch we will see a tx_timeout. To work around it, this patch goes ahead and reschedules NAPI in that situation, if NAPI is not already scheduled. We also add a counter in ethtool to keep track of when we detect a case of tx_lost_interrupt. Note: napi_reschedule() can be safely called from process/service_task context and is done in other drivers as well without an issue. Change-ID: I00f98f1ce3774524d9421227652bef20fcbd0d20 Signed-off-by: Anjali Singhai Jain <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-17i40e: properly show packet split status in debugfsMitch Williams1-1/+0
Get rid of the unused hsplit field in the ring struct and use the existing macro to detect packet split enablement. This allows debugfs dumps of the VSI to properly show which Rx routine is in use. Change-ID: Ic4e9589e6a788ab196ed0850703f704e30c03781 Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-17i40e/i40evf: use pages correctly in RxMitch Williams1-0/+2
Refactor the packet split Rx code to properly use half-pages for receives. The previous code was doing way more mapping and unmapping than it needed to, and wasn't properly using half-pages. Increment the page use count each time we give a half-page to an skb, knowing that the stack will probably process and release the page before we need it again. Only free and reallocate pages if the count shows that both half-pages are in use. Add counters to track reallocations and page reuse. Change-ID: I534b299196036b64be82b4861a0a4036310a8f22 Signed-off-by: Mitch Williams <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2016-02-17i40e/i40evf: try again after failureJesse Brandeburg1-3/+3
This is the "Don't Give Up" patch. Previously the driver could fail an allocation, and then possibly stall a queue forever, by never coming back to continue receiving or allocating buffers. With this patch, the driver will keep polling trying to allocate receive buffers until it succeeds. This should keep all receive queues running even in the face of memory pressure. Also update copyright year in file header. Change-ID: I2b103d1ce95b9831288a7222c3343ffa1988b81b Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-12-16i40e: geneve tunnel offload supportSinghai, Anjali1-1/+1
This patch adds driver hooks to implement ndo_ops to add/del udp port in the HW to identify GENEVE tunnels. Signed-off-by: Anjali Singhai Jain <[email protected]> Signed-off-by: Kiran Patil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-11-25i40e/i40evf: Add a stat to track how many times we have to do a force WBAnjali Singhai Jain1-0/+1
When in NAPI with interrupts disabled, the HW needs to be forced to do a write back on TX if the number of descriptors pending are less than a cache line. This stat helps keep track of how many times we get into this situation. Change-ID: I76c1bcc7ebccd6bffcc5aa33bfe05f2fa1c9a984 Signed-off-by: Anjali Singhai Jain <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-10-19i40e/i40evf: adjust interrupt throttle less frequentlyJesse Brandeburg1-2/+2
The adaptive ITR (interrupt throttle rate) algorithm was adjusting the hardware's interrupt rate too frequently. This caused a lot of variation in the interrupt rate for fairly constant workloads. Change the code to have a counter and adjust only once every N number of interrupts. Change-ID: I0460f1f86571037484eca5aca36ac4d889cb8389 Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-10-19i40e/i40evf: change dynamic interrupt thresholdsJesse Brandeburg1-0/+3
The dynamic algorithm, while now working, doesn't have good performance in 40G mode. One part of this patch addresses the high CPU utilization of some small streaming workloads that the driver should reduce CPU in. It also changes the minimum ITR that the dynamic algorithm will settle on, causing our minimum latency to go from 12us to about 14us, when using adaptive mode. It also changes the BULK interrupt rate to allow maximum throughput on a 40Gb connection with a single thread of transmit, clamping interrupt rate to 8000 for TX makes single thread traffic go too slow. The new ULTRA bulk setting is introduced and is used when the Rx packet rate on this queue exceeds 40000 packets per second. This value of 40000 was chosen because the automatic tuning of minimum ITR=20us means that a single queue can't quite achieve that many packets per second from a round-robin test. Change-ID: Icce8faa128688ca5fd2c4229bdd9726877a92ea2 Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-10-16i40e/i40evf: moderate interrupts differentlyJesse Brandeburg1-0/+10
The XL710 hardware has a different interrupt moderation design that can support a limit of total interrupts per second per vector, in addition to the "number of interrupts per second" controls already established in the driver. This combination of hardware features allows us to set very low default latency settings but minimize the total CPU utilization by not making too many interrupts, should the user desire. The current driver implementation is still enabling the dynamic moderation in the driver, and only using the rx/tx-usecs limit in ethtool to limit the interrupt rate per second, by default. The new code implemented in this patch 2) adds init/use of the new "Interrupt Limit" register 3) adds ethtool knob to control/report the limits above Usage is ethtool -C ethx rx-usecs-high <value> Where <value> is number of microseconds to create a rate of 1/N interrupts per second, regardless of rx-usecs or tx-usecs values. Since there is a credit based scheme in the hardware, the rx-usecs and tx-usecs can be configured for very low latency for short bursts, but once the credit runs out the refill rate on the credits is limited by rx-usecs-high. Change-ID: I3a1075d3296123b0f4f50623c779b027af5b188d Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-10-08i40e/i40evf: clean up some codeJesse Brandeburg1-0/+1
Add missings spaces after declarations, remove another __func__ use, remove uncessary braces, remove unneeded breaks, and useless returns, and generally fix up some code. Change-ID: Ie715d6b64976c50e1c21531685fe0a2bd38c4244 Signed-off-by: Jesse Brandeburg <[email protected]> Signed-off-by: Shannon Nelson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-10-07i40e/i40evf: Add a stat to keep track of linearization countAnjali Singhai Jain1-0/+1
Keep track of how many times we ask the stack to linearize the skb because the HW cannot handle skbs with more than 8 frags per segment/single packet. Change-ID: If455452060963a769bbe6112cba952e79e944b52 Signed-off-by: Anjali Singhai Jain <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-09-30i40e: fix 32 bit build warningsJesse Brandeburg1-6/+6
Sparse found some issues with 32 bit compilation, which probably should at least work without warning. Not only that, but the code was wrong. Thanks sparse!! And thanks to the kbuild robot zero day testing for finding this issue. $ make ARCH=i386 M=drivers/net/ethernet/intel/i40e C=2 CF="-D__CHECK_ENDIAN__" CHECK drivers/net/ethernet/intel/i40e/i40e_main.c include/linux/etherdevice.h:79:32: warning: restricted __be16 degrades to integer drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (32) for type unsigned long drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (42) for type unsigned long drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (39) for type unsigned long drivers/net/ethernet/intel/i40e/i40e_main.c:7565:17: warning: shift too big (40) for type unsigned long CC: [email protected] Signed-off-by: Jesse Brandeburg <[email protected]> Reported-by: kbuild test robot <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-09-28i40e: Fix RS bit update in Tx path and disable force WB workaroundAnjali Singhai1-0/+2
This patch fixes the issue of forcing WB too often causing us to not benefit from NAPI. Without this patch we were forcing WB/arming interrupt too often taking away the benefits of NAPI and causing a performance impact. With this patch we disable force WB in the clean routine for X710 and XL710 adapters. X722 adapters do not enable interrupt to force a WB and benefit from WB_ON_ITR and hence force WB is left enabled for those adapters. For XL710 and X710 adapters if we have less than 4 packets pending a software Interrupt triggered from service task will force a WB. This patch also changes the conditions for setting RS bit as described in code comments. This optimizes when the HW does a tail bump and when it does a WB. It also optimizes when we do a wmb. Signed-off-by: Anjali Singhai Jain <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-09-28i40e/i40evf: refactor tx timeout logicKiran Patil1-8/+2
This patch modifies the driver timeout logic by issuing a writeback request via a software interrupt to the hardware the first time the driver detects a hang. The driver was too aggressive in resetting a hung queue, so back that off by removing logic to down the netdevice after too many hangs, and move the function to the service task. Change-ID: Ife100b9d124cd08cbdb81ab659008c1b9abbedea Signed-off-by: Kiran Patil <[email protected]> Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-09-28i40e: Move i40e_get_head into header fileKiran Patil1-0/+14
i40e_get_head needs to be called in multiple files in a further patch, prepare by moving the function into a header file. Signed-off-by: Kiran Patil <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-08-05i40e/i40evf: Add TX/RX outer UDP checksum support for X722Anjali Singhai Jain1-0/+2
X722 supports offloading of outer UDP TX and RX checksum for tunneled packets. This patch exposes the support and leaves it enabled by default. Signed-off-by: Anjali Singhai Jain <[email protected]> Signed-off-by: Catherine Sullivan <[email protected]> Tested-by: Jim Young <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2015-08-05i40e/i40evf: Add support for writeback on ITR feature for X722Anjali Singhai Jain1-0/+2
X722 fixes an issue from X710 where TX descriptor WB would not happen if the interrupts were disabled. In order for the write backs to happen a bit needs to be set in the dynamic interrupt control register called WB_ON_ITR. With this feature, the SW driver need not arm SW interrupts to work around the issue in X710. Signed-off-by: Anjali Singhai Jain <[email protected]> Signed-off-by: Catherine Sullivan <[email protected]> Tested-by: Jim Young <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>