aboutsummaryrefslogtreecommitdiff
path: root/drivers/net
AgeCommit message (Collapse)AuthorFilesLines
2018-05-28virtio_net: Extend virtio to use VF datapath when availableSridhar Samudrala2-1/+38
This patch enables virtio_net to switch over to a VF datapath when STANDBY feature is enabled and a VF netdev is present with the same MAC address. It allows live migration of a VM with a direct attached VF without the need to setup a bond/team between a VF and virtio net device in the guest. It uses the API that is exported by the net_failover driver to create and and destroy a master failover netdev. When STANDBY feature is enabled, an additional netdev(failover netdev) is created that acts as a master device and tracks the state of the 2 lower netdevs. The original virtio_net netdev is marked as 'standby' netdev and a passthru device with the same MAC is registered as 'primary' netdev. The hypervisor needs to unplug the VF device from the guest on the source host and reset the MAC filter of the VF to initiate failover of datapath to virtio before starting the migration. After the migration is completed, the destination hypervisor sets the MAC filter on the VF and plugs it back to the guest to switch over to VF datapath. This patch is based on the discussion initiated by Jesse on this thread. https://marc.info/?l=linux-virtualization&m=151189725224231&w=2 Signed-off-by: Sridhar Samudrala <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28virtio_net: Introduce VIRTIO_NET_F_STANDBY feature bitSridhar Samudrala1-1/+1
This feature bit can be used by hypervisor to indicate virtio_net device to act as a standby for another device with the same MAC address. VIRTIO_NET_F_STANDBY is defined as bit 62 as it is a device feature bit. Signed-off-by: Sridhar Samudrala <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28net: Introduce net_failover driverSridhar Samudrala3-0/+849
The net_failover driver provides an automated failover mechanism via APIs to create and destroy a failover master netdev and manages a primary and standby slave netdevs that get registered via the generic failover infrastructure. The failover netdev acts a master device and controls 2 slave devices. The original paravirtual interface gets registered as 'standby' slave netdev and a passthru/vf device with the same MAC gets registered as 'primary' slave netdev. Both 'standby' and 'failover' netdevs are associated with the same 'pci' device. The user accesses the network interface via 'failover' netdev. The 'failover' netdev chooses 'primary' netdev as default for transmits when it is available with link up and running. This can be used by paravirtual drivers to enable an alternate low latency datapath. It also enables hypervisor controlled live migration of a VM with direct attached VF by failing over to the paravirtual datapath when the VF is unplugged. Signed-off-by: Sridhar Samudrala <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28netvsc: refactor notifier/event handling code to use the failover frameworkSridhar Samudrala3-165/+60
Use the registration/notification framework supported by the generic failover infrastructure. Signed-off-by: Sridhar Samudrala <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28vrf: add CRC32c offload to device featuresDavide Caratti1-1/+1
SCTP sockets originated in a VRF can improve their performance if CRC32c computation is delegated to underlying devices: update device features, setting NETIF_F_SCTP_CRC. Iterating the following command in the topology proposed with [1], # ip vrf exec vrf-h2 netperf -H 192.0.2.1 -t SCTP_STREAM -- -m 10K the measured throughput in Mbit/s improved from 2395 ± 1% to 2720 ± 1%. [1] https://www.spinics.net/lists/netdev/msg486007.html Signed-off-by: Davide Caratti <[email protected]> Reviewed-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28net: stmmac: Use mutex instead of spinlockThierry Reding3-24/+21
Some drivers, such as DWC EQOS on Tegra, need to perform operations that can sleep under this lock (clk_set_rate() in tegra_eqos_fix_speed()) for proper operation. Since there is no need for this lock to be a spinlock, convert it to a mutex instead. Fixes: e6ea2d16fc61 ("net: stmmac: dwc-qos: Add Tegra186 support") Reported-by: Jon Hunter <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Tested-by: Bhadram Varka <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28bnx2x: Collect the device debug information during Tx timeout.Sudarsana Reddy Kalluru1-1/+6
Tx-timeout mostly happens due to some issue in the device. In such cases, debug dump would be helpful for identifying the cause of the issue. This patch adds support to spill debug data during the Tx timeout. Here bnx2x_panic_dump() API is used instead of bnx2x_panic(), since we still want to allow the Tx-timeout recovery a chance to succeed. Changes from previous version: ------------------------------- v2: Fixed a coding error. Please consider applying this to "net-next". Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-28x86/pci-dma: remove the experimental forcesac boot optionChristoph Hellwig2-6/+4
Limiting the dma mask to avoid PCI (pre-PCIe) DAC cycles while paying the huge overhead of an IOMMU is rather pointless, and this seriously gets in the way of dma mapping work. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]>
2018-05-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller19-90/+133
Lots of easy overlapping changes in the confict resolutions here. Signed-off-by: David S. Miller <[email protected]>
2018-05-26net: convert datagram_poll users tp ->poll_maskChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]>
2018-05-26net: remove sock_no_pollChristoph Hellwig1-1/+0
Now that sock_poll handles a NULL ->poll or ->poll_mask there is no need for a stub. Signed-off-by: Christoph Hellwig <[email protected]>
2018-05-25net/mlx5e: Avoid reset netdev stats on configuration changesEran Ben Elisha8-100/+136
Move all RQ, SQ and channel counters from the channel objects into the priv structure. With this change, counters will not be reset upon channel configuration changes. Channel's statistics for SQs which are associated with TCs higher than zero will be presented in ethtool -S, only for SQs which were opened at least once since the module was loaded (regardless of their open/close current status). This is done in order to decrease the total amount of statistics presented and calculated for the common out of box use (no QoS). mlx5e_channel_stats is a compound of CH,RQ,SQs stats in order to create locality for the NAPI when handling TX and RX of the same channel. Align the new statistics struct per ring to avoid several channels update to the same cache line at the same time. Packet rate was tested, no degradation sensed. Signed-off-by: Eran Ben Elisha <[email protected]> CC: Qing Huang <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25ixgbe: Report PCIe link properties with pcie_print_link_status()Bjorn Helgaas1-46/+1
Previously the driver used pcie_get_minimum_link() to warn when the NIC is in a slot that can't supply as much bandwidth as the NIC could use. pcie_get_minimum_link() can be misleading because it finds the slowest link and the narrowest link (which may be different links) without considering the total bandwidth of each link. For a path with a 16 GT/s x1 link and a 2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of bandwidth, not the true available bandwidth of about 1969 MB/s for a 16 GT/s x1 link. Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. This finds the slowest link in the path to the device by computing the total bandwidth of each link and compares that with the capabilities of the device. The dmesg change is: - PCI Express bandwidth of %dGT/s available - (Speed:%s, Width: x%d, Encoding Loss:%s) + %u.%03u Gb/s available PCIe bandwidth (%s x%d link) or, if the device is capable of better performance than is available in the current slot: - This is not sufficient for optimal performance of this card. - For optimal performance, at least %dGT/s of bandwidth is required. - A slot with more lanes and/or higher speed is suggested. + %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link) Note that the driver previously used dev_warn() to suggest using a different slot, but pcie_print_link_status() uses dev_info() because if the platform has no faster slot available, the user can't do anything about the warning and may not want to be bothered with it. Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Jeff Kirsher <[email protected]>
2018-05-25cxgb4: Report PCIe link properties with pcie_print_link_status()Bjorn Helgaas1-74/+1
Previously the driver used pcie_get_minimum_link() to warn when the NIC is in a slot that can't supply as much bandwidth as the NIC could use. pcie_get_minimum_link() can be misleading because it finds the slowest link and the narrowest link (which may be different links) without considering the total bandwidth of each link. For a path with a 16 GT/s x1 link and a 2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of bandwidth, not the true available bandwidth of about 1969 MB/s for a 16 GT/s x1 link. Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. This finds the slowest link in the path to the device by computing the total bandwidth of each link and compares that with the capabilities of the device. The dmesg change is: - PCIe link speed is %s, device supports %s - PCIe link width is x%d, device supports x%d + %u.%03u Gb/s available PCIe bandwidth (%s x%d link) or, if the device is capable of better performance than is available in the current slot: - A slot with more lanes and/or higher speed is suggested for optimal performance. + %u.%03u Gb/s available PCIe bandwidth, limited by %s x%d link at %s (capable of %u.%03u Gb/s with %s x%d link) Signed-off-by: Bjorn Helgaas <[email protected]>
2018-05-25bnxt_en: Report PCIe link properties with pcie_print_link_status()Bjorn Helgaas1-18/+1
Previously the driver used pcie_get_minimum_link() to warn when the NIC is in a slot that can't supply as much bandwidth as the NIC could use. pcie_get_minimum_link() can be misleading because it finds the slowest link and the narrowest link (which may be different links) without considering the total bandwidth of each link. For a path with a 16 GT/s x1 link and a 2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of bandwidth, not the true available bandwidth of about 1969 MB/s for a 16 GT/s x1 link. Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. This finds the slowest link in the path to the device by computing the total bandwidth of each link and compares that with the capabilities of the device. The dmesg change is: - PCIe: Speed %s Width x%d + %u.%03u Gb/s available PCIe bandwidth (%s x%d link) Signed-off-by: Bjorn Helgaas <[email protected]>
2018-05-25bnx2x: Report PCIe link properties with pcie_print_link_status()Bjorn Helgaas1-17/+6
Previously the driver used pcie_get_minimum_link() to warn when the NIC is in a slot that can't supply as much bandwidth as the NIC could use. pcie_get_minimum_link() can be misleading because it finds the slowest link and the narrowest link (which may be different links) without considering the total bandwidth of each link. For a path with a 16 GT/s x1 link and a 2.5 GT/s x16 link, it returns 2.5 GT/s x1, which corresponds to 250 MB/s of bandwidth, not the true available bandwidth of about 1969 MB/s for a 16 GT/s x1 link. Use pcie_print_link_status() to report PCIe link speed and possible limitations instead of implementing this in the driver itself. This finds the slowest link in the path to the device by computing the total bandwidth of each link and compares that with the capabilities of the device. The dmesg change is: - %s (%c%d) PCI-E x%d %s found at mem %lx, IRQ %d, node addr %pM + %s (%c%d) PCI-E found at mem %lx, IRQ %d, node addr %pM + %u.%03u Gb/s available PCIe bandwidth (%s x%d link) Signed-off-by: Bjorn Helgaas <[email protected]>
2018-05-25net/mlx5e: Introducing new statistics rwlockShalom Lagziel5-9/+27
Introduce a new read/write lock that will protect statistics gathering from netdev channels configuration changes. e.g. when channels are being replaced (increase/decrease number of rings) prevent statistic gathering (ndo_get_stats64) to read the statistics of in-active channels (channels that are being closed). Plus update channels software statistics on the fly when calling ndo_get_stats64, and remove it from stats periodic work. Fixes: 9218b44dcc05 ("net/mlx5e: Statistics handling refactoring") Signed-off-by: Shalom Lagziel <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5e: Move phy link down events counter out of SW statsSaeed Mahameed2-18/+22
PHY link down events counter belongs to phy_counters group. although it has special handling, it doesn't mean it can't be there. Move it to phy_counters_grp handler. Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5: Use order-0 allocations for all WQ typesTariq Toukan8-90/+111
Complete the transition of all WQ types to use fragmented order-0 coherent memory instead of high-order allocations. CQ-WQ already uses order-0. Here we do the same for cyclic and linked-list WQs. This allows the driver to load cleanly on systems with a highly fragmented coherent memory. Performance tests: ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Packet rate of 64B packets, single transmit ring, size 8K. No degradation is sensed. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5i: Use compilation flag in IPOIB headerTariq Toukan1-0/+3
If CONFIG_MLX5_CORE_IPOIB is not set, compile-out the IPOIB related headers. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5e: TX, Use actual WQE size for SQ edge fillTariq Toukan5-92/+178
We fill SQ edge with NOPs to avoid WQEs wrap. Here, instead of doing that in advance for the maximum possible WQE size, we do it on-demand using the actual WQE size. We re-order some parts in mlx5e_sq_xmit to finish the calculation of WQE size (ds_cnt) before doing any writes to the WQE buffer. When SQ work queue is fragmented (introduced in an downstream patch), dealing with WQE wraps becomes more frequent. This change would drastically reduce the overhead in this case. Performance tests: ConnectX-5 100Gbps, CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz Packet rate of 64B packets, single transmit ring, size 8K. Before: 14.9 Mpps After: 15.8 Mpps Improvement of 6%. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5e: Use WQ API functions instead of direct fields accessTariq Toukan5-38/+60
Use the WQ API to get the WQ size, and to map a counter into a WQ entry index. Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5e: Split offloaded eswitch TC rules for port mirroringChris Mi3-21/+108
If a TC rule needs to be split for mirroring, create two HW rules, in the first level and the second level flow tables accordingly. In the first level flow table, forward the packet to the mirror port and forward the packet to the second level flow table for further processing, eg. encap, vlan push or header re-write. Currently the matching is repeated in both stages. While here, simplify the setup of the vhca id valid indicator also in the existing code. Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5e: Parse mirroring action for offloaded TC eswitch flowsChris Mi3-21/+43
Currently, we only support the mirred redirect TC sub-action. In order to support flow based vport mirroring, add support to parse the mirred mirror sub-action. For mirroring, user-space will typically set the action order such that the mirror port (mirror VF) sees packets as the original port (VF under mirroring) sent them or as it will receive them. In the general case, it means that packets are potentially sent to the mirror port before or after some actions were applied on them. To properly do that, we should follow on the exact action order as set for the flow and make sure this will also be the case when we program the HW offload. We introduce a counter for the output ports (attr->out_count), which we increase when parsing each mirred redirect/mirror sub-action and when dealing with encap. We introduce a counter (attr->mirror_count) telling us if split is needed. If no split is needed and mirroring is just multicasting to vport, the mirror count is zero, all the actions of the TC flow should apply on that single HW flow. If split is needed, the mirror count tells where to do the split, all non-mirred tc actions should apply only after the split. The mirror count is set while parsing the following actions encap/decap, header re-write, vlan push/pop. Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5: E-switch, Create a second level FDB flow tableChris Mi3-5/+32
If firmware supports the forward action with a destination list that includes a flow table, create a second level FDB flow table. This is going to be used for flow based mirroring under the switchdev offloads mode. Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Paul Blakey <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net/mlx5: E-Switch, Reorganize and rename fdb flow tablesChris Mi3-24/+25
We have several fdb flow tables for each of the legacy and switchdev modes. In the switchdev mode, there are fast path and slow path flow tables. Towards adding more flow tables in upcoming patches, reorganize and rename the various existing ones to reflect their functionality. Signed-off-by: Chris Mi <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-05-25net: dsa: dsa_loop: Make dynamic debugging helpfulFlorian Fainelli1-14/+17
Remove redundant debug prints from phy_read/write since we can trace those calls through trace events. Enhance dynamic debug prints to print arguments which helps figuring how what is going on at the driver level with higher level configuration interfaces. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25Merge tag 'mlx5e-updates-2018-05-19' of ↵David S. Miller11-80/+860
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2018-05-19 This series contains updates for mlx5e netdevice driver with one subject, DSCP to priority mapping, in the first patch Huy adds the needed API in dcbnl, the second patch adds the needed mlx5 core capability bits for the feature, and all other patches are mlx5e (netdev) only changes to add support for the feature. From: Huy Nguyen Dscp to priority mapping for Ethernet packet: These patches enable differentiated services code point (dscp) to priority mapping for Ethernet packet. Once this feature is enabled, the packet is routed to the corresponding priority based on its dscp. User can combine this feature with priority flow control (pfc) feature to have priority flow control based on the dscp. Firmware interface: Mellanox firmware provides two control knobs for this feature: QPTS register allow changing the trust state between dscp and pcp mode. The default is pcp mode. Once in dscp mode, firmware will route the packet based on its dscp value if the dscp field exists. QPDPM register allow mapping a specific dscp (0 to 63) to a specific priority (0 to 7). By default, all the dscps are mapped to priority zero. Software interface: This feature is controlled via application priority TLV. IEEE specification P802.1Qcd/D2.1 defines priority selector id 5 for application priority TLV. This APP TLV selector defines DSCP to priority map. This APP TLV can be sent by the switch or can be set locally using software such as lldptool. In mlx5 drivers, we add the support for net dcb's getapp and setapp call back. Mlx5 driver only handles the selector id 5 application entry (dscp application priority application entry). If user sends multiple dscp to priority APP TLV entries on the same dscp, the last sent one will take effect. All the previous sent will be deleted. This attribute combined with pfc attribute allows advanced user to fine tune the qos setting for specific priority queue. For example, user can give dedicated buffer for one or more priorities or user can give large buffer to certain priorities. The dcb buffer configuration will be controlled by lldptool. >> lldptool -T -i eth2 -V BUFFER prio 0,2,5,7,1,2,3,6 maps priorities 0,1,2,3,4,5,6,7 to receive buffer 0,2,5,7,1,2,3,6 >> lldptool -T -i eth2 -V BUFFER size 87296,87296,0,87296,0,0,0,0 sets receive buffer size for buffer 0,1,2,3,4,5,6,7 respectively After discussion on mailing list with Jakub, Jiri, Ido and John, we agreed to choose dcbnl over devlink interface since this feature is intended to set port attributes which are governed by the netdev instance of that port, where devlink API is more suitable for global ASIC configurations. The firmware trust state (in QPTS register) is changed based on the number of dscp to priority application entries. When the first dscp to priority application entry is added by the user, the trust state is changed to dscp. When the last dscp to priority application entry is deleted by the user, the trust state is changed to pcp. When the port is in DSCP trust state, the transmit queue is selected based on the dscp of the skb. When the port is in DSCP trust state and vport inline mode is not NONE, firmware requires mlx5 driver to copy the IP header to the wqe ethernet segment inline header if the skb has it. This is done by changing the transmit queue sq's min inline mode to L3. Note that the min inline mode of sqs that belong to other features such as xdpsq, icosq are not modified. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-05-258139too: Remove unnecessary netif_napi_del()Bo Chen1-2/+0
The call to free_netdev() in __rtl8139_cleanup_dev() clears the network device napi list, and explicit calls to netif_napi_del() are unnecessary. Signed-off-by: Bo Chen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25ibmvnic: Fix partial success login retriesThomas Falcon1-1/+6
In its current state, the driver will handle backing device login in a loop for a certain number of retries while the device returns a partial success, indicating that the driver may need to try again using a smaller number of resources. The variable it checks to continue retrying may change over the course of operations, resulting in reallocation of resources but exits without sending the login attempt. Guard against this by introducing a boolean variable that will retain the state indicating that the driver needs to reattempt login with backing device firmware. Signed-off-by: Thomas Falcon <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25qed*: Support drop action classificationManish Chopra5-15/+37
With this patch, User can configure for the supported flows to be dropped. Added a stat "gft_filter_drop" as well to be populated in ethtool for the dropped flows. For example - ethtool -N p5p1 flow-type udp4 dst-port 8000 action -1 ethtool -N p5p1 flow-type tcp4 scr-ip 192.168.8.1 action -1 Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Shahed Shaikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25qede: Support flow classification to the VFs.Manish Chopra1-4/+30
With the supported classification modes [4 tuples based, udp port based, src-ip based], flows can be classified to the VFs as well. With this patch, flows can be re-directed to the requested VF provided in "action" field of command. Please note that driver doesn't really care about the queue bits in "action" field for the VFs. Since queue will be still chosen by FW using RSS hash. [I.e., the classification would be done according to vport-only] For examples - ethtool -N p5p1 flow-type udp4 dst-port 8000 action 0x100000000 ethtool -N p5p1 flow-type tcp4 src-ip 192.16.6.10 action 0x200000000 ethtool -U p5p1 flow-type tcp4 src-ip 192.168.40.100 dst-ip \ 192.168.40.200 src-port 6660 dst-port 5550 \ action 0x100000000 Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Shahed Shaikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25qed*: Support other classification modes.Manish Chopra2-2/+31
Currently, driver supports flow classification to PF receive queues based on TCP/UDP 4 tuples [src_ip, dst_ip, src_port, dst_port] only. This patch enables to configure different flow profiles [For example - only UDP dest port or src_ip based] on the adapter so that classification can be done according to just those fields as well. Although, at a time just one type of flow configuration is supported due to limited number of flow profiles available on the device. For example - ethtool -N enp7s0f0 flow-type udp4 dst-port 45762 action 2 ethtool -N enp7s0f0 flow-type tcp4 src-ip 192.16.4.10 action 1 ethtool -N enp7s0f0 flow-type udp6 dst-port 45762 action 3 Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Shahed Shaikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25qede: Validate unsupported configurationsManish Chopra1-0/+73
Validate and prevent some of the configurations for unsupported [by firmware] inputs [for example - mac ext, vlans, masks/prefix, tos/tclass] via ethtool -N/-U. Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Shahed Shaikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25qede: Refactor ethtool rx classification flow.Manish Chopra1-182/+330
This patch simplifies the ethtool rx flow configuration [via ethtool -U/-N] flow code base by dividing it logically into various APIs based on given protocols. It also separates various validations and calculations done along the flow in their own APIs. Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Shahed Shaikh <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25cxgb4/cxgb4vf: Notify link changes to OS-dependent codeArjun Vynipadath6-6/+64
We have a confusion of two different abstractions in the Common Code: Physical Link (Port) and Logical Network Interface (Virtual Interface), and we haven't been properly managing the state of the intersection of those two abstractions. On the one hand we have the Physical state of the Link -- up or down -- and on the other we have the logical state of the VI, enabled or not. {ethN} refers to both the Physical and Logical State. In this case, ifconfig only affects/interrogates the Logical State of a VI, and ethtool only deals with the Physical State. And these are different. So, just because we disable the VI, we don't really want to change the Physical Link Up/Down state. Thus, the previous hack to set "lc->link_ok = 0" when we disable a VI is completely incorrect. Where we get into trouble is where the Physical Link State and the Logical VI State cross swords. And that happens in t4_handle_get_port_info() where we need to manage/safe the Physical Link State, but we also need to know when the Logical VI State has changed and pass that back up to the OS-dependent Driver routine t4_os_link_changed() which is concerned about the Logical Interface. So we enable a VI and that causes Firmware to send us a new Port Information message, but if none of the Physical Link State particulars have changed, we don't call t4_os_link_changed(). This fix uses the existing OS Contract APIs for the Common Code to inform the OS-dependent portion of the Host Driver when the "Link" (really Logical Network Interface) is "up" or "down". A new API t4_enable_pi_params() is added which calls t4_enable_vi_params() and, if that is successful, then calls back to the OS Contract API t4_os_link_changed() notifying the OS-dependent layer of the potential Link State change. Original Work by : Casey Leedom <[email protected]> Signed-off-by: Santosh Rastapur <[email protected]> Signed-off-by: Arjun Vynipadath <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25cxgb4: clean up init_oneGanesh Goudar2-21/+28
clean up init_one and use chip_ver consistently throughout init_one() for chip version. Signed-off-by: Casey Leedom <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25cxgb4/cxgb4vf: link management changes for new SFPGanesh Goudar3-18/+85
newer SFPs like SFP28 and QSFP28 Transceiver Modules present several new possibilities which we haven't faced before. Fix the assumptions in the code reflecting the more limited capabilities of previous Transceiver Module systems Original work by Casey Leedom <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25net: fec: remove stale commentYueHaibing1-6/+0
This comment is outdated as fec_ptp_ioctl has been replaced by fec_ptp_set/fec_ptp_get since commit 1d5244d0e43b ("fec: Implement the SIOCGHWTSTAMP ioctl") Signed-off-by: YueHaibing <[email protected]> Acked-by: Fugang Duan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25sfc: stop the TX queue before pushing new buffersMartin Habets1-8/+25
efx_enqueue_skb() can push new buffers for the xmit_more functionality. We must stops the TX queue before this or else the TX queue does not get restarted and we get a netdev watchdog. In the error handling we may now need to unwind more than 1 packet, and we may need to push the new buffers onto the partner queue. v2: In the error leg also push this queue if xmit_more is set Fixes: e9117e5099ea ("sfc: Firmware-Assisted TSO version 2") Reported-by: Jarod Wilson <[email protected]> Tested-by: Jarod Wilson <[email protected]> Signed-off-by: Martin Habets <[email protected]> Acked-by: Edward Cree <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25mlx4_core: allocate ICM memory in page size chunksQing Huang1-7/+9
When a system is under memory presure (high usage with fragments), the original 256KB ICM chunk allocations will likely trigger kernel memory management to enter slow path doing memory compact/migration ops in order to complete high order memory allocations. When that happens, user processes calling uverb APIs may get stuck for more than 120s easily even though there are a lot of free pages in smaller chunks available in the system. Syslog: ... Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task oracle_205573_e:205573 blocked for more than 120 seconds. ... With 4KB ICM chunk size on x86_64 arch, the above issue is fixed. However in order to support smaller ICM chunk size, we need to fix another issue in large size kcalloc allocations. E.g. Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt entry). So we need a 16MB allocation for a table->icm pointer array to hold 2M pointers which can easily cause kcalloc to fail. The solution is to use kvzalloc to replace kcalloc which will fall back to vmalloc automatically if kmalloc fails. Signed-off-by: Qing Huang <[email protected]> Acked-by: Daniel Jurgens <[email protected]> Reviewed-by: Zhu Yanjun <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-25wcn36xx: Add support for Factory Test Mode (FTM)Eyal Ilsar9-0/+332
Introduce infrastructure for supporting Factory Test Mode (FTM) of the wireless LAN subsystem. In order for the user space to access the firmware in test mode the relevant netlink channel needs to be exposed from the kernel driver. The above is achieved as follows: 1) Register wcn36xx driver to testmode callback from netlink 2) Add testmode callback implementation to handle incoming FTM commands 3) Add FTM command packet structure 4) Add handling for GET_BUILD_RELEASE_NUMBER (msgid=0x32A2) 5) Add generic handling for all PTT_MSG packets Signed-off-by: Eyal Ilsar <[email protected]> Signed-off-by: Ramon Fried <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25ath10k: DFS Host ConfirmationSriram R5-10/+273
In the 10.4-3.6 firmware branch there's a new DFS Host confirmation feature which is advertised using WMI_SERVICE_HOST_DFS_CHECK_SUPPORT flag. This new features enables the ath10k host to send information to the firmware on the specifications of detected radar type. This allows the firmware to validate if the host's radar pattern detector unit is operational and check if the radar information shared by host matches the radar pulses sent as phy error events from firmware. If the check fails the firmware won't allow use of DFS channels on AP mode when using FCC regulatory region. Hence this patch is mandatory when using a firmware from 10.4-3.6 branch. Else, DFS channels on FCC regions cannot be used. Supported Chipsets : QCA9984/QCA9888/QCA4019 Firmware Version : 10.4-3.6-00104 Signed-off-by: Sriram R <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25ath: add support to get the detected radar specificationsSriram R5-5/+10
This enables ath10k/ath9k drivers to collect the specifications of the radar type once it is detected by the dfs pattern detector unit. Usage of the collected info is specific to driver implementation. For example, collected radar info could be used by the host driver to send to co-processors for additional processing/validation. Note: 'radar_detector_specs' data containing the specifications of different radar types which was private within dfs_pattern_detector/ dfs_pri_detector is now shared with drivers as well for making use of this information. Signed-off-by: Sriram R <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: improve debug and error messages for SMDDaniel Mack1-4/+10
Add a missing newline in wcn36xx_smd_send_and_wait() and also log the command request and response type that was processed. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: simplify wcn36xx_smd_open()Daniel Mack1-9/+3
Drop the extra warning about failed allocations, both the core and the only caller of this function will warn loud enough in such cases. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: drain pending indicator messages on shutdownDaniel Mack1-0/+6
When the interface is shut down, wcn36xx_smd_close() potentially races against the queue worker. Make sure to cancel the work, and then free all the remnants in hal_ind_queue manually. This is again just a theoretical issue, not something that was triggered in the wild. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: set PREASSOC and IDLE stated when BSS info changesDaniel Mack1-0/+4
When a BSSID is joined, set the link status to 'preassoc', and set it to 'idle' when the BSS is deleted. This is what the downstream driver is doing, and it seems to improve the reliability during connect/disconnect stress tests. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: consider CTRL_EOP bit when looking for valid descriptorsDaniel Mack1-1/+3
In reap_tx_dxes(), when we iterate over the linked descriptors, only consider such valid that have WCN36xx_DXE_CTRL_EOP set. This is what the prima downstream driver is doing as well. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>
2018-05-25wcn36xx: only handle packets when ED or DONE bit is setDaniel Mack1-4/+16
On RX and TX interrupts, check for the WCN36XX_CH_STAT_INT_ED_MASK or WCN36XX_CH_STAT_INT_DONE_MASK in the interrupt reason register, and only handle packets when it is set. This way, reap_tx_dxes() is only invoked when needed. This brings the dequeing logic in line with what the prima downstream driver is doing. While at it, also log the interrupt reason. Signed-off-by: Daniel Mack <[email protected]> Signed-off-by: Kalle Valo <[email protected]>