aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-04-28IB/qib: use setup_timerGeliang Tang4-24/+16
Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/nes: use setup_timerGeliang Tang3-9/+6
Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/i40iw: use setup_timerGeliang Tang2-9/+6
Use setup_timer() instead of init_timer() to simplify the code. Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/nes: Fix incorrect type in assignmentLeon Romanovsky1-1/+1
Fix mismatch between types, wqe_words are in le32 format, while opcode in CPU format. The following sparse warnings are helped to find it: drivers/infiniband/hw/nes/nes_hw.c:3058:24: warning: incorrect type in assignment (different base types) drivers/infiniband/hw/nes/nes_hw.c:3058:24: expected unsigned int [unsigned] [assigned] [usertype] opcode drivers/infiniband/hw/nes/nes_hw.c:3058:24: got restricted __le32 <noident> CC: Faisal Latif <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/usnic: Simplify the code to balance loc/unlock callsLeon Romanovsky1-22/+23
Simplify code in find_free_vf_and_create_qp_grp() to avoid sparse error regarding call to unlock in the block other than lock was called. drivers/infiniband/hw/usnic/usnic_ib_verbs.c:206:9: warning: context imbalance in 'find_free_vf_and_create_qp_grp' - different lock contexts for basic block CC: Christian Benvenuti <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28Ib/usnic: Explicitly include usnic headersLeon Romanovsky2-0/+2
Sparse tool complains about undeclared symbols in usnic_ib_verbs.c and usnic_ib_sysfs.c This is caused by lack of direct include of appropriate usnic_ib_verbs.h and usnic_ib_sysfs.h, where all these functions were declared. Simple include eliminates 30 warnings similar to the below one: drivers/infiniband/hw/usnic/usnic_ib_sysfs.c:304:6: warning: symbol 'usnic_ib_sysfs_unregister_usdev' was not declared. Should it be static? CC: Christian Benvenuti <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28Ib/core: Mark local uverbs_std_types functions to be staticLeon Romanovsky1-24/+24
Functions declared in uverbs_std_types.c are local to that file, but they lack static declarations. This produces a lot of sparse warnings, like the one below: drivers/infiniband/core/uverbs_std_types.c:41:5: warning: symbol 'uverbs_free_ah' was not declared. Should it be static? So mark them as static. CC: Matan Barak <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28iw_cxgb4: check return value of alloc_skbPan Bian1-0/+2
Function alloc_skb() will return a NULL pointer when there is no enough memory. However, the return value of alloc_skb() is directly used without validation in function send_fw_pass_open_req(). This patches checks the return value of alloc_skb() against NULL. Signed-off-by: Pan Bian <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/rxe: fix typo: "algorithmi" -> "algorithm"Colin Ian King1-1/+1
trivial fix to typo in pr_err message Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28IB/rdmavt: restore IRQs on error path in rvt_create_ah()Dan Carpenter1-1/+1
We need to call spin_unlock_irqrestore() instead of vanilla spin_unlock() on this error path. Fixes: 119a8e708d16 ("IB/rdmavt: Add AH to rdmavt") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Acked-by: Dennis Dalessandro <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28infiniband: call ipv6 route lookup via the stub interfacePaolo Abeni1-2/+2
The infiniband address handle can be triggered to resolve an ipv6 address in response to MAD packets, regardless of the ipv6 module being disabled via the kernel command line argument. That will cause a call into the ipv6 routing code, which is not initialized, and a conseguent oops. This commit addresses the above issue replacing the direct lookup call with an indirect one via the ipv6 stub, which is properly initialized according to the ipv6 status (e.g. if ipv6 is disabled, the routing lookup fails gracefully) Cc: [email protected] # 3.12+ Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28RDMA/qedr: add support for send+invalidate in poll CQAmrani, Ram2-38/+63
Split the poll responder CQ into two functions. Add support for send+invalidate in poll CQ. Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28RDMA/qedr: destroy CQ only after HW releases itAmrani, Ram3-2/+75
Wait for all relevant CNQ interrupts before freeing the CQ. Don't invoke completion handlers for a destroyed CQ. Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28RDMA/qedr: enhance destroy flow for GSI QPAmrani, Ram1-10/+11
Avoid attempting to release irrelevant (and unused) resources for GSI QP. Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28RDMA/qedr: properly check atomic capabilitiesAmrani, Ram2-31/+47
After checking the path upwards towards root complex, actualy check root complex atomic_req capability, and not our own NIC. Verify that the PCIe device control register's atomic egress block is cleared in the path. Verify that the PCIe version is at least 2. Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-28RDMA/qedr: reset access control when registering a MRAmrani, Ram1-0/+2
Signed-off-by: Ram Amrani <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/vmw_pvrdma: Spare annotate imm_dataJason Gunthorpe1-2/+2
imm_data is copied directly from the ib_send_wr and ib_wc which have it marked as __be32, copy that mark into the uapi structures as well. Signed-off-by: Jason Gunthorpe <[email protected]> Tested-by: Adit Ranadive <[email protected]> Acked-by: Adit Ranadive <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Add ODP support to MWArtemy Kovalyov3-43/+120
Internally MW implemented as KLM MKey and filled by userspace UMR postsends. Handle pagefault trigered by operations on this MKeys. Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Extract page fault codeArtemy Kovalyov1-99/+104
To make page fault handling code more flexible split pagefault_single_data_segment() function. Keep MR resolution in pagefault_single_data_segment() and move actual updates into pagefault_single_mr(). Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/umem: Add support to huge ODPArtemy Kovalyov4-5/+23
Add IB_ACCESS_HUGETLB ib_reg_mr flag. Hugetlb region registered with this flag will use single translation entry per huge page. Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Add contiguous ODP supportArtemy Kovalyov2-18/+19
Currenlty ODP supports only regular MMU pages. Add ODP support for regions consisting of physically contiguous chunks of arbitrary order (huge pages for instance) to improve performance. Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/umem: Add contiguous ODP supportArtemy Kovalyov2-21/+33
Currenlty ODP supports only regular MMU pages. Add ODP support for regions consisting of physically contiguous chunks of arbitrary order (huge pages for instance) to improve performance. Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Decrease verbosity level of ODP errorsArtemy Kovalyov1-7/+4
Decrease verbosity level of ODP error flows messages to debug level. Remove one redundant print since debug level message already exists in this flow. Fixes: d9aaed838765 ('{net,IB}/mlx5: Refactor page fault handling') Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Fix implicit MR GCArtemy Kovalyov1-10/+1
When implicit MR's leaf MKey becomes unused, i.e. when it's last page being released my MMU invalidation it is marked as "dying" and scheduled for release by garbage collector. Currentle consequent page fault may remove "dying" flag. Treat leaf MKey as non-existent once it was scheduled to removal by GC. Fixes: 81713d3788d2 ('IB/mlx5: Add implicit MR support') Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Fix UMR size calculationArtemy Kovalyov1-1/+2
Translation table updates of large UMR may require multiple post send operations. The last operations can be in various lengths, but current code set them to be the same length. Fixes: 7d0cc6edcc70 ('IB/mlx5: Add MR cache for large UMR regions') Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/mlx5: Fix function updating xlt emergency pathArtemy Kovalyov1-1/+1
In memory shortage path we fall back to use spare buffer. mlx5_ib_update_xlt() called from ib_uverbs_reg_mr when ibmr.ucontext not initialized yet. Scenario how to test it: 1. trigger memory exhaustion so __get_free_pages(GFP_KERNEL, 4) will fail 2. register MR 3. there should be no kernel oops Fixes: 7d0cc6edcc70 ('IB/mlx5: Add MR cache for large UMR regions') Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB: Replace ib_umem page_size by page_shiftArtemy Kovalyov23-77/+67
Size of pages are held by struct ib_umem in page_size field. It is better to store it as an exponent, because page size by nature is always power-of-two and used as a factor, divisor or ilog2's argument. The conversion of page_size to be page_shift allows to have portable code and avoid following error while compiling on ARM: ERROR: "__aeabi_uldivmod" [drivers/infiniband/core/ib_core.ko] undefined! CC: Selvin Xavier <[email protected]> CC: Steve Wise <[email protected]> CC: Lijun Ou <[email protected]> CC: Shiraz Saleem <[email protected]> CC: Adit Ranadive <[email protected]> CC: Dennis Dalessandro <[email protected]> CC: Ram Amrani <[email protected]> Signed-off-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Acked-by: Ram Amrani <[email protected]> Acked-by: Shiraz Saleem <[email protected]> Acked-by: Selvin Xavier <[email protected]> Acked-by: Selvin Xavier <[email protected]> Acked-by: Adit Ranadive <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/core: change the return type to voidZhu Yanjun2-3/+2
The function ib_unregister_mad_agent always returns zero. And this returned value is not checked. As such, chane the return type to void. CC: Joe Jin <[email protected]> CC: Junxiao Bi <[email protected]> Signed-off-by: Zhu Yanjun <[email protected]> Reviewed-by: Yuval Shaia <[email protected]> Reviewed-by: Hal Rosenstock <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25MAINTAINERS: Update ocrdma module statusSelvin Xavier1-4/+4
Since ocrdma driver is not going to be updated with any new development activity, except for critical bug fixes reported by partners or customers, changing the module status to "Odd Fixes". Also, updating the web page info and the maintainers email addresses. Signed-off-by: Selvin Xavier <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/hfi: Fix up comments in engine mappingIra Weiny2-36/+36
Fix off by 1 error in comments documenting the sdma and send context mappings. Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25MAINTAINERS: Add file patterns for infiniband device tree bindingsGeert Uytterhoeven1-0/+1
Submitters of device tree binding documentation may forget to CC the subsystem maintainer if this is missing. Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Sean Hefty <[email protected]> Cc: Hal Rosenstock <[email protected]> Cc: [email protected] Acked-by: Doug Ledford <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25infiniband/uverbs: Fix integer overflowsVlad Tsyrklevich1-1/+12
The 'num_sge' variable is verfied to be smaller than the 'sge_count' variable; however, since both are user-controlled it's possible to cause an integer overflow for the kmalloc multiply on 32-bit platforms (num_sge and sge_count are both defined u32). By crafting an input that causes a smaller-than-expected allocation it's possible to write controlled data out-of-bounds. Signed-off-by: Vlad Tsyrklevich <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25infiniband: hns: avoid gcc-7.0.1 warning for uninitialized dataArnd Bergmann1-0/+1
hns_roce_v1_cq_set_ci() calls roce_set_bit() on an uninitialized field, which will then change only a few of its bits, causing a warning with the latest gcc: infiniband/hw/hns/hns_roce_hw_v1.c: In function 'hns_roce_v1_cq_set_ci': infiniband/hw/hns/hns_roce_hw_v1.c:1854:23: error: 'doorbell[1]' is used uninitialized in this function [-Werror=uninitialized] roce_set_bit(doorbell[1], ROCEE_DB_OTHERS_H_ROCEE_DB_OTH_HW_SYNS_S, 1); The code is actually correct since we always set all bits of the port_vlan field, but gcc correctly points out that the first access does contain uninitialized data. This initializes the field to zero first before setting the individual bits. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/fmr_pool: Convert the cleanup thread into kthread worker APIPetr Mladek1-30/+19
Kthreads are currently implemented as an infinite loop. Each has its own variant of checks for terminating, freezing, awakening. In many cases it is unclear to say in which state it is and sometimes it is done a wrong way. The plan is to convert kthreads into kthread_worker or workqueues API. It allows to split the functionality into separate operations. It helps to make a better structure. Also it defines a clean state where no locks are taken, IRQs blocked, the kthread might sleep or even be safely migrated. The kthread worker API is useful when we want to have a dedicated single thread for the work. It helps to make sure that it is available when needed. Also it allows a better control, e.g. define a scheduling priority. This patch converts the frm_pool kthread into the kthread worker API because I am not sure how busy the thread is. It is well possible that it does not need a dedicated kthread and workqueues would be perfectly fine. Well, the conversion between kthread worker API and workqueues is pretty trivial. The patch moves one iteration from the kthread into the work function. It is queued only when there is a pending work. Therefore we do not need to compare flush_ser and req_ser at the beginning. On the contrary, the same work could be queued only once at a time. Therefore it has to re-queue itself if some requests are pending. Otherwise, wake_up_process() is replaced by queuing the work. Important: The change is only compile tested. I did not find an easy way how to check it in a real life. Signed-off-by: Petr Mladek <[email protected]> TO: Doug Ledford <[email protected]> CC: Sean Hefty <[email protected]> CC: Hal Rosenstock <[email protected]> CC: [email protected] Signed-off-by: Doug Ledford <[email protected]>
2017-04-25{net,IB}/{rxe,usnic}: Utilize generic mac to eui32 functionYuval Shaia6-44/+27
This logic seems to be duplicated in (at least) three separate files. Move it to one place so code can be re-use. Signed-off-by: Yuval Shaia <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]>
2017-04-25IB/usnic: Remove unused functionsYuval Shaia1-29/+0
Signed-off-by: Yuval Shaia <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25IB/iser: fix spelling mistake: "unexepected" -> "unexpected"Colin Ian King1-1/+1
trivial fix to spelling mistake in iser_err error message Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Acked-by: Sagi Grimberg <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25iw_cxgb4: Use dsgl by defaultGanesh Goudar1-3/+3
Enable the use of dsgl by default and determine whether dsgl is supported from lld info. Signed-off-by: Steve Wise <[email protected]> Signed-off-by: Bharat Potnuri <[email protected]> Signed-off-by: Ganesh Goudar <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-25RDMA/bnxt_re: Use IS_ERR_OR_NULL where appropriateDoug Ledford1-4/+4
Constructs such as if (ptr && !IS_ERR(ptr)) can be shorted to just !IS_ERR_OR_NULL(ptr) instead. Make substitutions in the bnxt_re driver where appropriate. Signed-off-by: Doug Ledford <[email protected]>
2017-04-25RDMA/bnxt_re: remove redundant initialization of rc to zeroColin Ian King1-1/+1
rc is initialized to zero but is then updated by calls to bnxt_qplib_free_fast_reg_page_list and/or bnxt_qpliob_free_mrw so the initialization is redundant and can be removed. Detected with CoverityScan, CID#1408448 ("Unused Value") Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Laurence Oberman <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Add support for active_width and active_speed in RoCENoa Osherovich1-5/+67
Add missing calculation and translation of active_width and active_speed for RoCE. Fixes: 3f89a643eb295 ('IB/mlx5: Extend query_device/port to ...') Signed-off-by: Noa Osherovich <[email protected]> Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Set mlx5_query_roce_port's return value to voidNoa Osherovich1-7/+8
In case of an error, the properties reported to user are zeroed out, so no need for a return value. Signed-off-by: Noa Osherovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/core: Add HDR speed enumNoa Osherovich2-1/+6
Add high data rate speed to the ib_port_speed enumeration. Signed-off-by: Noa Osherovich <[email protected]> Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Set correct SL in completion for RoCEMoni Shoua1-3/+16
There is a difference when parsing a completion entry between Ethernet and IB ports. When link layer is Ethernet the bits describe the type of L3 header in the packet. In the case when link layer is Ethernet and VLAN header is present the value of SL is equal to the 3 UP bits in the VLAN header. If VLAN header is not present then the SL is undefined and consumer of the completion should check if IB_WC_WITH_VLAN is set. While that, this patch also fills the vlan_id field in the completion if present. Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/cma: Send MRA for reply messagesMoni Shoua1-0/+3
Current implementation of RDMA_CM sends MRA (Message Receipt Acknowledgment) only for request messages but not for response messages. As a result, a slow active side of the connection may send a ready-to-use message to the passive side in a delay that is too long for the passive side to wait for. This patch adds a call to ib_send_cm_mra() upon receiving a response message and by this tells the other side to modify the service timeout to a bigger value, 16 times than before. As in the request case, MRA for reply will be sent only if a duplicate response has arrived. Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Matan Barak <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Support congestion related countersParav Pandit6-67/+150
This patch adds support to query the congestion related hardware counters through new command and links them with other hw counters being available in hw_counters sysfs location. In order to reuse existing infrastructure it renames related q_counter data structures to more generic counters to reflect q_counters and congestion counters and maybe some other counters in the future. New hardware counters: * rp_cnp_handled - CNP packets handled by the reaction point * rp_cnp_ignored - CNP packets ignored by the reaction point * np_cnp_sent - CNP packets sent by notification point to respond to CE marked RoCE packets * np_ecn_marked_roce_packets - CE marked RoCE packets received by notification point It also avoids returning ENOSYS which is specific for invalid system call and produces the following checkpatch.pl warning. WARNING: ENOSYS means 'invalid syscall nr' and nothing else + return -ENOSYS; Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Eli Cohen <[email protected]> Reviewed-by: Daniel Jurgens <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mthca: Check validity of output parameter pointerLeon Romanovsky1-2/+10
The mthca driver didn't check supplied pointer to functions mthca_cmd_poll() and mthca_cmd_wait(). This caused to the following smatch errors: drivers/infiniband/hw/mthca/mthca_cmd.c:371 mthca_cmd_poll() error: we previously assumed 'out_param' could be null (see line 353) drivers/infiniband/hw/mthca/mthca_cmd.c:454 mthca_cmd_wait() error: we previously assumed 'out_param' could be null (see line 432) In reality all callers of these functions are setting out_is_imm flag are providing pointer too. However it is better to check again to remove smatch errors to achieve warning free subsystem. Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Add drop flow steering rule supportSlava Shwartsman1-5/+22
A drop rule is described by an action drop and no destination. If a user specified IB_FLOW_SPEC_ACTION_DROP then set the action to MLX5_FLOW_CONTEXT_ACTION_DROP and clear the destination. Signed-off-by: Slava Shwartsman <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/core: Introduce drop flow specificationSlava Shwartsman4-0/+26
This flow steering specification identifies flow for drop by the HW. If user create a flow only with the drop specification, then all the packets that hit this flow will be dropped, otherwise the HW will drop only the packets that match the other L2/L3/L4 specifications. Signed-off-by: Slava Shwartsman <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-04-21IB/mlx5: Use IP version matching to classify IP trafficAriel Levkovich2-20/+52
This change adds the ability for flow steering to classify IPv4/6 packets with MPLS tag (Ethertype 0x8847 and 0x8848) as standard IP packets and hit IPv4/6 classifed steering rules. When user added a flow rule with IP classification, driver was implicitly adding ethertype matching to the created rule in order to distinguish between IPv4 and IPv6 protocols. Since IP packets with MPLS tag header have MPLS ethertype, they missed the rule and ended up hitting the default filters. Such behavior prevented from MPLS packets to undergo inbound traffic load balancing flows (if such were defined by configuring RSS) to achieve higher throughput - the way that non-MPLS IP packets performed. Since our device is able to look past the MPLS tag and identify the next protocol we introduce this solution which replaces Ethertype matching by the device's capability to perform IP version parsing and matching in order to distinguish between IPv4 and IPv6. Therefore, whenever a flow with IP spec is added and device support IP version matching, driver will implicitly add IP version matching to the rule (Based on the IP spec type) without Ethertype matching which will cause relevant MPLS tagged packets to hit this rule as well. Otherwise (device doesn't support IP version matching), we fall back to setting Ethertype matching. If the user's filters specify an L2 ethertype and an IP spec the rule will then match both the ethertype and the IP version. The device's support for IP version matching is reported by the device via dedicated capability bit in query_device_cap and named outer/inner_ip_version. Signed-off-by: Ariel Levkovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>