aboutsummaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)AuthorFilesLines
2020-05-27RDMA/cm: Send and receive ECE parameter over the wireLeon Romanovsky2-5/+42
ECE parameters are exchanged through REQ->REP/SIDR_REP messages, this patch adds the data to provide to other side of CMID communication channel. Link: https://lore.kernel.org/r/20200526103304.196371-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/ucma: Deliver ECE parameters through UCMA eventsLeon Romanovsky1-1/+5
Passive side of CMID connection receives ECE request through REQ message and needs to respond with relevant REP message which will be forwarded to active side. The UCMA events interface is responsible for such communication with the user space (librdmacm). Extend it to provide ECE wire data. Link: https://lore.kernel.org/r/20200526103304.196371-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/ucma: Extend ucma_connect to receive ECE parametersLeon Romanovsky3-3/+33
Active side of CMID initiates connection through librdmacm's rdma_connect() and kernel's ucma_connect(). Extend UCMA interface to handle those new parameters. Link: https://lore.kernel.org/r/20200526103304.196371-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27Merge branch 'mellanox/mlx5-next' into rdma.git for/nextJason Gunthorpe6-4/+19
From the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Required for dependencies in following patches * branch 'mellanox/mlx5-next': net/mlx5: Add ability to read and write ECE options net/mlx5: Add support for RDMA TX FT headers modifying net/mlx5: Move iseg access helper routines close to mlx5_core driver net/mlx5: Cleanup mlx5_ifc_fte_match_set_misc2_bits Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27IB/mlx5: Fix DEVX support for MLX5_CMD_OP_INIT2INIT_QP commandMark Zhang1-0/+4
The commit citied in the Fixes line wasn't complete and solved only part of the problems. Update the mlx5_ib to properly support MLX5_CMD_OP_INIT2INIT_QP command in the DEVX, that is required when modify the QP tx_port_affinity. Fixes: 819f7427bafd ("RDMA/mlx5: Add init2init as a modify command") Link: https://lore.kernel.org/r/20200527135703.482501-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/core: Use sizeof_field() helperGustavo A. R. Silva4-10/+10
Make use of the sizeof_field() helper instead of an open-coded version. Link: https://lore.kernel.org/r/20200527144152.GA22605@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/ipoib: Remove can_sleep parameter from iboib_mcast_allocKamal Heib1-6/+5
can_sleep is always 0 when iboib_mcast_alloc() is called, so remove it and use GFP_ATOMIC instead of GFP_KERNEL. Link: https://lore.kernel.org/r/20200525130305.171509-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/iw_cxgb4: cleanup device debugfs entries on ULD removePotnuri Bharat Teja1-0/+1
Remove device specific debugfs entries immediately if LLD detaches a particular ULD device in case of fatal PCI errors. Link: https://lore.kernel.org/r/20200524190814.17599-1-bharat@chelsio.com Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Make the end of sge process more clearYixian Liu1-2/+2
Instead of i with the sge number of wr will make the comparision more clear, that is, when the sge number in wr is small than the maximum supported sge number in the queue, then a stop sge needed to be filled at the end of sges in wr. Link: https://lore.kernel.org/r/1590152579-32364-5-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Simplify process related to poll cqLang Cheng1-8/+3
Set hns_roce_v2_cq_set_ci to inline type and remove unnecessary next_cqe_sw_v2(). Link: https://lore.kernel.org/r/1590152579-32364-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Remove redundant parameters from free_srq/qp_wrid()Wenpeng Liang2-7/+6
The redundant parameters "hr_dev" need to be removed from free_kernel_wrid() and free_srq_wrid(). Link: https://lore.kernel.org/r/1590152579-32364-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Remove redundant type cast for general pointersWeihang Li3-137/+73
There is no need to do a type cast on genernal pointers, they could be assigned to any type of variables. In addition, optimize initialization of some variables and adjust order of them. Link: https://lore.kernel.org/r/1590152579-32364-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Optimize the usage of MTRXi Wang2-27/+30
Currently, the MTR region is configed before hns_roce_mtr_map() is invoked, but in some scenarios, the region is configed at MTR creation, the caller need to store this config and call hns_roce_mtr_map() later. So optimize the usage by wrapping the MTR region config into MTR. Link: https://lore.kernel.org/r/1589982799-28728-10-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Refactor the QP context filling process related to WQE buffer ↵Xi Wang1-115/+149
configure Split the code related to WQE buffer configure from the QPC filling process into two functions: config_qp_sq_buf() and config_qp_rq_buf(), this will make the code more readable. Link: https://lore.kernel.org/r/1589982799-28728-9-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Change variables representing quantity to unsignedWeihang Li2-10/+11
Number of sge/eqe is always non-negative, they should be defined in type of unsigned. Link: https://lore.kernel.org/r/1589982799-28728-8-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Change all page_shift to unsignedWeihang Li5-24/+27
page_shift is used to calculate the page size, it's always non-negative, and should be in type of unsigned. Link: https://lore.kernel.org/r/1589982799-28728-7-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Rename QP buffer related functionXi Wang1-4/+4
Rename the function related to QP buffer to make the code more readable. Link: https://lore.kernel.org/r/1589982799-28728-6-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Remove unused code about assertYangyang Li2-5/+0
The codes related to assert are no longer used and need to be deleted. Link: https://lore.kernel.org/r/1589982799-28728-5-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Optimize post and poll processLang Cheng1-13/+14
Add unlikely() and likely() to optimize main I/O process code. Link: https://lore.kernel.org/r/1589982799-28728-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Add CQ flag instead of independent enable flagLang Cheng3-11/+11
It's easier to understand and maintain enable flags of cq using a single field in type of u32 than defining a field for every flags in the structure hns_roce_cq, and we can add new flags for features more conveniently in the future. Link: https://lore.kernel.org/r/1589982799-28728-3-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-25RDMA/hns: Let software PI/CI grow naturallyLang Cheng1-5/+4
The hardware can truncate PI/CI when posting or polling, the driver does not need to do truncation. Therefore keep the software's PI/CI consistent with it in the hardware. Link: https://lore.kernel.org/r/1589982799-28728-2-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22RDMA/rtrs: Get rid of the do_next_path while_next_path macrosDanil Kipnis1-16/+13
The macros do_each_path/while_each_path lead to a smatch warning: drivers/infiniband/ulp/rtrs/rtrs-clt.c:1196 rtrs_clt_failover_req() warn: inconsistent indenting drivers/infiniband/ulp/rtrs/rtrs-clt.c:2890 rtrs_clt_request() warn: inconsistent indenting Also checkpatch complains: ERROR: Macros with multiple statements should be enclosed in a do - while loop The macros are used only in two places: for a normal IO path and for the failover path triggered after errors. Get rid of the macros and just use a for loop iterating over the list of paths in both places. It is easier to read and also less lines of code. Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality") Link: https://lore.kernel.org/r/20200522053924.528980-1-danil.kipnis@cloud.ionos.com Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22RDMA/rtrs: server: Use already dereferenced rtrs_sess structureMd Haris Iqbal1-2/+2
The rtrs_sess structure has already been extracted above from the rtrs_srv_sess structure. Use that to avoid redundant dereferencing. Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality") Link: https://lore.kernel.org/r/20200522082833.1480551-1-haris.phnx@gmail.com Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com> Acked-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22RDMA/rnbd: Fix compilation error when CONFIG_MODULES is disabledDanil Kipnis1-4/+7
module_is_live function is only defined when CONFIG_MODULES is enabled. Use try_module_get instead to check whether the module is being removed. When module unload and manuall unmapping is happening in parallel, we can try removing the symlink twice: rnbd_client_exit vs. rnbd_clt_unmap_dev_store. This is probably not the best way to deal with this race in general, but for now this fixes the compilation issue when CONFIG_MODULES is disabled and has no functional impact. Regression tests passed. Fixes: 1eb54f8f5dd8 ("block/rnbd: client: sysfs interface functions") Link: https://lore.kernel.org/r/20200521185909.457245-1-danil.kipnis@cloud.ionos.com Reported-by: Randy Dunlap <rdunlap@infradead.org> Suggested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Danil Kipnis <danil.kipnis@cloud.ionos.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22IB/cma: Fix ports memory leak in cma_configfsMaor Gottlieb1-0/+13
The allocated ports structure in never freed. The free function should be called by release_cma_ports_group, but the group is never released since we don't remove its default group. Remove default groups when device group is deleted. Fixes: 045959db65c6 ("IB/cma: Add configfs for rdma_cm") Link: https://lore.kernel.org/r/20200521072650.567908-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-22block/rnbd: Fix an IS_ERR() vs NULL check in find_or_create_sess()Dan Carpenter1-5/+4
The alloc_sess() function returns error pointers, it never returns NULL. Fixes: f7a7a5c228d4 ("block/rnbd: client: main functionality") Link: https://lore.kernel.org/r/20200519120347.GD42765@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Jack Wang <jinpu.wang@cloud.ionos.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/uverbs: Introduce create/destroy QP commands over ioctlYishai Hadas5-41/+405
Introduce create/destroy QP commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/20200519072711.257271-8-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/uverbs: Introduce create/destroy WQ commands over ioctlYishai Hadas5-24/+198
Introduce create/destroy WQ commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/20200519072711.257271-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/uverbs: Introduce create/destroy SRQ commands over ioctlYishai Hadas5-33/+238
Introduce create/destroy SRQ commands over the ioctl interface to let it be extended to get an asynchronous event FD. Link: https://lore.kernel.org/r/20200519072711.257271-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/uverbs: Extend CQ to get its own asynchronous event FDYishai Hadas2-3/+24
Extend CQ to get its own asynchronous event FD. The event FD is an optional attribute, in case wasn't given the ufile event FD will be used. Link: https://lore.kernel.org/r/20200519072711.257271-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/uverbs: Refactor related objects to use their own asynchronous event FDYishai Hadas4-9/+39
Refactor related objects to use their own asynchronous event FD. The ufile event FD will be the default in case an object won't have its own event FD. Link: https://lore.kernel.org/r/20200519072711.257271-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21RDMA/core: Allow the ioctl layer to abort a fully created uobjectJason Gunthorpe9-57/+63
While creating a uobject every create reaches a point where the uobject is fully initialized. For ioctls that go on to copy_to_user this means they need to open code the destruction of a fully created uobject - ie the RDMA_REMOVE_DESTROY sort of flow. Open coding this creates bugs, eg the CQ does not properly flush the events list when it does its error unwind. Provide a uverbs_finalize_uobj_create() function which indicates that the uobject is fully initialized and that abort should call to destroy_hw to destroy the uobj->object and related. Methods can call this function if they go on to have error cases after setting uobj->object. Once done those error cases can simply do return, without an error unwind. Link: https://lore.kernel.org/r/20200519072711.257271-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21Merge tag 'v5.7-rc6' into rdma.git for-nextJason Gunthorpe579-3004/+5596
Linux 5.7-rc6 Conflict in drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c resolved by deleting dr_cq_event, matching how netdev resolved it. Required for dependencies in the following patches. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Enable the transmit side of the datagram ipoib netdevPiotr Stankiewicz2-0/+3
This patch hooks the transmit side of the datagram netdev with ipoib by setting the rdma_netdev_get_params function for the hfi1 ib_device_ops structue. It also enables the receiving side by adding the AIP capability into the default capabilities. Link: https://lore.kernel.org/r/20200511160712.173205.65700.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Piotr Stankiewicz <piotr.stankiewicz@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/ipoib: Add capability to switch between datagram and connected modeGary Leshner1-8/+10
This is the prerequisite modification to the ipoib ulp to allow a rdma netdev to obtain the default ndo ops for init/uninit/open/close. This is accomplished by setting the netdev ops field within the callback function passed to the netdev allocation routine which in turn was passed into the rdma netdev allocation routine. This allows the rdma netdev to call back into the ulp to create the resources required for connected mode operation. Additionally as the ulp is not re-entrant, when switching modes, the number of real tx queues is set to 1 for the connected mode. For datagram mode the number of real tx queues is set to the actual number of tx queues specified at the netdev's allocation. For the internal ulp netdev the number of tx queues defaults to 1. It is up to the rdma netdev to specify the actual number it can support. When the driver does not support a rdma netdev for acceleration, (-ENOTSUPPORTED return code or the verbs function for allocation is NULL) the ipoib ulp functions are unaffected by using the internal netdev allocated by the ipoib ulp. Link: https://lore.kernel.org/r/20200511160706.173205.19086.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add packet histogram trace eventGrzegorz Andrejczuk3-1/+43
Add a simple trace event taking context number and building simple histogram to print packets distribution between contexts. Link: https://lore.kernel.org/r/20200511160700.173205.84270.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu sizeGary Leshner3-0/+6
When in connected mode ipoib sent broadcast pings which exceeded the mtu size for broadcast addresses. Add an mtu attribute to the rdma_netdev structure which ipoib sets to its mcast mtu size. The RDMA netdev uses this value to determine if the skb length is too long for the mtu specified and if it is, drops the packet and logs an error about the errant packet. Link: https://lore.kernel.org/r/20200511160655.173205.14546.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Activate the dummy netdevGrzegorz Andrejczuk11-342/+178
As described in earlier patches, ipoib netdev will share receive contexts with existing VNIC netdev through a dummy netdev. The following changes are made to achieve that: - Set up netdev receive contexts after user contexts. A function is added to count the available netdev receive contexts. - Add functions to set/get receive map table free index. - Rename NUM_VNIC_MAP_ENTRIES as NUM_NETDEV_MAP_ENTRIES. - Let the dummy netdev own the receive contexts instead of VNIC. - Allocate the dummy netdev when the hfi1 device is added and free it when the device is removed. - Initialize AIP RSM rules when the IpoIb rxq is initialized and remove the rules when it is de-initialized. - Convert VNIC to use the dummy netdev. Link: https://lore.kernel.org/r/20200511160649.173205.4626.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add rx functions for dummy netdevGrzegorz Andrejczuk5-2/+414
This patch adds the rx functions for the dummy netdev: - Functions to allocate/free the dummy netdev. - Functions to allocate/free receiving contexts for the netdev. - Functions to initialize/de-initialize the receive queue. - Functions to enable/disable the receive queue. Link: https://lore.kernel.org/r/20200511160643.173205.75087.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add interrupt handler functions for accelerated ipoibGrzegorz Andrejczuk10-7/+206
This patch adds the interrupt handler function, the NAPI poll function, and its associated helper functions for receiving accelerated ipoib packets. While we are here, fix the formats of two error printouts. Link: https://lore.kernel.org/r/20200511160637.173205.64890.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add functions to receive accelerated ipoib packetsKaike Wan7-2/+355
Ipoib netdev will share receive contexts with existing VNIC netdev. To achieve that, a dummy netdev is allocated with hfi1_devdata to own the receive contexts, and ipoib and VNIC netdevs will be put on top of it. Each receive context is associated with a single NAPI object. This patch adds the functions to receive incoming packets for accelerated ipoib. Link: https://lore.kernel.org/r/20200511160631.173205.54184.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Rename num_vnic_contexts as num_netdev_contextsGrzegorz Andrejczuk4-20/+20
Rename num_vnic_contexts as num_ndetdev_contexts since VNIC and ipoib will share the same set of receive contexts. Link: https://lore.kernel.org/r/20200511160625.173205.53306.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/ipoib: Increase ipoib Datagram mode MTU's upper limitKaike Wan4-23/+10
Currently the ipoib UD mtu is restricted to 4K bytes. Remove this limitation so that the IPOIB module can potentially use an MTU (in UD mode) that is bounded by the MTU of the underlying device. A field is added to the ib_port_attr structure to indicate the maximum physical MTU the underlying device supports. Link: https://lore.kernel.org/r/20200511160618.173205.23053.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: RSM rules for AIPGrzegorz Andrejczuk4-50/+136
This is implementation of RSM rule for AIP packets. AIP rule will use rule RSM2 and will match standard Infiniband packet containg BTH (LNH==BTH) and having Dest QPN prefixed with value 0x81. Spread between receive contexts will be done using source QPN bits. VNIC and AIP will share receive contexts, so their rules will point to the same RMT entries and their shared code is moved to separate functions. If any of the rules is active RMT mapping will be skipped for latter. Changed function hfi1_vnic_is_rsm_full to be more general and moved it from main header to chip.c. Changed the order of RSM rules because AIP rule as more specific one is needed to be placed before more general QOS rule. Rules are occupying two last RSM registers. Link: https://lore.kernel.org/r/20200511160612.173205.73002.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/{rdmavt, hfi1}: Implement creation of accelerated UD QPsGary Leshner3-6/+23
Adds capability to create a qpn to be recognized as an accelerated UD QP for ipoib. This is accomplished by reserving 0x81 in byte[0] of the qpn as the prefix for these qp types and reserving qpns between 0x810000 and 0x81ffff. The hfi1 capability mask already contained a flag for the VNIC netdev. This has been renamed and extended to include both VNIC and ipoib. The rvt code to allocate qps now recognizes this flag and sets 0x81 into byte[0] of the qpn. The code to allocate qpns is modified to reset the qpn numbering when it is detected that a value is located in byte[0] for a UD QP and it is a qpn being requested for net dev use. If it is a regular UD QP then it is allowable to have bits set in byte[0] of the qpn and provide the previously normal behavior. The code to free the qpn now checks for the AIP prefix value of 0x81 and removes it from the qpn before being freed so that the lower 16 bit number can be reused. This patch requires minor changes in the IB core and ipoib to facilitate the creation of accelerated UP QPs. Link: https://lore.kernel.org/r/20200511160607.173205.11757.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Remove module parameter for KDETH qpnsGary Leshner6-33/+11
The module parameter for KDETH qpns is being removed in favor of always using the default value of 0x80 as the qpn prefix. Defines have been added for various KDETH values including the prefix of 0x80. The reserved range now starts at the base value for KDETH qpns (0x80) and extends up to and including the last qpn for other reserved QP prefixed types. Adjust other QP prefixed define names to match KDETH defined names. Link: https://lore.kernel.org/r/20200511160600.173205.27508.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add the transmit side of a datagram ipoib RDMA netdevGary Leshner3-0/+289
This implements the transmit side of the multiple transmit queue RDMA netdev used to accelerate ipoib. The receive side remains the ipoib internal implementation. The init/unint/open/stop netdev operations are saved off and called by the versions within the hfi1 netdev in order to initialize the connected mode resources present in ipoib thus allowing us to switch modes between datagram and connected. The datagram queue pair instantiated by the ipoib ulp is used by this implementation for its queue pair number and to register with multicast. The above queue pair is not used on transmit other than its qpn as the verbs layer is skipped and packets are directly submitted to the sdma engines. Link: https://lore.kernel.org/r/20200511160554.173205.1369.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add functions to transmit datagram ipoib packetsGary Leshner4-1/+983
This patch implements the mechanism to accelerate the transmit side of a multiple transmit queue RDMA netdev by submitting the packets to the SDMA engine directly instead of sending through the verbs layer. This patch also changes the UD/SEND_ONLY op to output the entropy value in byte 0 of deth[1]. UD/SEND_ONLY_WITH_IMMEDIATE uses the previous behavior with no entropy value being output. The code in the ipoib rdma netdev which submits tx requests upon successful submission will call trace_sdma_output_ibhdr to output the ibhdr to the trace buffer. Link: https://lore.kernel.org/r/20200511160548.173205.45616.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add accelerated IP capability bitKaike Wan1-2/+3
The accelerated IP capability bit is added to allow users to control which feature is enabled and disabled. Link: https://lore.kernel.org/r/20200511160541.173205.96870.stgit@awfm-01.aw.intel.com Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21RDMA/efa: Report host information to the deviceGal Pressman4-10/+130
The host info feature allows the driver to infrom the EFA device firmware with system configuration for debugging and troubleshooting purposes. The host info buffer is passed as an admin command DMA mapped control buffer, and is unmapped and freed once the command CQE is consumed. Currently, the setting of host info is done for each device on its probe. Failing to set the host info for the device shall not disturb the probe flow, any errors will be discarded. Link: https://lore.kernel.org/r/20200512152204.93091-3-galpress@amazon.com Reviewed-by: Firas JahJah <firasj@amazon.com> Reviewed-by: Guy Tzalik <gtzalik@amazon.com> Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>