aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5/qp.c
AgeCommit message (Collapse)AuthorFilesLines
2019-02-07IB/mlx5: Simplify WQE count power of two checkGal Pressman1-3/+3
Use is_power_of_2() instead of hard coding it in the driver. While at it, fix the meaningless error print. Signed-off-by: Gal Pressman <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-02-05IB/mlx5: Do not use hw_access_flags for be and CPU dataBart Van Assche1-8/+8
Avoid that sparse reports the following for the mlx5 driver: drivers/infiniband/hw/mlx5/qp.c:2671:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2671:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2671:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2679:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2679:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2679:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2680:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2680:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2680:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2684:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2684:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2684:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: incorrect type in argument 1 (different base types) drivers/infiniband/hw/mlx5/qp.c:2686:28: expected unsigned int [usertype] val drivers/infiniband/hw/mlx5/qp.c:2686:28: got restricted __be32 [usertype] drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 This patch does not change any functionality. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Signed-off-by: Bart Van Assche <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-02-04Merge tag 'v5.0-rc5' into rdma.git for-nextJason Gunthorpe1-7/+9
Linux 5.0-rc5 Needed to merge the include/uapi changes so we have an up to date single-tree for these files. Patches already posted are also expected to need this for dependencies.
2019-02-04IB/mlx5: Let read user wqe also from SRQ bufferMoni Shoua1-47/+145
Reading a WQE from SRQ is almost identical to reading from regular RQ. The differences are the size of the queue, the size of a WQE and buffer location. Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE from a SRQ or RQ by caller choice. Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-01-21RDMA/mlx5: Fix check for supported user flags when creating a QPMark Bloch1-7/+9
When the flags verification was added two flags were missed from the check: * MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC * MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC This causes user applications that were using these flags to break. Fixes: 2e43bb31b8df ("IB/mlx5: Verify that driver supports user flags") Signed-off-by: Mark Bloch <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-01-10IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udataJason Gunthorpe1-22/+18
ib_umem_get() can only be called in a method callback, which always has a udata parameter. This allows ib_umem_get() to derive the ucontext pointer directly from the udata without requiring the drivers to find it in some way or another. Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Shamir Rabinovitch <[email protected]>
2019-01-02IB/mlx5: Allow XRC INI usage via verbs in DEVX contextYishai Hadas1-1/+2
From device point of view both XRC target and initiator are XRC transport type. Fix to use the expected UID as was handled for the XRC target case to allow its usage via verbs in DEVX context. Fixes: 5aa3771ded54 ("IB/mlx5: Allow XRC usage via verbs in DEVX context") Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-183/+264
Pull rdma updates from Jason Gunthorpe: "This has been a fairly typical cycle, with the usual sorts of driver updates. Several series continue to come through which improve and modernize various parts of the core code, and we finally are starting to get the uAPI command interface cleaned up. - Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4, mlx5, qib, rxe, usnic - Rework the entire syscall flow for uverbs to be able to run over ioctl(). Finally getting past the historic bad choice to use write() for command execution - More functional coverage with the mlx5 'devx' user API - Start of the HFI1 series for 'TID RDMA' - SRQ support in the hns driver - Support for new IBTA defined 2x lane widths - A big series to consolidate all the driver function pointers into a big struct and have drivers provide a 'static const' version of the struct instead of open coding initialization - New 'advise_mr' uAPI to control device caching/loading of page tables - Support for inline data in SRPT - Modernize how umad uses the driver core and creates cdev's and sysfs files - First steps toward removing 'uobject' from the view of the drivers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits) RDMA/srpt: Use kmem_cache_free() instead of kfree() RDMA/mlx5: Signedness bug in UVERBS_HANDLER() IB/uverbs: Signedness bug in UVERBS_HANDLER() IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported IB/umad: Start using dev_groups of class IB/umad: Use class_groups and let core create class file IB/umad: Refactor code to use cdev_device_add() IB/umad: Avoid destroying device while it is accessed IB/umad: Simplify and avoid dynamic allocation of class IB/mlx5: Fix wrong error unwind IB/mlx4: Remove set but not used variable 'pd' RDMA/iwcm: Don't copy past the end of dev_name() string IB/mlx5: Fix long EEH recover time with NVMe offloads IB/mlx5: Simplify netdev unbinding IB/core: Move query port to ioctl RDMA/nldev: Expose port_cap_flags2 IB/core: uverbs copy to struct or zero helper IB/rxe: Reuse code which sets port state IB/rxe: Make counters thread safe IB/mlx5: Use the correct commands for UMEM and UCTX allocation ...
2018-12-18RDMA: Cleanup undesired pd->uobject usageShamir Rabinovitch1-5/+5
Drivers should be using udata to determine if a method is invoked from user space or kernel space. A pd does not necessarily say a different objects is kernel or user. Transforming the tests to use udata eliminates a large number of uobject references from the drivers. Signed-off-by: Shamir Rabinovitch <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-18RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completionLeon Romanovsky1-1/+0
Handle atomic was left as unimplemented from 2013, remove the code till this part will be developed. Remove the dead code by simplifying SW completion logic which is supposed to be the same for send and receive paths. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Tested-by: Stephen Rothwell <[email protected]> # compile tested Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-14net/mlx5: Make RoCE and SR-IOV LAG modes explicitAviv Heller1-1/+1
With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-12-11Merge tag 'v4.20-rc6' into rdma.git for-nextJason Gunthorpe1-11/+11
For dependencies in following patches.
2018-12-11IB/core: Add new IB ratesMichael Guralnik1-1/+1
Add the new rates that were added to Infiniband spec as part of HDR and 2x support. Signed-off-by: Michael Guralnik <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-07Merge branch 'mlx5-packet-credit-fc' into rdma.gitJason Gunthorpe1-2/+13
Danit Goldberg says: Packet based credit mode Packet based credit mode is an alternative end-to-end credit mode for QPs set during their creation. Credits are transported from the responder to the requester to optimize the use of its receive resources. In packet-based credit mode, credits are issued on a per packet basis. The advantage of this feature comes while sending large RDMA messages through switches that are short in memory. The first commit exposes QP creation flag and the HCA capability. The second commit adds support for a new DV QP creation flag. The last commit report packet based credit mode capability via the MLX5DV device capabilities. * branch 'mlx5-packet-credit-fc': IB/mlx5: Report packet based credit mode device capability IB/mlx5: Add packet based credit mode support net/mlx5: Expose packet based credit mode Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-07IB/mlx5: Add packet based credit mode supportDanit Goldberg1-2/+13
The device can support two credit modes, message based (default) and packet based. In order to enable packet based mode, the QP should be created with special flag that indicates this. This patch adds support for the new DV QP creation flag that can be used for RC QPs in order to change the credit mode. Signed-off-by: Danit Goldberg <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-12-04IB/mlx5: Allow XRC usage via verbs in DEVX contextYishai Hadas1-7/+5
Allows XRC usage from the verbs flow in a DEVX context. As XRCD is some shared kernel resource between processes it should be created with UID=0 to point on that. As a result once XRC QP/SRQ are created they must be used as well with UID=0 so that firmware will allow the XRCD usage. Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-11-29IB/mlx5: Use fragmented QP's buffer for in-kernel usersGuy Levi1-166/+239
The current implementation of create QP requires contiguous memory, such a requirement is problematic once the memory is fragmented or the system is low in memory, it causes failures in dma_zalloc_coherent(). This patch takes advantage of the new mlx5_core API which allocates a fragmented buffer. This makes the QP creation much more resilient to memory fragmentation. Data-path code was adapted to the fact that WQEs can cross buffers. We also use the opportunity to fix some cosmetic legacy coding convention errors which were in the feature scope. Signed-off-by: Guy Levi <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-11-21IB/mlx5: Allow modify AV in DCI QP to RTRArtemy Kovalyov1-1/+1
This is required so the user can set the SL on the DC QP. Signed-off-by: Artemy Kovalyov <[email protected]> Reviewed-by: Yossi Itigin <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-11-21IB/mlx5: Fix XRC QP support after introducing extended atomicYonatan Cohen1-2/+1
Extended atomics are supported with RC and XRC QP types, but the commit citied in the Fixes line added an unneeded check to to_mlx5_access_flags. This broke XRC QPs. The following ib_atomic_bw invocation over XRC reproduces the issue: ib_atomic_bw -d mlx5_1 --connection=XRC --atomic_type=FETCH_AND_ADD It is safe to remove such checks because the QP type was already checked in ib_modify_qp_is_ok(), which was previously called from mlx5_ib_modify_qp. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Signed-off-by: Yonatan Cohen <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-11-21RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WRMajd Dibbiny1-9/+10
Currently, for IB_WR_LOCAL_INV WR, when the next fence is None, the current fence will be SMALL instead of Normal Fence. Without this patch krping doesn't work on CX-5 devices and throws following error: The error messages are from CX5 driver are: (from server side) [ 710.434014] mlx5_0:dump_cqe:278:(pid 2712): dump error cqe [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434017] 00000000 00000000 00000000 00000000 [ 710.434018] 00000000 93003204 100000b8 000524d2 [ 710.434019] krping: cq completion failed with wr_id 0 status 4 opcode 128 vender_err 32 Fixed the logic to set the correct fence type. Fixes: 6e8484c5cf07 ("RDMA/mlx5: set UMR wqe fence according to HCA cap") Signed-off-by: Majd Dibbiny <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-10-17IB/mlx5: Add support for extended atomic operationsYonatan Cohen1-12/+84
Extended atomic operations cmp&swp and fetch&add is a Mellanox feature extending the standard atomic operation to use, varied operand sizes, as apposed to normal atomic operation that use an 8 byte operand only. Extended atomics allows masking the results and arguments. This patch configures QP to support extended atomic operation with the maximum size possible, as exposed by HCA capabilities. Signed-off-by: Yonatan Cohen <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-10-17IB/mlx5: Allow scatter to CQE without global signaled WRsYonatan Cohen1-3/+11
Requester scatter to CQE is restricted to QPs configured to signal all WRs. This patch adds ability to enable scatter to cqe (force enable) in the requester without sig_all, for users who do not want all WRs signaled but rather just the ones whose data found in the CQE. Signed-off-by: Yonatan Cohen <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-10-17IB/mlx5: Verify that driver supports user flagsYonatan Cohen1-0/+14
Flags sent down from user might not be supported by running driver. This might lead to unwanted bugs. To solve this, added macro to test for unsupported flags. Signed-off-by: Yonatan Cohen <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-10-17IB/mlx5: Support scatter to CQE for DC transport typeYonatan Cohen1-19/+52
Scatter to CQE is a HW offload that saves PCI writes by scattering the payload to the CQE. This patch extends already existing functionality to support DC transport type. Signed-off-by: Yonatan Cohen <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-10-16RDMA/mlx5: Remove extraneous error checkGal Pressman1-2/+1
Remove double error check from create user RQ error flow. Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Signed-off-by: Gal Pressman <[email protected]> Reviewed-by: Majd Dibbiny <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-10-03RDMA: Remove unused parameter from ib_modify_qp_is_ok()Kamal Heib1-3/+2
The ll parameter is not used in ib_modify_qp_is_ok(), so remove it. Signed-off-by: Kamal Heib <[email protected]> Reviewed-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-27IB/mlx5: Expose RAW QP device handles to user spaceYishai Hadas1-2/+36
Expose RAW QP device handles to user space by extending the UHW part of mlx5_ib_create_qp_resp. This data is returned only when DEVX context is used where it may be applicable. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-26RDMA/drivers: Use dev_err/dbg/etc instead of pr_* + ibdev->nameJason Gunthorpe1-4/+6
Kernel convention is that a driver for a subsystem will print using dev_* on the subsystem's struct device, or with dev_* on the physical device. Drivers should rarely use a pr_* function. Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of XRCD commandsYishai Hadas1-2/+6
Set uid as part of XRCD commands so that the firmware can manage the XRCD object in a secured way. That will enable using an XRCD that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of RQT commandsYishai Hadas1-2/+5
Set uid as part of RQT commands so that the firmware can manage the RQT object in a secured way. That will enable using an RQT that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of TIS commandsYishai Hadas1-10/+17
Set uid as part of TIS commands so that the firmware can manage the TIS object in a secured way. That will enable using a TIS that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of TIR commandsYishai Hadas1-9/+15
Set uid as part of TIR commands so that the firmware can manage the TIR object in a secured way. That will enable using a TIR that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of DCT commandsYishai Hadas1-0/+1
Set uid as part of DCT create command so that the firmware can manage the DCT object in a secured way. The uid for the destroy and drain commands are set by mlx5_core. That will enable using a DCT that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of SQ commandsYishai Hadas1-5/+7
Set uid as part of SQ commands so that the firmware can manage the SQ object in a secured way. The uid for the destroy command is set by mlx5_core. This will enable using an SQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of RQ commandsYishai Hadas1-6/+11
Set uid as part of RQ commands so that the firmware can manage the RQ object in a secured way. The uid for the destroy command is set by mlx5_core. This will enable using an RQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-25IB/mlx5: Set uid as part of QP creationYishai Hadas1-0/+1
Set uid as part of QP creation so that the firmware can manage the QP object in a secured way. The uid for the destroy and the modify commands is set by mlx5_core. This will enable using a QP that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-21Merge branch 'mlx5-vport-loopback' into rdma.getDoug Ledford1-20/+76
For dependencies, branch based on 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5 mcast/ucast loopback control enhancements from Leon Romanovsky: ==================== This is short series from Mark which extends handling of loopback traffic. Originally mlx5 IB dynamically enabled/disabled both unicast and multicast based on number of users. However RAW ethernet QPs need more granular access. ==================== Fixed failed automerge in mlx5_ib.h (minor context conflict issue) mlx5-vport-loopback branch: RDMA/mlx5: Enable vport loopback when user context or QP mandate RDMA/mlx5: Allow creating RAW ethernet QP with loopback support RDMA/mlx5: Refactor transport domain bookkeeping logic net/mlx5: Rename incorrect naming in IFC file Signed-off-by: Doug Ledford <[email protected]>
2018-09-21RDMA/mlx5: Enable vport loopback when user context or QP mandateMark Bloch1-7/+27
A user can create a QP which can accept loopback traffic, but that's not enough. We need to enable loopback on the vport as well. Currently vport loopback is enabled only when more than 1 users are using the IB device, update the logic to consider whatever a QP which supports loopback was created, if so enable vport loopback even if there is only a single user. Signed-off-by: Mark Bloch <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-09-21RDMA/mlx5: Allow creating RAW ethernet QP with loopback supportMark Bloch1-13/+49
Expose two new flags: MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC Those flags can be used at creation time in order to allow a QP to be able to receive loopback traffic (unicast and multicast). We store the state in the QP to be used on the destroy path to indicate with which flags the QP was created with. Signed-off-by: Mark Bloch <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-09-22net/mlx5: Rename incorrect naming in IFC fileMark Bloch1-2/+2
Remove a trailing underscore from the multicast/unicast names. Signed-off-by: Mark Bloch <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2018-09-12IB/mlx5: Allow transition of DCI QP to resetMoni Shoua1-1/+3
The transition is allowed from any state and the atrribute mask must be IB_QP_STATE. Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP") Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-06IB/mlx5: Don't hold spin lock while checking device stateParav Pandit1-14/+12
mdev->state device state is not protected by the QP for which WRs are being processed. Therefore, there is no need to hold spin lock while checking mdev state. Given that device fatal error is unlikely situation, wrap the condition check with unlikely(). Additionally, kernel QP1 is also a kernel ULP for which soft CQEs needs to be generated. Therefore, check for device fatal error before processing QP1 work requests. Fixes: 89ea94a7b6c4 ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Daniel Jurgens <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-09-04IB/mlx5: Change TX affinity assignment in RoCE LAG modeMajd Dibbiny1-4/+33
In the current code, the TX affinity is per RoCE device, which can cause unfairness between different contexts. e.g. if we open two contexts, and each open 10 QPs concurrently, all of the QPs of the first context might end up on the first port instead of distributed on the two ports as expected To overcome this unfairness between processes, we maintain per device TX affinity, and per process TX affinity. The allocation algorithm is as follow: 1. Hold two tx_port_affinity atomic variables, one per RoCE device and one per ucontext. Both initialized to 0. 2. In mlx5_ib_alloc_ucontext do: 2.1. ucontext.tx_port_affinity = device.tx_port_affinity 2.2. device.tx_port_affinity += 1 3. In modify QP INIT2RST: 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM 3.2. ucontext.tx_port_affinity += 1 Signed-off-by: Majd Dibbiny <[email protected]> Reviewed-by: Moni Shoua <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-08-14IB/mlx5: Fix leaking stack memory to userspaceJason Gunthorpe1-1/+1
mlx5_ib_create_qp_resp was never initialized and only the first 4 bytes were written. Fixes: 41d902cb7c32 ("RDMA/mlx5: Fix definition of mlx5_ib_create_qp_resp") Cc: <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-08-08RDMA/mlx5: Fix shift overflow in mlx5_ib_create_wqLeon Romanovsky1-1/+3
[ 61.182439] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:5366:34 [ 61.183673] shift exponent 4294967288 is too large for 32-bit type 'unsigned int' [ 61.185530] CPU: 0 PID: 639 Comm: qp Not tainted 4.18.0-rc1-00037-g4aa1d69a9c60-dirty #96 [ 61.186981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 [ 61.188315] Call Trace: [ 61.188661] dump_stack+0xc7/0x13b [ 61.190427] ubsan_epilogue+0x9/0x49 [ 61.190899] __ubsan_handle_shift_out_of_bounds+0x1ea/0x22f [ 61.197040] mlx5_ib_create_wq+0x1c99/0x1d50 [ 61.206632] ib_uverbs_ex_create_wq+0x499/0x820 [ 61.213892] ib_uverbs_write+0x77e/0xae0 [ 61.248018] vfs_write+0x121/0x3b0 [ 61.249831] ksys_write+0xa1/0x120 [ 61.254024] do_syscall_64+0x7c/0x2a0 [ 61.256178] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 61.259211] RIP: 0033:0x7f54bab70e99 [ 61.262125] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 [ 61.268678] RSP: 002b:00007ffe1541c318 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 61.271076] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54bab70e99 [ 61.273795] RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003 [ 61.276982] RBP: 00007ffe1541c330 R08: 00000000200078e0 R09: 0000000000000002 [ 61.280035] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005c0 [ 61.283279] R13: 00007ffe1541c420 R14: 0000000000000000 R15: 0000000000000000 Cc: <[email protected]> # 4.7 Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Cc: syzkaller <[email protected]> Reported-by: Noa Osherovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-07-30RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments constBart Van Assche1-10/+11
Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-07-30IB/mlx5, ib_post_send(), IB_WR_REG_SIG_MR: Do not modify the 'wr' argumentBart Van Assche1-12/+18
Since the next patch will constify the wr pointer, do not modify the data that pointer points at. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Cc: Saeed Mahameed <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-07-30RDMA: Constify the argument of the work request conversion functionsBart Van Assche1-15/+16
When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Reviewed-by: Steve Wise <[email protected]> Reviewed-by: Dennis Dalessandro <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-07-13RDMA/mlx5: Check that supplied blue flame index doesn't overflowLeon Romanovsky1-7/+8
User's supplied index is checked again total number of system pages, but this number already includes num_static_sys_pages, so addition of that value to supplied index causes to below error while trying to access sys_pages[]. BUG: KASAN: slab-out-of-bounds in bfregn_to_uar_index+0x34f/0x400 Read of size 4 at addr ffff880065561904 by task syz-executor446/314 CPU: 0 PID: 314 Comm: syz-executor446 Not tainted 4.18.0-rc1+ #256 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xef/0x17e print_address_description+0x83/0x3b0 kasan_report+0x18d/0x4d0 bfregn_to_uar_index+0x34f/0x400 create_user_qp+0x272/0x227d create_qp_common+0x32eb/0x43e0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x22d0 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x433679 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff2b3d8e48 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433679 RDX: 0000000000000040 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000006d4018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040cb00 R14: 000000000040cb90 R15: 0000000000000006 Allocated by task 314: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x1a9/0x510 mlx5_ib_alloc_ucontext+0x966/0x2620 ib_uverbs_get_context+0x23f/0xa60 ib_uverbs_write+0xc2c/0x1010 __vfs_write+0x10d/0x720 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1: __kasan_slab_free+0x12e/0x180 kfree+0x159/0x630 kvfree+0x37/0x50 single_release+0x8e/0xf0 __fput+0x2d8/0x900 task_work_run+0x102/0x1f0 exit_to_usermode_loop+0x159/0x1c0 do_syscall_64+0x408/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff880065561100 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2052 bytes inside of 4096-byte region [ffff880065561100, ffff880065562100) The buggy address belongs to the page: page:ffffea0001955800 count:1 mapcount:0 mapping:ffff88006c402480 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffea0001a7c000 0000000200000002 ffff88006c402480 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880065561800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880065561880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880065561900: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880065561980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880065561a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: <[email protected]> # 4.15 Fixes: 1ee47ab3e8d8 ("IB/mlx5: Enable QP creation with a given blue flame index") Reported-by: Noa Osherovich <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-07-13RDMA/mlx5: Melt consecutive calls to alloc_bfreg() in one callLeon Romanovsky1-35/+12
There is no need for three consecutive calls to alloc_bfreg(). It can be implemented with one function. Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>