Age | Commit message (Collapse) | Author | Files | Lines |
|
Rename RVT AETH defines and export in rdma/ib_hdrs.h
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Don Hiatt <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Return usec from an index into ib_rvt_rnr_table.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Don Hiatt <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
To move common code across target to rdmavt for code reuse.
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Brian Welty <[email protected]>
Signed-off-by: Venkata Sandeep Dhanalakota <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add rvt_rc_error() and rvt_comm_est() as shared functions in
rdmavt, moved from hfi1/qib logic.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Brian Welty <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
rvt_create_qp() creates qp->ip only when a qp creation request comes from
userspace (udata is not NULL). If we exceed the number of available
queue pairs however, the error path always attempts to put a kref to this
structure. If the requestor is inside the kernel, this leads to a crash.
We fix this by checking that qp->ip is not NULL before caling kref_put().
Signed-off-by: Jim Foraker <[email protected]>
Acked-by: Dennis Dalessandro <[email protected]>
Acked-by: Jonathan Toppins <[email protected]>
Acked-by: Alex Estrin <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This is for use by client drivers to drive
send completions into a CQ.
A new exported table allows for the mapping
of ib_wr_opcode into a ib_wc_opcode.
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Corrected function name in comment from qib_ to rvt_.
Signed-off-by: Parav Pandit <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch adds lockdep asserts in key code paths for
insuring lock correctness.
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add an rvt_qp_init() to initialize specific
common fields as the qp is created or reset.
The routine is shared by the rvt_reset_qp() and
the rvt_create_qp().
The intent is that lock dep assertions will only
appear in the rvt_reset_qp().
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The reset calldown is misplaced.
It should only be called in the code that actually
transitions the QP to reset.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The __must_hold() is sufficent to correct the sparse
context imbalance inside a function.
Per Documentation/sparse.txt:
__must_hold - The specified lock is held on function entry and exit.
Fixes: Commit c0a67f6ba356 ("IB/rdmavt: Annotate rvt_reset_qp()")
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This improves readability and hides the reference count
mechanism from the client drivers.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The unwind logic for creating a user QP has a double vfree
of the non-shared receive queue when handling a "too many qps"
failure.
The code unwinds the mmmap info by decrementing a reference
count which will call rvt_release_mmap_info() which in turn
does the vfree() of the r_rq.wq. The unwind code then does
the same free.
Fix by guarding the vfree() with the same test that is done
in close and only do the vfree() if qp->ip is NULL.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The use of the specific opcode test is redundant since
all ack entry users correctly manipulate the mr pointer
to selectively trigger the reference clearing.
The overly specific test hinders the use of implementation
specific operations.
The change needs to get rid of the union to insure that
an atomic value is not seen as an MR pointer.
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Hanging has been observed while writing a file over NFSoRDMA. Dmesg on
the server contains messages like these:
[ 931.992501] svcrdma: Error -22 posting RDMA_READ
[ 952.076879] svcrdma: Error -22 posting RDMA_READ
[ 982.154127] svcrdma: Error -22 posting RDMA_READ
[ 1012.235884] svcrdma: Error -22 posting RDMA_READ
[ 1042.319194] svcrdma: Error -22 posting RDMA_READ
Here is why:
With the base memory management extension enabled, FRMR is used instead
of FMR. The xprtrdma server issues each RDMA read request as the following
bundle:
(1)IB_WR_REG_MR, signaled;
(2)IB_WR_RDMA_READ, signaled;
(3)IB_WR_LOCAL_INV, signaled & fencing.
These requests are signaled. In order to generate completion, the fast
register work request is processed by the hfi1 send engine after being
posted to the work queue, and the corresponding lkey is not valid until
the request is processed. However, the rdmavt driver validates lkey when
the RDMA read request is posted and thus it fails immediately with error
-EINVAL (-22).
This patch changes the work flow of local operations (fast register and
local invalidate) so that fast register work requests are always
processed immediately to ensure that the corresponding lkey is valid
when subsequent work requests are posted. Local invalidate requests are
processed immediately if fencing is not required and no previous local
invalidate request is pending.
To allow completion generation for signaled local operations that have
been processed before posting to the work queue, an internal send flag
RVT_SEND_COMPLETION_ONLY is added. The hfi1 send engine checks this flag
and only generates completion for such requests.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Jianxin Xiong <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This fix allows for support of in-kernel reserved operations
without impacting the ULP user.
The low level driver can register a non-zero value which
will be transparently added to the send queue size and hidden
from the ULP in every respect.
ULP post sends will never see a full queue due to a reserved
post send and reserved operations will never exceed that
registered value.
The s_avail will continue to track the ULP swqe availability
and the difference between the reserved value and the reserved
in use will track reserved availabity.
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Some work requests are local operations, such as IB_WR_REG_MR and
IB_WR_LOCAL_INV. They differ from non-local operations in that:
(1) Local operations can be processed immediately without being posted
to the send queue if neither fencing nor completion generation is needed.
However, to ensure correct ordering, once a local operation is posted to
the work queue due to fencing or completion requiement, all subsequent
local operations must also be posted to the work queue until all the
local operations on the work queue have completed.
(2) Local operations don't send packets over the wire and thus don't
need (and shouldn't update) the packet sequence numbers.
Define a new a flag bit for the post send table to identify local
operations.
Add a new field to the QP structure to track the number of local
operations on the send queue to determine if direct processing of new
local operations should be enabled/disabled.
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Jianxin Xiong <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Change rvt_post_one_wr to use the new table mechanism for
post send.
Validate that each low level driver specifies the table.
Reviewed-by: Jianxin Xiong <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add flexibility for driver dependent operations in post send
because different drivers will have differing post send
operation support.
This includes data structure definitions to support a table
driven scheme along with the necessary validation routine
using the new table.
Reviewed-by: Ashutosh Dixit <[email protected]>
Reviewed-by: Jianxin Xiong <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The current drivers return errors from this calldown
wrapped in an ERR_PTR().
The rdmavt code incorrectly tests for NULL.
The code is fixed to use IS_ERR() and change ret according
to the driver return value.
Cc: Stable <[email protected]> # 4.6+
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Since rvt_reset_qp already zero's out qp->s_ack_queue head and tail
pointers, there is no need to zero out qp->s_ack_queue itself.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Correct calculation of the low order bits which should be unset
based on use of qos_shift parameter when assigning QPN.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Brian Welty <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch avoids that sparse reports the following warning:
rdmavt/qp.c:507:17: warning: context imbalance in 'rvt_reset_qp' - unexpected unlock
Signed-off-by: Bart Van Assche <[email protected]>
Cc: Mike Marciniszyn <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
rdmavt allows the driver to specify the size of the ack queue, but
only uses it for the modify QP limit testing for setting the atomic
limit value.
The driver dependent size is now used to size the s_ack_queue ring
dynamicially.
Since the driver knows its size, the driver will use its define
for any ring size dependent code.
Reviewed-by: Mitko Haralanov <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The usage of the various vmalloc APIs do not consistently zero memory
when allocating the swqe. Insure zeroing variants are used.
Reviewed-by: Mitko Haralanov <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The awkward coding for setting the allowed_ops field
was tripping an smatch warning.
This patch uses the more appropriate defines from include/rdma
to avoid the issue.
As part of the patch remove a mask that was duplicated
in rdmavt include files and use that mask as appropriate.
Fixes: 8bea6b1cfe6f ("IB/rdmavt: Add create queue pair functionality")
Reported-by: Dan Carpenter <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
call_send is used to determine whether to send immediately or schedule
a send for later. The current logic in rdmavt is inverted and has a
negative impact on the latency of the hfi1 and qib drivers. Fix this
regression by correctly calling send immediately when call_send is set.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Accordingly IB Spec post WR to receive queue must
complete with error if QP is in Error state.
Please refer to C10-42, C10-97.2.1
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Alex Estrin <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Tracking user/QP ownership is needed to debug issues with
user ULPs like OpenMPI.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The non-rdamvt versions of qib and hfi1 allow for a differing
heuristic to override a schedule progress in favor of a direct
call the the progress routine.
This patch adds that for both drivers and rdmavt.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Remove exported functions which are no longer required as the
functionality has moved into rdmavt. This also requires re-ordering some
of the functions since their prototype no longer appears in a header
file. Rather than add forward declarations it is just cleaner to
re-order some of the functions.
Reviewed-by: Jubin John <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
While hfi1 and qib were still supporting bits and pieces of core verbs
components there needed to be a way to convey if rdmavt should handle
allocation and initialize of resources like the queue pair table. Now
that all of this is moved into rdmavt there is no need for these flags.
They are no longer used in the drivers.
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Jubin John <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add, remove, and otherwise clean up existing comments that are leftover
from the initial code postings of rdmavt. Many of the comments were added
to provide an idea on the direction we were thinking of going. Now that the
design is solidified make a pass over and clean everything up. Also add
details where lacking.
Ensure all non static functions have nano comments.
Reviewed-by: Jubin John <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
These trace and error print statements would help in debugging issues which
are caused due to messed up QP ring buffer pointers.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Harish Chegondi <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch adds an additional lock to reduce contention on the s_lock.
This lock is used in post_send() so that the post_send is not
serialized with the send engine and other send related processing.
To do this the s_next_psn is now maintained on post_send() while
post_send() related fields are moved to a new cache line. There is
an s_avail maintained for the post_send() to mitigate trading cache
lines with the send engine. The lock is released/acquired around
releasing the just built packet to the egress mechanism.
Reviewed-by: Jubin John <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Dean Luick <[email protected]>
Signed-off-by: Harish Chegondi <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
A busy_jiffies variable is maintained and updated when rc qps are
created and deleted. busy_jiffies is a scaled value of the number
of rc qps in the device. busy_jiffies is incremented every rc qp
scaling interval. busy_jiffies is added to the rc timeout
in add_retry_timer and mod_retry_timer. The rc qp scaling interval
is selected based on extensive performance evaluation of targeted
workloads.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Vennila Megavannan <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Use the new RNR timer for hfi1.
For qib, this timer doesn't exist, so exploit driver
callbacks to use the new timer as appropriate.
Reviewed-by: Jubin John <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
In addition to removing the modify queue pair verb from hfi1 we also
remove ancillary functions which existed only for modify queue pair and
are also already present in hfi1.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
alloc_qpn must use GFP and the hardware drivers should use it as well.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
IB core uses 1 relative indexing for ports. All of our data structures
use 0 based indexing. Add an inline function that we can use whenever we
need to validate a legal value and try to convert a port number to a
port index at the entrance into rdmavt.
Try to follow the policy that when we are talking about a port from IB
core point of view we refer to it as a port number. When port is an
index into our arrays refer to it as a port index.
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Harish Chegondi <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Change verbs memory allocations to the device numa node. This keeps memory
close to the device for optimal performance.
Reviewed-by: Dean Luick <[email protected]>
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Mitko Haralanov <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Some hardware drivers requires additional checks on send WRs. Create an
optional call back to allow hardware drivers to reject a send WR.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Fill in srq function stubs with code derived from hfi1 and qib.
Move necessary functions and data structure members as well.
Reviewed-by: Dennis Dalessandro <[email protected]>
Reviewed-by: Harish Chegondi <[email protected]>
Signed-off-by: Jubin John <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Drivers using rdmavt can rely on rvt_query_qp instead of defining their own
query_qp functions.
Reviewed-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Harish Chegondi <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Update all files added by rdmavt which do not yet have 2016 as the
copyright year.
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Harish Chegondi <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Low level drivers need to be able to check incoming attributes as well as be
able to adjust their private data on queue pair modification. Add 2 driver
callbacks, check_modify_qp and modify_qp, to facilitate this.
Signed-off-by: Ira Weiny <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch adds in the multicast add and remove functions as well as the
ancillary infrastructure needed.
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch adds the simple post receive verbs call to rdmavt. The actual
interrupt handling and packet processing is still done in the low level
driver.
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Harish Chegondi <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This patch adds in support the qp destroy verb call.
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Add modify qp and supporting functions.
Reviewed-by: Mike Marciniszyn <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Signed-off-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|