Age | Commit message (Collapse) | Author | Files | Lines |
|
When xa_insert() fails, the acquired resource in qedr_create_qp should
also be freed. However, current implementation does not handle the error.
Fix this by adding a new goto label that calls qedr_free_qp_resources.
Fixes: 1212767e23bb ("qedr: Add wrapping generic structure for qpidr and adjust idr routines.")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Keita Suzuki <[email protected]>
Acked-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Receive side delayed ack mode is needed only for certain area networks/
connections. Therefore disable it by default.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Potnuri Bharat Teja <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Leon Romanovsky says:
====================
This series from Alex extends software steering interface to support
devices with extra capability "sw_owner_2" which will replace existing
"sw_owner".
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_sw_owner_v2:
RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devices
RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devices
RDMA/mlx5: Add sw_owner_v2 bit capability
|
|
Leon Romanovsky says:
====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
* branch 'mlx5_active_speed':
RDMA: Fix link active_speed size
RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
net/mlx5: Refactor query port speed functions
|
|
According to the IB spec active_speed size should be u16 and not u8 as
before. Changing it to allow further extensions in offered speeds.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Aharon Landau <[email protected]>
Reviewed-by: Michael Guralnik <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Expose the ICM address to access TIR and QP, this will allow sw_owned_v2
devices to steer traffic to TIRs and QPs same as done with sw_owner
capability.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Vesker <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
sw_owner_v2 will replace sw_owner for future devices, this means that if
sw_owner_v2 is set sw_owner should be ignored and DM allocation is
required for sw_owner_v2 devices to function.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Vesker <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Move struct ib_rwq_ind_table allocation to ib_core.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Move allocation and destruction of memory windows under ib_core
responsibility and clean drivers to ensure that no updates to MW
ib_core structures are done in driver layer.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Combine two same enums to avoid duplication.
Signed-off-by: Aharon Landau <[email protected]>
Reviewed-by: Michael Guralnik <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The functions mlx5_query_port_link_width_oper and
mlx5_query_port_ib_proto_oper are always called together, so combine them
to a new function called mlx5_query_port_oper to avoid duplication.
And while the mlx5i_get_port_settings is the same as
mlx5_query_port_oper therefore let's remove it.
According to the IB spec link_width_oper and ib_proto_oper should be u16
and not as written u8, so perform casting as a preparation to cross-RDMA
patch which will fix that type for all drivers in the RDMA subsystem.
Fixes: ada68c31ba9c ("net/mlx5: Introduce a new header file for physical port functions")
Signed-off-by: Aharon Landau <[email protected]>
Reviewed-by: Michael Guralnik <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
hw_context stores a pci_device pointer. Use the correct type instead of
void * and avoid typecasts.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Parav Pandit <[email protected]>
Acked-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Implement the XRC specific verbs. The additional QP type introduced new
logic to the rest of the verbs that now require distinguishing whether a
QP has an "RQ" or an "SQ" or both.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Yuval Basson <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Pull rdma fixes from Jason Gunthorpe:
"A number of driver bug fixes and a few recent regressions:
- Several bug fixes for bnxt_re. Crashing, incorrect data reported,
and corruption on new HW
- Memory leak and crash in rxe
- Fix sysfs corruption in rxe if the netdev name is too long
- Fix a crash on error unwind in the new cq_pool code
- Fix kobject panics in rtrs by working device lifetime properly
- Fix a data corruption bug in iser target related to misaligned
buffers"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/isert: Fix unaligned immediate-data handling
RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
RDMA/core: Fix reported speed and width
RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ
RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
RDMA/bnxt_re: Restrict the max_gids to 256
RDMA/bnxt_re: Static NQ depth allocation
RDMA/bnxt_re: Fix the qp table indexing
RDMA/bnxt_re: Do not report transparent vlan from QP1
RDMA/mlx4: Read pkey table length instead of hardcoded value
RDMA/rxe: Fix panic when calling kmem_cache_create()
RDMA/rxe: Fix memleak in rxe_mem_init_user
RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars
RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
|
|
Alignment of parameters was off by one
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
commit 59e8970b3798 ("RDMA/qedr: Return max inline data in QP query
result") changed query_qp max_inline size to return the max roce inline
size. When iwarp was introduced, this should have been modified to return
the max inline size based on protocol. This size is cached in the device
attributes
Fixes: 69ad0e7fe845 ("RDMA/qedr: Add support for iWARP in user space")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Currently iWARP does not support mtu-change. Notify user when MTU changes
that reload of qedr is required for mtu to change. Display the correct
active mtu.
Fixes: f5b1b1775be6 ("RDMA/qedr: Add iWARP support in existing verbs")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In iWARP, accept could be called after a QP is already destroyed. In this
case an error should be returned and not success.
Fixes: 82af6d19d8d9 ("RDMA/qedr: Fix synchronization methods and memory leaks in qedr")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
dev->attr.page_size_caps was used uninitialized when setting device
attributes
Fixes: ec72fce401c6 ("qedr: Add support for RoCE HW init")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Change the doorbell setting so that the maximum value between the last and
current value is set. This is to avoid doorbells being lost.
Fixes: a7efd7773e31 ("qedr: Add support for PD,PKEY and CQ verbs")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The qedr_qp structure wasn't freed when the protocol was RoCE. kmemleak
output when running basic RoCE scenario.
unreferenced object 0xffff927ad7e22c00 (size 1024):
comm "ib_send_bw", pid 7082, jiffies 4384133693 (age 274.698s)
hex dump (first 32 bytes):
00 b0 cd a2 79 92 ff ff 00 3f a1 a2 79 92 ff ff ....y....?..y...
00 ee 5c dd 80 92 ff ff 00 f6 5c dd 80 92 ff ff ..\.......\.....
backtrace:
[<00000000b2ba0f35>] qedr_create_qp+0xb3/0x6c0 [qedr]
[<00000000e85a43dd>] ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x555/0xad0 [ib_uverbs]
[<00000000fee4d029>] ib_uverbs_cmd_verbs+0xa5a/0xb80 [ib_uverbs]
[<000000005d622660>] ib_uverbs_ioctl+0xa4/0x110 [ib_uverbs]
[<00000000eb4cdc71>] ksys_ioctl+0x87/0xc0
[<00000000abe6b23a>] __x64_sys_ioctl+0x16/0x20
[<0000000046e7cef4>] do_syscall_64+0x4d/0x90
[<00000000c6948f76>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1212767e23bb ("qedr: Add wrapping generic structure for qpidr and adjust idr routines.")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This is always the same value as IOVA masked by the page size, just use
that clearer calculation directly.
It is unclear of ocrdma hardware can actually support a true fbo, if so it
could use a different algorithm to compute the best page size.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
zbva is always false, so fbo is never read.
A 'zero-based-virtual-address' is simply IOVA == 0, and the driver already
supports this.
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
For the calls linked to mlx4_ib_umem_calc_optimal_mtt_size() use
ib_umem_num_dma_blocks() inside the function, it is just some weird static
default.
All other places are just using it with PAGE_SIZE, switch to
ib_umem_num_dma_blocks().
As this is the last call site, remove ib_umem_num_count().
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This driver always uses PAGE_SIZE.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This driver always uses a DMA array made up of PAGE_SIZE elements, so just
use ib_umem_num_dma_blocks().
Since rdma_for_each_dma_block() always iterates exactly
ib_umem_num_dma_blocks() there is no need for the early exit check in
build_user_pbes(), delete it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
mtr_umem_page_count() does the same thing, replace it with the core code.
Also, ib_umem_find_best_pgsz() should always be called to check that the
umem meets the page_size requirement. If there is a limited set of
page_sizes that work it the pgsz_bitmap should be set to that set. 0 is a
failure and the umem cannot be used.
Lightly tidy the control flow to implement this flow properly.
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Weihang Li <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
ib_umem_page_count() returns the number of 4k entries required for a DMA
map, but bnxt_re already computes a variable page size. The correct API to
determine the size of the page table array is ib_umem_num_dma_blocks().
Fix the overallocation of the page array in fill_umem_pbl_tbl() when
working with larger page sizes by using the right function. Lightly
re-organize this function to make it clearer.
Replace the other calls to ib_umem_num_pages().
Fixes: d85582517e91 ("RDMA/bnxt_re: Use core helpers to get aligned DMA address")
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The length of the list populated by qedr_populate_pbls() should be
calculated using ib_umem_num_dma_blocks() with the same size/shift passed
to qedr_populate_pbls().
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This loop is splitting the DMA SGL into pg_shift sized pages, use the core
code for this directly.
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Michal Kalderon <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
If ib_umem_find_best_pgsz() returns > PAGE_SIZE then the equation here is
not correct. 'start' should be 'virt'. Change it to use the core code for
page_num and the canonical calculation of page_shift.
Fixes: eb52c0333f06 ("RDMA/i40iw: Use core helpers to get aligned DMA address within a supported page size")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
If ib_umem_find_best_pgsz() returns > PAGE_SIZE then the equation here is
not correct. 'start' should be 'virt'. Change it to use the core code for
page_num and the canonical calculation of page_shift.
Fixes: 40ddb3f02083 ("RDMA/efa: Use API to get contiguous memory blocks aligned to device supported page size")
Link: https://lore.kernel.org/r/[email protected]
Tested-by: Gal Pressman <[email protected]>
Acked-by: Gal Pressman <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
ib_umem_num_pages() should only be used by things working with the SGL in
CPU pages directly.
Drivers building DMA lists should use the new ib_num_dma_blocks() which
returns the number of blocks rdma_umem_for_each_block() will return.
To make this general for DMA drivers requires a different implementation.
Computing DMA block count based on umem->address only works if the
requested page size is < PAGE_SIZE and/or the IOVA == umem->address.
Instead the number of DMA pages should be computed in the IOVA address
space, not umem->address. Thus the IOVA has to be stored inside the umem
so it can be used for these calculations.
For now set it to umem->address by default and fix it up if
ib_umem_find_best_pgsz() was called. This allows drivers to be converted
to ib_umem_num_dma_blocks() safely.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Generally drivers should be using this core helper to split up the umem
into DMA pages.
These drivers are all probably wrong in some way to pass PAGE_SIZE in as
the HW page size. Either the driver doesn't support other page sizes and
it should use 4096, or the driver does support other page sizes and should
use ib_umem_find_best_pgsz() to select the best HW pages size of the HW
supported set.
The only case it could be correct is if the HW has a global setting for
PAGE_SIZE set at driver initialization time.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
This helper does the same as rdma_for_each_block(), except it works on a
umem. This simplifies most of the call sites.
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Miguel Ojeda <[email protected]>
Acked-by: Shiraz Saleem <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Change counters to return failure like any other verbs destroy, however
this flow shouldn't return error at all.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Make this interface symmetrical to other destroy paths.
Fixes: a49b1dc7ae44 ("RDMA: Convert destroy_wq to be void")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Update XRCD destroy flow to allow command failure.
Fixes: 28ad5f65c314 ("RDMA: Move XRCD to be under ib_core responsibility")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Like any other verbs objects, CQ shouldn't fail during destroy, but
mlx5_ib didn't follow this contract with mixed IB verbs objects with
DEVX. Such mix causes to the situation where FW and kernel are fully
interdependent on the reference counting of each side.
Kernel verbs and drivers that don't have DEVX flows shouldn't fail.
Fixes: e39afe3d6dbd ("RDMA: Convert CQ allocations to be under core responsibility")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In similar way to other IB objects, restore the ability to return error on
SRQ destroy. Strictly speaking, this change is not necessary, and provided
here to ensure a symmetrical interface like other destroy functions.
Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The HW release can fail and leave the system in limbo state, where SRQ is
removed from the table, but can't be destroyed later. In every reentry,
the initial xa_erase_irq() check will fail.
Rewrite the erase logic to keep index, but don't store the entry
itself. By doing it, we can safely reinsert entry back in the case of
destroy failure.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Like any other IB verbs objects, AH are refcounted by ib_core. The release
of those objects are controlled by ib_core with promise that AH destroy
can't fail.
Being SW object for now, this change makes dealloc_ah() to behave like any
other destroy IB flows.
Fixes: d345691471b4 ("RDMA: Handle AH allocations by IB/core")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The IB verbs objects are counted by the kernel and ib_core ensures that
deallocate PD will success so it will be called once all other objects
that depends on PD will be released. This is achieved by managing various
reference counters on such objects.
The mlx5 driver didn't follow this standard flow when allowed DEVX objects
that are not managed by ib_core to be interleaved with the ones under
ib_core responsibility.
In such interleaved scenarios deallocate command can fail and ib_core will
leave uobject in internal DB and attempt to clean it later to free
resources anyway.
This change partially restores returned value from dealloc_pd() for all
drivers, but keeping in mind that non-DEVX devices and kernel verbs paths
shouldn't fail.
Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Some variables have been initialized when used. As a result, here removes
some unncessary initial assignment.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lijun Ou <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
drivers/infiniband/hw/bnxt_re/main.c:1012:25:
warning: variable ‘qplib_ctx’ set but not used [-Wunused-but-set-variable]
Fixes: f86b31c6a28f ("RDMA/bnxt_re: Static NQ depth allocation")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In preparation for unconditionally passing the struct tasklet_struct
pointer to all tasklet callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In preparation for unconditionally passing the struct tasklet_struct
pointer to all tasklet callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In preparation for unconditionally passing the struct tasklet_struct
pointer to all tasklet callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In preparation for unconditionally passing the struct tasklet_struct
pointer to all tasklet callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Romain Perier <[email protected]>
Signed-off-by: Allen Pais <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The SRQ can be destroyed right before mlx5_cmd_get_srq is called.
In such case the latter will return NULL instead of expected SRQ.
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|