aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5
AgeCommit message (Collapse)AuthorFilesLines
2019-10-22IB/mlx5: Align usage of QP1 create flags with rest of mlx5 definesMichael Guralnik3-11/+6
There is little value in keeping separate function for one flag, provide it directly like any other mlx5 define. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-22IB/mlx5: Remove dead codeRan Rozenstein2-16/+0
mlx5_ib_dc_atomic_is_supported function is not used anywhere. Remove the dead code. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ran Rozenstein <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-22RDMA/nldev: Provide MR statisticsErez Alfasi3-0/+46
Add RDMA nldev netlink interface for dumping MR statistics information. Output example: $ ./ibv_rc_pingpong -o -P -s 500000000 local address: LID 0x0001, QPN 0x00008a, PSN 0xf81096, GID :: $ rdma stat show mr dev mlx5_0 mrn 2 page_faults 122071 page_invalidations 0 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Erez Alfasi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-22RDMA/mlx5: Return ODP type per MRErez Alfasi5-1/+55
Provide an ODP explicit/implicit type as part of 'rdma -dd resource show mr' dump. For example: $ rdma -dd resource show mr dev mlx5_0 mrn 1 rkey 0xa99a lkey 0xa99a mrlen 50000000 pdn 9 pid 7372 comm ibv_rc_pingpong drv_odp explicit For non-ODP MRs, we won't print "drv_odp ..." at all. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Erez Alfasi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-22IB/mlx5: Introduce ODP diagnostic countersErez Alfasi2-0/+19
Introduce ODP diagnostic counters and count the following per MR within IB/mlx5 driver: 1) Page faults: Total number of faulted pages. 2) Page invalidations: Total number of pages invalidated by the OS during all invalidation events. The translations can be no longer valid due to either non-present pages or mapping changes. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Erez Alfasi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-22Merge branch 'mlx5-rd-sgl' into rdma.git for-nextJason Gunthorpe1-0/+2
From Yamin Friedman: ==================== This series from Yamin implements long standing "TODO" existed in rw.c. It allows the driver to specify a cut-over point where it is faster to build a lkey MR rather than do a large SGL for RDMA READ operations. mlx5 HW gets a notable performane boost by switching to MRs. ==================== Based on the mlx5-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for dependencies * branch 'mlx5-rd-sgl': (3 commits) RDMA/mlx5: Add capability for max sge to get optimized performance RDMA/rw: Support threshold for registration vs scattering to local pages net/mlx5: Expose optimal performance scatter entries capability
2019-10-22RDMA/mlx5: Add capability for max sge to get optimized performanceYamin Friedman1-0/+2
Allows the IB device to provide a value of maximum scatter gather entries per RDMA READ. In certain cases it may be preferable for a device to perform UMR memory registration rather than have many scatter entries in a single RDMA READ. This provides a significant performance increase in devices capable of using different memory registration schemes based on the number of scatter gather entries. This general capability allows each device vendor to fine tune when it is better to use memory registration. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yamin Friedman <[email protected]> Reviewed-by: Or Gerlitz <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-17RDMA/mlx5: Clear old rate limit when closing QPRafi Wiener1-3/+5
Before QP is closed it changes to ERROR state, when this happens the QP was left with old rate limit that was already removed from the table. Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state transitions") Signed-off-by: Rafi Wiener <[email protected]> Signed-off-by: Oleg Kuporosov <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-10-08IB/mlx5: Introduce and use mkey context setting helper routineParav Pandit1-18/+16
Introduce and use set_mkc_access_pd_addr_fields() which sets mkey context's access rights, PD, address fields. Thereby avoid the code duplication. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Add missing synchronize_srcu() for MW casesJason Gunthorpe3-48/+32
While MR uses live as the SRCU 'update', the MW case uses the xarray directly, xa_erase() causes the MW to become inaccessible to the pagefault thread. Thus whenever a MW is removed from the xarray we must synchronize_srcu() before freeing it. This must be done before freeing the mkey as re-use of the mkey while the pagefault thread is using the stale mkey is undesirable. Add the missing synchronizes to MW and DEVX indirect mkey and delete the bogus protection against double destroy in mlx5_core_destroy_mkey() Fixes: 534fd7aac56a ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP") Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Put live in the correct place for ODP MRsJason Gunthorpe3-38/+14
live is used to signal to the pagefault thread that the MR is initialized and ready for use. It should be after the umem is assigned and all other setup is completed. This prevents races (at least) of the form: CPU0 CPU1 mlx5_ib_alloc_implicit_mr() implicit_mr_alloc() live = 1 imr->umem = umem num_pending_prefetch_inc() if (live) atomic_inc(num_pending_prefetch) atomic_set(num_pending_prefetch,0) // Overwrites other thread's store Further, live is being used with SRCU as the 'update' in an acquire/release fashion, so it can not be read and written raw. Move all live = 1's to after MR initialization is completed and use smp_store_release/smp_load_acquire() for manipulating it. Add a missing live = 0 when an implicit MR child is deleted, before queuing work to do synchronize_srcu(). The barriers in update_odp_mr() were some broken attempt to create a acquire/release, but were not even applied consistently and missed the point, delete it as well. Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcuJason Gunthorpe1-2/+3
During destroy setting live = 0 and then synchronize_srcu() prevents num_pending_prefetch from incrementing, and also, ensures that all work holding that count is queued on the WQ. Testing before causes races of the form: CPU0 CPU1 dereg_mr() mlx5_ib_advise_mr_prefetch() srcu_read_lock() num_pending_prefetch_inc() if (!live) live = 0 atomic_read() == 0 // skip flush_workqueue() atomic_inc() queue_work(); srcu_read_unlock() WARN_ON(atomic_read()) // Fails Swap the order so that the synchronize_srcu() prevents this. Fixes: a6bc3875f176 ("IB/mlx5: Protect against prefetch of invalid MR") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/odp: Lift umem_mutex out of ib_umem_odp_unmap_dma_pages()Jason Gunthorpe1-4/+8
This fixes a race of the form: CPU0 CPU1 mlx5_ib_invalidate_range() mlx5_ib_invalidate_range() // This one actually makes npages == 0 ib_umem_odp_unmap_dma_pages() if (npages == 0 && !dying) // This one does nothing ib_umem_odp_unmap_dma_pages() if (npages == 0 && !dying) dying = 1; dying = 1; schedule_work(&umem_odp->work); // Double schedule of the same work schedule_work(&umem_odp->work); // BOOM npages and dying must be read and written under the umem_mutex lock. Since whenever ib_umem_odp_unmap_dma_pages() is called mlx5 must also call mlx5_ib_update_xlt, and both need to be done in the same locking region, hoist the lock out of unmap. This avoids an expensive double critical section in mlx5_ib_invalidate_range(). Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MRJason Gunthorpe1-2/+32
mlx5_ib_update_xlt() must be protected against parallel free of the MR it is accessing, also it must be called single threaded while updating the HW. Otherwise we can have races of the form: CPU0 CPU1 mlx5_ib_update_xlt() mlx5_odp_populate_klm() odp_lookup() == NULL pklm = ZAP implicit_mr_get_data() implicit_mr_alloc() <update interval tree> mlx5_ib_update_xlt mlx5_odp_populate_klm() odp_lookup() != NULL pklm = VALID mlx5_ib_post_send_wait() mlx5_ib_post_send_wait() // Replaces VALID with ZAP This can be solved by putting both the SRCU and the umem_mutex lock around every call to mlx5_ib_update_xlt(). This ensures that the content of the interval tree relavent to mlx5_odp_populate_klm() (ie mr->parent == mr) will not change while it is running, and thus the posted WRs to update the KLM will always reflect the correct information. The race above will resolve by either having CPU1 wait till CPU0 completes the ZAP or CPU0 will run after the add and instead store VALID. The pagefault path adding children already holds the umem_mutex and SRCU, so the only missed lock is during MR destruction. Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Do not allow rereg of a ODP MRJason Gunthorpe1-3/+3
This code is completely broken, the umem of a ODP MR simply cannot be discarded without a lot more locking, nor can an ODP mkey be blithely destroyed via destroy_mkey(). Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Artemy Kovalyov <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04IB/mlx5: Remove unnecessary else statementErez Alfasi1-2/+2
'else' is not generally useful after a break or return. Remove this unnecessary statement. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Erez Alfasi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04IB/mlx5: Remove unnecessary return statementErez Alfasi1-2/+0
There is no reason to call return at the end of function which returns void. Remove this unnecessary statement. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Erez Alfasi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-10-04RDMA/mlx5: Group boolean parameters to take less spaceLeon Romanovsky1-4/+4
Clean the code to store all boolean parameters inside one variable. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-09-21Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds6-131/+146
Pull RDMA subsystem updates from Jason Gunthorpe: "This cycle mainly saw lots of bug fixes and clean up code across the core code and several drivers, few new functional changes were made. - Many cleanup and bug fixes for hns - Various small bug fixes and cleanups in hfi1, mlx5, usnic, qed, bnxt_re, efa - Share the query_port code between all the iWarp drivers - General rework and cleanup of the ODP MR umem code to fit better with the mmu notifier get/put scheme - Support rdma netlink in non init_net name spaces - mlx5 support for XRC devx and DC ODP" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits) RDMA: Fix double-free in srq creation error flow RDMA/efa: Fix incorrect error print IB/mlx5: Free mpi in mp_slave mode IB/mlx5: Use the original address for the page during free_pages RDMA/bnxt_re: Fix spelling mistake "missin_resp" -> "missing_resp" RDMA/hns: Package operations of rq inline buffer into separate functions RDMA/hns: Optimize cmd init and mode selection for hip08 IB/hfi1: Define variables as unsigned long to fix KASAN warning IB/{rdmavt, hfi1, qib}: Add a counter for credit waits IB/hfi1: Add traces for TID RDMA READ RDMA/siw: Relax from kmap_atomic() use in TX path IB/iser: Support up to 16MB data transfer in a single command RDMA/siw: Fix page address mapping in TX path RDMA: Fix goto target to release the allocated memory RDMA/usnic: Avoid overly large buffers on stack RDMA/odp: Add missing cast for 32 bit RDMA/hns: Use devm_platform_ioremap_resource() to simplify code Documentation/infiniband: update name of some functions RDMA/cma: Fix false error message RDMA/hns: Fix wrong assignment of qp_access_flags ...
2019-09-21Merge tag 'for-linus-hmm' of ↵Linus Torvalds4-78/+70
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is more cleanup and consolidation of the hmm APIs and the very strongly related mmu_notifier interfaces. Many places across the tree using these interfaces are touched in the process. Beyond that a cleanup to the page walker API and a few memremap related changes round out the series: - General improvement of hmm_range_fault() and related APIs, more documentation, bug fixes from testing, API simplification & consolidation, and unused API removal - Simplify the hmm related kconfigs to HMM_MIRROR and DEVICE_PRIVATE, and make them internal kconfig selects - Hoist a lot of code related to mmu notifier attachment out of drivers by using a refcount get/put attachment idiom and remove the convoluted mmu_notifier_unregister_no_release() and related APIs. - General API improvement for the migrate_vma API and revision of its only user in nouveau - Annotate mmu_notifiers with lockdep and sleeping region debugging Two series unrelated to HMM or mmu_notifiers came along due to dependencies: - Allow pagemap's memremap_pages family of APIs to work without providing a struct device - Make walk_page_range() and related use a constant structure for function pointers" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (75 commits) libnvdimm: Enable unit test infrastructure compile checks mm, notifier: Catch sleeping/blocking for !blockable kernel.h: Add non_block_start/end() drm/radeon: guard against calling an unpaired radeon_mn_unregister() csky: add missing brackets in a macro for tlb.h pagewalk: use lockdep_assert_held for locking validation pagewalk: separate function pointers from iterator data mm: split out a new pagewalk.h header from mm.h mm/mmu_notifiers: annotate with might_sleep() mm/mmu_notifiers: prime lockdep mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end mm/mmu_notifiers: remove the __mmu_notifier_invalidate_range_start/end exports mm/hmm: hmm_range_fault() infinite loop mm/hmm: hmm_range_fault() NULL pointer bug mm/hmm: fix hmm_range_fault()'s handling of swapped out pages mm/mmu_notifiers: remove unregister_no_release RDMA/odp: remove ib_ucontext from ib_umem RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address ...
2019-09-16IB/mlx5: Free mpi in mp_slave modeDanit Goldberg1-0/+1
ib_add_slave_port() allocates a multiport struct but never frees it. Don't leak memory, free the allocated mpi struct during driver unload. Cc: <[email protected]> Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Danit Goldberg <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-09-16IB/mlx5: Use the original address for the page during free_pagesDanit Goldberg1-4/+5
The removal of 'buffer' in the patch below caused free_page() to use a value that had been offset since the wqe pointer is adjusted while the routine runs. The current implementation of free_pages() rounds down to a pfn, discarding the adjustment, but this is not the right way to use the API. Preserve the initial value and use it for free_page(). Fixes: 0f51427bd097 ("RDMA/mlx5: Cleanup WQE page fault handler") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Danit Goldberg <[email protected]> Reviewed-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-09-13Merge tag 'v5.3-rc8' into rdma.git for-nextJason Gunthorpe5-20/+48
To resolve dependencies in following patches mlx5_ib.h conflict resolved by keeing both hunks Linux 5.3-rc8 Signed-off-by: Jason Gunthorpe <[email protected]>
2019-09-03net/mlx5: Add flow steering actions to fs_cmd shim layerMaor Gottlieb3-13/+20
Add flow steering actions: modify header and packet reformat to the fs_cmd shim layer. This allows each namespace to define possibly different functionality for alloc/dealloc action commands. Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-09-02Merge branch 'mlx5-next' of ↵Saeed Mahameed4-208/+30
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Merge mlx5-next patches needed for upcoming mlx5 software steering. 1) Alex adds HW bits and definitions required for SW steering 2) Ariel moves device memory management to mlx5_core (From mlx5_ib) 3) Maor, Cleanups and fixups for eswitch mode and RoCE 4) Mark, Set only stag for match untagged packets Signed-off-by: Saeed Mahameed <[email protected]>
2019-09-01net/mlx5: Move device memory management to mlx5_coreAriel Levkovich4-208/+30
Move the device memory allocation and deallocation commands SW ICM memory to mlx5_core to expose this API for all mlx5_core users. This comes as preparation for supporting SW steering in kernel where it will be required to allocate and register device memory for direct rule insertion. In addition, an API to register this device memory for future remote access operations is introduced using the create_mkey commands. Signed-off-by: Ariel Levkovich <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2019-08-28Merge branch 'mlx5-next' of ↵Saeed Mahameed2-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 HW spec and bits updates: 1) Aya exposes IP-in-IP capability in mlx5_core. 2) Maxim exposes lag tx port affinity capabilities. 3) Moshe adds VNIC_ENV internal rq counter bits. 4) ODP capabilities for DC transport Misc updates: 5) Saeed, two compiler warnings cleanups 6) Add XRQ legacy commands opcodes 7) Use refcount_t for refcount 8) fix a -Wstringop-truncation warning
2019-08-28Merge branch 'mlx5-odp-dc' into rdma.git for-nextJason Gunthorpe1-48/+3
Michael Guralnik says: ==================== The series adds support for on-demand paging for DC transport. As DC is a mlx-only transport, the capabilities are exposed to the user using DEVX objects and later on through mlx5dv_query_device. ==================== Based on the mlx5-next branch from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux for dependencies * branch 'mlx5-odp-dc': IB/mlx5: Add page fault handler for DC initiator WQE IB/mlx5: Remove check of FW capabilities in ODP page fault handling net/mlx5: Set ODP capabilities for DC transport to max
2019-08-28IB/mlx5: Add page fault handler for DC initiator WQEMichael Guralnik1-1/+2
Parsing DC initiator WQEs upon page fault requires skipping an address vector segment, as in UD WQEs. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-28IB/mlx5: Remove check of FW capabilities in ODP page fault handlingMichael Guralnik1-47/+1
As page fault handling is initiated by FW, there is no need to check that the ODP supports the operation and transport. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-26RDMA/mlx5: RDMA_RX flow type support for user applicationsMark Zhang3-2/+19
Currently user applications can only steer TCP/IP(NIC RX/RX) traffic. This patch adds RDMA_RX as a new flow type to allow the user to insert steering rules to control RDMA traffic. Two destinations are supported(but not set at the same time): devx flow table object and QP. Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-21RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'Jason Gunthorpe1-5/+0
This is a significant simplification, no extra list is kept per FD, and the interval tree is now shared between all the ucontexts, reducing overhead if there are multiple ucontexts active. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21Merge branch 'odp_fixes' into rdma.git for-nextJason Gunthorpe5-114/+97
Jason Gunthorpe says: ==================== This is a collection of general cleanups for ODP to clarify some of the flows around umem creation and use of the interval tree. ==================== The branch is based on v5.3-rc5 due to dependencies * odp_fixes: RDMA/mlx5: Use odp instead of mr->umem in pagefault_mr RDMA/mlx5: Use ib_umem_start instead of umem.address RDMA/core: Make invalidate_range a device operation RDMA/odp: Use kvcalloc for the dma_list and page_list RDMA/odp: Check for overflow when computing the umem_odp end RDMA/odp: Provide ib_umem_odp_release() to undo the allocs RDMA/odp: Split creating a umem_odp from ib_umem_get RDMA/odp: Make the three ways to create a umem_odp clear RMDA/odp: Consolidate umem_odp initialization RDMA/odp: Make it clearer when a umem is an implicit ODP umem RDMA/odp: Iterate over the whole rbtree directly RDMA/odp: Use the common interval tree library instead of generic RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/mlx5: Use odp instead of mr->umem in pagefault_mrJason Gunthorpe1-6/+5
These are the same thing since mr always comes from odp->private. It is confusing to reference the same memory via two names. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/mlx5: Use ib_umem_start instead of umem.addressJason Gunthorpe1-3/+3
These are subtly different, the address is the original VA requested during umem_get, while ib_umem_start() is the version that is rounded to the proper page size, ie is the true start of the umem's dma map. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/core: Make invalidate_range a device operationMoni Shoua2-4/+1
The callback function 'invalidate_range' is implemented in a driver so the place for it is in the ib_device_ops structure and not in ib_ucontext. Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Guy Levi <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/odp: Provide ib_umem_odp_release() to undo the allocsJason Gunthorpe2-4/+4
Now that there are allocator APIs that return the ib_umem_odp directly it should be freed through a umem_odp free'er as well. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/odp: Split creating a umem_odp from ib_umem_getJason Gunthorpe2-21/+26
This is the last creation API that is overloaded for both, there is very little code sharing and a driver has to be specifically ready for a umem_odp to be created to use the odp version. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/odp: Make the three ways to create a umem_odp clearJason Gunthorpe1-12/+11
The three paths to build the umem_odps are kind of muddled, they are: - As a normal ib_mr umem - As a child in an implicit ODP umem tree - As the root of an implicit ODP umem tree Only the first two are actually umem's, the last is an abuse. The implicit case can only be triggered by explicit driver request, it should never be co-mingled with the normal case. While we are here, make sensible function names and add some comments to make this clearer. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/odp: Make it clearer when a umem is an implicit ODP umemJason Gunthorpe2-2/+2
Implicit ODP umems are special, they don't have any page lists, they don't exist in the interval tree and they are never DMA mapped. Instead of trying to guess this based on a zero length use an explicit flag. Further, do not allow non-implicit umems to be 0 size. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-21RDMA/odp: Iterate over the whole rbtree directlyJason Gunthorpe1-22/+19
Instead of intersecting a full interval, just iterate over every element directly. This is faster and clearer. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-20IB/mlx5: Block MR WR if UMR is not possibleMoni Shoua1-5/+19
Check conditions that are mandatory to post_send UMR WQEs. 1. Modifying page size. 2. Modifying remote atomic permissions if atomic access is required. If either condition is not fulfilled then fail to post_send() flow. Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-20IB/mlx5: Fix MR re-registration flow to use UMR properlyMoni Shoua1-1/+2
The UMR WQE in the MR re-registration flow requires that modify_atomic and modify_entity_size capabilities are enabled. Therefore, check that the these capabilities are present before going to umr flow and go through slow path if not. Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-20IB/mlx5: Report and handle ODP support properlyMoni Shoua2-11/+12
ODP depends on the several device capabilities, among them is the ability to send UMR WQEs with that modify atomic and entity size of the MR. Therefore, only if all conditions to send such a UMR WQE are met then driver can report that ODP is supported. Use this check of conditions in all places where driver needs to know about ODP support. Also, implicit ODP support depends on ability of driver to send UMR WQEs for an indirect mkey. Therefore, verify that all conditions to do so are met when reporting support. Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <[email protected]> Reviewed-by: Guy Levi <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-20IB/mlx5: Consolidate use_umr checks into single functionMoni Shoua2-3/+15
Introduce helper function to unify various use_umr checks. Signed-off-by: Moni Shoua <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-20RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLBJason Gunthorpe1-2/+3
When ODP is enabled with IB_ACCESS_HUGETLB then the required pages should be calculated based on the extent of the MR, which is rounded to the nearest huge page alignment. Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-13RDMA/mlx5: Annotate lock dependency in bind/unbind slave portLeon Romanovsky1-2/+4
NULL-ing notifier_call is performed under protection of mlx5_ib_multiport_mutex lock. Such protection is not easily spotted and better to be guarded by lockdep annotation. Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-13IB/mlx5: Expose XRQ legacy commands over the DEVX interfaceYishai Hadas1-0/+4
Expose missing XRQ legacy commands over the DEVX interface. Signed-off-by: Yishai Hadas <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-13IB/mlx5: Add legacy events to DEVX listYishai Hadas1-0/+8
Add two events that were defined in the device specification but were not exposed in the driver list. Post this patch those events can be read over the DEVX events interface once be reported by the firmware. Signed-off-by: Yishai Hadas <[email protected]> Reviewed-by: Edward Srouji <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Doug Ledford <[email protected]>
2019-08-13Merge remote-tracking branch 'mlx5-next/mlx5-next' into wip/dl-for-nextDoug Ledford2-3/+4
Merging tip of mlx5-next in order to get changes related to adding XRQ support to the DEVX interface needed prior to the following two patches. Signed-off-by: Doug Ledford <[email protected]>