aboutsummaryrefslogtreecommitdiff
path: root/drivers/infiniband/sw
AgeCommit message (Collapse)AuthorFilesLines
2023-04-03RDMA/siw: Remove namespace check from siw_netdev_event()Tetsuo Handa1-3/+0
syzbot is reporting that siw_netdev_event(NETDEV_UNREGISTER) cannot destroy siw_device created after unshare(CLONE_NEWNET) due to net namespace check. It seems that this check was by error there and should be removed. Reported-by: syzbot <[email protected]> Link: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b Suggested-by: Jason Gunthorpe <[email protected]> Suggested-by: Leon Romanovsky <[email protected]> Fixes: bdcf26bf9b3a ("rdma/siw: network and RDMA core interface") Signed-off-by: Tetsuo Handa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Bernard Metzler <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-30RDMA/rxe: Clean kzalloc failure pathsLeon Romanovsky2-23/+9
There is no need to print any debug messages after failure to allocate memory, because kernel will print OOM dumps anyway. Together with removal of these messages, remove useless goto jumps. Fixes: 5bf944f24129 ("RDMA/rxe: Add error messages") Reported-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/all/[email protected] Link: https://lore.kernel.org/r/d3cedf723b84e73e8062a67b7489d33802bafba2.1680113597.git.leon@kernel.org Reviewed-by: Bob Pearson <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-29RDMA/rxe: Remove tasklet call from rxe_cq.cBob Pearson3-33/+3
Remove the tasklet call in rxe_cq.c and also the is_dying in the cq struct. There is no reason for the rxe driver to defer the call to the cq completion handler by scheduling a tasklet. rxe_cq_post() is not called in a hard irq context. The rxe driver currently is incorrect because the tasklet call is made without protecting the cq pointer with a reference from having the underlying memory freed before the deferred routine is called. Executing the comp_handler inline fixes this problem. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Bob Pearson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Zhu Yanjun <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-24RDMA/rxe: Rewrite rxe_task.cBob Pearson2-56/+218
This patch is a major rewrite of the tasklet routines in rxe_task.c. The main motivation for this is the realization that the code violates the safety of the qp pointer by correct reference counting. When a tasklet is scheduled from a verbs API the calling thread has a valid reference to the qp and schedules the tasklet to run at a later time carrying a pointer to the qp. Once the calling code returns however the qp can be destroyed at any time. In order to correct this a reference to the qp must be taken when the task is scheduled and held until it finishes running. This is complicated by the tasklet library not alwys running a task that is scheduled depending on whether someone else has scheduled it. This patch moves the logic for deciding whether to run or schedule a task outside of do_task() and guarantees that there is only one copy of the task scheduled or running at a time. Secondly the separate flags controlling teardown and draining of the task are included in the task state machine and all references to the state are protected by spinlocks to avoid consistency and memory barrier issues. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Make tasks schedule each otherBob Pearson2-6/+6
Replace rxe_run_task() by rxe_sched_task() when tasks call each other. These are not performance critical and mainly involve error paths but they run the risk of causing deadlocks. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ian Ziemba <[email protected]> Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Remove __rxe_do_task()Bob Pearson3-58/+17
The subroutine __rxe_do_task is not thread safe and it has no way to guarantee that the tasks, which are designed with the assumption that they are non-reentrant, are not reentered. All of its uses are non-performance critical. This patch replaces calls to __rxe_do_task with calls to rxe_sched_task. It also removes irrelevant or unneeded if tests. Instead of calling the task machinery a single call to the tasklet function (rxe_requester, etc.) is sufficient to draing the queues if task execution has been disabled or stopped. Together these changes allow the removal of __rxe_do_task. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ian Ziemba <[email protected]> Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Remove qp reference counting in tasksBob Pearson3-14/+0
Currently each of the three tasklets requester, completer and responder in the rxe driver take and release a reference to the qp argument at the beginning and end of the subroutines. The caller passing in the qp argument should be responsible for holding a reference to qp so these are not required. Further doing so breaks the qp cleanup code in rxe_qp_do_cleanup which calls these routines after all the references have been dropped so they cannot drain the packet and work request queues as intended. In fact if these routines are deferred by calling tasklet_schedule there is no guarantee that the calling code does have a qp reference. That is a bug in rxe_task.c which will be fixed later in this series. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Cleanup error state handling in rxe_comp.cBob Pearson3-23/+61
Cleanup the handling of qp in the error state, reset state and during rxe_qp_do_cleanup. Make the same as rxe_resp.c Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ian Ziemba <[email protected]> Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Cleanup reset state handling in rxe_resp.cBob Pearson2-51/+57
Cleanup the handling of qp in the error state, reset state and during rxe_qp_do_cleanup. The error state does about the same thing as the others but has code spread all over. This patch combines them in a cleaner way. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ian Ziemba <[email protected]> Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Convert tasklet args to queue pairsBob Pearson6-18/+17
Originally is was thought that the tasklet machinery in rxe_task.c would be used in other applications but that has not happened for years. This patch replaces the 'void *arg' by struct 'rxe_qp *qp' in the parameters to the tasklet calls. This change will have no affect on performance but may make the code a little clearer. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Add error messagesBob Pearson5-241/+609
This patch adds error and debug messages so that every interaction with rdma-core through a verbs API call or a completion error return will generate at least one error message backed up by debug messages with more detail. With dynamic debugging one can follow up after seeing an error message by turning on the appropriate debug messages. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Extend dbg log messages to err and infoBob Pearson2-3/+47
Extend the dbg log messages (e.g. rxe_dbg_xxx) to include err and info types. rxe.c is modified to use these new log messages as examples. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Change rxe_dbg to rxe_dbg_devBob Pearson9-24/+25
Replace the name rxe_dbg with rxe_dbg_dev which better matches the remaining rxe_dbg_xxx macros for debug messages with a rxe device parameter. Reuse the name rxe_dbg for debug messages which do not have a rxe device parameter. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-24RDMA/rxe: Replace exists by rxe in rxe.cBob Pearson1-6/+6
'exists' looks like a boolean. This patch replaces it by the normal name used for the rxe device, 'rxe', which should be a little less confusing. The second rxe_dbg() message is incorrect since rxe is known to be NULL and this will cause a seg fault if this message were ever sent. Replace it by pr_debug for the moment. Fixes: c6aba5ea0055 ("RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe.c") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-03-14RDMA/rdmavt: Delete unnecessary NULL checkNatalia Petrova1-2/+0
There is no need to check 'rdi->qp_dev' for NULL. The field 'qp_dev' is created in rvt_register_device() which will fail if the 'qp_dev' allocation fails in rvt_driver_qp_init(). Overwise this pointer doesn't changed and passed to rvt_qp_exit() by the next step. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 0acb0cc7ecc1 ("IB/rdmavt: Initialize and teardown of qpn table") Signed-off-by: Natalia Petrova <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-13IB/rdmavt: Fix target union member for rvt_post_one_wr()Kees Cook1-1/+1
The "cplen" result used by the memcpy() into struct rvt_swqe "wqe" may be sized to 80 for struct rvt_ud_wr (which is member "ud_wr", not "wr" which is only 40 bytes in size). Change the destination union member so the compiler can use the correct bounds check. struct rvt_swqe { union { struct ib_send_wr wr; /* don't use wr.sg_list */ struct rvt_ud_wr ud_wr; ... }; ... }; Silences false positive memcpy() run-time warning: memcpy: detected field-spanning write (size 80) of single field "&wqe->wr" at drivers/infiniband/sw/rdmavt/qp.c:2043 (size 40) Reported-by: Zhang Yi <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216561 Cc: Dennis Dalessandro <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Dennis Dalessandro <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-03-13RDMA/siw: Fix potential page_array out of range accessDaniil Dulov1-1/+1
When seg is equal to MAX_ARRAY, the loop should break, otherwise it will result in out of range access. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b9be6f18cf9e ("rdma/siw: transmit path") Signed-off-by: Daniil Dulov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2023-02-24Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds10-585/+600
Pull rdma updates from Jason Gunthorpe: "Quite a small cycle this time, even with the rc8. I suppose everyone went to sleep over xmas. - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw, mana - inline CQE support for hns - Have mlx5 display device error codes - Pinned DMABUF support for irdma - Continued rxe cleanups, particularly converting the MRs to use xarray - Improvements to what can be cached in the mlx5 mkey cache" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (61 commits) IB/mlx5: Extend debug control for CC parameters IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors IB/hfi1: Fix math bugs in hfi1_can_pin_pages() RDMA/irdma: Add support for dmabuf pin memory regions RDMA/mlx5: Use query_special_contexts for mkeys net/mlx5e: Use query_special_contexts for mkeys net/mlx5: Change define name for 0x100 lkey value net/mlx5: Expose bits for querying special mkeys RDMA/rxe: Fix missing memory barriers in rxe_queue.h RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet RDMA/rxe: Remove rxe_alloc() RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size Subject: RDMA/rxe: Handle zero length rdma iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry() RDMA/mlx5: Use rdma_umem_for_each_dma_block() RDMA/umem: Remove unused 'work' member from struct ib_umem RDMA/irdma: Cap MSIX used to online CPUs + 1 RDMA/mlx5: Check reg_create() create for errors RDMA/restrack: Correct spelling RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish() ...
2023-02-16RDMA/rxe: Fix missing memory barriers in rxe_queue.hBob Pearson2-52/+76
An earlier patch which introduced smp_load_acquire/smp_store_release into rxe_queue.h incorrectly assumed that surrounding spin-locks in rxe_verbs.c around queue updates for kernel ulps was sufficient to protect the passing of data through the queues between the ulp and the rxe tasklets. But this was incorrect. The typical sequence was ulp rxe requester tasklet ------------------------ --------------------- spin_lock_irqsave() wqe = queue_head(queue) if (!queue_full(q)) { if (!wqe) spin_unlock_irqrestore return; return -ENOMEM } <process wqe> wqe = queue_producer_addr(q) <fill in wqe> queue_advance_consumer(queue) queue_advance_producer(q) spin_unlock_irqrestore() queue_head() calls queue_empty() which calls smp_load_acquire() For user space apps queue_advance_producer() calls smp_store_release() so that there is a memory barrier between the producer and the consumer but for kernel ulps queue_advance_produce() just incremented the producer index because the lock function is a release function. But to work the barrier has to come between filling in the wqe and updating the producer index. This patch adds the missing barriers. It also changes the enum names for the ulp queue types to QUEUE_TYPE_FROM/TO_ULP instead of QUEUE_TYPE_TO/FROM_DRIVER which is very ambiguous. This bug is suspected as the cause of very rare lockups in a very high scale storage application. It is a bug in any case and should be corrected. Fixes: 0a67c46d2e99 ("RDMA/rxe: Protect user space index loads/stores") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-02-16RDMA/rxe: Remove rxe_alloc()Bob Pearson4-66/+44
Currently all the object types in the rxe driver are allocated in rdma-core except for MRs. By moving tha kzalloc() call outside of the pool code the rxe_alloc() subroutine can be eliminated and code checking for MR as a special case can be removed. This patch moves the kzalloc() and kfree_rcu() calls into the mr registration and destruction verbs. It removes that code from rxe_pool.c including the rxe_alloc() subroutine which is no longer used. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-02-16Subject: RDMA/rxe: Handle zero length rdmaBob Pearson2-15/+51
Currently the rxe driver does not handle all cases of zero length rdma operations correctly. The client does not have to provide an rkey for zero length RDMA read or write operations so the rkey provided may be invalid and should not be used to lookup an mr. This patch corrects the driver to ignore the provided rkey if the reth length is zero for read or write operations and make sure to set the mr to NULL. In read_reply() if length is zero rxe_recheck_mr() is not called. Warnings are added in the routines in rxe_mr.c to catch NULL MRs when the length is non-zero. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Reviewed-by: Daisuke Matsuda <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-02-06RDMA/siw: Fix user page pinning accountingBernard Metzler1-11/+12
To avoid racing with other user memory reservations, immediately account full amount of pages to be pinned. Fixes: 2251334dcac9 ("rdma/siw: application buffer management") Reported-by: Jason Gunthorpe <[email protected]> Suggested-by: Alistair Popple <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Signed-off-by: Bernard Metzler <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-16/+16
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/[email protected]/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/[email protected]/ https://lore.kernel.org/all/[email protected]/ Signed-off-by: Jakub Kicinski <[email protected]>
2023-01-27RDMA/rxe: Replace rxe_map and rxe_phys_buf by xarrayBob Pearson3-302/+254
Replace struct rxe-phys_buf and struct rxe_map by struct xarray in rxe_verbs.h. This allows using rcu locking on reads for the memory maps stored in each mr. This is based off of a sketch of a patch from Jason Gunthorpe in the link below. Some changes were needed to make this work. It applies cleanly to the current for-next and passes the pyverbs, perftest and the same blktests test cases which run today. Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/linux-rdma/Y3gvZr6%[email protected]/ Co-developed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-26RDMA/rxe: Cleanup page variables in rxe_mr.cBob Pearson2-20/+22
Cleanup usage of mr->page_shift and mr->page_mask and introduce an extractor for mr->ibmr.page_size. Normal usage in the kernel has page_mask masking out offset in page rather than masking out the page number. The rxe driver had reversed that which was confusing. Implicitly there can be a per mr page_size which was not uniformly supported. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-26RDMA-rxe: Isolate mr code from atomic_write_reply()Bob Pearson3-42/+59
Isolate mr specific code from atomic_write_reply() in rxe_resp.c into a subroutine rxe_mr_do_atomic_write() in rxe_mr.c. Check length for atomic write operation. Make iova_to_vaddr() static. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-26RDMA-rxe: Isolate mr code from atomic_reply()Bob Pearson4-100/+105
Isolate mr specific code from atomic_reply() in rxe_resp.c into a subroutine rxe_mr_do_atomic_op() in rxe_mr.c. Minor cleanups to rxe_check_range() and iova_to_vaddr(). Move enum resp_state to rxe.h Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-26RDMA/rxe: Move rxe_map_mr_sg to rxe_mr.cBob Pearson3-36/+38
Move rxe_map_mr_sg() to rxe_mr.c where it makes a little more sense. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-26RDMA/rxe: Cleanup mr_check_rangeBob Pearson1-3/+1
Remove blank lines and replace EFAULT by EINVAL when an invalid mr type is used. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-23net/sock: Introduce trace_sk_data_ready()Peilin Ye2-0/+8
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready() callback implementations. For example: <...> iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable <...> Suggested-by: Cong Wang <[email protected]> Signed-off-by: Peilin Ye <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-01-09RDMA/rxe: Prevent faulty rkey generationDaisuke Matsuda1-5/+5
If you create MRs more than 0x10000 times after loading the module, responder starts to reply NAKs for RDMA/Atomic operations because of rkey violation detected in check_rkey(). The root cause is that rkeys are incremented each time a new MR is created and the value overflows into the range reserved for MWs. This commit also increases the value of RXE_MAX_MW that has been limited unlike other parameters. Fixes: 0994a1bcd5f7 ("RDMA/rxe: Bump up default maximum values used via uverbs") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daisuke Matsuda <[email protected]> Tested-by: Li Zhijian <[email protected]> Reviewed-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-01-09RDMA/rxe: Fix inaccurate constants in rxe_type_infoDaisuke Matsuda1-11/+11
ibv_query_device() has reported incorrect device attributes, which are actually not used by the device. Make the constants correspond with the attributes shown to users. Fixes: 3ccffe8abf2f ("RDMA/rxe: Move max_elem into rxe_type_info") Fixes: 3225717f6dfa ("RDMA/rxe: Replace red-black trees by xarrays") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daisuke Matsuda <[email protected]> Reviewed-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds2-33/+41
Pull rdma fixes from Jason Gunthorpe: "Fix two build warnings on 32 bit platforms It seems the linux-next CI and 0-day bot are not testing enough 32 bit configurations, as soon as you merged the rdma pull request there were two instant reports of warnings on these sytems that I would have thought should have been covered by time in linux-next Anyhow, here are the fixes so people don't hit problems with -Werror" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/siw: Fix pointer cast warning RDMA/rxe: Fix compile warnings on 32-bit
2022-12-16RDMA/siw: Fix pointer cast warningArnd Bergmann1-1/+1
The previous build fix left a remaining issue in configurations with 64-bit dma_addr_t on 32-bit architectures: drivers/infiniband/sw/siw/siw_qp_tx.c: In function 'siw_get_pblpage': drivers/infiniband/sw/siw/siw_qp_tx.c:32:37: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast] 32 | return virt_to_page((void *)paddr); | ^ Use the same double cast here that the driver uses elsewhere to convert between dma_addr_t and void*. Fixes: 0d1b756acf60 ("RDMA/siw: Pass a pointer to virt_to_page()") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Bernard Metzler <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-15RDMA/rxe: Fix compile warnings on 32-bitJason Gunthorpe1-32/+40
Move the conditional code into a function, with two varients so it is harder to make these kinds of mistakes. drivers/infiniband/sw/rxe/rxe_resp.c: In function 'atomic_write_reply': drivers/infiniband/sw/rxe/rxe_resp.c:794:13: error: unused variable 'payload' [-Werror=unused-variable] 794 | int payload = payload_size(pkt); | ^~~~~~~ drivers/infiniband/sw/rxe/rxe_resp.c:793:24: error: unused variable 'mr' [-Werror=unused-variable] 793 | struct rxe_mr *mr = qp->resp.mr; | ^~ drivers/infiniband/sw/rxe/rxe_resp.c:791:19: error: unused variable 'dst' [-Werror=unused-variable] 791 | u64 src, *dst; | ^~~ drivers/infiniband/sw/rxe/rxe_resp.c:791:13: error: unused variable 'src' [-Werror=unused-variable] 791 | u64 src, *dst; Fixes: 034e285f8b99 ("RDMA/rxe: Make responder support atomic write on RC service") Link: https://lore.kernel.org/linux-rdma/[email protected]/ Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-14Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds25-361/+818
Pull rdma updates from Jason Gunthorpe: "Usual size of updates, a new driver, and most of the bulk focusing on rxe: - Usual typos, style, and language updates - Driver updates for mlx5, irdma, siw, rts, srp, hfi1, hns, erdma, mlx4, srp - Lots of RXE updates: * Improve reply error handling for bad MR operations * Code tidying * Debug printing uses common loggers * Remove half implemented RD related stuff * Support IBA's recently defined Atomic Write and Flush operations - erdma support for atomic operations - New driver 'mana' for Ethernet HW available in Azure VMs. This driver only supports DPDK" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (122 commits) IB/IPoIB: Fix queue count inconsistency for PKEY child interfaces RDMA: Add missed netdev_put() for the netdevice_tracker RDMA/rxe: Enable RDMA FLUSH capability for rxe device RDMA/cm: Make QP FLUSHABLE for supported device RDMA/rxe: Implement flush completion RDMA/rxe: Implement flush execution in responder side RDMA/rxe: Implement RC RDMA FLUSH service in requester side RDMA/rxe: Extend rxe packet format to support flush RDMA/rxe: Allow registering persistent flag for pmem MR only RDMA/rxe: Extend rxe user ABI to support flush RDMA: Extend RDMA kernel verbs ABI to support flush RDMA: Extend RDMA user ABI to support flush RDMA/rxe: Fix incorrect responder length checking RDMA/rxe: Fix oops with zero length reads RDMA/mlx5: Remove not-used IB_FLOW_SPEC_IB define RDMA/hns: Fix XRC caps on HIP08 RDMA/hns: Fix error code of CMD RDMA/hns: Fix page size cap from firmware RDMA/hns: Fix PBL page MTR find RDMA/hns: Fix AH attr queried by query_qp ...
2022-12-09RDMA/rxe: Enable RDMA FLUSH capability for rxe deviceLi Zhijian1-0/+2
Now we are ready to enable RDMA FLUSH capability for RXE. It can support Global Visibility and Persistence placement types. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Implement flush completionLi Zhijian1-1/+3
Per IBA SPEC, FLUSH will ack in rdma read response with 0 length. Use IB_WC_FLUSH (aka IB_UVERBS_WC_FLUSH) code to tell userspace a FLUSH completion. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Implement flush execution in responder sideLi Zhijian4-20/+183
Only the requested placement types that also registered in the destination memory region are acceptable. Otherwise, responder will also reply NAK "Remote Access Error" if it found a placement type violation. We will persist data via arch_wb_cache_pmem(), which could be architecture specific. This commit also adds 2 helpers to update qp.resp from the incoming packet. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Implement RC RDMA FLUSH service in requester sideLi Zhijian1-1/+14
Implement FLUSH request operation in the requester. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Extend rxe packet format to support flushLi Zhijian3-5/+73
Extend rxe opcode tables, headers, helper and constants to support flush operations. Refer to the IBA A19.4.1 for more FETH definition details Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Allow registering persistent flag for pmem MR onlyLi Zhijian1-2/+20
Memory region could support at most 2 flush access flags: IB_ACCESS_FLUSH_PERSISTENT and IB_ACCESS_FLUSH_GLOBAL But we only allow user to register persistent flush flags to the pmem MR where it has the ability of persisting data across power cycles. So registering a persistent access flag to a non-pmem MR will be rejected. Link: https://lore.kernel.org/r/[email protected] CC: Dan Williams <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Fix incorrect responder length checkingBob Pearson1-26/+36
The code in rxe_resp.c at check_length() is incorrect as it compares pkt->opcode an 8 bit value against various mask bits which are all higher than 256 so nothing is ever reported. This patch rewrites this to compare against pkt->mask which is correct. However this now exposes another error. For UD send packets the value of the pmtu cannot be determined from qp->mtu. All that is required here is to later check if the payload fits into the posted receive buffer in that case. Fixes: 837a55847ead ("RDMA/rxe: Implement packet length validation on responder") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bob Pearson <[email protected]> Reviewed-by: Daisuke Matsuda <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09RDMA/rxe: Fix oops with zero length readsDaisuke Matsuda1-3/+5
The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes a kernel crash. If responder get a zero-byte RDMA Read request, qp->resp.mr is not set in check_rkey() (see IBA C9-88). The mr is NULL in this case, and a NULL pointer dereference occurs as shown below. BUG: kernel NULL pointer dereference, address: 0000000000000010 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP PTI CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe] Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246 RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000 RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000 RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000 FS: 00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0 Call Trace: <IRQ> read_reply+0xda/0x310 [rdma_rxe] rxe_responder+0x82d/0xe50 [rdma_rxe] do_task+0x84/0x170 [rdma_rxe] tasklet_action_common.constprop.0+0xa7/0x120 __do_softirq+0xcb/0x2ac do_softirq+0x63/0x90 </IRQ> Support a NULL mr during read_reply() Fixes: 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") Fixes: b5f9a01fae42 ("RDMA/rxe: Fix mr leak in RESPST_ERR_RNR") Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Daisuke Matsuda <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Reviewed-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-09Merge tag 'v6.1-rc8' into rdma.git for-nextJason Gunthorpe1-1/+3
For dependencies in following patches Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA/rxe: Enable atomic write capability for rxe deviceXiao Yang1-0/+5
The capability shows that rxe device supports atomic write operation. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA/rxe: Implement atomic write completionXiao Yang1-0/+4
Generate an atomic write completion when the atomic write request has been finished. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA/rxe: Make responder support atomic write on RC serviceXiao Yang1-5/+79
Make responder process an atomic write request and send a read response on RC service. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA/rxe: Make requester support atomic write on RC serviceXiao Yang1-2/+13
Make requester process and send an atomic write request on RC service. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA/rxe: Extend rxe packet format to support atomic writeXiao Yang2-0/+21
Extend rxe_wr_opcode_info[] and rxe_opcode[] for new atomic write opcode. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>