aboutsummaryrefslogtreecommitdiff
path: root/include/rdma
AgeCommit message (Collapse)AuthorFilesLines
2023-02-12RDMA/umem: Remove unused 'work' member from struct ib_umemJason Gunthorpe1-1/+0
It is not used now. Fixes: b95df5e3e459 ("drivers/IB,core: reduce scope of mmap_sem") Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Devesh Sharma <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2023-02-07RDMA/restrack: Correct spellingDeming Wang1-2/+2
Fix spelling errors. Signed-off-by: Deming Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2023-01-15RDMA/mlx: Calling qp event handler in workqueue contextMark Zhang1-1/+1
Move the call of qp event handler from atomic to workqueue context, so that the handler is able to block. This is needed by following patches. Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Patrisious Haddad <[email protected]> Link: https://lore.kernel.org/r/0cd17b8331e445f03942f4bb28d447f24ac5669d.1672821186.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]>
2023-01-10RDMA/cma: Refactor the inbound/outbound path records process flowMark Zhang2-2/+1
Refactors based on comments [1] of the multiple path records support patchset: - Return failure if not able to set inbound/outbound PRs; - Simplify the flow when receiving the PRs from netlink channel: When a good PR response is received, unpack it and call the path_query callback directly. This saves two memory allocations; - Define RDMA_PRIMARY_PATH_MAX_REC_NUM in a proper place. [1] https://lore.kernel.org/linux-rdma/[email protected]/ Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> #srp Link: https://lore.kernel.org/r/7610025d57342b8b6da0f19516c9612f9c3fdc37.1672819376.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]>
2022-12-09RDMA: Extend RDMA kernel verbs ABI to support flushLi Zhijian2-1/+20
This commit extends the RDMA kernel verbs ABI to support the flush operation defined in IBA A19.4.1. These changes are backward compatible with the existing RDMA kernel verbs ABI. It makes device/HCA support new FLUSH attributes/capabilities, and it also makes memory region support new FLUSH access flags. Users can use ibv_reg_mr(3) to register flush access flags. Only the access flags also supported by device's capabilities can be registered successfully. Once registered successfully, it means the MR is flushable. Similarly, A flushable MR should also have one or both of GLOBAL_VISIBILITY and PERSISTENT attributes/capabilities like device/HCA. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Li Zhijian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-12-01RDMA: Extend RDMA kernel ABI to support atomic writeXiao Yang2-0/+5
1) Define new atomic write request/completion in kernel. 2) Define new atomic write capability in kernel. 3) Define new atomic write opcode for RC service in packet. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-11-28RDMA: Add netdevice_tracker to ib_device_set_netdev()Jason Gunthorpe1-0/+1
This will cause an informative backtrace to print if the user of ib_device_set_netdev() isn't careful about tearing down the ibdevice before its the netdevice parent is destroyed. Such as like this: unregister_netdevice: waiting for vlan0 to become free. Usage count = 2 leaked reference. ib_device_set_netdev+0x266/0x730 siw_newlink+0x4e0/0xfd0 nldev_newlink+0x35c/0x5c0 rdma_nl_rcv_msg+0x36d/0x690 rdma_nl_rcv+0x2ee/0x430 netlink_unicast+0x543/0x7f0 netlink_sendmsg+0x918/0xe20 sock_sendmsg+0xcf/0x120 ____sys_sendmsg+0x70d/0x8b0 ___sys_sendmsg+0x11d/0x1b0 __sys_sendmsg+0xfa/0x1d0 do_syscall_64+0x35/0xb0 entry_SYSCALL_64_after_hwframe+0x63/0xcd This will help debug the issues syzkaller is seeing. Signed-off-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2022-10-19RDMA/opa_vnic: fix spelling typo in commentJiangshan Yi1-1/+1
Fix spelling typo in comment. Reported-by: k2ci <[email protected]> Signed-off-by: Jiangshan Yi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]>
2022-10-19RDMA/core: return -EOPNOSUPP for ODP unsupported deviceLi Zhijian1-1/+1
ib_reg_mr(3) which is used to register a MR with specific access flags for specific HCA will set errno when something go wrong. So, here we should return the specific -EOPNOTSUPP when the being requested ODP access flag is unsupported by the HCA(such as RXE). Signed-off-by: Li Zhijian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Zhu Yanjun <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-09-27RDMA/core: Add UVERBS_ATTR_RAW_FDJason Gunthorpe1-0/+13
This uses the same passing protocol as UVERBS_ATTR_FD (eg len = 0 data_s64 = fd), except that the FD is not required to be a uverbs object and the core code does not covert the FD to an object handle automatically. Access to the int fd is provided by uverbs_get_raw_fd(). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Gunthorpe <[email protected]>
2022-09-22RDMA/cm: Use DLID from inbound/outbound PathRecords as the datapath DLIDMark Zhang1-0/+2
In inter-subnet cases, when inbound/outbound PRs are available, outbound_PR.dlid is used as the requestor's datapath DLID and inbound_PR.dlid is used as the responder's DLID. The inbound_PR.dlid is passed to responder side with the "ConnectReq.Primary_Local_Port_LID" field. With this solution the PERMISSIVE_LID is no longer used in Primary Local LID field. Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Link: https://lore.kernel.org/r/b3f6cac685bce9dde37c610be82e2c19d9e51d9e.1662631201.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]>
2022-09-22RDMA/cma: Multiple path records support with netlink channelMark Zhang2-1/+8
Support receiving inbound and outbound IB path records (along with GMP PathRecord) from user-space service through the RDMA netlink channel. The LIDs in these 3 PRs can be used in this way: 1. GMP PR: used as the standard local/remote LIDs; 2. DLID of outbound PR: Used as the "dlid" field for outbound traffic; 3. DLID of inbound PR: Used as the "dlid" field for outbound traffic in responder side. This is aimed to support adaptive routing. With current IB routing solution when a packet goes out it's assigned with a fixed DLID per target, meaning a fixed router will be used. The LIDs in inbound/outbound path records can be used to identify group of routers that allow communication with another subnet's entity. With them packets from an inter-subnet connection may travel through any router in the set to reach the target. As confirmed with Jason, when sending a netlink request, kernel uses LS_RESOLVE_PATH_USE_ALL so that the service knows kernel supports multiple PRs. Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Link: https://lore.kernel.org/r/2fa2b6c93c4c16c8915bac3cfc4f27be1d60519d.1662631201.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]>
2022-09-22RDMA/core: Rename rdma_route.num_paths field to num_pri_alt_pathsMark Zhang1-1/+6
This fields means the total number of primary and alternative paths, i.e.,: 0 - No primary nor alternate path is available; 1 - Only primary path is available; 2 - Both primary and alternate path are available. Rename it to avoid confusion as with follow patches primary path will support multiple path records. Signed-off-by: Mark Zhang <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Link: https://lore.kernel.org/r/cbe424de63a56207870d70c5edce7c68e45f429e.1662631201.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]>
2022-08-30IB/cm: remove cm_id_priv->id.service_mask and service_mask parameter of ↵Mark Zhang1-1/+0
cm_init_listen() The service_mask is always ~cpu_to_be64(0), so the result is always a NOP when it is &'d with a service_id. Remove it for simplicity. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-08-30IB/cm: Remove the service_mask parameter from ib_cm_listen()Mark Zhang1-6/+1
Remove the service_mask parameter of ib_cm_listen(), as all callers use 0. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-08-21IB: move from strlcpy with unused retval to strscpyWolfram Sang1-1/+1
Follow the advice of the below link and prefer 'strscpy' in this subsystem. Conversion is 1:1 because the return value is not used. Generated by a coccinelle script. Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wolfram Sang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-08-06Merge tag 'dma-mapping-5.20-2022-08-06' of ↵Linus Torvalds1-0/+11
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping updates from Christoph Hellwig: - convert arm32 to the common dma-direct code (Arnd Bergmann, Robin Murphy, Christoph Hellwig) - restructure the PCIe peer to peer mapping support (Logan Gunthorpe) - allow the IOMMU code to communicate an optional DMA mapping length and use that in scsi and libata (John Garry) - split the global swiotlb lock (Tianyu Lan) - various fixes and cleanup (Chao Gao, Dan Carpenter, Dongli Zhang, Lukas Bulwahn, Robin Murphy) * tag 'dma-mapping-5.20-2022-08-06' of git://git.infradead.org/users/hch/dma-mapping: (45 commits) swiotlb: fix passing local variable to debugfs_create_ulong() dma-mapping: reformat comment to suppress htmldoc warning PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() RDMA/rw: drop pci_p2pdma_[un]map_sg() RDMA/core: introduce ib_dma_pci_p2p_dma_supported() nvme-pci: convert to using dma_map_sgtable() nvme-pci: check DMA ops when indicating support for PCI P2PDMA iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg iommu: Explicitly skip bus address marked segments in __iommu_map_sg() dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support dma-direct: support PCI P2PDMA pages in dma-direct map_sg dma-mapping: allow EREMOTEIO return code for P2PDMA transfers PCI/P2PDMA: Introduce helpers for dma_map_sg implementations PCI/P2PDMA: Attempt to set map_type if it has not been set lib/scatterlist: add flag for indicating P2PDMA segments in an SGL swiotlb: clean up some coding style and minor issues dma-mapping: update comment after dmabounce removal scsi: sd: Add a comment about limiting max_sectors to shost optimal limit ata: libata-scsi: cap ata_device->max_sectors according to shost->max_sectors scsi: scsi_transport_sas: cap shost opt_sectors according to DMA optimal limit ...
2022-07-26RDMA/core: introduce ib_dma_pci_p2p_dma_supported()Logan Gunthorpe1-0/+11
Introduce the helper function ib_dma_pci_p2p_dma_supported() to check if a given ib_device can be used in P2PDMA transfers. This ensures the ib_device is not using virt_dma and also that the underlying dma_device supports P2PDMA. Use the new helper in nvme-rdma to replace the existing check for ib_uses_virt_dma(). Adding the dma_pci_p2pdma_supported() check allows switching away from pci_p2pdma_[un]map_sg(). Signed-off-by: Logan Gunthorpe <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2022-07-22RDMA: Fix comment typoXin Gao1-1/+1
The double `get' is duplicated, remove one. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xin Gao <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-06-16RDMA/core: Add a netevent notifier to cmaPatrisious Haddad1-0/+1
Add a netevent callback for cma, mainly to catch NETEVENT_NEIGH_UPDATE. Previously, when a system with failover MAC mechanism change its MAC address during a CM connection attempt, the RDMA-CM would take a lot of time till it disconnects and timesout due to the incorrect MAC address. Now when we get a NETEVENT_NEIGH_UPDATE we check if it is due to a failover MAC change and if so, we instantly destroy the CM and notify the user in order to spare the unnecessary waiting for the timeout. Link: https://lore.kernel.org/r/bb255c9e301cd50b905663b8e73f7f5133d0e4c5.1654601342.git.leonro@nvidia.com Signed-off-by: Patrisious Haddad <[email protected]> Reviewed-by: Mark Zhang <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-05-24RDMA/core: Fix typo in commentJulia Lawall1-1/+1
Spelling mistake (triple letters) in comment. Detected with the help of Coccinelle. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-04-12Merge branch 'mlx5-next' of ↵Jason Gunthorpe1-8/+0
https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Leon Romanovsky says: ==================== Mellanox shared branch that includes: * Removal of FPGA TLS code https://lore.kernel.org/all/[email protected] Mellanox INNOVA TLS cards are EOL in May, 2018 [1]. As such, the code is unmaintained, untested and not in-use by any upstream/distro oriented customers. In order to reduce code complexity, drop the kernel code, clean build config options and delete useless kTLS vs. TLS separation. [1] https://network.nvidia.com/related-docs/eol/LCR-000286.pdf * Removal of FPGA IPsec code https://lore.kernel.org/all/[email protected] Together with FPGA TLS, the IPsec went to EOL state in the November of 2019 [1]. Exactly like FPGA TLS, no active customers exist for this upstream code and all the complexity around that area can be deleted. [2] https://network.nvidia.com/related-docs/eol/LCR-000535.pdf * Fix to undefined behavior from Borislav https://lore.kernel.org/all/[email protected] ==================== * 'mlx5-next' of https://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Remove not-implemented IPsec capabilities net/mlx5: Remove ipsec_ops function table net/mlx5: Reduce kconfig complexity while building crypto support net/mlx5: Move IPsec file to relevant directory net/mlx5: Remove not-needed IPsec config net/mlx5: Align flow steering allocation namespace to common style net/mlx5: Unify device IPsec capabilities check net/mlx5: Remove useless IPsec device checks net/mlx5: Remove ipsec vs. ipsec offload file separation RDMA/core: Delete IPsec flow action logic from the core RDMA/mlx5: Drop crypto flow steering API RDMA/mlx5: Delete never supported IPsec flow action net/mlx5: Remove FPGA ipsec specific statistics net/mlx5: Remove XFRM no_trailer flag net/mlx5: Remove not-used IDA field from IPsec struct net/mlx5: Delete metadata handling logic net/mlx5_fpga: Drop INNOVA IPsec support IB/mlx5: Fix undefined behavior due to shift overflowing the constant net/mlx5: Cleanup kTLS function names and their exposure net/mlx5: Remove tls vs. ktls separation as it is the same net/mlx5: Remove indirection in TLS build net/mlx5: Reliably return TLS device capabilities net/mlx5_fpga: Drop INNOVA TLS support Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jason Gunthorpe <[email protected]>
2022-04-09RDMA/core: Delete IPsec flow action logic from the coreLeon Romanovsky1-8/+0
The removal of mlx5 flow steering logic, left the kernel without any RDMA drivers that implements flow action callbacks supplied by RDMA/core. Any user access to them caused to EOPNOTSUPP error, which can be achieved by simply removing ioctl implementation. Link: https://lore.kernel.org/r/a638e376314a2eb1c66f597c0bbeeab2e5de7faf.1649232994.git.leonro@nvidia.com Reviewed-by: Raed Salem <[email protected]> Acked-by: Jason Gunthorpe <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]>
2022-04-06RDMA: Split kernel-only global device caps from uverbs device capsJason Gunthorpe2-53/+38
Split out flags from ib_device::device_cap_flags that are only used internally to the kernel into kernel_cap_flags that is not part of the uapi. This limits the device_cap_flags to being the same bitmap that will be copied to userspace. This cleanly splits out the uverbs flags from the kernel flags to avoid confusion in the flags bitmap. Add some short comments describing which each of the kernel flags is connected to. Remove unused kernel flags. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-04-04IB/uverbs: Move part of enum ib_device_cap_flags to uapiXiao Yang1-38/+66
1) Part of enum ib_device_cap_flags are used by ibv_query_device(3) or ibv_query_device_ex(3), so we define them in include/uapi/rdma/ib_user_verbs.h and only expose them to userspace. 2) Reformat enum ib_device_cap_flags by removing the indent before '='. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-04-04IB/uverbs: Move enum ib_raw_packet_caps to uapiXiao Yang1-7/+11
This enum is used by ibv_query_device_ex(3) so it should be defined in include/uapi/rdma/ib_user_verbs.h. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiao Yang <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-02-25uaccess: remove CONFIG_SET_FSArnd Bergmann1-1/+1
There are no remaining callers of set_fs(), so CONFIG_SET_FS can be removed globally, along with the thread_info field and any references to it. This turns access_ok() into a cheaper check against TASK_SIZE_MAX. As CONFIG_SET_FS is now gone, drop all remaining references to set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel(). Acked-by: Sam Ravnborg <[email protected]> # for sparc32 changes Acked-by: "Eric W. Biederman" <[email protected]> Tested-by: Sergey Matyukevich <[email protected]> # for arc changes Acked-by: Stafford Horne <[email protected]> # [openrisc, asm-generic] Acked-by: Dinh Nguyen <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2022-01-07RDMA/core: Calculate UDP source port based on flow label or lqpn/rqpnZhu Yanjun1-0/+17
Calculate and set UDP source port based on the flow label. If flow label is not defined in GRH then calculate it based on lqpn/rqpn. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Zhu Yanjun <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-01-05IB/mlx5: Expose NDR speed through MADMaher Sanalla1-0/+1
Under MAD query port, Report NDR speed when NDR is supported in the port capability mask. Link: https://lore.kernel.org/r/a2ab630d2a634547db9b581faa9d65da2edb9d05.1639554831.git.leonro@nvidia.com Signed-off-by: Maher Sanalla <[email protected]> Reviewed-by: Michael Guralnik <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2022-01-05RDMA/mad: Delete duplicated init_query_mad functionsLeon Romanovsky1-1/+11
Several drivers used same function to initialize query MAD, so move that function to global header file. Link: https://lore.kernel.org/r/af6f35c590ff5ef56d0137351b8b295af0f7c13c.1641369858.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]> Reviewed-by: Håkon Bugge <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-11-16RDMA/netlink: Add __maybe_unused to static inline in C fileLeon Romanovsky1-1/+1
Like other commits in the tree add __maybe_unused to a static inline in a C file because some clang compilers will complain about unused code: >> drivers/infiniband/core/nldev.c:2543:1: warning: unused function '__chk_RDMA_NL_NLDEV' MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_NLDEV, 5); ^ Fixes: e3bf14bdc17a ("rdma: Autoload netlink client modules") Link: https://lore.kernel.org/r/4a8101919b765e01d7fde6f27fd572c958deeb4a.1636267207.git.leonro@nvidia.com Reported-by: kernel test robot <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-29RDMA/hns: Use the core code to manage the fixed mmap entriesChengchang Tang1-0/+9
Add a new implementation for mmap by using the new mmap entry API. This makes way for further use of the dynamic mmap allocator in this driver. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Chengchang Tang <[email protected]> Signed-off-by: Yixing Liu <[email protected]> Signed-off-by: Wenpeng Liang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-28RDMA/umem: Allow pinned dmabuf umem usageGal Pressman1-0/+11
Introduce ib_umem_dmabuf_get_pinned() which allows the driver to get a dmabuf umem which is pinned and does not require move_notify callback implementation. The returned umem is pinned and DMA mapped like standard cpu umems, and is released through ib_umem_release() (incl. unpinning and unmapping). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gal Pressman <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-13RDMA/core: Set sgtable nents when using ib_dma_virt_map_sg()Logan Gunthorpe1-1/+6
ib_dma_map_sgtable_attrs() should be mapping the sgls and setting nents but the ib_uses_virt_dma() path falls back to ib_dma_virt_map_sg() which will not set the nents in the sgtable. Check the return value (per the map_sg calling convention) and set sgt->nents appropriately on success. Fixes: 79fbd3e1241c ("RDMA: Use the sg_table directly and remove the opencoded version from umem") Link: https://lore.kernel.org/r/[email protected] Reported-by: Bart Van Assche <[email protected]> Signed-off-by: Logan Gunthorpe <[email protected]> Tested-by: Bart Van Assche <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/mlx5: Add modify_op_stat() supportAharon Landau1-0/+2
Add support for ib callback modify_op_stat() to add or remove an optional counter. When adding, a steering flow table is created with a rule that catches and counts all the matching packets. When removing, the table and flow counter are destroyed. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Aharon Landau <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/mlx5: Add steering support in optional flow countersAharon Landau1-0/+1
Adding steering infrastructure for adding and removing optional counter. This allows to add and remove the counters dynamically in order not to hurt performance. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Aharon Landau <[email protected]> Reviewed-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/counter: Add optional counter supportAharon Landau2-0/+15
An optional counter is a driver-specific counter that may be dynamically enabled/disabled. This enhancement allows drivers to expose counters which are, for example, mutually exclusive and cannot be enabled at the same time, counters that might degrades performance, optional debug counters, etc. Optional counters are marked with IB_STAT_FLAG_OPTIONAL flag. They are not exported in sysfs, and must be at the end of all stats, otherwise the attr->show() in sysfs would get wrong indexes for hwcounters that are behind optional counters. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Aharon Landau <[email protected]> Signed-off-by: Neta Ostrovsky <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/counter: Add an is_disabled field in struct rdma_hw_statsAharon Landau1-0/+3
Add a bitmap in rdma_hw_stat structure, with each bit indicates whether the corresponding counter is currently disabled or not. By default hwcounters are enabled. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Aharon Landau <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/core: Add a helper API rdma_free_hw_stats_structMark Zhang1-23/+4
Add a new API rdma_free_hw_stats_struct to pair with rdma_alloc_hw_stats_struct (which is also de-inlined). This will be useful when there are more alloc/free works in following patches. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-10-12RDMA/counter: Add a descriptor in struct rdma_hw_statsAharon Landau1-6/+15
Add a counter statistic descriptor structure in rdma_hw_stats. In addition to the counter name, more meta-information will be added. This code extension is needed for optional-counter support in the following patches. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Aharon Landau <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Mark Zhang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-30Merge branch 'sg_nents' into rdma.git for-nextJason Gunthorpe2-6/+33
From Maor Gottlieb ==================== Fix the use of nents and orig_nents in the sg table append helpers. The nents should be used by the DMA layer to store the number of DMA mapped sges, the orig_nents is the number of CPU sges. Since the sg append logic doesn't always create a SGL with exactly orig_nents entries store a total_nents as well to allow the table to be properly free'd and reorganize the freeing logic to share across all the use cases. ==================== Signed-off-by: Jason Gunthorpe <[email protected]> * 'sg_nents': RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append
2021-08-24RDMA: Use the sg_table directly and remove the opencoded version from umemMaor Gottlieb2-7/+33
This allows using the normal sg_table APIs and makes all the code cleaner. Remove sgt, nents and nmapd from ib_umem. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-24lib/scatterlist: Fix wrong update of orig_nentsMaor Gottlieb1-0/+1
orig_nents should represent the number of entries with pages, but __sg_alloc_table_from_pages sets orig_nents as the number of total entries in the table. This is wrong when the API is used for dynamic allocation where not all the table entries are mapped with pages. It wasn't observed until now, since RDMA umem who uses this API in the dynamic form doesn't use orig_nents implicit or explicit by the scatterlist APIs. Fix it by changing the append API to track the SG append table state and have an API to free the append table according to the total number of entries in the table. Now all APIs set orig_nents as number of enries with pages. Fixes: 07da1223ec93 ("lib/scatterlist: Add support in dynamic allocation of SG table from pages") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Maor Gottlieb <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-19RDMA/core/sa_query: Remove unused functionHåkon Bugge1-24/+0
ib_sa_service_rec_query() was introduced in kernel v2.6.13 by commit cbae32c56314 ("[PATCH] IB: Add Service Record support to SA client") in 2005. It was not used then and have never been used since. Removing it and related functions/structs. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Håkon Bugge <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-03RDMA/core: Reorganize create QP low-level functionsLeon Romanovsky1-4/+12
The low-level create QP function grew to be larger than any sensible inline function should be. The inline attribute is not really needed for that function and can be implemented as exported symbol. Link: https://lore.kernel.org/r/2c08709d86f876c3dfb77684357b2a939e570ca4.1628014762.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-03RDMA: Globally allocate and release QP memoryLeon Romanovsky1-5/+25
Convert QP object to follow IB/core general allocation scheme. That change allows us to make sure that restrack properly kref the memory. Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com Reviewed-by: Gal Pressman <[email protected]> #efa Tested-by: Gal Pressman <[email protected]> Reviewed-by: Dennis Dalessandro <[email protected]> #rdma and core Tested-by: Dennis Dalessandro <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Tested-by: Tatyana Nikolova <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-08-03RDMA/rdmavt: Decouple QP and SGE lists allocationsLeon Romanovsky1-1/+1
The rdmavt QP has fields that are both needed for the control and data path. Such mixed declaration caused to the very specific allocation flow with kzalloc_node and SGE list embedded into the struct rvt_qp. This patch separates QP creation to two: regular memory allocation for the control path and specific code for the SGE list, while the access to the later is performed through derefenced pointer. Such pointer and its context are expected to be in the cache, so performance difference is expected to be negligible, if any exists. Link: https://lore.kernel.org/r/f66c1e20ccefba0db3c69c58ca9c897f062b4d1c.1627040189.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-06-21IB/core: Shuffle locks in ib_port_data to save memoryAnand Khoje1-1/+3
pahole shows two 4-byte holes in struct ib_port_data after pkey_list_lock and netdev_lock respectively. Shuffling the netdev_lock to be after pkey_list_lock, this shaves off eight bytes from the struct. Link: https://lore.kernel.org/r/[email protected] Suggested-by: Haakon Bugge <[email protected]> Signed-off-by: Anand Khoje <[email protected]> Reviewed-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-06-21RDMA/mlx5: Enable Relaxed Ordering by default for kernel ULPsAvihai Horon1-0/+8
Relaxed Ordering is a capability that can only benefit users that support it. All kernel ULPs should support Relaxed Ordering, as they are designed to read data only after observing the CQE and use the DMA API correctly. Hence, implicitly enable Relaxed Ordering by default for MR transfers in kernel ULPs. Link: https://lore.kernel.org/r/b7e820aab7402b8efa63605f4ea465831b3b1e5e.1623236426.git.leonro@nvidia.com Signed-off-by: Avihai Horon <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2021-06-16RDMA: Remove rdma_set_device_sysfs_group()Jason Gunthorpe1-23/+7
The driver's device group can be specified as part of the ops structure like the device's port group. No need for the complicated API. Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>