Age | Commit message (Collapse) | Author | Files | Lines |
|
If the probe succeeds, then auxiliary_get_drvdata() can't return a NULL
pointer.
So several NULL checks can be removed to simplify code.
Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://patch.msgid.link/f02eb630734ee530315dce9f60b078f631ae93d0.1730477345.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
If bnxt_re_add_device() fails, 'en_info' still needs to be freed, as
already done in the .remove() function.
The commit in Fixes incorrectly removed this call, certainly because it
was expecting the .remove() function was called anyway. But if the probe
fails, the remove function is not called.
There is no need to call bnxt_re_remove() as it was done before, kfree()
is enough.
Fixes: a5e099e0c464 ("RDMA/bnxt_re: Fix an error path in bnxt_re_add_device")
Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://patch.msgid.link/9e48ff955ae55fc39a9eb1eb590d374539eab5ba.1730477345.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
There is a race between the CREQ tasklet and destroy qp when accessing the
qp-handle table. There is a chance of reading a valid qp-handle in the
CREQ tasklet handler while the QP is already moving ahead with the
destruction.
Fixing this race by implementing a table-lock to synchronize the access.
Fixes: f218d67ef004 ("RDMA/bnxt_re: Allow posting when QPs are in error")
Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Control path completion processing always runs in tasklet context. To
synchronize with the posting thread, there is no need to use the irq
variant of spin lock. Use spin_lock_bh instead.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
After the cited commit below max_dest_rd_atomic and max_rd_atomic values
are being rounded down to the next power of 2. As opposed to the old
behavior and mlx4 driver where they used to be rounded up instead.
In order to stay consistent with older code and other drivers, revert to
using fls round function which rounds up to the next power of 2.
Fixes: f18e26af6aba ("RDMA/mlx5: Convert modify QP to use MLX5_SET macros")
Link: https://patch.msgid.link/r/d85515d6ef21a2fa8ef4c8293dce9b58df8a6297.1728550179.git.leon@kernel.org
Signed-off-by: Patrisious Haddad <[email protected]>
Reviewed-by: Maher Sanalla <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Restore the missing functionality to dump vendor specific QP details,
which was mistakenly removed in the commit mentioned in Fixes line.
Fixes: 5cc34116ccec ("RDMA: Add dedicated QP resource tracker function")
Link: https://patch.msgid.link/r/ed9844829135cfdcac7d64285688195a5cd43f82.1728323026.git.leonro@nvidia.com
Reported-by: Dr. David Alan Gilbert <[email protected]>
Closes: https://lore.kernel.org/all/Zv_4qAxuC0dLmgXP@gallifrey
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
GID table length is reported by FW. The gid index which is passed to the
driver during modify_qp/create_ah is restricted by the sgid_index field of
struct ib_global_route. sgid_index is u8 and the max sgid possible is
256.
Each GID entry in HW will have 2 GID entries in the kernel gid table. So
we can support twice the gid table size reported by FW. Also, restrict the
max GID to 256 also.
Fixes: 847b97887ed4 ("RDMA/bnxt_re: Restrict the max_gids to 256")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Avoid memory corruption while setting up Level-2 PBL pages for the non MR
resources when num_pages > 256K.
There will be a single PDE page address (contiguous pages in the case of >
PAGE_SIZE), but, current logic assumes multiple pages, leading to invalid
memory access after 256K PBL entries in the PDE.
Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Bhargava Chenna Marreddy <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Currently the CQ toggle value in the shared page (read by the userlib) is
updated as part of the cqn_handler. There is a potential race of
application calling the CQ ARM doorbell immediately and using the old
toggle value.
Change the sequence of updating CQ toggle value to update in the
bnxt_qplib_service_nq function immediately after reading the toggle value
to be in sync with the HW updated value.
Fixes: e275919d9669 ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Chandramohan Akula <[email protected]>
Reviewed-by: Selvin Xavier <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
In bnxt_re_add_device(), when register netdev notifier fails, driver is
not unregistering the IB device in the error cleanup path. Also, removed
the duplicate cleanup in error path of bnxt_re_probe.
Fixes: 94a9dc6ac8f7 ("RDMA/bnxt_re: Group all operations under add_device and remove_device")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Driver waits indefinitely for the fifo occupancy to go below a threshold
as soon as the pacing interrupt is received. This can cause soft lockup on
one of the processors, if the rate of DB is very high.
Add a loop count for FPGA and exit the __wait_for_fifo_occupancy_below_th
if the loop is taking more time. Pacing will be continuing until the
occupancy is below the threshold. This is ensured by the checks in
bnxt_re_pacing_timer_exp and further scheduling the work for pacing based
on the fifo occupancy.
Fixes: 2ad4e6303a6d ("RDMA/bnxt_re: Implement doorbell pacing algorithm")
Link: https://patch.msgid.link/r/[email protected]
Reviewed-by: Kalesh AP <[email protected]>
Reviewed-by: Chandramohan Akula <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
There is a possibility of a NULL pointer dereference in the failure path
of bnxt_re_add_device(). To address that, moved the update of
"rdev->adev" to bnxt_re_dev_add().
Fixes: dee3da3422d5 ("RDMA/bnxt_re: Change aux driver data to en_info to hold more information")
Link: https://patch.msgid.link/r/[email protected]
Reported-by: Dan Carpenter <[email protected]>
Closes: https://lore.kernel.org/linux-rdma/CAH-L+nMCwymKGqf5pd8-FZNhxEkDD=kb6AoCaE6fAVi7b3e5Qw@mail.gmail.com/T/#t
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
When the HWRM command fails, driver currently returns -EFAULT(Bad
address). This does not look correct.
Modified to return -EIO(I/O error).
Fixes: cc1ec769b87c ("RDMA/bnxt_re: Fixing the Control path command and response handling")
Fixes: 65288a22ddd8 ("RDMA/bnxt_re: use shadow qd while posting non blocking rcfw command")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Currently driver is not getting correct srq. Dereference only if qplib has
a valid srq.
Fixes: b02fd3f79ec3 ("RDMA/bnxt_re: Report async events and errors")
Link: https://patch.msgid.link/r/[email protected]
Reviewed-by: Saravanan Vajravel <[email protected]>
Reviewed-by: Chandramohan Akula <[email protected]>
Signed-off-by: Kashyap Desai <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Driver exports pacing stats only on GenP5 and P7 adapters. But while
parsing the pacing stats, driver has a check for "rdev->dbr_pacing". This
caused a trace when KASAN is enabled.
BUG: KASAN: slab-out-of-bounds in bnxt_re_get_hw_stats+0x2b6a/0x2e00 [bnxt_re]
Write of size 8 at addr ffff8885942a6340 by task modprobe/4809
Fixes: 8b6573ff3420 ("bnxt_re: Update the debug counters for doorbell pacing")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Older adapters doesn't support the MAX CQ WQEs reported by older FW. So
restrict the value reported to 1M always for older adapters.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Abhishek Mohapatra<[email protected]>
Reviewed-by: Chandramohan Akula <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
There is "accept*" misspelled as "accpet*" in the comments. Fix the
spelling.
Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Alexander Zubkov <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
ip_dev_find() always returns real net_device address, whether traffic is
running on a vlan or real device, if traffic is over vlan, filling
endpoint struture with real ndev and an attempt to send a connect request
will results in RDMA_CM_EVENT_UNREACHABLE error. This patch fixes the
issue by using vlan_dev_real_dev().
Fixes: 830662f6f032 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Anumula Murali Mohan Reddy <[email protected]>
Signed-off-by: Potnuri Bharat Teja <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
max_sw_wqe used for static wqe mode should be same as the max_wqe.
Calculate the max_sw_wqe only for the variable WQE mode.
Fixes: de1d364c3815 ("RDMA/bnxt_re: Add support for Variable WQE in Genp7 adapters")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
__alloc_pbl() can return error when memory allocation fails.
Driver is not checking the status on one of the instances.
Fixes: 0c4dcd602817 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Link: https://patch.msgid.link/r/[email protected]
Reviewed-by: Selvin Xavier <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Driver uses internal data structure to construct WQE frame.
It used avid type as u16 which can accommodate up to 64K AVs.
When outstanding AVID crosses 64K, driver truncates AVID and
hence it uses incorrect AVID to WR. This leads to WR failure
due to invalid AV ID and QP is moved to error state with reason
set to 19 (INVALID AVID). When RDMA CM path is used, this issue
hits QP1 and it is moved to error state
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://patch.msgid.link/r/[email protected]
Reviewed-by: Selvin Xavier <[email protected]>
Reviewed-by: Chandramohan Akula <[email protected]>
Signed-off-by: Saravanan Vajravel <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
In bnxt_re_setup_chip_ctx() when bnxt_qplib_map_db_bar() fails
driver is not freeing the memory allocated for "rdev->chip_ctx".
Fixes: 0ac20faf5d83 ("RDMA/bnxt_re: Reorg the bar mapping")
Link: https://patch.msgid.link/r/[email protected]
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
no_llseek had been defined to NULL two years ago, in commit 868941b14441
("fs: remove no_llseek")
To quote that commit,
At -rc1 we'll need do a mechanical removal of no_llseek -
git grep -l -w no_llseek | grep -v porting.rst | while read i; do
sed -i '/\<no_llseek\>/d' $i
done
would do it.
Unfortunately, that hadn't been done. Linus, could you do that now, so
that we could finally put that thing to rest? All instances are of the
form
.llseek = no_llseek,
so it's obviously safe.
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Variable en_dev is not effectively used, so delete it.
drivers/infiniband/hw/bnxt_re/main.c:1980:22: warning: variable ‘en_dev’ set but not used.
Reported-by: Abaci Robot <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=10867
Signed-off-by: Jiapeng Chong <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Use a correct field max_dest_rd_atomic instead of max_rd_atomic for the
error output.
Found by Linux Verification Center (linuxtesting.org) with Svace.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Vitaliy Shevtsov <[email protected]>
Link: https://lore.kernel.org/stable/20240916165817.14691-1-v.shevtsov%40maxima.ru
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The lookup_atid() function can return NULL if the ATID is
invalid or does not exist in the identifier table, which
could lead to dereferencing a null pointer without a
check in the `act_establish()` and `act_open_rpl()` functions.
Add a NULL check to prevent null pointer dereferencing.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: cfdda9d76436 ("RDMA/cxgb4: Add driver for Chelsio T4 RNIC")
Signed-off-by: Mikhail Lobanov <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
There are several error cases where hns_roce_create_ah() returns
directly without jumping to sw stat path, thus leading to a problem
that the ah error counter does not increase.
Fixes: ee20cc17e9d8 ("RDMA/hns: Support DSCP")
Fixes: eb7854d63db5 ("RDMA/hns: Support SW stats with debugfs")
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
If the FW crashes, L2 driver gets notified and it notifies
the RoCE driver. Currently driver doesn't re-initialize the
device. Add support for re-initialize the RoCE device.
RoCE device is removed and re-attached in the ulp_stop and
ulp_start respectively. The recovery logic expects the RoCE
driver to be registered with L2 driver while its being removed.
So the driver avoids unregistering with L2 driver in the
recovery path.
Signed-off-by: Chandramohan Akula <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Adding and removing device need to be handled from multiple contexts
when Firmware error recovery is supported. So group all the add and remove
operations to add_device and remove_device function.
Signed-off-by: Chandramohan Akula <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
While registering with the L2 for ULP operations, use the
aux device pointer as the handle. Aux device has
the data bnxt_re_en_dev_info, which is used to
store required information for the bnxt_re_suspend
and bnxt_re_resume functions.
Signed-off-by: Chandramohan Akula <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Reviewed-by: Kashyap Desai <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
rdev will be destroyed and recreated during the FW error
recovery scenarios. So to keep the state, if any, use an
en_info structure which gets created/freed based on auxiliary
device initialization/de-initialization.
Signed-off-by: Chandramohan Akula <[email protected]>
Reviewed-by: Kashyap Desai <[email protected]>
Reviewed-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The IB layer provides a common interface to store and get net
devices associated to an IB device port (ib_device_set_netdev()
and ib_device_get_netdev()).
Previously, mlx5_ib stored and managed the associated net devices
internally.
Replace internal net device management in mlx5_ib with
ib_device_set_netdev() when attaching/detaching a net device and
ib_device_get_netdev() when retrieving the net device.
Export ib_device_get_netdev().
For mlx5 representors/PFs/VFs and lag creation we replace the netdev
assignments with the IB set/get netdev functions.
In active-backup mode lag the active slave net device is stored in the
lag itself. To assure the net device stored in a lag bond IB device is
the active slave we implement the following:
- mlx5_core: when modifying the slave of a bond we send the internal driver event
MLX5_DRIVER_EVENT_ACTIVE_BACKUP_LAG_CHANGE_LOWERSTATE.
- mlx5_ib: when catching the event call ib_device_set_netdev()
This patch also ensures the correct IB events are sent in switchdev lag.
While at it, when in multiport eswitch mode, only a single IB device is
created for all ports. The said IB device will receive all netdev events
of its VFs once loaded, thus to avoid overwriting the mapping of PF IB
device to PF netdev, ignore NETDEV_REGISTER events if the ib device has
already been mapped to a netdev.
Signed-off-by: Chiara Meiohas <[email protected]>
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
phys_port_cnt of the IB device must be initialized before calling
ib_device_set_netdev().
Previously, phys_port_cnt was initialized in the mlx5_ib init function.
Remove this initialization to allow setting it separately, providing
the flexibility to call ib_device_set_netdev before registering the
IB device.
Signed-off-by: Chiara Meiohas <[email protected]>
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Report the upper device's state as the RDMA port state only in RoCE LAG or
switchdev LAG.
Fixes: 27f9e0ccb6da ("net/mlx5: Lag, Add single RDMA device in multiport mode")
Signed-off-by: Mark Bloch <[email protected]>
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Reviewed-by: Kalesh AP <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Check if RoCE LAG is active before calling the LAG layer for netdev.
This clarifies if LAG is active. No behavior changes with this patch.
Signed-off-by: Mark Bloch <[email protected]>
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Reviewed-by: Kalesh AP <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Consider also the query_vuid cap before enabling the data_direct
functionality.
This may prevent a syndrome from the FW in case the query_vuid command
is not supported. (e.g. migratable VF)
Signed-off-by: Yishai Hadas <[email protected]>
Reviewed-by: Gal Shalom <[email protected]>
Link: https://patch.msgid.link/274c4f6f1ac0b1078243dd296695a49dbe58e7d1.1725907637.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Implicit MRs in ODP memory scheme require allocating a private null mkey
and assigning the mkey and va differently in the KSM mkey.
The page faults are received on the null mkey so we also add storing the
null mkey in the odp_mkey xarray.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The memory scheme page fault event is a new approch in handling page fault
on mkeys using the on-demand-paging feature.
The major shift in handling the page fault in this scheme is that the HW
is taking responsibilty for parsing the faulted mkey instead of the
previous approach where the driver would read and parse the wqes and
query the mkeys to get to the direct mkey that we need to handle.
Therefore, the event we get from FW in this scheme will contain the
direct mkey and address we need to handle and require much less work
from driver.
Additionally, to optimize performance, the FW can generate the event on
a memory area that is larger than the faulted memory operation is
requiring, to 'prefetch' memory that is around it and will likely be
used soon.
Unlike previous types of page fault, the memory page scheme fault does
not always require a resume command after handling the page fault as the FW
can post multiple events on same mkey and will set the 'last' flag only on
the page fault that requires the resume command.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Split the search for the ODP mkey when handling an rdma type page fault to
a helper function, later to be used in other page fault types.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The new memory scheme page faults are requesting the driver to fetch
additinal pages to the faulted memory access.
This is done in order to prefetch pages before and after the area that
got the page fault, assuming this will reduce the total amount of page
faults.
The driver should ensure it handles only the pages that are within the
umem range.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Add new fields to support the new memory scheme page fault and extend
the token field to u64 as in the new scheme the token is 48 bit.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Expose IFC bits to support the new memory scheme on demand paging.
Change the macro reading odp capabilities to be able to read from the
new IFC layout and align the code in upper layers to be compiled.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Protect the usage of the 6th bit with the relevant capability to ensure
we are using the new page sizes with FW that supports the bit extension.
Signed-off-by: Michael Guralnik <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Fix sparse warnings: restricted __le16 degrades to integer.
Fixes: 5a87279591a1 ("RDMA/hns: Support hns HW stats")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The definition of qib_rc_rnr_retry() has been removed since
commit b4238e70579c ("IB/qib: Use new rdmavt timers"). Also, the definition
of mr_rcu_callback() has been remove since commit 7c2e11fe2dbe ("IB/qib:
Remove qp and mr functionality from qib"). So, let's remove the unused
declartions.
Signed-off-by: Zhang Zekun <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
When allocating MTT hem, for each hop level of each hem that is being
allocated, the driver iterates the hem list to find out whether the
bt page has been allocated in this hop level. If not, allocate a new
one and splice it to the list. The time complexity is O(n^2) in worst
cases.
Currently the allocation for-loop uses 'unit' as the step size. This
actually has taken into account the reuse of last-hop-level MTT bt
pages by multiple buffer pages. Thus pages of last hop level will
never have been allocated, so there is no need to iterate the hem list
in last hop level.
Removing this unnecessary iteration can reduce the time complexity to
O(n).
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing")
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The 1bit-ECC recovery address read from HW only contain bits 64:12, so
it should be fixed left-shifted 12 bits when used.
Currently, the driver will shift the address left by PAGE_SHIFT when
used, which is wrong in non-4K OS.
Fixes: 2de949abd6a5 ("RDMA/hns: Recover 1bit-ECC error of RAM on chip")
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
In abnormal interrupt handler, a PF reset will be triggered even if
the device is a VF. It should be a VF reset.
Fixes: 2b9acb9a97fe ("RDMA/hns: Add the process of AEQ overflow for hip08")
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Fix missuse of spin_lock_irq()/spin_unlock_irq() when
spin_lock_irqsave()/spin_lock_irqrestore() was hold.
This was discovered through the lock debugging, and the corresponding
log is as follows:
raw_local_irq_restore() called with IRQs enabled
WARNING: CPU: 96 PID: 2074 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x30/0x40
...
Call trace:
warn_bogus_irq_restore+0x30/0x40
_raw_spin_unlock_irqrestore+0x84/0xc8
add_qp_to_list+0x11c/0x148 [hns_roce_hw_v2]
hns_roce_create_qp_common.constprop.0+0x240/0x780 [hns_roce_hw_v2]
hns_roce_create_qp+0x98/0x160 [hns_roce_hw_v2]
create_qp+0x138/0x258
ib_create_qp_kernel+0x50/0xe8
create_mad_qp+0xa8/0x128
ib_mad_port_open+0x218/0x448
ib_mad_init_device+0x70/0x1f8
add_client_context+0xfc/0x220
enable_device_and_get+0xd0/0x140
ib_register_device.part.0+0xf4/0x1c8
ib_register_device+0x34/0x50
hns_roce_register_device+0x174/0x3d0 [hns_roce_hw_v2]
hns_roce_init+0xfc/0x2c0 [hns_roce_hw_v2]
__hns_roce_hw_v2_init_instance+0x7c/0x1d0 [hns_roce_hw_v2]
hns_roce_hw_v2_init_instance+0x9c/0x180 [hns_roce_hw_v2]
Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver")
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The max value of 'unit' and 'hop_num' is 2^24 and 2, so the value of
'step' may exceed the range of u32. Change the type of 'step' to u64.
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing")
Signed-off-by: wenglianfa <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|