Age | Commit message (Collapse) | Author | Files | Lines |
|
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting
ready to enable it globally.
There are currently a couple of objects (`alloc_head` and `bundle`) in
`struct bundle_priv` that contain a couple of flexible structures:
struct bundle_priv {
/* Must be first */
struct bundle_alloc_head alloc_head;
...
/*
* Must be last. bundle ends in a flex array which overlaps
* internal_buffer.
*/
struct uverbs_attr_bundle bundle;
u64 internal_buffer[32];
};
So, in order to avoid ending up with a couple of flexible-array members
in the middle of a struct, we use the `struct_group_tagged()` helper to
separate the flexible array from the rest of the members in the flexible
structures:
struct uverbs_attr_bundle {
struct_group_tagged(uverbs_attr_bundle_hdr, hdr,
... the rest of the members
);
struct uverbs_attr attrs[];
};
With the change described above, we now declare objects of the type of
the tagged struct without embedding flexible arrays in the middle of
another struct:
struct bundle_priv {
/* Must be first */
struct bundle_alloc_head_hdr alloc_head;
...
struct uverbs_attr_bundle_hdr bundle;
u64 internal_buffer[32];
};
We also use `container_of()` whenever we need to retrieve a pointer
to the flexible structures.
Notice that the `bundle_size` computed in `uapi_compute_bundle_size()`
remains the same.
So, with these changes, fix the following warnings:
drivers/infiniband/core/uverbs_ioctl.c:45:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
45 | struct bundle_alloc_head alloc_head;
| ^~~~~~~~~~
drivers/infiniband/core/uverbs_ioctl.c:67:35: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
67 | struct uverbs_attr_bundle bundle;
| ^~~~~~
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Link: https://lore.kernel.org/r/ZeIgeZ5Sb0IZTOyt@neat
Reviewed-by: Kees Cook <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
granularity
Currently, congestion control algorithm is statically configured in
FW, and all QPs use the same algorithm(except UD which has a fixed
configuration of DCQCN). This is not flexible enough.
Support userspace configuring congestion control algorithm with QP
granularity while creating QPs. If the algorithm is not specified in
userspace, use the default one.
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.
This patch prepares dev_get_iflink() and nla_put_iflink()
to run either with RTNL or RCU held.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
simple_recursive_removal() drops the pinning references to all positives
in subtree. For the cases when its argument has been kept alive by
the pinning alone that's exactly the right thing to do, but here
the argument comes from dcache lookup, that needs to be balanced by
explicit dput().
Fixes: e41d237818598 "qib_fs: switch to simple_recursive_removal()"
Fucked-up-by: Al Viro <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
strnlen() may return 0 (e.g. for "\0\n" string), it's better to
check the result of strnlen() before using 'len - 1' expression
for the 'buf' array index.
Detected using the static analysis tool - Svace.
Fixes: dc3b66a0ce70 ("RDMA/rtrs-clt: Add a minimum latency multipath policy")
Signed-off-by: Alexey Kodanev <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Jack Wang <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
When a struct containing a flexible array is included in another struct,
and there is a member after the struct-with-flex-array, there is a
possibility of memory overlap. These cases must be audited [1]. See:
struct inner {
...
int flex[];
};
struct outer {
...
struct inner header;
int overlap;
...
};
This is the scenario for all the "struct *_filter" structures that are
included in the following "struct ib_flow_spec_*" structures:
struct ib_flow_spec_eth
struct ib_flow_spec_ib
struct ib_flow_spec_ipv4
struct ib_flow_spec_ipv6
struct ib_flow_spec_tcp_udp
struct ib_flow_spec_tunnel
struct ib_flow_spec_esp
struct ib_flow_spec_gre
struct ib_flow_spec_mpls
The pattern is like the one shown below:
struct *_filter {
...
u8 real_sz[];
};
struct ib_flow_spec_* {
...
struct *_filter val;
struct *_filter mask;
};
In this case, the trailing flexible array "real_sz" is never allocated
and is only used to calculate the size of the structures. Here the use
of the "offsetof" helper can be changed by the "sizeof" operator because
the goal is to get the size of these structures. Therefore, the trailing
flexible arrays can also be removed.
However, due to the trailing padding that can be induced in structs it
is possible that the:
offsetof(struct *_filter, real_sz) != sizeof(struct *_filter)
This situation happens with the "struct ib_flow_ipv6_filter" and to
avoid it the "__packed" macro is used in this structure. But now, the
"sizeof(struct ib_flow_ipv6_filter)" has changed. This is not a problem
since this size is not used in the code.
The situation now is that "sizeof(struct ib_flow_spec_ipv6)" has also
changed (this struct contains the struct ib_flow_ipv6_filter). This is
also not a problem since it is only used to set the size of the "union
ib_flow_spec", which can store all the "ib_flow_spec_*" structures.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Erick Archer <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The mad_client will be initialized in enable_device_and_get(), while the
devices_rwsem will be downgraded to a read semaphore. There is a window
that leads to the failed initialization for cm_client, since it can not
get matched mad port from ib_mad_port_list, and the matched mad port will
be added to the list after that.
mad_client | cm_client
------------------|--------------------------------------------------------
ib_register_device|
enable_device_and_get
down_write(&devices_rwsem)
xa_set_mark(&devices, DEVICE_REGISTERED)
downgrade_write(&devices_rwsem)
|
|ib_cm_init
|ib_register_client(&cm_client)
|down_read(&devices_rwsem)
|xa_for_each_marked (&devices, DEVICE_REGISTERED)
|add_client_context
|cm_add_one
|ib_register_mad_agent
|ib_get_mad_port
|__ib_get_mad_port
|list_for_each_entry(entry, &ib_mad_port_list, port_list)
|return NULL
|up_read(&devices_rwsem)
|
add_client_context|
ib_mad_init_device|
ib_mad_port_open |
list_add_tail(&port_priv->port_list, &ib_mad_port_list)
up_read(&devices_rwsem)
|
Fix it by using down_write(&devices_rwsem) in ib_register_client().
Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration")
Link: https://lore.kernel.org/r/[email protected]
Suggested-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Shifeng Li <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Commit 27c5fd271d8b ("RDMA/hns: The UD mode can only be configured
with DCQCN") adds a check of congest control alorithm for UD. But
that patch causes a problem: hr_dev->caps.congest_type is global,
used by all QPs, so modifying this field to DCQCN for UD QPs causes
other QPs unable to use any other algorithm except DCQCN.
Revert the modification in commit 27c5fd271d8b ("RDMA/hns: The UD
mode can only be configured with DCQCN"). Add a new field cong_type
to struct hns_roce_qp and configure DCQCN for UD QPs.
Fixes: 27c5fd271d8b ("RDMA/hns: The UD mode can only be configured with DCQCN")
Fixes: f91696f2f053 ("RDMA/hns: Support congestion control type selection according to the FW")
Signed-off-by: Luoyouming <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
clang-16 notices that srpt_qp_event() gets called through an incompatible
pointer here:
drivers/infiniband/ulp/srpt/ib_srpt.c:1815:5: error: cast from 'void (*)(struct ib_event *, struct srpt_rdma_ch *)' to 'void (*)(struct ib_event *, void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict]
1815 | = (void(*)(struct ib_event *, void*))srpt_qp_event;
Change srpt_qp_event() to use the correct prototype and adjust the
argument inside of it.
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: Arnd Bergmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Avoid the following warning by making sure to free the allocated
resources in case that qedr_init_user_queue() fail.
-----------[ cut here ]-----------
WARNING: CPU: 0 PID: 143192 at drivers/infiniband/core/rdma_core.c:874 uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs]
Modules linked in: tls target_core_user uio target_core_pscsi target_core_file target_core_iblock ib_srpt ib_srp scsi_transport_srp nfsd nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs 8021q garp mrp stp llc ext4 mbcache jbd2 opa_vnic ib_umad ib_ipoib sunrpc rdma_ucm ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm hfi1 intel_rapl_msr intel_rapl_common mgag200 qedr sb_edac drm_shmem_helper rdmavt x86_pkg_temp_thermal drm_kms_helper intel_powerclamp ib_uverbs coretemp i2c_algo_bit kvm_intel dell_wmi_descriptor ipmi_ssif sparse_keymap kvm ib_core rfkill syscopyarea sysfillrect video sysimgblt irqbypass ipmi_si ipmi_devintf fb_sys_fops rapl iTCO_wdt mxm_wmi iTCO_vendor_support intel_cstate pcspkr dcdbas intel_uncore ipmi_msghandler lpc_ich acpi_power_meter mei_me mei fuse drm xfs libcrc32c qede sd_mod ahci libahci t10_pi sg crct10dif_pclmul crc32_pclmul crc32c_intel qed libata tg3
ghash_clmulni_intel megaraid_sas crc8 wmi [last unloaded: ib_srpt]
CPU: 0 PID: 143192 Comm: fi_rdm_tagged_p Kdump: loaded Not tainted 5.14.0-408.el9.x86_64 #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.14.0 01/25/2022
RIP: 0010:uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs]
Code: 5d 41 5c 41 5d 41 5e e9 0f 26 1b dd 48 89 df e8 67 6a ff ff 49 8b 86 10 01 00 00 48 85 c0 74 9c 4c 89 e7 e8 83 c0 cb dd eb 92 <0f> 0b eb be 0f 0b be 04 00 00 00 48 89 df e8 8e f5 ff ff e9 6d ff
RSP: 0018:ffffb7c6cadfbc60 EFLAGS: 00010286
RAX: ffff8f0889ee3f60 RBX: ffff8f088c1a5200 RCX: 00000000802a0016
RDX: 00000000802a0017 RSI: 0000000000000001 RDI: ffff8f0880042600
RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8f11fffd5000 R11: 0000000000039000 R12: ffff8f0d5b36cd80
R13: ffff8f088c1a5250 R14: ffff8f1206d91000 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8f11d7c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000147069200e20 CR3: 00000001c7210002 CR4: 00000000001706f0
Call Trace:
<TASK>
? show_trace_log_lvl+0x1c4/0x2df
? show_trace_log_lvl+0x1c4/0x2df
? ib_uverbs_close+0x1f/0xb0 [ib_uverbs]
? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs]
? __warn+0x81/0x110
? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs]
? report_bug+0x10a/0x140
? handle_bug+0x3c/0x70
? exc_invalid_op+0x14/0x70
? asm_exc_invalid_op+0x16/0x20
? uverbs_destroy_ufile_hw+0xcf/0xf0 [ib_uverbs]
ib_uverbs_close+0x1f/0xb0 [ib_uverbs]
__fput+0x94/0x250
task_work_run+0x5c/0x90
do_exit+0x270/0x4a0
do_group_exit+0x2d/0x90
get_signal+0x87c/0x8c0
arch_do_signal_or_restart+0x25/0x100
? ib_uverbs_ioctl+0xc2/0x110 [ib_uverbs]
exit_to_user_mode_loop+0x9c/0x130
exit_to_user_mode_prepare+0xb6/0x100
syscall_exit_to_user_mode+0x12/0x40
do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? syscall_exit_work+0x103/0x130
? syscall_exit_to_user_mode+0x22/0x40
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
? common_interrupt+0x43/0xa0
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x1470abe3ec6b
Code: Unable to access opcode bytes at RIP 0x1470abe3ec41.
RSP: 002b:00007fff13ce9108 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: fffffffffffffffc RBX: 00007fff13ce9218 RCX: 00001470abe3ec6b
RDX: 00007fff13ce9200 RSI: 00000000c0181b01 RDI: 0000000000000004
RBP: 00007fff13ce91e0 R08: 0000558d9655da10 R09: 0000558d9655dd00
R10: 00007fff13ce95c0 R11: 0000000000000246 R12: 00007fff13ce9358
R13: 0000000000000013 R14: 0000558d9655db50 R15: 00007fff13ce9470
</TASK>
--[ end trace 888a9b92e04c5c97 ]--
Fixes: df15856132bc ("RDMA/qedr: restructure functions that create/destroy QPs")
Signed-off-by: Kamal Heib <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Make loading ib_srpt with this parameter set work. The current behavior is
that setting that parameter while loading the ib_srpt kernel module
triggers the following kernel crash:
BUG: kernel NULL pointer dereference, address: 0000000000000000
Call Trace:
<TASK>
parse_one+0x18c/0x1d0
parse_args+0xe1/0x230
load_module+0x8de/0xa60
init_module_from_file+0x8b/0xd0
idempotent_init_module+0x181/0x240
__x64_sys_finit_module+0x5a/0xb0
do_syscall_64+0x5f/0xe0
entry_SYSCALL_64_after_hwframe+0x6e/0x76
Cc: LiHonggang <[email protected]>
Reported-by: LiHonggang <[email protected]>
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: Bart Van Assche <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
This one is not needed since commit 954afc5a8fd8 ("RDMA/rxe:
Use members of generic struct in rxe_mr").
Signed-off-by: Guoqing Jiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Zhu Yanjun <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Upon rare occasions, KASAN reports a use-after-free Write
in srpt_refresh_port().
This seems to be because an event handler is registered before the
srpt device is fully setup and a race condition upon error may leave a
partially setup event handler in place.
Instead, only register the event handler after srpt device initialization
is complete.
Fixes: a42d985bd5b2 ("ib_srpt: Initial SRP Target merge for v3.3-rc1")
Signed-off-by: William Kucharski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Bart Van Assche <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Unfortunately the commit `fd8958efe877` introduced another error
causing the `descs` array to overflow. This reults in further crashes
easily reproducible by `sendmsg` system call.
[ 1080.836473] general protection fault, probably for non-canonical address 0x400300015528b00a: 0000 [#1] PREEMPT SMP PTI
[ 1080.869326] RIP: 0010:hfi1_ipoib_build_ib_tx_headers.constprop.0+0xe1/0x2b0 [hfi1]
--
[ 1080.974535] Call Trace:
[ 1080.976990] <TASK>
[ 1081.021929] hfi1_ipoib_send_dma_common+0x7a/0x2e0 [hfi1]
[ 1081.027364] hfi1_ipoib_send_dma_list+0x62/0x270 [hfi1]
[ 1081.032633] hfi1_ipoib_send+0x112/0x300 [hfi1]
[ 1081.042001] ipoib_start_xmit+0x2a9/0x2d0 [ib_ipoib]
[ 1081.046978] dev_hard_start_xmit+0xc4/0x210
--
[ 1081.148347] __sys_sendmsg+0x59/0xa0
crash> ipoib_txreq 0xffff9cfeba229f00
struct ipoib_txreq {
txreq = {
list = {
next = 0xffff9cfeba229f00,
prev = 0xffff9cfeba229f00
},
descp = 0xffff9cfeba229f40,
coalesce_buf = 0x0,
wait = 0xffff9cfea4e69a48,
complete = 0xffffffffc0fe0760 <hfi1_ipoib_sdma_complete>,
packet_len = 0x46d,
tlen = 0x0,
num_desc = 0x0,
desc_limit = 0x6,
next_descq_idx = 0x45c,
coalesce_idx = 0x0,
flags = 0x0,
descs = {{
qw = {0x8024000120dffb00, 0x4} # SDMA_DESC0_FIRST_DESC_FLAG (bit 63)
}, {
qw = { 0x3800014231b108, 0x4}
}, {
qw = { 0x310000e4ee0fcf0, 0x8}
}, {
qw = { 0x3000012e9f8000, 0x8}
}, {
qw = { 0x59000dfb9d0000, 0x8}
}, {
qw = { 0x78000e02e40000, 0x8}
}}
},
sdma_hdr = 0x400300015528b000, <<< invalid pointer in the tx request structure
sdma_status = 0x0, SDMA_DESC0_LAST_DESC_FLAG (bit 62)
complete = 0x0,
priv = 0x0,
txq = 0xffff9cfea4e69880,
skb = 0xffff9d099809f400
}
If an SDMA send consists of exactly 6 descriptors and requires dword
padding (in the 7th descriptor), the sdma_txreq descriptor array is not
properly expanded and the packet will overflow into the container
structure. This results in a panic when the send completion runs. The
exact panic varies depending on what elements of the container structure
get corrupted. The fix is to use the correct expression in
_pad_sdma_tx_descs() to test the need to expand the descriptor array.
With this patch the crashes are no longer reproducible and the machine is
stable.
Fixes: fd8958efe877 ("IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors")
Cc: [email protected]
Reported-by: Mats Kronberg <[email protected]>
Tested-by: Mats Kronberg <[email protected]>
Signed-off-by: Daniel Vacek <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Remove the unneeded assignment of the qp_num which is already
set in irdma_create_qp().
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sindhu Devale <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Add IRDMA_AE_LLP_TOO_MANY_RNRS to the list of AE's processed as an
abnormal asyncronous event.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sindhu Devale <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The CQ shadow read threshold is currently not set for GEN 2. This could
cause an invalid CQ overflow condition, so remove the GEN check that
exclused GEN 1.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sindhu Devale <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Validate that max_send_wr and max_recv_wr is within the
supported range.
Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Change-Id: I2fc8b10292b641fddd20b36986a9dae90a93f4be
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sindhu Devale <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
KASAN testing revealed the following issue assocated with freeing an IRQ.
[50006.466686] Call Trace:
[50006.466691] <IRQ>
[50006.489538] dump_stack+0x5c/0x80
[50006.493475] print_address_description.constprop.6+0x1a/0x150
[50006.499872] ? irdma_sc_process_ceq+0x483/0x790 [irdma]
[50006.505742] ? irdma_sc_process_ceq+0x483/0x790 [irdma]
[50006.511644] kasan_report.cold.11+0x7f/0x118
[50006.516572] ? irdma_sc_process_ceq+0x483/0x790 [irdma]
[50006.522473] irdma_sc_process_ceq+0x483/0x790 [irdma]
[50006.528232] irdma_process_ceq+0xb2/0x400 [irdma]
[50006.533601] ? irdma_hw_flush_wqes_callback+0x370/0x370 [irdma]
[50006.540298] irdma_ceq_dpc+0x44/0x100 [irdma]
[50006.545306] tasklet_action_common.isra.14+0x148/0x2c0
[50006.551096] __do_softirq+0x1d0/0xaf8
[50006.555396] irq_exit_rcu+0x219/0x260
[50006.559670] irq_exit+0xa/0x20
[50006.563320] smp_apic_timer_interrupt+0x1bf/0x690
[50006.568645] apic_timer_interrupt+0xf/0x20
[50006.573341] </IRQ>
The issue is that a tasklet could be pending on another core racing
the delete of the irq.
Fix by insuring any scheduled tasklet is killed after deleting the
irq.
Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Shiraz Saleem <[email protected]>
Signed-off-by: Sindhu Devale <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
When creating EQs we take into consideration the max number of EQs the
device reported it can support and the number of available CPUs. There
are situations where the number of EQs the device reported it can
support and the PCI configuration of MSI-X is different, take it in
account as well when creating EQs.
Also request at least 1 MSI-X vector for the management queue and allow
the kernel to return a number of vectors in a range between 1 and the
max supported MSI-X vectors according to the PCI config.
Reviewed-by: Michael Margolin <[email protected]>
Reviewed-by: Yonatan Goldhirsh <[email protected]>
Signed-off-by: Yonatan Nachum <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Relax DEVX access upon modify commands to be UVERBS_ACCESS_READ.
The kernel doesn't need to protect what firmware protects, or what
causes no damage to anyone but the user.
As firmware needs to protect itself from parallel access to the same
object, don't block parallel modify/query commands on the same object in
the kernel side.
This change will allow user space application to run parallel updates to
different entries in the same bulk object.
Tested-by: Tamar Mashiah <[email protected]>
Signed-off-by: Yishai Hadas <[email protected]>
Reviewed-by: Michael Guralnik <[email protected]>
Link: https://lore.kernel.org/r/7407d5ed35dc427c1097699e12b49c01e1073406.1706433934.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
supported
debugfs entries for RRoCE general CC parameters must be exposed only when
they are supported, otherwise when accessing them there may be a syndrome
error in kernel log, for example:
$ cat /sys/kernel/debug/mlx5/0000:08:00.1/cc_params/rtt_resp_dscp
cat: '/sys/kernel/debug/mlx5/0000:08:00.1/cc_params/rtt_resp_dscp': Invalid argument
$ dmesg
mlx5_core 0000:08:00.1: mlx5_cmd_out_err:805:(pid 1253): QUERY_CONG_PARAMS(0x824) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x325a82), err(-22)
Fixes: 66fb1d5df6ac ("IB/mlx5: Extend debug control for CC parameters")
Reviewed-by: Edward Srouji <[email protected]>
Signed-off-by: Mark Zhang <[email protected]>
Link: https://lore.kernel.org/r/e7ade70bad52b7468bdb1de4d41d5fad70c8b71c.1706433934.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
------------[ cut here ]------------
memcpy: detected field-spanning write (size 56) of single field "eseg->inline_hdr.start" at /var/lib/dkms/mlnx-ofed-kernel/5.8/build/drivers/infiniband/hw/mlx5/wr.c:131 (size 2)
WARNING: CPU: 0 PID: 293779 at /var/lib/dkms/mlnx-ofed-kernel/5.8/build/drivers/infiniband/hw/mlx5/wr.c:131 mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib]
Modules linked in: 8021q garp mrp stp llc rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) ib_core(OE) mlx5_core(OE) pci_hyperv_intf mlxdevm(OE) mlx_compat(OE) tls mlxfw(OE) psample nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables libcrc32c nfnetlink mst_pciconf(OE) knem(OE) vfio_pci vfio_pci_core vfio_iommu_type1 vfio iommufd irqbypass cuse nfsv3 nfs fscache netfs xfrm_user xfrm_algo ipmi_devintf ipmi_msghandler binfmt_misc crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 snd_pcsp aesni_intel crypto_simd cryptd snd_pcm snd_timer joydev snd soundcore input_leds serio_raw evbug nfsd auth_rpcgss nfs_acl lockd grace sch_fq_codel sunrpc drm efi_pstore ip_tables x_tables autofs4 psmouse virtio_net net_failover failover floppy
[last unloaded: mlx_compat(OE)]
CPU: 0 PID: 293779 Comm: ssh Tainted: G OE 6.2.0-32-generic #32~22.04.1-Ubuntu
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib]
Code: 0c 01 00 a8 01 75 25 48 8b 75 a0 b9 02 00 00 00 48 c7 c2 10 5b fd c0 48 c7 c7 80 5b fd c0 c6 05 57 0c 03 00 01 e8 95 4d 93 da <0f> 0b 44 8b 4d b0 4c 8b 45 c8 48 8b 4d c0 e9 49 fb ff ff 41 0f b7
RSP: 0018:ffffb5b48478b570 EFLAGS: 00010046
RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffb5b48478b628 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffb5b48478b5e8
R13: ffff963a3c609b5e R14: ffff9639c3fbd800 R15: ffffb5b480475a80
FS: 00007fc03b444c80(0000) GS:ffff963a3dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556f46bdf000 CR3: 0000000006ac6003 CR4: 00000000003706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
? show_regs+0x72/0x90
? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib]
? __warn+0x8d/0x160
? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib]
? report_bug+0x1bb/0x1d0
? handle_bug+0x46/0x90
? exc_invalid_op+0x19/0x80
? asm_exc_invalid_op+0x1b/0x20
? mlx5_ib_post_send+0x191b/0x1a60 [mlx5_ib]
mlx5_ib_post_send_nodrain+0xb/0x20 [mlx5_ib]
ipoib_send+0x2ec/0x770 [ib_ipoib]
ipoib_start_xmit+0x5a0/0x770 [ib_ipoib]
dev_hard_start_xmit+0x8e/0x1e0
? validate_xmit_skb_list+0x4d/0x80
sch_direct_xmit+0x116/0x3a0
__dev_xmit_skb+0x1fd/0x580
__dev_queue_xmit+0x284/0x6b0
? _raw_spin_unlock_irq+0xe/0x50
? __flush_work.isra.0+0x20d/0x370
? push_pseudo_header+0x17/0x40 [ib_ipoib]
neigh_connected_output+0xcd/0x110
ip_finish_output2+0x179/0x480
? __smp_call_single_queue+0x61/0xa0
__ip_finish_output+0xc3/0x190
ip_finish_output+0x2e/0xf0
ip_output+0x78/0x110
? __pfx_ip_finish_output+0x10/0x10
ip_local_out+0x64/0x70
__ip_queue_xmit+0x18a/0x460
ip_queue_xmit+0x15/0x30
__tcp_transmit_skb+0x914/0x9c0
tcp_write_xmit+0x334/0x8d0
tcp_push_one+0x3c/0x60
tcp_sendmsg_locked+0x2e1/0xac0
tcp_sendmsg+0x2d/0x50
inet_sendmsg+0x43/0x90
sock_sendmsg+0x68/0x80
sock_write_iter+0x93/0x100
vfs_write+0x326/0x3c0
ksys_write+0xbd/0xf0
? do_syscall_64+0x69/0x90
__x64_sys_write+0x19/0x30
do_syscall_64+0x59/0x90
? do_user_addr_fault+0x1d0/0x640
? exit_to_user_mode_prepare+0x3b/0xd0
? irqentry_exit_to_user_mode+0x9/0x20
? irqentry_exit+0x43/0x50
? exc_page_fault+0x92/0x1b0
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7fc03ad14a37
Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffdf8697fe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000008024 RCX: 00007fc03ad14a37
RDX: 0000000000008024 RSI: 0000556f46bd8270 RDI: 0000000000000003
RBP: 0000556f46bb1800 R08: 0000000000007fe3 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000002
R13: 0000556f46bc66b0 R14: 000000000000000a R15: 0000556f46bb2f50
</TASK>
---[ end trace 0000000000000000 ]---
Link: https://lore.kernel.org/r/8228ad34bd1a25047586270f7b1fb4ddcd046282.1706433934.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
mlx5_ib_copy_pas() doesn't exist anymore.
Signed-off-by: Alexey Dobriyan <[email protected]>
Link: https://lore.kernel.org/r/a2cb861e-d11e-4567-8a73-73763d1dc199@p183
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
c4iw_ep_redirect() doesn't exist.
Signed-off-by: Alexey Dobriyan <[email protected]>
Link: https://lore.kernel.org/r/a67881e7-469b-43d9-9973-ad7579eb2c4b@p183
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Before populating the response, driver has to check the status
of HWRM command.
Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
SRQ resize is not supported in the driver. But driver is not
returning error from bnxt_re_modify_srq() for SRQ resize.
Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Older adapters required an unconditional fence for
non-wire memory operations. Newer adapters doesn't require
this and therefore, disabling the unconditional fence.
Fixes: 1801d87b3598 ("RDMA/bnxt_re: Support new 5760X P7 devices")
Signed-off-by: Kashyap Desai <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
After the cited commit, there is no possibility that this check
can return true. Remove it.
Fixes: a43c26fa2e6c ("RDMA/bnxt_re: Remove the sriov config callback")
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Limit the usage of fence MR to adapters older than Gen P5 products.
Fixes: 1801d87b3598 ("RDMA/bnxt_re: Support new 5760X P7 devices")
Signed-off-by: Kashyap Desai <[email protected]>
Signed-off-by: Bhargava Chenna Marreddy <[email protected]>
Signed-off-by: Kalesh AP <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Use a helper function to install callbacks to CQs.
This patch removes code repetition.
Signed-off-by: Konstantin Taranov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Use a helper function to access netdevs using a port number.
This patch removes code repetitions as well as removes the need
to explicitly use gdma_dev, which was error-prone.
Signed-off-by: Konstantin Taranov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Use a helper function to access gdma_context from mana_ib_dev.
This patch removes code repetitions as well as removes the need
to explicitly use gdma_dev, which was error-prone.
Signed-off-by: Konstantin Taranov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
When dma_alloc_coherent fails to allocate dd->cr_base[i].va,
init_credit_return should deallocate dd->cr_base and
dd->cr_base[i] that allocated before. Or those resources
would be never freed and a memleak is triggered.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Signed-off-by: Zhipeng Lu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Dennis Dalessandro <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
'struct hns_roce_hem' is used to refer to the last level of
dma buffer managed by the hw, pointed by a single BA(base
address) in the previous level of BT(base table), so the dma
buffer in 'struct hns_roce_hem' must be contiguous.
Right now the size of dma buffer in 'struct hns_roce_hem' is
decided by mhop->buf_chunk_size in get_hem_table_config(),
which ensure the mhop->buf_chunk_size is power of two of
PAGE_SIZE, so there will be only one contiguous dma buffer
allocated in hns_roce_alloc_hem(), which means hem->chunk_list
and chunk->mem for linking multi dma buffers is unnecessary.
This patch removes the hem->chunk_list and chunk->mem and other
related macro and function accordingly.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
In the current implementation, a fixed addressing level is used for
PBL. But in fact, the necessary addressing level is related to page
size and the size of MR.
This patch calculates the addressing level according to page size
and the size of MR, and uses the addressing level to configure the
PBL.
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
In the current implementation, a fixed page size is used to
configure the umem PBL, which is not flexible enough and is
not conducive to the performance of the HW. Find a best page
size to get better performance.
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
MTR memory allocation do not depend on allocation of mtt.
This patch moves the allocation of mtr before mtt in preparation for
the following optimization.
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
page_shift and page_cnt is only used in mtr_map_bufs(). And these
parameter could be calculated indepedently.
Strip the computation of page_shift and page_cnt from mtr_init_buf_cfg(),
reducing the number of parameters of it. This helps reducing coupling
between mtr_init_buf_cfg() and mtr_map_bufs().
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
hns_roce_mtr_find() is a collection of multiple functions, and the
return value is also difficult to understand, which is not conducive
to modification and maintenance.
Separate the function of obtaining MTR root BA from this function.
And some adjustments has been made to improve readability.
Signed-off-by: Chengchang Tang <[email protected]>
Signed-off-by: Junxian Huang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Utilize the %pe print specifier to get the symbolic error name as a
string (i.e "-ENOMEM") in the log message instead of the error code to
increase its readability.
This change was suggested in
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Christian Heusel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Dan Carpenter <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
commit 9ac01f434a1e ("RDMA/rxe: Extend dbg log messages to err and info")
newly added this info. But it did only show null device when
the rdma_rxe is being loaded because dev_name(rxe->ib_dev->dev)
has not yet been assigned at the moment:
"(null): rxe_set_mtu: Set mtu to 1024"
Remove it to silent this message, check the mtu from it backend link
instead if needed.
CC: Bob Pearson <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Zhu Yanjun <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Previously rxe_{dbg,info,err}() macros are appended built-in newline,
but some users will add redundant newline sometimes. So remove the
built-in newline for these macros.
In terms of rxe_{dbg,info,err}_xxx() macros, because they don't have
built-in newline, append newline when using them.
CC: Daisuke Matsuda <[email protected]>
Reviewed-by: Daisuke Matsuda <[email protected]>
Acked-by: Zhu Yanjun <[email protected]>
Signed-off-by: Li Zhijian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Fix spelling mistakes as reported by codespell.
Fix kernel-doc warnings.
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Dennis Dalessandro <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Leon Romanovsky <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Pull rdma updates from Jason Gunthorpe:
"Small cycle, with some typical driver updates:
- General code tidying in siw, hfi1, idrdma, usnic, hns rtrs and
bnxt_re
- Many small siw cleanups without an overeaching theme
- Debugfs stats for hns
- Fix a TX queue timeout in IPoIB and missed locking of the mcast
list
- Support more features of P7 devices in bnxt_re including a new work
submission protocol
- CQ interrupts for MANA
- netlink stats for erdma
- EFA multipath PCI support
- Fix Incorrect MR invalidation in iser"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (66 commits)
RDMA/bnxt_re: Fix error code in bnxt_re_create_cq()
RDMA/efa: Add EFA query MR support
IB/iser: Prevent invalidating wrong MR
RDMA/erdma: Add hardware statistics support
RDMA/erdma: Introduce dma pool for hardware responses of CMDQ requests
IB/iser: iscsi_iser.h: fix kernel-doc warning and spellos
RDMA/mana_ib: Add CQ interrupt support for RAW QP
RDMA/mana_ib: query device capabilities
RDMA/mana_ib: register RDMA device with GDMA
RDMA/bnxt_re: Fix the sparse warnings
RDMA/bnxt_re: Fix the offset for GenP7 adapters for user applications
RDMA/bnxt_re: Share a page to expose per CQ info with userspace
RDMA/bnxt_re: Add UAPI to share a page with user space
IB/ipoib: Fix mcast list locking
RDMA/mlx5: Expose register c0 for RDMA device
net/mlx5: E-Switch, expose eswitch manager vport
net/mlx5: Manage ICM type of SW encap
RDMA/mlx5: Support handling of SW encap ICM area
net/mlx5: Introduce indirect-sw-encap ICM properties
RDMA/bnxt_re: Adds MSN table capability for Gen P7 adapters
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"CPU features:
- Remove ARM64_HAS_NO_HW_PREFETCH copy_page() optimisation for ye
olde Thunder-X machines
- Avoid mapping KPTI trampoline when it is not required
- Make CPU capability API more robust during early initialisation
Early idreg overrides:
- Remove dependencies on core kernel helpers from the early
command-line parsing logic in preparation for moving this code
before the kernel is mapped
FPsimd:
- Restore kernel-mode fpsimd context lazily, allowing us to run
fpsimd code sequences in the kernel with pre-emption enabled
KBuild:
- Install 'vmlinuz.efi' when CONFIG_EFI_ZBOOT=y
- Makefile cleanups
LPA2 prep:
- Preparatory work for enabling the 'LPA2' extension, which will
introduce 52-bit virtual and physical addressing even with 4KiB
pages (including for KVM guests).
Misc:
- Remove dead code and fix a typo
MM:
- Pass NUMA node information for IRQ stack allocations
Perf:
- Add perf support for the Synopsys DesignWare PCIe PMU
- Add support for event counting thresholds (FEAT_PMUv3_TH)
introduced in Armv8.8
- Add support for i.MX8DXL SoCs to the IMX DDR PMU driver.
- Minor PMU driver fixes and optimisations
RIP VPIPT:
- Remove what support we had for the obsolete VPIPT I-cache policy
Selftests:
- Improvements to the SVE and SME selftests
Stacktrace:
- Refactor kernel unwind logic so that it can used by BPF unwinding
and, eventually, reliable backtracing
Sysregs:
- Update a bunch of register definitions based on the latest XML drop
from Arm"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (87 commits)
kselftest/arm64: Don't probe the current VL for unsupported vector types
efi/libstub: zboot: do not use $(shell ...) in cmd_copy_and_pad
arm64: properly install vmlinuz.efi
arm64/sysreg: Add missing system instruction definitions for FGT
arm64/sysreg: Add missing system register definitions for FGT
arm64/sysreg: Add missing ExtTrcBuff field definition to ID_AA64DFR0_EL1
arm64/sysreg: Add missing Pauth_LR field definitions to ID_AA64ISAR1_EL1
arm64: memory: remove duplicated include
arm: perf: Fix ARCH=arm build with GCC
arm64: Align boot cpucap handling with system cpucap handling
arm64: Cleanup system cpucap handling
MAINTAINERS: add maintainers for DesignWare PCIe PMU driver
drivers/perf: add DesignWare PCIe PMU driver
PCI: Move pci_clear_and_set_dword() helper to PCI header
PCI: Add Alibaba Vendor ID to linux/pci_ids.h
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
arm64: irq: set the correct node for shadow call stack
Revert "perf/arm_dmc620: Remove duplicate format attribute #defines"
arm64: fpsimd: Implement lazy restore for kernel mode FPSIMD
arm64: fpsimd: Preserve/restore kernel mode NEON at context switch
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains the usual miscellaneous features, cleanups, and fixes
for vfs and individual fses.
Features:
- Add Jan Kara as VFS reviewer
- Show correct device and inode numbers in proc/<pid>/maps for vma
files on stacked filesystems. This is now easily doable thanks to
the backing file work from the last cycles. This comes with
selftests
Cleanups:
- Remove a redundant might_sleep() from wait_on_inode()
- Initialize pointer with NULL, not 0
- Clarify comment on access_override_creds()
- Rework and simplify eventfd_signal() and eventfd_signal_mask()
helpers
- Process aio completions in batches to avoid needless wakeups
- Completely decouple struct mnt_idmap from namespaces. We now only
keep the actual idmapping around and don't stash references to
namespaces
- Reformat maintainer entries to indicate that a given subsystem
belongs to fs/
- Simplify fput() for files that were never opened
- Get rid of various pointless file helpers
- Rename various file helpers
- Rename struct file members after SLAB_TYPESAFE_BY_RCU switch from
last cycle
- Make relatime_need_update() return bool
- Use GFP_KERNEL instead of GFP_USER when allocating superblocks
- Replace deprecated ida_simple_*() calls with their current ida_*()
counterparts
Fixes:
- Fix comments on user namespace id mapping helpers. They aren't
kernel doc comments so they shouldn't be using /**
- s/Retuns/Returns/g in various places
- Add missing parameter documentation on can_move_mount_beneath()
- Rename i_mapping->private_data to i_mapping->i_private_data
- Fix a false-positive lockdep warning in pipe_write() for watch
queues
- Improve __fget_files_rcu() code generation to improve performance
- Only notify writer that pipe resizing has finished after setting
pipe->max_usage otherwise writers are never notified that the pipe
has been resized and hang
- Fix some kernel docs in hfsplus
- s/passs/pass/g in various places
- Fix kernel docs in ntfs
- Fix kcalloc() arguments order reported by gcc 14
- Fix uninitialized value in reiserfs"
* tag 'vfs-6.8.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (36 commits)
reiserfs: fix uninit-value in comp_keys
watch_queue: fix kcalloc() arguments order
ntfs: dir.c: fix kernel-doc function parameter warnings
fs: fix doc comment typo fs tree wide
selftests/overlayfs: verify device and inode numbers in /proc/pid/maps
fs/proc: show correct device and inode numbers in /proc/pid/maps
eventfd: Remove usage of the deprecated ida_simple_xx() API
fs: super: use GFP_KERNEL instead of GFP_USER for super block allocation
fs/hfsplus: wrapper.c: fix kernel-doc warnings
fs: add Jan Kara as reviewer
fs/inode: Make relatime_need_update return bool
pipe: wakeup wr_wait after setting max_usage
file: remove __receive_fd()
file: stop exposing receive_fd_user()
fs: replace f_rcuhead with f_task_work
file: remove pointless wrapper
file: s/close_fd_get_file()/file_close_fd()/g
Improve __fget_files_rcu() code generation (and thus __fget_light())
file: massage cleanup of files that failed to open
fs/pipe: Fix lockdep false-positive in watchqueue pipe_write()
...
|
|
Return -ENOMEM if get_zeroed_page() fails. Don't return success.
Fixes: e275919d9669 ("RDMA/bnxt_re: Share a page to expose per CQ info with userspace")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Selvin Xavier <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Add EFA driver uapi definitions and register a new query MR method that
currently returns the physical interconnects the device is using to
reach the MR. Update admin definitions and efa-abi accordingly.
Reviewed-by: Anas Mousa <[email protected]>
Reviewed-by: Firas Jahjah <[email protected]>
Signed-off-by: Michael Margolin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
The iser_reg_resources structure has two pointers to MR but only one
mr_valid field. The implementation assumes that we use only *sig_mr when
pi_enable is true. Otherwise, we use only *mr. However, it is only
sometimes correct. Read commands without protection information occur even
when pi_enble is true. For example, the following SCSI commands have a
Data-In buffer but never have protection information: READ CAPACITY (16),
INQUIRY, MODE SENSE(6), MAINTENANCE IN. So, we use
*sig_mr for some SCSI commands and *mr for the other SCSI commands.
In most cases, it works fine because the remote invalidation is applied.
However, there are two cases when the remote invalidation is not
applicable.
1. Small write commands when all data is sent as an immediate.
2. The target does not support the remote invalidation feature.
The lazy invalidation is used if the remote invalidation is impossible.
Since, at the lazy invalidation, we always invalidate the MR we want to
use, the wrong MR may be invalidated.
To fix the issue, we need a field per MR that indicates the MR needs
invalidation. Since the ib_mr structure already has such a field, let's
use ib_mr.need_inval instead of iser_reg_resources.mr_valid.
Fixes: b76a439982f8 ("IB/iser: Use IB_WR_REG_MR_INTEGRITY for PI handover")
Link: https://lore.kernel.org/r/[email protected]
Acked-by: Max Gurtovoy <[email protected]>
Signed-off-by: Sergey Gorenko <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|