Age | Commit message (Collapse) | Author | Files | Lines |
|
svc_rdma_post_recv() allocates pages for receive buffers on-demand.
It uses GFP_KERNEL so the allocator tries hard, and may sleep. But
I'm about to add a call to svc_rdma_post_recv() from a function
that may not sleep.
Since all svc_rdma_post_recv() call sites can tolerate its failure,
allow it to fail if the page allocator returns nothing. Longer term,
receive buffers, being a finite resource per-connection, should be
pre-allocated and re-used.
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Clean up.
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
To ensure this allocation cannot fail and will not sleep,
pre-allocate the req_map structures per-connection.
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When the maximum payload size of NFS READ and WRITE was increased
by commit cc9a903d915c ("svcrdma: Change maximum server payload back
to RPCSVC_MAXPAYLOAD"), the size of struct svc_rdma_op_ctxt
increased to over 6KB (on x86_64). That makes allocating one of
these from a kmem_cache more likely to fail in situations when
system memory is exhausted.
Since I'm about to add a caller where this allocation must always
work _and_ it cannot sleep, pre-allocate ctxts for each connection.
Another motivation for this change is that NFSv4.x servers are
required by specification not to drop NFS requests. Pre-allocating
memory resources reduces the likelihood of a drop.
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Be sure the completed ctxt is put in every path.
The xprt enqueue can take a while, so put the completed ctxt back
in circulation _before_ enqueuing the xprt.
Remove/disable debugging.
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
kzalloc is used here, so setting the atomic fields to zero is
unnecessary. sc_ord is set again in handle_connect_req. The other
fields are re-initialized in svc_rdma_accept().
Signed-off-by: Chuck Lever <[email protected]>
Acked-by: Bruce Fields <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Previously, IPV6_DEFAULT_HOPLIMIT was used as the hop limit value for
RoCE. Fixing that by taking ip4_dst_hoplimit and ip6_dst_hoplimit as
hop limit values.
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
rdma_addr_find_dmac_by_grh resolves dmac, vlan_id and if_index and
downsteram patch will also add hop_limit as an output parameter,
thus we rename it to rdma_addr_find_l2_eth_by_grh.
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
ib_send_cm_drep() calls cm_enter_timewait() while holding a spinlock
that can be locked from inside an interrupt handler. Hence do not
enable interrupts inside cm_enter_timewait() if called with interrupts
disabled.
This patch fixes e.g. the following deadlock:
Acked-by: Erez Shitrit <[email protected]>
=================================
[ INFO: inconsistent lock state ]
4.4.0-rc7+ #1 Tainted: G E
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/8/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
(&(&cm_id_priv->lock)->rlock){?.+...}, at: [<ffffffffa036eec4>] cm_establish+0x
74/0x1b0 [ib_cm]
{HARDIRQ-ON-W} state was registered at:
[<ffffffff810a3c11>] mark_held_locks+0x71/0x90
[<ffffffff810a3e87>] trace_hardirqs_on_caller+0xa7/0x1c0
[<ffffffff810a3fad>] trace_hardirqs_on+0xd/0x10
[<ffffffff8151c40b>] _raw_spin_unlock_irq+0x2b/0x40
[<ffffffffa036ea8e>] cm_enter_timewait+0xae/0x100 [ib_cm]
[<ffffffffa036ff76>] ib_send_cm_drep+0xb6/0x190 [ib_cm]
[<ffffffffa052ed08>] srp_cm_handler+0x128/0x1a0 [ib_srp]
[<ffffffffa0370340>] cm_process_work+0x20/0xf0 [ib_cm]
[<ffffffffa0371335>] cm_dreq_handler+0x135/0x2c0 [ib_cm]
[<ffffffffa03733c5>] cm_work_handler+0x75/0xd0 [ib_cm]
[<ffffffff8107184d>] process_one_work+0x1bd/0x460
[<ffffffff81073148>] worker_thread+0x118/0x420
[<ffffffff81078454>] kthread+0xe4/0x100
[<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
irq event stamp: 1672286
hardirqs last enabled at (1672283): [<ffffffff81408ec0>] poll_idle+0x10/0x80
hardirqs last disabled at (1672284): [<ffffffff8151d304>] common_interrupt+0x84/0x89
softirqs last enabled at (1672286): [<ffffffff8105b4dc>] _local_bh_enable+0x1c/0x50
softirqs last disabled at (1672285): [<ffffffff8105b697>] irq_enter+0x47/0x70
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&(&cm_id_priv->lock)->rlock);
<Interrupt>
lock(&(&cm_id_priv->lock)->rlock);
*** DEADLOCK ***
no locks held by swapper/8/0.
stack backtrace:
CPU: 8 PID: 0 Comm: swapper/8 Tainted: G E 4.4.0-rc7+ #1
Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
ffff88045af5e950 ffff88046e503a88 ffffffff81251c1b 0000000000000007
0000000000000006 0000000000000003 ffff88045af5ddc0 ffff88046e503ad8
ffffffff810a32f4 0000000000000000 0000000000000000 0000000000000001
Call Trace:
<IRQ> [<ffffffff81251c1b>] dump_stack+0x4f/0x74
[<ffffffff810a32f4>] print_usage_bug+0x184/0x190
[<ffffffff810a36e2>] mark_lock_irq+0xf2/0x290
[<ffffffff810a3995>] mark_lock+0x115/0x1b0
[<ffffffff810a3b8c>] mark_irqflags+0x15c/0x170
[<ffffffff810a4fef>] __lock_acquire+0x1ef/0x560
[<ffffffff810a53c2>] lock_acquire+0x62/0x80
[<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
[<ffffffffa036eec4>] cm_establish+0x74/0x1b0 [ib_cm]
[<ffffffffa036f031>] ib_cm_notify+0x31/0x100 [ib_cm]
[<ffffffffa0637f24>] srpt_qp_event+0x54/0xd0 [ib_srpt]
[<ffffffffa0196052>] mlx4_ib_qp_event+0x72/0xc0 [mlx4_ib]
[<ffffffffa00775b9>] mlx4_qp_event+0x69/0xd0 [mlx4_core]
[<ffffffffa006000e>] mlx4_eq_int+0x51e/0xd50 [mlx4_core]
[<ffffffffa006084f>] mlx4_msi_x_interrupt+0xf/0x20 [mlx4_core]
[<ffffffff810b67b0>] handle_irq_event_percpu+0x40/0x110
[<ffffffff810b68bf>] handle_irq_event+0x3f/0x70
[<ffffffff810ba7f9>] handle_edge_irq+0x79/0x120
[<ffffffff81007f3d>] handle_irq+0x5d/0x130
[<ffffffff810071fd>] do_IRQ+0x6d/0x130
[<ffffffff8151d309>] common_interrupt+0x89/0x89
<EOI> [<ffffffff8140895f>] cpuidle_enter_state+0xcf/0x200
[<ffffffff81408aa2>] cpuidle_enter+0x12/0x20
[<ffffffff810990d6>] call_cpuidle+0x36/0x60
[<ffffffff81099163>] cpuidle_idle_call+0x63/0x110
[<ffffffff8109930a>] cpu_idle_loop+0xfa/0x130
[<ffffffff8109934e>] cpu_startup_entry+0xe/0x10
[<ffffffff8103c443>] start_secondary+0x83/0x90
Fixes: commit be4b499323bf ("IB/cm: Do not queue work to a device that's going away")
Signed-off-by: Bart Van Assche <[email protected]>
Cc: Erez Shitrit <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Avoid that the following kernel crash is triggered when processing
an RDMA completion:
BUG: unable to handle kernel paging request at 0000000100000198
IP: [<ffffffff810a4ea2>] __lock_acquire+0xa2/0x560
Call Trace:
[<ffffffff810a53c2>] lock_acquire+0x62/0x80
[<ffffffff8151bd33>] _raw_spin_lock_irqsave+0x43/0x60
[<ffffffffa04fd437>] srpt_rdma_read_done+0x57/0x120 [ib_srpt]
[<ffffffffa0144dd3>] __ib_process_cq+0x43/0xc0 [ib_core]
[<ffffffffa0145115>] ib_cq_poll_work+0x25/0x70 [ib_core]
[<ffffffff8107184d>] process_one_work+0x1bd/0x460
[<ffffffff81073148>] worker_thread+0x118/0x420
[<ffffffff81078454>] kthread+0xe4/0x100
[<ffffffff8151cbbf>] ret_from_fork+0x3f/0x70
Fixes: commit 59fae4deaad3 ("IB/srpt: chain RDMA READ/WRITE requests").
Signed-off-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The IRQ_POLL_F_SCHED bit is set as long as polling is ongoing.
This means that irq_poll_sched() must proceed if this bit has
not yet been set.
Fixes: commit ea51190c0315 ("irq_poll: fold irq_poll_sched_prep into irq_poll_sched").
Signed-off-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Sparse complains about dereference before check. Fixing this by
moving the check before the dereference.
Fixes: 200298326b27 ('IB/core: Validate route when we init ah')
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When write_gid function needs to do a sleep-able operation, it unlocks
table->rwlock and then relocks it. Sparse complains about context
imbalance.
This is safe as write_gid is always called with table->rwlock.
write_gid protects from simultaneous writes to this GID entry
by setting the GID_TABLE_ENTRY_INVALID flag.
Fixes: 9c584f049596 ('IB/core: Change per-entry lock in RoCE GID table to
one lock')
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Port number is not part of ClassPortInfo attribute but is
still needed as a parameter when invoking process_mad.
To properly handle this attribute, port_num is added as a
parameter to get_counter_table and get_perf_mad was changed
not to store port_num in the attribute itself when it's
querying the ClassPortInfo attribute.
This handles issue pointed out by Matan Barak <[email protected]>
Fixes: 145d9c541032 ('IB/core: Display extended counter set if available')
Signed-off-by: Hal Rosenstock <[email protected]>
Acked-by: Matan Barak <[email protected]>
Acked-by: Ira Weiny <[email protected]>
Reviewed-by: Christoph Lameter <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Detected this by building the IB core with W=1. See also patch
"IB core: Fix ib_sg_to_pages()" (commit 8f5ba10ed40a).
Signed-off-by: Bart Van Assche <[email protected]>
Cc: Sagi Grimberg <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Acked-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Fix static checker warning:
drivers/infiniband/hw/mlx5/main.c:149 mlx5_query_port_roce()
warn: passing casted pointer '&props->qkey_viol_cntr' to
'mlx5_query_nic_vport_qkey_viol_cntr()' 32 vs 16.
Fixes: 3f89a643eb29 ("IB/mlx5: Extend query_device/port to support RoCE")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Remove the local workqueue to process mad completions and use the CQ API
instead.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hal Rosenstock <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Stop abusing wr_id and just pass the parameter explicitly.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hal Rosenstock <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Use eth_zero_addr to assign the zero address to the given address
array instead of memset when second argument is address of zero.
Signed-off-by: Lucas Tanure <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Fix the following sparse warning:
drivers/infiniband/hw/mlx5/main.c:1061:29: warning: symbol 'pfn' shadows
an earlier one
drivers/infiniband/hw/mlx5/main.c:1030:21: originally declared here
Fixes: d69e3bcf7976 ('IB/mlx5: Mmap the HCA's core clock register to user-space')
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The macro mlx4_foreach_non_ib_transport_port() is not used anywhere. Remove it.
Fixes: aa9a2d51a3e7 ("mlx4: Activate RoCE/SRIOV")
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
In commit dbf727de7440 ("IB/core: Use GID table in AH creation and dmac
resolution") we copy source mac to mlx4_ah from the attributes of
gid at ib_ah_attr.grh.sgid_index. Now we can use it.
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Hop limit value wasn't copied from attributes when ah was created.
This may influence packets for unconnected services to get dropped in
routers when endpoints are not in the same subnet.
Fixes: fa417f7b520e ("IB/mlx4: Add support for IBoE")
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Maximum number of EQE capacity per CQ was mistakenly exposed
as CQE. Fix that.
Fixes: 938fe83c8dcb ("net/mlx5_core: New device capabilities handling")
Signed-off-by: Leon Romanovsky <[email protected]>
Cc: <[email protected]>
Reviewed-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The h/w is designed in such a way that, if you do anything IPv6
related, a valid clip entry must be there. So take clip reference
before creating IPv6 listening servers, and then if we fail to
create server, release the clip entry.
Signed-off-by: Hariprasad Shenai <[email protected]>
Acked-by: Steve Wise <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Commit c5dfb000b904 ("iw_cxgb4: Pass qid range to user space driver")
from Dec 11, 2015, leads to the following static checker warning:
drivers/infiniband/hw/cxgb4/device.c:857 c4iw_rdev_open()
warn: variable dereferenced before check 'rdev->status_page'
Also we weren't deallocating ocqp pool in error path when failed to
allocate status page. Fixing it too.
Fixes: c5dfb000b904 ("iw_cxgb4: Pass qid range to user space driver")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Hariprasad Shenai <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The issue here is that there is a cut and paste bug. When we allocate
cma_dev_group->default_ports_group we use "sizeof(*cma_dev_group->ports)"
instead of "sizeof(*cma_dev_group->default_ports_group)".
We're bumping up against the 80 character limit so I introduced a new
local pointer "ports_group" to get around that.
Fixes: 045959db65c6 ('IB/cma: Add configfs for rdma_cm')
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Matan Barak <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
nes_reg_phys_mr() returns ERR_PTRs on error. It doesn't return NULL.
This bug has been there for a while, but we recently changed from
calling a function pointer to calling nes_reg_phys_mr() directly so now
Smatch is able to detect the bug.
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The current code is problematic when the QP creation and ipoib is used to
support NFS and NFS desires to do IO for paging purposes. In that case, the
GFP_KERNEL allocation in qib_qp.c causes a deadlock in tight memory
situations.
This fix adds support to create queue pair with GFP_NOIO flag for connected
mode only to cleanly fail the create queue pair in those situations.
Cc: <[email protected]> # 3.16+
Reviewed-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Vinit Agnihotri <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Attributed ID was declared as an int while the value should really be big
endian 16.
Fixes: 35c4cbb17811 ("IB/core: Create get_perf_mad function in sysfs.c")
Reported-by: Bart Van Assche <[email protected]>
Signed-off-by: Ira Weiny <[email protected]>
Reviewed-by: Bart Van Assche <[email protected]>
Reviewed-by: Christoph Lameter <[email protected]>
Reviewed-by: Hal Rosenstock <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issueing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
In order to resolve the above deadlock condition, ocrdma intorduced a
patch to stop listening to administrative open/close events generated from
be2net driver. It now depends on link-state-change async-event generated from
CNA. This change leaves behind dead code which used to generate administrative
open/close events. This patch cleans-up all that dead code from be2net.
Reported-by: Doug Ledford <[email protected]>
CC: Sathya Perla <[email protected]>
Signed-off-by: Padmanabh Ratnakar <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Devesh Sharma <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Recently Dough Ledford reported a deadlock happening
between ocrdma-load sequence and NetworkManager service
issuing "open" on be2net interface.
The deadlock happens when any be2net hook (e.g. open/close) is called
in parallel to insmod ocrdma.ko.
A. be2net is sending administrative open/close event to ocrdma holding
device_list_mutex. It does this from ndo_open/ndo_stop hooks of be2net.
So sequence of locks is rtnl_lock---> device_list lock
B. When new ocrdma roce device gets registered, infiniband stack now
takes rtnl_lock in ib_register_device() in GID initialization routines.
So sequence of locks in this path is device_list lock ---> rtnl_lock.
This improper locking sequence causes deadlock.
With this patch we stop using administrative open and close events
injected by be2net driver. These events were used to dispatch PORT_ACTIVE
and PORT_ERROR events to the IB-stack. This patch implements a logic
to receive async-link-events generated from CNA whenever link-state-change
is detected. Now on, these async-events will be used to dispatch
PORT_ACTIVE and PORT_ERROR events to IB-stack.
Depending on async-events from CNA removes the need to hold device-list-mutex
and thus breaks the busy-wait scenario.
Reported-by: Doug Ledford <[email protected]>
CC: Sathya Perla <[email protected]>
Signed-off-by: Padmanabh Ratnakar <[email protected]>
Signed-off-by: Selvin Xavier <[email protected]>
Signed-off-by: Devesh Sharma <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Dispatch only port event to IB stack when port state changes.
Don't explicitly modify qps to error. Let application listen to
port events on async event queue or let QP fail with retry-exceeded
completion error.
Signed-off-by: Padmanabh Ratnakar <[email protected]>
Signed-off-by: Devesh Sharma <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
vlan-id is wrongly getting as 0 when PFC is enabled.
Set vlan-id configured by user in QP parameters.
In case vlan interface is not used, flash a warning to
user to configure vlan and assign vlan-id as 0 in qp params.
Fixes: dbf727de7440 ('IB/core: Use GID table in AH creation and dmac resolution')
Cc: Matan Barak <[email protected]>
Signed-off-by: Devesh Sharma <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
cma_validate_port wrongly assumed that Ethernet devices are RoCE
devices and thus their ndev should be matched in the GID table.
This broke the iWarp support. Fixing that matching the ndev only if
we work on a RoCE port.
Cc: <[email protected]> # 4.4.x-
Fixes: abae1b71dd37 ('IB/cma: cma_validate_port should verify the port
and netdevice')
Reported-by: Hariprasad Shenai <[email protected]>
Tested-by: Hariprasad Shenai <[email protected]>
Signed-off-by: Matan Barak <[email protected]>
Reviewed-by: Steve Wise <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The code produces the following trace:
[1750924.419007] general protection fault: 0000 [#3] SMP
[1750924.420364] Modules linked in: nfnetlink autofs4 rpcsec_gss_krb5 nfsv4
dcdbas rfcomm bnep bluetooth nfsd auth_rpcgss nfs_acl dm_multipath nfs lockd
scsi_dh sunrpc fscache radeon ttm drm_kms_helper drm serio_raw parport_pc
ppdev i2c_algo_bit lpc_ich ipmi_si ib_mthca ib_qib dca lp parport ib_ipoib
mac_hid ib_cm i3000_edac ib_sa ib_uverbs edac_core ib_umad ib_mad ib_core
ib_addr tg3 ptp dm_mirror dm_region_hash dm_log psmouse pps_core
[1750924.420364] CPU: 1 PID: 8401 Comm: python Tainted: G D
3.13.0-39-generic #66-Ubuntu
[1750924.420364] Hardware name: Dell Computer Corporation PowerEdge
860/0XM089, BIOS A04 07/24/2007
[1750924.420364] task: ffff8800366a9800 ti: ffff88007af1c000 task.ti:
ffff88007af1c000
[1750924.420364] RIP: 0010:[<ffffffffa0131d51>] [<ffffffffa0131d51>]
qib_mcast_qp_free+0x11/0x50 [ib_qib]
[1750924.420364] RSP: 0018:ffff88007af1dd70 EFLAGS: 00010246
[1750924.420364] RAX: 0000000000000001 RBX: ffff88007b822688 RCX:
000000000000000f
[1750924.420364] RDX: ffff88007b822688 RSI: ffff8800366c15a0 RDI:
6764697200000000
[1750924.420364] RBP: ffff88007af1dd78 R08: 0000000000000001 R09:
0000000000000000
[1750924.420364] R10: 0000000000000011 R11: 0000000000000246 R12:
ffff88007baa1d98
[1750924.420364] R13: ffff88003ecab000 R14: ffff88007b822660 R15:
0000000000000000
[1750924.420364] FS: 00007ffff7fd8740(0000) GS:ffff88007fc80000(0000)
knlGS:0000000000000000
[1750924.420364] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1750924.420364] CR2: 00007ffff597c750 CR3: 000000006860b000 CR4:
00000000000007e0
[1750924.420364] Stack:
[1750924.420364] ffff88007b822688 ffff88007af1ddf0 ffffffffa0132429
000000007af1de20
[1750924.420364] ffff88007baa1dc8 ffff88007baa0000 ffff88007af1de70
ffffffffa00cb313
[1750924.420364] 00007fffffffde88 0000000000000000 0000000000000008
ffff88003ecab000
[1750924.420364] Call Trace:
[1750924.420364] [<ffffffffa0132429>] qib_multicast_detach+0x1e9/0x350
[ib_qib]
[1750924.568035] [<ffffffffa00cb313>] ? ib_uverbs_modify_qp+0x323/0x3d0
[ib_uverbs]
[1750924.568035] [<ffffffffa0092d61>] ib_detach_mcast+0x31/0x50 [ib_core]
[1750924.568035] [<ffffffffa00cc213>] ib_uverbs_detach_mcast+0x93/0x170
[ib_uverbs]
[1750924.568035] [<ffffffffa00c61f6>] ib_uverbs_write+0xc6/0x2c0 [ib_uverbs]
[1750924.568035] [<ffffffff81312e68>] ? apparmor_file_permission+0x18/0x20
[1750924.568035] [<ffffffff812d4cd3>] ? security_file_permission+0x23/0xa0
[1750924.568035] [<ffffffff811bd214>] vfs_write+0xb4/0x1f0
[1750924.568035] [<ffffffff811bdc49>] SyS_write+0x49/0xa0
[1750924.568035] [<ffffffff8172f7ed>] system_call_fastpath+0x1a/0x1f
[1750924.568035] Code: 66 2e 0f 1f 84 00 00 00 00 00 31 c0 5d c3 66 2e 0f 1f
84 00 00 00 00 00 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 8b 7f 10
<f0> ff 8f 40 01 00 00 74 0e 48 89 df e8 8e f8 06 e1 5b 5d c3 0f
[1750924.568035] RIP [<ffffffffa0131d51>] qib_mcast_qp_free+0x11/0x50
[ib_qib]
[1750924.568035] RSP <ffff88007af1dd70>
[1750924.650439] ---[ end trace 73d5d4b3f8ad4851 ]
The fix is to note the qib_mcast_qp that was found. If none is found, then
return EINVAL indicating the error.
Cc: <[email protected]>
Reviewed-by: Dennis Dalessandro <[email protected]>
Reported-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Mike Marciniszyn <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
ipoib_mcast_restart_task calls ipoib_mcast_remove_list with the
parameter mcast->dev. That mcast is a temporary (used as an iterator)
variable that may be uninitialized.
There is no need to send the variable dev to the function, as each mcast
has its dev as a member in the mcast struct.
This causes the next panic:
RIP: 0010: ipoib_mcast_leave+0x6d/0xf0 [ib_ipoib]
RSP: 0018: EFLAGS: 00010246
RAX: f0201 RBX: 24e00 RCX: 00000
....
....
Stack:
Call Trace:
ipoib_mcast_remove_list+0x3a/0x70 [ib_ipoib]
ipoib_mcast_restart_task+0x3bb/0x520 [ib_ipoib]
process_one_work+0x164/0x470
worker_thread+0x11d/0x420
...
Fixes: 5a0e81f6f483 ('IB/IPoIB: factor out common multicast list removal code')
Signed-off-by: Erez Shitrit <[email protected]>
Reported-by: Doron Tsur <[email protected]>
Reviewed-by: Christoph Lameter <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Declare that we support remote invalidation in case we are:
1. using fastreg method
2. always registering memory
Detect the invalidated rkey from the work completion info so we
won't invalidate it locally. The spec mandates that we must not rely
on the target remote invalidate our rkey so we must check it upon
a receive (scsi response) completion.
Signed-off-by: Jenny Derzhavetz <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When we enable remote invalidate support we won't want to perform
local invalidates at the same time we do today, but we still need
to get new rkeys. So, decouple the rkey update from the local
invalidate and tie it to memory reg instead.
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jenny Derzhavetz <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
We'll use remote invalidate, according to negotiation result
during connection establishment. If the initiator declared that
it supports the remote invalidate exception and the local HCA
supports IB_DEVICE_MEM_MGT_EXTENSIONS then the target will
use IB_WR_SEND_WITH_INV with the correct rkey for the response.
Signed-off-by: Jenny Derzhavetz <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
iser target does not support zero based virtual addresses and
send with invalidate, so it should declare that it doesn't.
Signed-off-by: Jenny Derzhavetz <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
We don't need iser_proto.h anymore, remove it and
move (non-protocol) declarations to ib_isert.h
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jenny Derzhavetz <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The iser RDMA_CM negotiation protocol is shared by
the initiator and the target, so have a shared header
for the defines and structure. Move relevant items from
the initiator and target headers.
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Jenny Derzhavetz <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This parameter is described as "is mr valid indicator".
In other words, it indicates whether memory registration
is valid or not. So intuitive values would be:
mr_valid=True, when memory registration is valid and
mr_valid=False otherwise.
Signed-off-by: Jenny Derzhavetz <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
When all the task data is sent as immediate data, we are
allowed to use the local_dma_lkey as it is not sent to
the wire.
Signed-off-by: Jenny Derzhavetz <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
We have in iser iser_sg_to_page_vec which has exactly
the same role as ib_sg_to_pages. Customize the page_vec
to hold a fake MR so we can reuse ib_sg_to_pages.
Signed-off-by: Sagi Grimberg <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
Destroy workqueue on transport register error, also
release kmem cache on workqueue allocation error.
Signed-off-by: Roi Dayan <[email protected]>
Signed-off-by: Sagi Grimberg <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
This mmu_notifier_ops structure is never modified, so declare it as
const, like the other mmu_notifier_ops structures.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]>
Reviewed-by: Haggai Eran <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|
|
The iser_reg_ops structures are never modified, so declare them as const.
Done with the help of Coccinelle.
Signed-off-by: Julia Lawall <[email protected]>
Acked-by: Sagi Grimberg <[email protected]>
Signed-off-by: Doug Ledford <[email protected]>
|