aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-01-07svcrdma: Move svc_rdma_read_info::ri_pageno to struct svc_rdma_recv_ctxtChuck Lever2-12/+10
Further clean up: move the page index field into svc_rdma_recv_ctxt. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Start moving fields out of struct svc_rdma_read_infoChuck Lever2-31/+30
Since the request's svc_rdma_recv_ctxt will stay around for the duration of the RDMA Read operation, the contents of struct svc_rdma_read_info can reside in the request's svc_rdma_recv_ctxt rather than being allocated separately. This will eventually save a call to kmalloc() in a hot path. Start this clean-up by moving the Read chunk's svc_rdma_chunk_ctxt. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Move struct svc_rdma_chunk_ctxt to svc_rdma.hChuck Lever2-18/+15
Prepare for nestling these into the send and recv ctxts so they no longer have to be allocated dynamically. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Remove the svc_rdma_chunk_ctxt::cc_rdma fieldChuck Lever1-2/+0
In every instance, the pointer address in that field is now available by other means. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Pass a pointer to the transport to svc_rdma_cc_release()Chuck Lever1-6/+7
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma field. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Explicitly pass the transport to svc_rdma_post_chunk_ctxt()Chuck Lever1-5/+5
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma field. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Explicitly pass the transport into Read chunk I/O pathsChuck Lever1-22/+36
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma field. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Explicitly pass the transport into Write chunk I/O pathsChuck Lever1-1/+4
Enable the eventual removal of the svc_rdma_chunk_ctxt::cc_rdma field. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Acquire the svcxprt_rdma pointer from the CQ contextChuck Lever1-2/+3
Enable the removal of the svc_rdma_chunk_ctxt::cc_rdma field in a subsequent patch. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Reduce size of struct svc_rdma_rw_ctxtChuck Lever1-4/+8
SG_CHUNK_SIZE is 128, making struct svc_rdma_rw_ctxt + the first SGL array more than 4200 bytes in length, pushing the memory allocation well into order 1. Even so, the RDMA rw core doesn't seem to use more than max_send_sge entries in that array (typically 32 or less), so that is all wasted space. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Update some svcrdma DMA-related tracepointsChuck Lever2-15/+16
A send/recv_ctxt already records transport-related information in the cq.id, thus there is no need to record the IP addresses of the transport endpoints. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: DMA error tracepoints should report completion IDsChuck Lever2-41/+42
Update the DMA error flow tracepoints to report the completion ID of the failing context. This ties the wait/failure to a particular operation or request, which is more useful than knowing only the failing transport. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: SQ error tracepoints should report completion IDsChuck Lever3-26/+35
Update the Send Queue's error flow tracepoints to report the completion ID of the waiting or failing context. This ties the wait/failure to a particular operation or request, which is a little more useful than knowing only the transport that is about to close. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07rpcrdma: Introduce a simple cid tracepoint classChuck Lever5-71/+30
De-duplicate some code, making it easier to add new tracepoints that report only a completion ID. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Add lockdep class keys for transport locksChuck Lever1-0/+6
Two svcrdma-related transport locks can become quite contended. Collate their use and make them easy to find in /proc/lock_stat for better observability. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Clean up lockingChuck Lever1-2/+2
There's no need to protect llist_entry() with a spin lock. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Add an async version of svc_rdma_write_info_free()Chuck Lever1-1/+11
DMA unmapping can take quite some time, so it should not be handled in a single-threaded completion handler. Defer releasing write_info structs to the recently-added workqueue. With this patch, DMA unmapping can be handled in parallel, and it does not cause head-of-queue blocking of Write completions. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Add an async version of svc_rdma_send_ctxt_put()Chuck Lever2-9/+27
DMA unmapping can take quite some time, so it should not be handled in a single-threaded completion handler. Defer releasing send_ctxts to the recently-added workqueue. With this patch, DMA unmapping can be handled in parallel, and it does not cause head-of-queue blocking of Send completions. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Add a utility workqueue to svcrdmaChuck Lever3-8/+26
To handle work in the background, set up an UNBOUND workqueue for svcrdma. Subsequent patches will make use of it. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Pre-allocate svc_rdma_recv_ctxt objectsChuck Lever1-11/+21
The original reason for allocating svc_rdma_recv_ctxt objects during Receive completion was to ensure the objects were allocated on the NUMA node closest to the underlying IB device. Since commit c5d68d25bd6b ("svcrdma: Clean up allocation of svc_rdma_recv_ctxt"), however, the device's favored node is explicitly passed to the memory allocator. To enable switching Receive completion to soft IRQ context, move memory allocation out of completion handling, since it can be costly, and it can sleep. A limited number of objects is now allocated at "accept" time. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07svcrdma: Eliminate allocation of recv_ctxt objects in backchannelChuck Lever3-22/+23
The svc_rdma_recv_ctxt free list uses a lockless list to avoid the need for a spin lock in the fast path. llist_del_first(), which is used by svc_rdma_recv_ctxt_get(), requires serialization, however, when there are multiple list producers that are unserialized. I mistakenly thought there was only one caller of svc_rdma_recv_ctxt_get() (svc_rdma_refresh_recvs()), thus explicit serialization would not be necessary. But there is another caller: svc_rdma_bc_sendto(), and these two are not serialized against each other. I haven't seen ill effects that I could directly ascribe to a lack of serialization. It's just an observation based on code audit. When DMA-mapping before sending a Reply, the passed-in struct svc_rdma_recv_ctxt is used only for its write and reply PCLs. These are currently always empty in the backchannel case. So, instead of passing a full svc_rdma_recv_ctxt object to svc_rdma_map_reply_msg(), let's pass in just the Write and Reply PCLs. This change makes it unnecessary for the backchannel to acquire a dummy svc_rdma_recv_ctxt object when sending an RPC Call. The need for svc_rdma_recv_ctxt free list serialization is now completely avoided. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSv4, NFSD: move enum nfs_cb_opnum4 to include/linux/nfs4.hChenXiaoSong3-44/+23
Callback operations enum is defined in client and server, move it to common header file. Signed-off-by: ChenXiaoSong <[email protected]> Acked-by: Anna Schumaker <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07nfsd: remove unnecessary NULL checkDan Carpenter1-1/+1
We check "state" for NULL on the previous line so it can't be NULL here. No need to check again. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/r/[email protected]/ Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07SUNRPC: Remove RQ_SPLICE_OKChuck Lever4-15/+0
This flag is no longer used. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: Modify NFSv4 to use nfsd_read_splice_ok()Chuck Lever3-7/+14
Avoid the use of an atomic bitop, and prepare for adding a run-time switch for using splice reads. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: Replace RQ_SPLICE_OK in nfsd_read()Chuck Lever2-1/+26
RQ_SPLICE_OK is a bit of a layering violation. Also, a subsequent patch is going to provide a mechanism for always disabling splice reads. Splicing is an issue only for NFS READs, so refactor nfsd_read() to check the auth type directly instead of relying on an rq_flag setting. The new helper will be added into the NFSv4 read path in a subsequent patch. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07SUNRPC: Add a server-side API for retrieving an RPC's pseudoflavorChuck Lever3-1/+28
NFSD will use this new API to determine whether nfsd_splice_read is safe to use. This avoids the need to add a dependency to NFSD for CONFIG_SUNRPC_GSS. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: Document lack of f_pos_lock in nfsd_readdir()Chuck Lever1-3/+17
Al Viro notes that normal system calls hold f_pos_lock when calling ->iterate_shared and ->llseek; however nfsd_readdir() does not take that mutex when calling these methods. It should be safe however because the struct file acquired by nfsd_readdir() is not visible to other threads. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: Remove nfsd_drc_gc() tracepointChuck Lever2-27/+1
This trace point was for debugging the DRC's garbage collection. In the field it's just noise. Reviewed-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: Make the file_delayed_close workqueue UNBOUNDChuck Lever1-1/+1
workqueue: nfsd_file_delayed_close [nfsd] hogged CPU for >13333us 8 times, consider switching to WQ_UNBOUND There's no harm in closing a cached file descriptor on another core. Signed-off-by: Chuck Lever <[email protected]>
2024-01-07NFSD: use read_seqbegin() rather than read_seqbegin_or_lock()Oleg Nesterov1-4/+3
The usage of read_seqbegin_or_lock() in nfsd_copy_write_verifier() is wrong. "seq" is always even and thus "or_lock" has no effect, this code can never take ->writeverf_lock for writing. I guess this is fine, nfsd_copy_write_verifier() just copies 8 bytes and nfsd_reset_write_verifier() is supposed to be very rare operation so we do not need the adaptive locking in this case. Yet the code looks wrong and sub-optimal, it can use read_seqbegin() without changing the behaviour. [ cel: Note also that it eliminates this Sparse warning: fs/nfsd/nfssvc.c:360:6: warning: context imbalance in 'nfsd_copy_write_verifier' - different lock contexts for basic block ] Signed-off-by: Oleg Nesterov <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: NeilBrown <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07nfsd: new Kconfig option for legacy client trackingJeff Layton3-34/+85
We've had a number of attempts at different NFSv4 client tracking methods over the years, but now nfsdcld has emerged as the clear winner since the others (recoverydir and the usermodehelper upcall) are problematic. As a case in point, the recoverydir backend uses MD5 hashes to encode long form clientid strings, which means that nfsd repeatedly gets dinged on FIPS audits, since MD5 isn't considered secure. Its use of MD5 is not cryptographically significant, so there is no danger there, but allowing us to compile that out allows us to sidestep the issue entirely. As a prelude to eventually removing support for these client tracking methods, add a new Kconfig option that enables them. Mark it deprecated and make it default to N. Acked-by: NeilBrown <[email protected]> Signed-off-by: Jeff Layton <[email protected]> Signed-off-by: Chuck Lever <[email protected]>
2024-01-07ipvlan: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-6/+6
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). This is less verbose. Note that the upper bound of ida_alloc_range() is inclusive while the one of ida_simple_get() was exclusive. So calls have been updated accordingly. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-07ipvlan: Fix a typo in a commentChristophe JAILLET1-1/+1
s/diffentiate/differentiate/ Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-07smb: client: stop revalidating reparse points unnecessarilyPaulo Alcantara3-81/+57
Query dir responses don't provide enough information on reparse points such as major/minor numbers and symlink targets other than reparse tags, however we don't need to unconditionally revalidate them only because they are reparse points. Instead, revalidate them only when their ctime or reparse tag has changed. For instance, Windows Server updates ctime of reparse points when their data have changed. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07cifs: Pass unbyteswapped eof value into SMB2_set_eof()David Howells3-25/+20
Change SMB2_set_eof() to take eof as CPU order rather than __le64 and pass it directly rather than by pointer. This moves the conversion down into SMB_set_eof() rather than all of its callers and means we don't need to undo it for the traceline. Signed-off-by: David Howells <[email protected]> cc: Jeff Layton <[email protected]> cc: [email protected] Signed-off-by: Steve French <[email protected]>
2024-01-07smb3: Improve exception handling in allocate_mr_list()Markus Elfring1-2/+2
The kfree() function was called in one case by the allocate_mr_list() function during error handling even if the passed variable contained a null pointer. This issue was detected by using the Coccinelle software. Thus use another label. Signed-off-by: Markus Elfring <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07cifs: fix in logging in cifs_chan_update_ifaceShyam Prasad N1-6/+8
Recently, cifs_chan_update_iface was modified to not remove an iface if a suitable replacement was not found. With that, there were two conditionals that were exactly the same. This change removes that extra condition check. Also, fixed a logging in the same function to indicate the correct message. Signed-off-by: Shyam Prasad N <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: handle special files and symlinks in SMB3 POSIXPaulo Alcantara1-21/+29
Parse reparse points in SMB3 posix query info as they will be supported and required by the new specification. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: cleanup smb2_query_reparse_point()Paulo Alcantara3-139/+39
Use smb2_compound_op() with SMB2_OP_GET_REPARSE to get reparse point. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: allow creating symlinks via reparse pointsPaulo Alcantara3-5/+86
Add support for creating symlinks via IO_REPARSE_TAG_SYMLINK reparse points in SMB2+. These are fully supported by most SMB servers and documented in MS-FSCC. Also have the advantage of requiring fewer roundtrips as their symlink targets can be parsed directly from CREATE responses on STATUS_STOPPED_ON_SYMLINK errors. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: fix hardlinking of reparse pointsPaulo Alcantara6-27/+43
The client was sending an SMB2_CREATE request without setting OPEN_REPARSE_POINT flag thus failing the entire hardlink operation. Fix this by setting OPEN_REPARSE_POINT in create options for SMB2_CREATE request when the source inode is a repase point. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: fix renaming of reparse pointsPaulo Alcantara6-31/+55
The client was sending an SMB2_CREATE request without setting OPEN_REPARSE_POINT flag thus failing the entire rename operation. Fix this by setting OPEN_REPARSE_POINT in create options for SMB2_CREATE request when the source inode is a repase point. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: optimise reparse point queryingPaulo Alcantara6-31/+119
Reduce number of roundtrips to server when querying reparse points in ->query_path_info() by sending a single compound request of create+get_reparse+get_info+close. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: allow creating special files via reparse pointsPaulo Alcantara10-60/+256
Add support for creating special files (e.g. char/block devices, sockets, fifos) via NFS reparse points on SMB2+, which are fully supported by most SMB servers and documented in MS-FSCC. smb2_get_reparse_inode() creates the file with a corresponding reparse point buffer set in @iov through a single roundtrip to the server. Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: extend smb2_compound_op() to accept more commandsPaulo Alcantara2-384/+402
Make smb2_compound_op() accept up to MAX_COMPOUND(5) commands to be sent over a single compounded request. This will allow next commits to read and write reparse files through a single roundtrip to the server. Signed-off-by: Paulo Alcantara (SUSE) <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07smb: client: Fix minor whitespace errors and warningsPierre Mariani1-8/+17
Fixes no-op checkpatch errors and warnings. Signed-off-by: Pierre Mariani <[email protected]> Signed-off-by: Steve French <[email protected]>
2024-01-07Linux 6.7Linus Torvalds1-1/+1
2024-01-07net/sched: Remove ipt action testsJamal Hadi Salim1-243/+0
Commit ba24ea129126 ("net/sched: Retire ipt action") removed the ipt action but not the testcases. This patch removes the outstanding tdc tests. Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-01-07Merge branch 'stmmac-per-dma-channel-interrupt'David S. Miller9-40/+90
Swee Leong Ching says: ==================== net: stmmac: Enable Per DMA Channel interrupt Add Per DMA Channel interrupt feature for DWXGMAC IP. Patchset (link below) contains per DMA channel interrupt, But it was achieved. https://lore.kernel.org/lkml/20230821203328.GA2197059- [email protected]/t/#m849b529a642e1bff89c05a07efc25d6a94c8bfb4 Some of the changes in this patchset are based on reviewer comment on patchset mentioned beforehand. ==================== Signed-off-by: David S. Miller <[email protected]>