aboutsummaryrefslogtreecommitdiff
path: root/fs/nfsd/nfssvc.c
AgeCommit message (Collapse)AuthorFilesLines
2024-10-11Merge tag 'nfs-for-6.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds1-2/+2
Pull NFS client fixes from Anna Schumaker: "Localio Bugfixes: - remove duplicated include in localio.c - fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put() - fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT - fix nfsd_file tracepoints to handle NULL rqstp pointers Other Bugfixes: - fix program selection loop in svc_process_common - fix integer overflow in decode_rc_list() - prevent NULL-pointer dereference in nfs42_complete_copies() - fix CB_RECALL performance issues when using a large number of delegations" * tag 'nfs-for-6.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: NFS: remove revoked delegation from server's delegation list nfsd/localio: fix nfsd_file tracepoints to handle NULL rqstp nfs_common: fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put() NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() SUNRPC: Fix integer overflow in decode_rc_list() sunrpc: fix prog selection loop in svc_process_common nfs: Remove duplicated include in localio.c
2024-10-10Merge tag 'nfsd-6.12-1' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix NFSD bring-up / shutdown - Fix a UAF when releasing a stateid * tag 'nfsd-6.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix possible badness in FREE_STATEID nfsd: nfsd_destroy_serv() must call svc_destroy() even if nfsd_startup_net() failed NFSD: Mark filecache "down" if init fails
2024-10-03nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put()Mike Snitzer1-2/+2
Add nfs_to_nfsd_file_put_local() interface to fix race with nfsd module unload. Similarly, use RCU around nfs_open_local_fh()'s error path call to nfs_to->nfsd_serv_put(). Holding RCU ensures that NFS will safely _call and return_ from its nfs_to calls into the NFSD functions nfsd_file_put_local() and nfsd_serv_put(). Otherwise, if RCU isn't used then there is a narrow window when NFS's reference for the nfsd_file and nfsd_serv are dropped and the NFSD module could be unloaded, which could result in a crash from the return instruction for either nfs_to->nfsd_file_put_local() or nfs_to->nfsd_serv_put(). Reported-by: NeilBrown <neilb@suse.de> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23nfsd: implement server support for NFS_LOCALIO_PROGRAMMike Snitzer1-1/+22
The LOCALIO auxiliary RPC protocol consists of a single "UUID_IS_LOCAL" RPC method that allows the Linux NFS client to verify the local Linux NFS server can see the nonce (single-use UUID) the client generated and made available in nfs_common. The server expects this protocol to use the same transport as NFS and NFSACL for its RPCs. This protocol isn't part of an IETF standard, nor does it need to be considering it is Linux-to-Linux auxiliary RPC protocol that amounts to an implementation detail. The UUID_IS_LOCAL method encodes the client generated uuid_t in terms of the fixed UUID_SIZE (16 bytes). The fixed size opaque encode and decode XDR methods are used instead of the less efficient variable sized methods. The RPC program number for the NFS_LOCALIO_PROGRAM is 400122 (as assigned by IANA, see https://www.iana.org/assignments/rpc-program-numbers/ ): Linux Kernel Organization 400122 nfslocalio Signed-off-by: Mike Snitzer <snitzer@kernel.org> [neilb: factored out and simplified single localio protocol] Co-developed-by: NeilBrown <neilb@suse.de> Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23nfs_common: prepare for the NFS client to use nfsd_file for LOCALIOMike Snitzer1-0/+1
The next commit will introduce nfsd_open_local_fh() which returns an nfsd_file structure. This commit exposes LOCALIO's required NFSD symbols to the NFS client: - Make nfsd_open_local_fh() symbol and other required NFSD symbols available to NFS in a global 'nfs_to' nfsd_localio_operations struct (global access suggested by Trond, nfsd_localio_operations suggested by NeilBrown). The next commit will also introduce nfsd_localio_ops_init() that init_nfsd() will call to initialize 'nfs_to'. - Introduce nfsd_file_file() that provides access to nfsd_file's backing file. Keeps nfsd_file structure opaque to NFS client (as suggested by Jeff Layton). - Introduce nfsd_file_put_local() that will put the reference to the nfsd_file's associated nn->nfsd_serv and then put the reference to the nfsd_file (as suggested by NeilBrown). Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com> # nfs_to Suggested-by: NeilBrown <neilb@suse.de> # nfsd_localio_operations Suggested-by: Jeff Layton <jlayton@kernel.org> # nfsd_file_file Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23SUNRPC: replace program list with program arrayNeilBrown1-19/+19
A service created with svc_create_pooled() can be given a linked list of programs and all of these will be served. Using a linked list makes it cumbersome when there are several programs that can be optionally selected with CONFIG settings. After this patch is applied, API consumers must use only svc_create_pooled() when creating an RPC service that listens for more than one RPC program. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23nfsd: add nfsd_serv_try_get and nfsd_serv_putMike Snitzer1-0/+43
Introduce nfsd_serv_try_get and nfsd_serv_put and update the nfsd code to prevent nfsd_destroy_serv from destroying nn->nfsd_serv until any caller of nfsd_serv_try_get releases their reference using nfsd_serv_put. A percpu_ref is used to implement the interlock between nfsd_destroy_serv and any caller of nfsd_serv_try_get. This interlock is needed to properly wait for the completion of client initiated localio calls to nfsd (that are _not_ in the context of nfsd). Signed-off-by: Mike Snitzer <snitzer@kernel.org> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2024-09-23nfsd: nfsd_destroy_serv() must call svc_destroy() even if nfsd_startup_net() ↵NeilBrown1-3/+3
failed If nfsd_startup_net() fails and so ->nfsd_net_up is false, nfsd_destroy_serv() doesn't currently call svc_destroy(). It should. Fixes: 1e3577a4521e ("SUNRPC: discard sv_refcnt, and svc_get/svc_put") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-20nfsd: use nfsd_v4client() in nfsd_breaker_owns_lease()NeilBrown1-2/+4
nfsd_breaker_owns_lease() currently open-codes the same test that nfsd_v4client() performs. With this patch we use nfsd_v4client() instead. Also as i_am_nfsd() is only used in combination with kthread_data(), replace it with nfsd_current_rqst() which combines the two and returns a valid svc_rqst, or NULL. The test for NULL is moved into nfsd_v4client() for code clarity. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-20sunrpc: allow svc threads to fail initialisation cleanlyNeilBrown1-6/+3
If an svc thread needs to perform some initialisation that might fail, it has no good way to handle the failure. Before the thread can exit it must call svc_exit_thread(), but that requires the service mutex to be held. The thread cannot simply take the mutex as that could deadlock if there is a concurrent attempt to shut down all threads (which is unlikely, but not impossible). nfsd currently call svc_exit_thread() unprotected in the unlikely event that unshare_fs_struct() fails. We can clean this up by introducing svc_thread_init_status() by which an svc thread can report whether initialisation has succeeded. If it has, it continues normally into the action loop. If it has not, svc_thread_init_status() immediately aborts the thread. svc_start_kthread() waits for either of these to happen, and calls svc_exit_thread() (under the mutex) if the thread aborted. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-20sunrpc: change sp_nrthreads from atomic_t to unsigned int.NeilBrown1-1/+1
sp_nrthreads is only ever accessed under the service mutex nlmsvc_mutex nfs_callback_mutex nfsd_mutex so these is no need for it to be an atomic_t. The fact that all code using it is single-threaded means that we can simplify svc_pool_victim and remove the temporary elevation of sp_nrthreads. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-20nfsd: don't allocate the versions array.NeilBrown1-83/+19
Instead of using kmalloc to allocate an array for storing active version info, just declare an array to the max size - it is only 5 or so. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-09-01nfsd: move nfsd_pool_stats_open into nfsctl.cNeilBrown1-7/+0
nfsd_pool_stats_open() is used in nfsctl.c, so move it there. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-07-08nfsd: allow passing in array of thread counts via netlinkJeff Layton1-1/+11
Now that nfsd_svc can handle an array of thread counts, fix up the netlink threads interface to construct one from the netlink call and pass it through so we can start a pooled server the same way we would start a normal one. Note that any unspecified values in the array are considered zeroes, so it's possible to shut down a pooled server by passing in a short array that has only zeros, or even an empty array. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-07-08nfsd: make nfsd_svc take an array of thread countsJeff Layton1-21/+33
Now that the refcounting is fixed, rework nfsd_svc to use the same thread setup as the pool_threads interface. Have it take an array of thread counts instead of just a single value, and pass that from the netlink threads set interface. Since the new netlink interface doesn't have the same restriction as pool_threads, move the guard against shutting down all threads to write_pool_threads. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-06-25nfsd: initialise nfsd_info.mutex early.NeilBrown1-1/+0
nfsd_info.mutex can be dereferenced by svc_pool_stats_start() immediately after the new netns is created. Currently this can trigger an oops. Move the initialisation earlier before it can possibly be dereferenced. Fixes: 7b207ccd9833 ("svc: don't hold reference for poolstats, only mutex.") Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/all/c2e9f6de-1ec4-4d3a-b18d-d5a6ec0814a0@linux.ibm.com/ Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-05-06NFSD: add write_version to netlink commandLorenzo Bianconi1-2/+1
Introduce write_version netlink command through a "declarative" interface. This patch introduces a change in behavior since for version-set userspace is expected to provide a NFS major/minor version list it wants to enable while all the other ones will be disabled. (procfs write_version command implements imperative interface where the admin writes +3/-3 to enable/disable a single version. Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-05-06NFSD: allow callers to pass in scope string to nfsd_svcJeff Layton1-2/+2
Currently admins set this by using unshare to create a new uts namespace, and then resetting the hostname. With the new netlink interface we can just pass this in directly. Prepare nfsd_svc for this change. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-05-06NFSD: move nfsd_mutex handling into nfsd_svc callersJeff Layton1-2/+2
Currently nfsd_svc holds the nfsd_mutex over the whole function. For some of the later netlink patches though, we want to do some other things to the server before starting it. Move the mutex handling into the callers. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01nfsd: make svc_stat per-network namespace instead of globalJosef Bacik1-1/+1
The final bit of stats that is global is the rpc svc_stat. Move this into the nfsd_net struct and use that everywhere instead of the global struct. Remove the unused global struct. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01nfsd: remove nfsd_stats, make th_cnt a global counterJosef Bacik1-2/+3
This is the last global stat, take it out of the nfsd_stats struct and make it a global part of nfsd, report it the same as always. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01sunrpc: remove ->pg_stats from svc_programJosef Bacik1-1/+0
Now that this isn't used anywhere, remove it. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01sunrpc: pass in the sv_stats struct through svc_create_pooledJosef Bacik1-1/+2
Since only one service actually reports the rpc stats there's not much of a reason to have a pointer to it in the svc_program struct. Adjust the svc_create_pooled function to take the sv_stats as an argument and pass the struct through there as desired instead of getting it from the svc_program->pg_stats. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01nfsd: stop setting ->pg_stats for unused statsJosef Bacik1-5/+0
A lot of places are setting a blank svc_stats in ->pg_stats and never utilizing these stats. Remove all of these extra structs as we're not reporting these stats anywhere. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-03-01nfsd: Don't leave work of closing files to a work queueNeilBrown1-0/+2
The work of closing a file can have non-trivial cost. Doing it in a separate work queue thread means that cost isn't imposed on the nfsd threads and an imbalance can be created. This can result in files being queued for the work queue more quickly that the work queue can process them, resulting in unbounded growth of the queue and memory exhaustion. To avoid this work imbalance that exhausts memory, this patch moves all closing of files into the nfsd threads. This means that when the work imposes a cost, that cost appears where it would be expected - in the work of the nfsd thread. A subsequent patch will ensure the final __fput() is called in the same (nfsd) thread which calls filp_close(). Files opened for NFSv3 are never explicitly closed by the client and are kept open by the server in the "filecache", which responds to memory pressure, is garbage collected even when there is no pressure, and sometimes closes files when there is particular need such as for rename. These files currently have filp_close() called in a dedicated work queue, so their __fput() can have no effect on nfsd threads. This patch discards the work queue and instead has each nfsd thread call flip_close() on as many as 8 files from the filecache each time it acts on a client request (or finds there are no pending client requests). If there are more to be closed, more threads are woken. This spreads the work of __fput() over multiple threads and imposes any cost on those threads. The number 8 is somewhat arbitrary. It needs to be greater than 1 to ensure that files are closed more quickly than they can be added to the cache. It needs to be small enough to limit the per-request delays that will be imposed on clients when all threads are busy closing files. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07nfsd: rename nfsd_last_thread() to nfsd_destroy_serv()NeilBrown1-4/+8
As this function now destroys the svc_serv, this is a better name. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07SUNRPC: discard sv_refcnt, and svc_get/svc_putNeilBrown1-22/+4
sv_refcnt is no longer useful. lockd and nfs-cb only ever have the svc active when there are a non-zero number of threads, so sv_refcnt mirrors sv_nrthreads. nfsd also keeps the svc active between when a socket is added and when the first thread is started, but we don't really need a refcount for that. We can simply not destroy the svc while there are any permanent sockets attached. So remove sv_refcnt and the get/put functions. Instead of a final call to svc_put(), call svc_destroy() instead. This is changed to also store NULL in the passed-in pointer to make it easier to avoid use-after-free situations. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07svc: don't hold reference for poolstats, only mutex.NeilBrown1-22/+2
A future patch will remove refcounting on svc_serv as it is of little use. It is currently used to keep the svc around while the pool_stats file is open. Change this to get the pointer, protected by the mutex, only in seq_start, and the release the mutex in seq_stop. This means that if the nfsd server is stopped and restarted while the pool_stats file it open, then some pool stats info could be from the first instance and some from the second. This might appear odd, but is unlikely to be a problem in practice. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-01-07NFSD: use read_seqbegin() rather than read_seqbegin_or_lock()Oleg Nesterov1-4/+3
The usage of read_seqbegin_or_lock() in nfsd_copy_write_verifier() is wrong. "seq" is always even and thus "or_lock" has no effect, this code can never take ->writeverf_lock for writing. I guess this is fine, nfsd_copy_write_verifier() just copies 8 bytes and nfsd_reset_write_verifier() is supposed to be very rare operation so we do not need the adaptive locking in this case. Yet the code looks wrong and sub-optimal, it can use read_seqbegin() without changing the behaviour. [ cel: Note also that it eliminates this Sparse warning: fs/nfsd/nfssvc.c:360:6: warning: context imbalance in 'nfsd_copy_write_verifier' - different lock contexts for basic block ] Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-12-20Merge tag 'nfsd-6.7-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address a few recently-introduced issues * tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Revert 5f7fc5d69f6e92ec0b38774c387f5cf7812c5806 NFSD: Revert 738401a9bd1ac34ccd5723d69640a4adbb1a4bc0 NFSD: Revert 6c41d9a9bd0298002805758216a9c44e38a8500d nfsd: hold nfsd_mutex across entire netlink operation nfsd: call nfsd_last_thread() before final nfsd_put()
2023-12-15cred: get rid of CONFIG_DEBUG_CREDENTIALSJens Axboe1-1/+0
This code is rarely (never?) enabled by distros, and it hasn't caught anything in decades. Let's kill off this legacy debug code. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-12-15nfsd: call nfsd_last_thread() before final nfsd_put()NeilBrown1-1/+1
If write_ports_addfd or write_ports_addxprt fail, they call nfsd_put() without calling nfsd_last_thread(). This leaves nn->nfsd_serv pointing to a structure that has been freed. So remove 'static' from nfsd_last_thread() and call it when the nfsd_serv is about to be destroyed. Fixes: ec52361df99b ("SUNRPC: stop using ->sv_nrthreads as a refcount") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17NFSD: Fix checksum mismatches in the duplicate reply cacheChuck Lever1-1/+9
nfsd_cache_csum() currently assumes that the server's RPC layer has been advancing rq_arg.head[0].iov_base as it decodes an incoming request, because that's the way it used to work. On entry, it expects that buf->head[0].iov_base points to the start of the NFS header, and excludes the already-decoded RPC header. These days however, head[0].iov_base now points to the start of the RPC header during all processing. It no longer points at the NFS Call header when execution arrives at nfsd_cache_csum(). In a retransmitted RPC the XID and the NFS header are supposed to be the same as the original message, but the contents of the retransmitted RPC header can be different. For example, for krb5, the GSS sequence number will be different between the two. Thus if the RPC header is always included in the DRC checksum computation, the checksum of the retransmitted message might not match the checksum of the original message, even though the NFS part of these messages is identical. The result is that, even if a matching XID is found in the DRC, the checksum mismatch causes the server to execute the retransmitted RPC transaction again. Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-11-17NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update()Chuck Lever1-1/+3
The "statp + 1" pointer that is passed to nfsd_cache_update() is supposed to point to the start of the egress NFS Reply header. In fact, it does point there for AUTH_SYS and RPCSEC_GSS_KRB5 requests. But both krb5i and krb5p add fields between the RPC header's accept_stat field and the start of the NFS Reply header. In those cases, "statp + 1" points at the extra fields instead of the Reply. The result is that nfsd_cache_update() caches what looks to the client like garbage. A connection break can occur for a number of reasons, but the most common reason when using krb5i/p is a GSS sequence number window underrun. When an underrun is detected, the server is obliged to drop the RPC and the connection to force a retransmit with a fresh GSS sequence number. The client presents the same XID, it hits in the server's DRC, and the server returns the garbage cache entry. The "statp + 1" argument has been used since the oldest changeset in the kernel history repo, so it has been in nfsd_dispatch() literally since before history began. The problem arose only when the server-side GSS implementation was added twenty years ago. Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Jeff Layton <jlayton@kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16NFSD: simplify error paths in nfsd_svc()NeilBrown1-10/+4
The error paths in nfsd_svc() are needlessly complex and can result in a final call to svc_put() without nfsd_last_thread() being called. This results in the listening sockets not being closed properly. The per-netns setup provided by nfsd_startup_new() and removed by nfsd_shutdown_net() is needed precisely when there are running threads. So we don't need nfsd_up_before. We don't need to know if it *was* up. We only need to know if any threads are left. If none are, then we must call nfsd_shutdown_net(). But we don't need to do that explicitly as nfsd_last_thread() does that for us. So simply call nfsd_last_thread() before the last svc_put() if there are no running threads. That will always do the right thing. Also discard: pr_info("nfsd: last server has exited, flushing export cache\n"); It may not be true if an attempt to start the first server failed, and it isn't particularly helpful and it simply reports normal behaviour. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16NFSD: add rpc_status netlink supportLorenzo Bianconi1-0/+15
Introduce rpc_status netlink support for NFSD in order to dump pending RPC requests debugging information from userspace. Closes: https://bugzilla.linux-nfs.org/show_bug.cgi?id=366 Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16SUNRPC: change sp_nrthreads to atomic_tNeilBrown1-6/+5
Using an atomic_t avoids the need to take a spinlock (which can soon be removed). Choosing a thread to kill needs to be careful as we cannot set the "die now" bit atomically with the test on the count. Instead we temporarily increase the count. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-10-16SUNRPC: change how svc threads are asked to exit.NeilBrown1-1/+1
svc threads are currently stopped using kthread_stop(). This requires identifying a specific thread. However we don't care which thread stops, just as long as one does. So instead, set a flag in the svc_pool to say that a thread needs to die, and have each thread check this flag instead of calling kthread_should_stop(). The first thread to find and clear this flag then moves towards exiting. This removes an explicit dependency on sp_all_threads which will make a future patch simpler. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-09-12NFSD: fix possible oops when nfsd/pool_stats is closed.NeilBrown1-2/+3
If /proc/fs/nfsd/pool_stats is open when the last nfsd thread exits, then when the file is closed a NULL pointer is dereferenced. This is because nfsd_pool_stats_release() assumes that the pointer to the svc_serv cannot become NULL while a reference is held. This used to be the case but a recent patch split nfsd_last_thread() out from nfsd_put(), and clearing the pointer is done in nfsd_last_thread(). This is easily reproduced by running rpc.nfsd 8 ; ( rpc.nfsd 0;true) < /proc/fs/nfsd/pool_stats Fortunately nfsd_pool_stats_release() has easy access to the svc_serv pointer, and so can call svc_put() on it directly. Fixes: 9f28a971ee9f ("nfsd: separate nfsd_last_thread() from nfsd_put()") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29SUNRPC: remove timeout arg from svc_recv()NeilBrown1-1/+1
Most svc threads have no interest in a timeout. nfsd sets it to 1 hour, but this is a wart of no significance. lockd uses the timeout so that it can call nlmsvc_retry_blocked(). It also sometimes calls svc_wake_up() to ensure this is called. So change lockd to be consistent and always use svc_wake_up() to trigger nlmsvc_retry_blocked() - using a timer instead of a timeout to svc_recv(). And change svc_recv() to not take a timeout arg. This makes the sp_threads_timedout counter always zero. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29SUNRPC: change svc_recv() to return void.NeilBrown1-11/+2
svc_recv() currently returns a 0 on success or one of two errors: - -EAGAIN means no message was successfully received - -EINTR means the thread has been told to stop Previously nfsd would stop as the result of a signal as well as following kthread_stop(). In that case the difference was useful: EINTR means stop unconditionally. EAGAIN means stop if kthread_should_stop(), continue otherwise. Now threads only exit when kthread_should_stop() so we don't need the distinction. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29SUNRPC: call svc_process() from svc_recv().NeilBrown1-2/+1
All callers of svc_recv() go on to call svc_process() on success. Simplify callers by having svc_recv() do that for them. This loses one call to validate_process_creds() in nfsd. That was debugging code added 14 years ago. I don't think we need to keep it. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29nfsd: separate nfsd_last_thread() from nfsd_put()NeilBrown1-33/+19
Now that the last nfsd thread is stopped by an explicit act of calling svc_set_num_threads() with a count of zero, we only have a limited number of places that can happen, and don't need to call nfsd_last_thread() in nfsd_put() So separate that out and call it at the two places where the number of threads is set to zero. Move the clearing of ->nfsd_serv and the call to svc_xprt_destroy_all() into nfsd_last_thread(), as they are really part of the same action. nfsd_put() is now a thin wrapper around svc_put(), so make it a static inline. nfsd_put() cannot be called after nfsd_last_thread(), so in a couple of places we have to use svc_put() instead. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29nfsd: Simplify code around svc_exit_thread() call in nfsd()NeilBrown1-23/+0
Previously a thread could exit asynchronously (due to a signal) so some care was needed to hold nfsd_mutex over the last svc_put() call. Now a thread can only exit when svc_set_num_threads() is called, and this is always called under nfsd_mutex. So no care is needed. Not only is the mutex held when a thread exits now, but the svc refcount is elevated, so the svc_put() in svc_exit_thread() will never be a final put, so the mutex isn't even needed at this point in the code. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29nfsd: don't allow nfsd threads to be signalled.NeilBrown1-12/+0
The original implementation of nfsd used signals to stop threads during shutdown. In Linux 2.3.46pre5 nfsd gained the ability to shutdown threads internally it if was asked to run "0" threads. After this user-space transitioned to using "rpc.nfsd 0" to stop nfsd and sending signals to threads was no longer an important part of the API. In commit 3ebdbe5203a8 ("SUNRPC: discard svo_setup and rename svc_set_num_threads_sync()") (v5.17-rc1~75^2~41) we finally removed the use of signals for stopping threads, using kthread_stop() instead. This patch makes the "obvious" next step and removes the ability to signal nfsd threads - or any svc threads. nfsd stops allowing signals and we don't check for their delivery any more. This will allow for some simplification in later patches. A change worth noting is in nfsd4_ssc_setup_dul(). There was previously a signal_pending() check which would only succeed when the thread was being shut down. It should really have tested kthread_should_stop() as well. Now it just does the latter, not the former. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29NFSD: Rename struct svc_cacherepChuck Lever1-1/+1
The svc_ prefix is identified with the SunRPC layer. Although the duplicate reply cache caches RPC replies, it is only for the NFS protocol. Rename the struct to better reflect its purpose. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-08-29NFSD: Remove svc_rqst::rq_cacherepChuck Lever1-4/+6
Over time I'd like to see NFS-specific fields moved out of struct svc_rqst, which is an RPC layer object. These fields are layering violations. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-06-18NFSD: Distinguish per-net namespace initializationChuck Lever1-0/+5
I find the naming of nfsd_init_net() and nfsd_startup_net() to be confusingly similar. Rename the namespace initialization and tear- down ops and add comments to distinguish their separate purposes. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20NFSD: copy the whole verifier in nfsd_copy_write_verifierChuck Lever1-1/+1
Currently, we're only memcpy'ing the first __be32. Ensure we copy into both words. Fixes: 91d2e9b56cf5 ("NFSD: Clean up the nfsd_net::nfssvc_boot field") Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2023-02-20nfsd: move reply cache initialization into nfsd startupJeff Layton1-1/+9
There's no need to start the reply cache before nfsd is up and running, and doing so means that we register a shrinker for every net namespace instead of just the ones where nfsd is running. Move it to the per-net nfsd startup instead. Reported-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>