aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-28NFS: Remove unused FLUSH_SYNC support in nfs_initiate_pgio()Trond Myklebust1-12/+3
If the FLUSH_SYNC flag is set, nfs_initiate_pgio() will currently wait for completion, and then return the status of the I/O operation. What we actually want to report in nfs_pageio_doio() is whether or not the RPC call was launched successfully, whereas actual I/O status is intended handled in the reply callbacks. Since FLUSH_SYNC is never set by any of the callers anyway, let's just remove that code altogether. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS/flexfiles: Specify the layout segment range in LAYOUTGETTrond Myklebust1-4/+4
Move from requesting only full file layout segments, to requesting layout segments that match our I/O size. This means the server is still free to return a full file layout, but we will no longer error out if it does not. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS/flexfiles: remove requirement for whole file layoutsTrond Myklebust1-21/+0
Remove the requirement that the server always sends whole file layouts. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS/flexfiles: Check the layout segment range before doing I/OTrond Myklebust3-3/+13
When starting to read or write with a layout segment, check that the range matches our request. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS/flexfile: Don't merge layout segments if the mirrors don't matchTrond Myklebust1-0/+19
Check that the number of mirrors, and the mirror information matches before deciding to merge layout segments in pNFS/flexfiles. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Fix pnfs_layout_mark_request_commit() invalid layout segment handlingTrond Myklebust1-16/+12
Fix up pnfs_layout_mark_request_commit() to alway reschedule the write if the layout segment is invalid. Also minor cleanup. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Simplify bucket layout segment reference countingTrond Myklebust2-21/+21
Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Clean up pNFS commit operationsTrond Myklebust6-71/+98
Move the pNFS commit related operations into a separate structure that can be carried by the pnfs_ds_commit_info. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS: Remove bucket array from struct pnfs_ds_commit_infoTrond Myklebust6-185/+1
Remove the unused bucket array in struct pnfs_ds_commit_info. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Add a helper pnfs_generic_search_commit_reqs()Trond Myklebust3-31/+54
Lift filelayout_search_commit_reqs() into the generic pnfs/nfs code, and add support for commit arrays. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS: Enable per-layout segment commit structuresTrond Myklebust4-6/+117
Enable adding and lookup of per-layout segment commits in filelayout and flexfilelayout. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS: Add infrastructure for cleaning up per-layout commit structuresTrond Myklebust7-4/+121
Ensure that both the file and flexfiles layout types clean up when freeing the layout segments. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Support commit arrays in nfs_clear_pnfs_ds_commit_verifiers()Trond Myklebust1-3/+16
Add support for scanning the full list of per-layout segment commit arrays to nfs_clear_pnfs_ds_commit_verifiers(). Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS: Fix O_DIRECT commit verifier handlingTrond Myklebust3-124/+22
Instead of trying to save the commit verifiers and checking them against previous writes, adopt the same strategy as for buffered writes, of just checking the verifiers at commit time. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS: commit errors should be fatalTrond Myklebust1-2/+30
Fix the O_DIRECT code to avoid retries if the COMMIT fails with a fatal error. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFS/pNFS: Allow O_DIRECT to release the DS commitinfoTrond Myklebust2-0/+17
Add a pNFS callback to allow the O_DIRECT code to release the DS commitinfo when freeing the dreq. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS: Support per-layout segment commits in pnfs_generic_commit_pagelist()Trond Myklebust1-0/+16
Add support for scanning the full list of per-layout segment commit arrays to pnfs_generic_commit_pagelist(). Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27pNFS: Support per-layout segment commits in pnfs_generic_recover_commit_reqs()Trond Myklebust1-8/+33
Add support for scanning the full list of per-layout segment commit arrays to pnfs_generic_recover_commit_reqs(). Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFSv4/pNFS: Scan the full list of commit arrays when committingTrond Myklebust1-12/+40
Add support for scanning the full list of per-layout segment commit arrays to pnfs_generic_scan_commit_lists() Signed-off-by: Trond Myklebust <[email protected]>
2020-03-27NFSv4/pnfs: Support a list of commit arrays in struct pnfs_ds_commit_infoTrond Myklebust5-1/+18
When we have multiple layout segments with different lists of mirrored data, we need to track the commits on a per layout segment basis. This patch adds a list to support this tracking in struct pnfs_ds_commit_info. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-26pNFS: Add a helper to allocate the array of bucketsTrond Myklebust3-3/+46
Signed-off-by: Trond Myklebust <[email protected]>
2020-03-26NFS/pNFS: Refactor pnfs_generic_commit_pagelist()Trond Myklebust2-101/+76
Refactor pnfs_generic_commit_pagelist() to simplify the conversion to layout segment based commit lists. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-26pNFS/flexfiles: Simplify allocation of the mirror arrayTrond Myklebust2-17/+6
Just allocate the array at the end of the layout segment structure, instead of allocating it as a separate array of pointers. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-26SUNRPC: fix krb5p mount to provide large enough buffer in rq_rcvsizeOlga Kornievskaia1-1/+2
Ever since commit 2c94b8eca1a2 ("SUNRPC: Use au_rslack when computing reply buffer size"). It changed how "req->rq_rcvsize" is calculated. It used to use au_cslack value which was nice and large and changed it to au_rslack value which turns out to be too small. Since 5.1, v3 mount with sec=krb5p fails against an Ontap server because client's receive buffer it too small. For gss krb5p, we need to account for the mic token in the verifier, and the wrap token in the wrap token. RFC 4121 defines: mic token Octet no Name Description -------------------------------------------------------------- 0..1 TOK_ID Identification field. Tokens emitted by GSS_GetMIC() contain the hex value 04 04 expressed in big-endian order in this field. 2 Flags Attributes field, as described in section 4.2.2. 3..7 Filler Contains five octets of hex value FF. 8..15 SND_SEQ Sequence number field in clear text, expressed in big-endian order. 16..last SGN_CKSUM Checksum of the "to-be-signed" data and octet 0..15, as described in section 4.2.4. that's 16bytes (GSS_KRB5_TOK_HDR_LEN) + chksum wrap token Octet no Name Description -------------------------------------------------------------- 0..1 TOK_ID Identification field. Tokens emitted by GSS_Wrap() contain the hex value 05 04 expressed in big-endian order in this field. 2 Flags Attributes field, as described in section 4.2.2. 3 Filler Contains the hex value FF. 4..5 EC Contains the "extra count" field, in big- endian order as described in section 4.2.3. 6..7 RRC Contains the "right rotation count" in big- endian order, as described in section 4.2.5. 8..15 SND_SEQ Sequence number field in clear text, expressed in big-endian order. 16..last Data Encrypted data for Wrap tokens with confidentiality, or plaintext data followed by the checksum for Wrap tokens without confidentiality, as described in section 4.2.4. Also 16bytes of header (GSS_KRB5_TOK_HDR_LEN), encrypted data, and cksum (other things like padding) RFC 3961 defines known cksum sizes: Checksum type sumtype checksum section or value size reference --------------------------------------------------------------------- CRC32 1 4 6.1.3 rsa-md4 2 16 6.1.2 rsa-md4-des 3 24 6.2.5 des-mac 4 16 6.2.7 des-mac-k 5 8 6.2.8 rsa-md4-des-k 6 16 6.2.6 rsa-md5 7 16 6.1.1 rsa-md5-des 8 24 6.2.4 rsa-md5-des3 9 24 ?? sha1 (unkeyed) 10 20 ?? hmac-sha1-des3-kd 12 20 6.3 hmac-sha1-des3 13 20 ?? sha1 (unkeyed) 14 20 ?? hmac-sha1-96-aes128 15 20 [KRB5-AES] hmac-sha1-96-aes256 16 20 [KRB5-AES] [reserved] 0x8003 ? [GSS-KRB5] Linux kernel now mainly supports type 15,16 so max cksum size is 20bytes. (GSS_KRB5_MAX_CKSUM_LEN) Re-use already existing define of GSS_KRB5_MAX_SLACK_NEEDED that's used for encoding the gss_wrap tokens (same tokens are used in reply). Fixes: 2c94b8eca1a2 ("SUNRPC: Use au_rslack when computing reply buffer size") Signed-off-by: Olga Kornievskaia <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-25NFS: Don't specify NFS version in "UDP not supported" errorPetr Vorel1-2/+2
UDP was originally disabled in 6da1a034362f for NFSv4. Later in b24ee6c64ca7 UDP is by default disabled by NFS_DISABLE_UDP_SUPPORT=y for all NFS versions. Therefore remove v4 from error message. Fixes: b24ee6c64ca7 ("NFS: allow deprecation of NFS UDP protocol") Signed-off-by: Petr Vorel <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-25nfsroot: set tcp as the default transport protocolLiwei Song1-1/+1
UDP is disabled by default in commit b24ee6c64ca7 ("NFS: allow deprecation of NFS UDP protocol"), but the default mount options is still udp, change it to tcp to avoid the "Unsupported transport protocol udp" error if no protocol is specified when mount nfs. Fixes: b24ee6c64ca7 ("NFS: allow deprecation of NFS UDP protocol") Signed-off-by: Liwei Song <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-22NFS: direct.c: Fix memory leak of dreq when nfs_get_lock_context failsMisono Tomohiro1-0/+2
When dreq is allocated by nfs_direct_req_alloc(), dreq->kref is initialized to 2. Therefore we need to call nfs_direct_req_release() twice to release the allocated dreq. Usually it is called in nfs_file_direct_{read, write}() and nfs_direct_complete(). However, current code only calls nfs_direct_req_relese() once if nfs_get_lock_context() fails in nfs_file_direct_{read, write}(). So, that case would result in memory leak. Fix this by adding the missing call. Signed-off-by: Misono Tomohiro <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-17nfs: Fix up documentation in nfs_follow_referral() and nfs_do_submount()Trond Myklebust2-5/+2
Fallout from the mount patches. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16SUNRPC: Trim stack utilization in the wrap and unwrap pathsChuck Lever1-6/+8
By preventing compiler inlining of the integrity and privacy helpers, stack utilization for the common case (authentication only) goes way down. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16SUNRPC: Remove xdr_buf_read_mic()Chuck Lever2-56/+0
Clean up: this function is no longer used. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16sunrpc: Fix gss_unwrap_resp_integ() againChuck Lever1-19/+58
xdr_buf_read_mic() tries to find unused contiguous space in a received xdr_buf in order to linearize the checksum for the call to gss_verify_mic. However, the corner cases in this code are numerous and we seem to keep missing them. I've just hit yet another buffer overrun related to it. This overrun is at the end of xdr_buf_read_mic(): 1284 if (buf->tail[0].iov_len != 0) 1285 mic->data = buf->tail[0].iov_base + buf->tail[0].iov_len; 1286 else 1287 mic->data = buf->head[0].iov_base + buf->head[0].iov_len; 1288 __read_bytes_from_xdr_buf(&subbuf, mic->data, mic->len); 1289 return 0; This logic assumes the transport has set the length of the tail based on the size of the received message. base + len is then supposed to be off the end of the message but still within the actual buffer. In fact, the length of the tail is set by the upper layer when the Call is encoded so that the end of the tail is actually the end of the allocated buffer itself. This causes the logic above to set mic->data to point past the end of the receive buffer. The "mic->data = head" arm of this if statement is no less fragile. As near as I can tell, this has been a problem forever. I'm not sure that minimizing au_rslack recently changed this pathology much. So instead, let's use a more straightforward approach: kmalloc a separate buffer to linearize the checksum. This is similar to how gss_validate() currently works. Coming back to this code, I had some trouble understanding what was going on. So I've cleaned up the variable naming and added a few comments that point back to the XDR definition in RFC 2203 to help guide future spelunkers, including myself. As an added clean up, the functionality that was in xdr_buf_read_mic() is folded directly into gss_unwrap_resp_integ(), as that is its only caller. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Benjamin Coddington <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16nfs: Replace zero-length array with flexible-array memberGustavo A. R. Silva2-2/+2
The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4.2: error out when relink swapfileMurphy Zhou1-0/+3
This fixes xfstests generic/356 failure on NFSv4.2. Signed-off-by: Murphy Zhou <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFS:remove redundant call to nfs_do_accessZhouyi Zhou1-8/+1
In function nfs_permission: 1. the rcu_read_lock and rcu_read_unlock around nfs_do_access is unnecessary because the rcu critical data structure is already protected in subsidiary function nfs_access_get_cached_rcu. No other data structure needs rcu_read_lock in nfs_do_access. 2. call nfs_do_access once is enough, because: 2-1. when mask has MAY_NOT_BLOCK bit The second call to nfs_do_access will not happen. 2-2. when mask has no MAY_NOT_BLOCK bit The second call to nfs_do_access will happen if res == -ECHILD, which means the first nfs_do_access goes out after statement if (!may_block). The second call to nfs_do_access will go through this procedure once again except continue the work after if (!may_block). But above work can be performed by only one call to nfs_do_access without mangling the mask flag. Tested in x86_64 Signed-off-by: Zhouyi Zhou <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16SUNRPC: remove redundant assignments to variable statusColin Ian King1-1/+1
The variable status is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Add support for CB_RECALL_ANY for flexfiles layoutsTrond Myklebust7-18/+186
When we receive a CB_RECALL_ANY that asks us to return flexfiles layouts, we iterate through all the layouts and look at whether or not there are active open file descriptors that might need them for I/O. If there are no such descriptors, we return the layouts. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Clean up nfs_delegation_reap_expired()Trond Myklebust1-43/+40
Convert to use nfs_client_for_each_server() for efficiency. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Clean up nfs_delegation_reap_unclaimed()Trond Myklebust1-39/+37
Convert nfs_delegation_reap_unclaimed() to use nfs_client_for_each_server() for efficiency. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Clean up nfs_client_return_marked_delegations()Trond Myklebust1-69/+60
Convert it to use the nfs_client_for_each_server() helper, and make it more efficient by skipping delegations for inodes we know are in the process of being freed. Also improve the efficiency of the cursor by skipping delegations that are being freed. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFS: Add a helper nfs_client_for_each_server()Trond Myklebust2-1/+38
Add a helper nfs_client_for_each_server() to iterate through all the filesystems that are attached to a struct nfs_client, and apply a function to all the active ones. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4/pnfs: Clean up nfs_layout_find_inode()Trond Myklebust1-31/+21
Now that we can rely on just the rcu_read_lock(), remove the clp->cl_lock and clean up. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Ensure layout headers are RCU safeTrond Myklebust5-11/+13
Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4/pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid()Trond Myklebust1-0/+2
Make sure to test the stateid for validity so that we catch instances where the server may have been reusing stateids in nfs_layout_find_inode_by_stateid(). Fixes: 7b410d9ce460 ("pNFS: Delay getting the layout header in CB_LAYOUTRECALL handlers") Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16pNFS/flexfiles: Report DELAY and GRACE errors from the DS to the serverTrond Myklebust1-9/+11
Ensure that if the DS is returning too many DELAY and GRACE errors, we also report that to the MDS through the layouterror mechanism. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFS: Limit the size of the access cache by defaultTrond Myklebust1-1/+1
Currently, we have no real limit on the access cache size (we set it to ULONG_MAX). That can lead to credentials getting pinned for a very long time on lots of files if you have a system with a lot of memory. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFS: Avoid referencing the cred twice in async rename/unlinkTrond Myklebust1-2/+2
In both async rename and rename, we take a reference to the cred in the call arguments. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Avoid unnecessary credential references in layoutgetTrond Myklebust2-3/+2
Layoutget is just using the credential attached to the open context. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFSv4: Avoid referencing the cred unnecessarily during NFSv4 I/OTrond Myklebust1-5/+5
Avoid unnecessary references to the cred when we have already referenced it through the open context or the open owner. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16NFS: Assume cred is pinned by open context in I/O requestsTrond Myklebust2-2/+2
In read/write/commit, we should be able to assume that the cred is pinned by the open context. Signed-off-by: Trond Myklebust <[email protected]>
2020-03-16SUNRPC: Don't take a reference to the cred on synchronous tasksTrond Myklebust1-0/+3
If the RPC call is synchronous, assume the cred is already pinned by the caller. Signed-off-by: Trond Myklebust <[email protected]>