aboutsummaryrefslogtreecommitdiff
path: root/fs/nfs
AgeCommit message (Collapse)AuthorFilesLines
2015-09-01NFSv4.1/flexfiles: Fix freeing of mirrorsTrond Myklebust1-12/+2
Mirrors are now shared objects, so we should not be freeing them directly inside ff_layout_free_lseg(). We should already be doing the right thing in _ff_layout_free_lseg(), so just let it handle things. Also ensure that ff_layout_free_mirror() frees the RPC credential if it is set. Fixes: 28a0d72c6867 ("Add refcounting to struct nfs4_ff_layout_mirror") Signed-off-by: Trond Myklebust <[email protected]>
2015-08-31NFSv4.1/pNFS: Don't request a minimal read layout beyond the end of fileTrond Myklebust1-0/+9
If we have a read layout, then sanity check the minimal layout length so that it does not extend beyond the end of file. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-31NFSv4.1/pnfs: Handle LAYOUTGET return values correctlyTrond Myklebust1-1/+14
According to RFC5661 section 18.43.3, if the server cannot satisfy the loga_minlength argument to LAYOUTGET, there are 2 cases: 1) If loga_minlength == 0, it returns NFS4ERR_LAYOUTTRYLATER 2) If loga_minlength != 0, it returns NFS4ERR_BADLAYOUT Signed-off-by: Trond Myklebust <[email protected]>
2015-08-31NFSv4.1/pnfs: Don't ask for a read layout for an empty file.Trond Myklebust1-0/+3
Signed-off-by: Trond Myklebust <[email protected]>
2015-08-30NFSv4.1: Fix a protocol issue with CLOSE stateidsTrond Myklebust1-5/+10
According to RFC5661 Section 18.2.4, CLOSE is supposed to return the zero stateid. This means that nfs_clear_open_stateid_locked() cannot assume that the result stateid will always match the 'other' field of the existing open stateid when trying to determine a race with a parallel OPEN. Instead, we look at the argument, and check for matches. Cc: [email protected] # v4.0+ Signed-off-by: Trond Myklebust <[email protected]>
2015-08-30NFSv4.1/flexfiles: Don't mark the entire deviceid as bad for file errorsTrond Myklebust1-8/+16
If the file was fenced and/or has been deleted on the DS, then we want to retry pNFS after a layoutreturn with error report. If the server cannot fix the problem, then we rely on it to tell us so in the response to the LAYOUTGET. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFSv4.1/pnfs: Ensure layoutreturn reserves space for the opaque payloadTrond Myklebust1-1/+2
The "FIXME" is outdated. Flexfiles does add a payload. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFSv4.1/flexfiles: Fix a protocol error in layoutreturnTrond Myklebust1-2/+5
According to the flexfiles protocol, the layoutreturn should specify an array of errors in the following format: struct ff_ioerr4 { offset4 ffie_offset; length4 ffie_length; stateid4 ffie_stateid; device_error4 ffie_errors<>; }; This patch fixes up the code to ensure that our ffie_errors is indeed encoded as an array (albeit with only a single entry). Reported-by: Tom Haynes <[email protected]> Cc: [email protected] Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFS: Send attributes in OPEN request for NFS4_CREATE_EXCLUSIVE4_1Kinglong Mee2-12/+32
Client sends a SETATTR request after OPEN for updating attributes. For create file with S_ISGID is set, the S_ISGID in SETATTR will be ignored at nfs server as chmod of no PERMISSION. v3, same as v2. Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFS: Get suppattr_exclcreat when getting server capabilitiesKinglong Mee2-6/+34
Create file with attributs as NFS4_CREATE_EXCLUSIVE4_1 mode depends on suppattr_exclcreat attribut. v3, same as v2. Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFS: Make opened as optional argument in _nfs4_do_openKinglong Mee2-5/+3
Check opened, only update it when non-NULL. It's not needs define an unused value for the opened when calling _nfs4_do_open. v3, same as v2. Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFS: Check size by inode_newsize_ok in nfs_setattrKinglong Mee1-8/+10
Set rlimit for NFS's files is useless right now. For local process's rlimit, it should be checked by nfs client. The same, CIFS also call inode_change_ok checking rlimit at its client in cifs_setattr_nounix() and cifs_setattr_unix(). v3, fix bad using of error Signed-off-by: Kinglong Mee <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-27NFSv4.1/pNFS: pnfs_mark_matching_lsegs_return must notify of layout returnTrond Myklebust1-0/+2
It's not sufficient to just mark the layout segment for layout return. We also need to set the NFS_LAYOUT_RETURN_BEFORE_CLOSE flag in the layout header. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25nfs42: remove unused declarationPeng Tao1-2/+0
Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25nfs42: decode_layoutstats does not need res parameterPeng Tao1-3/+2
Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/flexfiles: Allow coalescing of new layout segments and existing onesTrond Myklebust2-0/+76
In order to ensure atomicity of updates, we merge the old layout segments into the new ones, and then invalidate the old ones. Also ensure that we order the list of layout segments so that RO segments are preferred over RW. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/pnfs: Allow pNFS device drivers to customise layout segment insertionTrond Myklebust2-9/+61
This is needed in order to allow merging of contiguous layout segments, and also to correct the ordering of layouts for those device drivers that don't necessarily want to place the read-write layouts first. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/pnfs: Add sanity check for the layout range returned by the serverTrond Myklebust1-1/+24
Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/pnfs Improve the packing of struct pnfs_layout_hdrTrond Myklebust1-3/+3
Eliminate a couple of holes in the structure, and move the 2 atomics into the same cacheline. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/flexfile: ff_layout_remove_mirror can be statickbuild test robot1-1/+1
Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.2/pnfs: Make the layoutstats timer configurableTrond Myklebust3-1/+11
Allow advanced users to set the layoutstats timer in order to lengthen or shorten the period between layoutstat transmissions to the server. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/flexfile: Ensure uniqueness of mirrors across layout segmentsTrond Myklebust2-29/+99
Keep the full list of mirrors in the struct nfs4_ff_layout_mirror so that they can be shared among the layout segments that use them. Also ensure that we send out only one copy of the layoutstats per mirror. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/flexfiles: Remove mirror backpointer to lseg.Trond Myklebust2-14/+12
When we start sharing mirrors between several lsegs, we won't be able to keep it. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/flexfiles: Add refcounting to struct nfs4_ff_layout_mirrorTrond Myklebust2-9/+28
We do want to share mirrors between layout segments, so add a refcount to enable that. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFS41/flexfiles: zero out DS write wccPeng Tao1-0/+2
We do not want to update inode attributes with DS values. Cc: [email protected] # v4.0+ Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFS41: remove NFS_LAYOUT_ROC flagPeng Tao2-6/+2
If we return delegation before closing, we fail to do roc check during close because NFS_LAYOUT_ROC is cleared by delegreturn and it causes layouts to be still hanging around after delegreturn + close, which is a voilation against protocol. Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4: Add a tracepoint for CB_LAYOUTRECALLTrond Myklebust2-1/+3
Only support for single file layoutrecall for now. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4: Add a tracepoint for CB_GETATTRTrond Myklebust2-1/+64
Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4.1/pnfs: Add a tracepoint for return-on-close eventsTrond Myklebust2-0/+2
Allow tracing of return-on-close. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-25NFSv4: Force a post-op attribute update when holding a delegationTrond Myklebust1-0/+7
If the ctime or mtime or change attribute have changed because of an operation we initiated, we should make sure that we force an attribute update. However we do not want to mark the page cache for revalidation. Signed-off-by: Trond Myklebust <[email protected]> Cc: [email protected] # v4.0+
2015-08-20NFSv4.1/pnfs Ensure flexfiles reports all connection related errorsTrond Myklebust1-13/+35
Make sure that we also handle RPC level connection and protocol negotiation errors. Reported-by: Tom Haynes <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-20NFSv4.1/pnfs: Ensure the flexfiles layoutstats timers are consistentTrond Myklebust1-27/+24
We want to ensure that the stopwatches for the busy timer and the aggregate timer are consistent. This means that they need to use the same start/stop times. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-20NFS41: fix list splice typePeng Tao1-1/+1
We want to move commiting pages to pages list instead. Otherwise it causes pnfs small writes crash like: [34560.037692] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 [34560.038557] IP: [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.039400] PGD 69f5a067 PUD 69f59067 PMD 0 [34560.040207] Oops: 0000 [#1] SMP [34560.041014] Modules linked in: nfsv3(OE) nfs_layout_flexfiles(OE) nfsv4(OE) nfs(OE) fscache(E) rpcsec_gss_krb5(E) xt_addrtype(E) xt_conntrack(E) ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) nf_nat_ipv4(E) iptable_filter(E) ip_tables(E) x_tables(E) nf_nat(E) nf_conntrack(E) bridge(E) stp(E) llc(E) dm_thin_pool(E) dm_persistent_data(E) dm_bio_prison(E) dm_bufio(E) ppdev(E) vmw_balloon(E) coretemp(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) aes_x86_64(E) glue_helper(E) lrw(E) gf128mul(E) ablk_helper(E) cryptd(E) psmouse(E) serio_raw(E) vmw_vmci(E) i2c_piix4(E) shpchp(E) parport_pc(E) parport(E) mac_hid(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) xfs(E) libcrc32c(E) hid_generic(E) usbhid(E) hid(E) e1000(E) mptspi(E) [34560.045106] mptscsih(E) mptbase(E) vmwgfx(E) drm_kms_helper(E) ttm(E) drm(E) autofs4(E) [last unloaded: fscache] [34560.045897] CPU: 0 PID: 130543 Comm: bash Tainted: G OE 4.2.0-rc5-dp-00057-gf993a93 #11 [34560.046699] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014 [34560.047525] task: ffff880031b0a980 ti: ffff880045fec000 task.ti: ffff880045fec000 [34560.048264] RIP: 0010:[<ffffffffa05423d6>] [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.049000] RSP: 0018:ffff880045fefc18 EFLAGS: 00010246 [34560.049717] RAX: 0000000000000000 RBX: ffff8800208fbc80 RCX: ffff880045fefd50 [34560.050396] RDX: ffff880031c19ec0 RSI: ffff880045fefc88 RDI: ffff8800208fbc80 [34560.051041] RBP: ffff880045fefc28 R08: ffff8800208fbe68 R09: ffff880045fefc88 [34560.051666] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880045fefc78 [34560.052247] R13: ffff880045fefc88 R14: ffff880045fefa90 R15: ffff880045fefd50 [34560.052825] FS: 00007fa02d58c740(0000) GS:ffff88006d600000(0000) knlGS:0000000000000000 [34560.053410] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [34560.053992] CR2: 0000000000000068 CR3: 000000003b37a000 CR4: 00000000001406f0 [34560.054615] Stack: [34560.055200] ffff8800208fbc80 ffff8800208fbc80 ffff880045fefcc8 ffffffffa05c1a5b [34560.055800] ffff880045fefcc8 ffff880045fefd50 0000000045fefcb8 ffff880045fefd40 [34560.056418] ffff8800420608e0 ffffffffa04f3910 0000000100000001 ffff880045fefd50 [34560.057013] Call Trace: [34560.057672] [<ffffffffa05c1a5b>] pnfs_generic_commit_pagelist+0x1cb/0x300 [nfsv4] [34560.058277] [<ffffffffa04f3910>] ? ff_layout_commit_pagelist+0x20/0x20 [nfs_layout_flexfiles] [34560.058907] [<ffffffffa04f3905>] ff_layout_commit_pagelist+0x15/0x20 [nfs_layout_flexfiles] [34560.059557] [<ffffffffa0543fc1>] nfs_generic_commit_list+0xb1/0xf0 [nfs] [34560.060214] [<ffffffffa0543e47>] ? nfs_scan_commit+0x37/0xa0 [nfs] [34560.060825] [<ffffffffa0544081>] nfs_commit_inode+0x81/0x150 [nfs] [34560.061432] [<ffffffffa05443ae>] nfs_wb_all+0x1ae/0x400 [nfs] [34560.062035] [<ffffffffa05380ad>] nfs_getattr+0x33d/0x510 [nfs] [34560.062630] [<ffffffff8122499c>] vfs_getattr_nosec+0x2c/0x40 [34560.063223] [<ffffffff81224a66>] vfs_getattr+0x26/0x30 [34560.063818] [<ffffffff81224b35>] vfs_fstatat+0x65/0xa0 [34560.064413] [<ffffffff81224f3f>] SYSC_newstat+0x1f/0x40 [34560.065016] [<ffffffff8102b176>] ? do_audit_syscall_entry+0x66/0x70 [34560.065626] [<ffffffff8102c773>] ? syscall_trace_enter_phase1+0x113/0x170 [34560.066245] [<ffffffff81003017>] ? trace_hardirqs_on_thunk+0x17/0x19 [34560.066868] [<ffffffff812251ae>] SyS_newstat+0xe/0x10 [34560.067533] [<ffffffff817a5df2>] entry_SYSCALL_64_fastpath+0x16/0x7a [34560.068173] Code: 0f 1f 44 00 00 0f 1f 44 00 00 55 4c 8d 87 e8 01 00 00 48 89 e5 53 48 89 fb 48 83 ec 08 4c 8b 0e 49 8b 41 18 4c 39 ce 48 8b 40 40 <4c> 8b 50 68 74 24 48 8b 87 e8 01 00 00 48 8b 7e 08 4d 89 41 08 [34560.069609] RIP [<ffffffffa05423d6>] nfs_init_commit+0x26/0x130 [nfs] [34560.070295] RSP <ffff880045fefc18> [34560.071008] CR2: 0000000000000068 [34560.073207] ---[ end trace f85f873260977406 ]--- [fixes 27571297a7e(pNFS: Tighten up locking around DS commit buckets)] Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19NFSv4: Enable delegated opens even when reboot recovery is pendingTrond Myklebust1-8/+19
Unlike the previous attempt, this takes into account the fact that we may be calling it from the recovery thread itself. Detect this by looking at what kind of open we're doing, and checking the state of the NFS_DELEGATION_NEED_RECLAIM if it turns out we're doing a reboot reclaim-type open. Cc: Olga Kornievskaia <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19pNFS: Fix an unused variable warning in pnfs_roc_get_barrierTrond Myklebust1-2/+0
Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19NFS41/flexfiles: update inode after write finishesPeng Tao1-0/+3
Otherwise we break fstest case tests/read_write/mctime.t Does files layout need the same fix as well? Cc: [email protected] # v4.0+ Cc: Anna Schumaker <[email protected]> Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19NFS41: make sure sending LAYOUTRETURN before close if marked soPeng Tao1-23/+28
If layout is marked by NFS_LAYOUT_RETURN_BEFORE_CLOSE, we should always send LAYOUTRETURN before close, and we don't need to do ROC drain if we do send LAYOUTRETURN. Signed-off-by: Peng Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19Revert "NFSv4: Remove incorrect check in can_open_delegated()"Trond Myklebust1-0/+2
This reverts commit 4e379d36c050b0117b5d10048be63a44f5036115. This commit opens up a race between the recovery code and the open code. Reported-by: Olga Kornievskaia <[email protected]> Cc: [email protected] # v4.0+ Signed-off-by: Trond Myklebust <[email protected]>
2015-08-18NFSv4.1/pnfs: Play safe w.r.t. close() races when return-on-close is setTrond Myklebust1-5/+5
If we have an OPEN_DOWNGRADE and CLOSE race with one another, we want to ensure that the layout is forgotten by the client, so that we start afresh with a new layoutget. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-18NFSv4.1/pnfs: Fix a close/delegreturn hang when return-on-close is setTrond Myklebust3-35/+8
The helper pnfs_roc() has already verified that we have no delegations, and no further open files, hence no outstanding I/O and it has marked all the return-on-close lsegs as being invalid. Furthermore, it sets the NFS_LAYOUT_RETURN bit, thus serialising the close/delegreturn with all future layoutget calls on this inode. The checks in pnfs_roc_drain() for valid layout segments are therefore redundant: those cannot exist until another layoutget completes. The other check for whether or not NFS_LAYOUT_RETURN is set, actually causes a hang, since we already know that we hold that flag. To fix, we therefore strip out all the functionality in pnfs_roc_drain() except the retrieval of the barrier state, and then rename the function accordingly. Reported-by: Christoph Hellwig <[email protected]> Fixes: 5c4a79fb2b1c ("Don't prevent layoutgets when doing return-on-close") Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Don't fsync twice for O_SYNC/IS_SYNC filesTrond Myklebust1-5/+3
generic_file_write_iter() will already do an fsync on our behalf if the file descriptor is O_SYNC or the file is marked as IS_SYNC. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Don't let the ctime override attribute barriers.Trond Myklebust1-8/+0
Chuck reports seeing cases where a GETATTR that happens to race with an asynchronous WRITE is overriding the file size, despite the attribute barrier being set by the writeback code. The culprit turns out to be the check in nfs_ctime_need_update(), which sees that the ctime is newer than the cached ctime, and assumes that it is safe to override the attribute barrier. This patch removes that override, and ensures that attribute barriers are always respected. Reported-by: Chuck Lever <[email protected]> Fixes: a08a8cd375db9 ("NFS: Add attribute update barriers to NFS writebacks") Cc: [email protected] # v4.0+ Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17Merge branch 'layoutfixes'Trond Myklebust2-48/+44
* layoutfixes: NFSv4.1/pnfs: Remove redundant wakeup in pnfs_send_layoutreturn() NFSv4.1/pnfs: Remove redundant check in pnfs_layoutgets_blocked() NFSv4.1/pnfs: Remove redundant lo->plh_block_lgets in layoutreturn NFSv4.1/pnfs: Don't prevent layoutgets when doing return-on-close NFSv4.1/pnfs: Fix serialisation of layout return and layoutget NFSv4.1/pnfs: Remove redundant checks in pnfs_layoutgets_blocked() pNFS: Tighten up locking around DS commit buckets
2015-08-17Merge branch 'bugfixes'Trond Myklebust2-16/+21
* bugfixes: SUNRPC: Fix a thinko in xs_connect() NFSv4.1/pNFS: Fix borken function _same_data_server_addrs_locked() NFS: nfs_set_pgio_error sometimes misses errors
2015-08-17Merge tag 'nfs-rdma-for-4.3' of git://git.linux-nfs.org/projects/anna/nfs-rdmaTrond Myklebust2-1/+4
NFS: NFS over RDMA Client Side Changes These patches improve both client performance and scalability, most notably by increasing the maixmum allowed rsize and wsize and by increasing the number of RDMA "credits". There are also several bugfixes, such as correcting how WRITE compounds are encoded and fixing large NFS symlink operations. Signed-off-by: Anna Schumaker <[email protected]>
2015-08-17NFS: Remove nfs_release()Anna Schumaker2-8/+3
And call nfs_file_clear_open_context() directly. This makes it obvious that nfs_file_release() will always return 0. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Rename nfs_commit_unstable_pages() to nfs_write_inode()Anna Schumaker1-6/+1
All nfs_write_inode() does is pass its arguments to nfs_commit_unstable_pages(). Let's cut out the middle man and have nfs_write_pages() do the work directly. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Remove nfs41_server_notify_{target|highest}_slotid_update()Anna Schumaker4-16/+4
All these functions do is call nfs41_ping_server() without adding anything. Let's remove them and give nfs41_ping_server() a better name instead. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Combine nfs_idmap_{init|quit}() and nfs_idmap_{init|quit}_keyring()Anna Schumaker1-12/+2
The idmap_init() and idmap_quit() functions only exist to call the _keyring() version. Let's just call the keyring() functions directly. Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2015-08-17NFS: Use RPC functions for matching sockaddrsAnna Schumaker3-119/+3
They already exist and do the exact same thing. Let's save ourselves several lines of code! Signed-off-by: Anna Schumaker <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>