aboutsummaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/Locking4
-rw-r--r--Documentation/filesystems/autofs-mount-control.txt6
-rw-r--r--Documentation/filesystems/autofs.txt66
-rw-r--r--Documentation/filesystems/debugfs.txt16
-rw-r--r--Documentation/filesystems/mount_api.txt367
-rw-r--r--Documentation/filesystems/porting35
-rw-r--r--Documentation/filesystems/vfs.txt8
7 files changed, 300 insertions, 202 deletions
diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index efea228ccd8a..dac435575384 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -52,7 +52,7 @@ prototypes:
int (*rename) (struct inode *, struct dentry *,
struct inode *, struct dentry *, unsigned int);
int (*readlink) (struct dentry *, char __user *,int);
- const char *(*get_link) (struct dentry *, struct inode *, void **);
+ const char *(*get_link) (struct dentry *, struct inode *, struct delayed_call *);
void (*truncate) (struct inode *);
int (*permission) (struct inode *, int, unsigned int);
int (*get_acl)(struct inode *, int);
@@ -118,6 +118,7 @@ set: exclusive
--------------------------- super_operations ---------------------------
prototypes:
struct inode *(*alloc_inode)(struct super_block *sb);
+ void (*free_inode)(struct inode *);
void (*destroy_inode)(struct inode *);
void (*dirty_inode) (struct inode *, int flags);
int (*write_inode) (struct inode *, struct writeback_control *wbc);
@@ -139,6 +140,7 @@ locking rules:
All may block [not true, see below]
s_umount
alloc_inode:
+free_inode: called from RCU callback
destroy_inode:
dirty_inode:
write_inode:
diff --git a/Documentation/filesystems/autofs-mount-control.txt b/Documentation/filesystems/autofs-mount-control.txt
index 45edad6933cc..acc02fc57993 100644
--- a/Documentation/filesystems/autofs-mount-control.txt
+++ b/Documentation/filesystems/autofs-mount-control.txt
@@ -354,8 +354,10 @@ this ioctl is called until no further expire candidates are found.
The call requires an initialized struct autofs_dev_ioctl with the
ioctlfd field set to the descriptor obtained from the open call. In
-addition an immediate expire, independent of the mount timeout, can be
-requested by setting the how field of struct args_expire to 1. If no
+addition an immediate expire that's independent of the mount timeout,
+and a forced expire that's independent of whether the mount is busy,
+can be requested by setting the how field of struct args_expire to
+AUTOFS_EXP_IMMEDIATE or AUTOFS_EXP_FORCED, respectively . If no
expire candidates can be found the ioctl returns -1 with errno set to
EAGAIN.
diff --git a/Documentation/filesystems/autofs.txt b/Documentation/filesystems/autofs.txt
index 373ad25852d3..3af38c7fd26d 100644
--- a/Documentation/filesystems/autofs.txt
+++ b/Documentation/filesystems/autofs.txt
@@ -116,7 +116,7 @@ that purpose there is another flag.
**DCACHE_MANAGE_TRANSIT**
If a dentry has DCACHE_MANAGE_TRANSIT set then two very different but
-related behaviors are invoked, both using the `d_op->d_manage()`
+related behaviours are invoked, both using the `d_op->d_manage()`
dentry operation.
Firstly, before checking to see if any filesystem is mounted on the
@@ -193,8 +193,8 @@ VFS remain in RCU-walk mode, but can only tell it to get out of
RCU-walk mode by returning `-ECHILD`.
So `d_manage()`, when called with `rcu_walk` set, should either return
--ECHILD if there is any reason to believe it is unsafe to end the
-mounted filesystem, and otherwise should return 0.
+-ECHILD if there is any reason to believe it is unsafe to enter the
+mounted filesystem, otherwise it should return 0.
autofs will return `-ECHILD` if an expiry of the filesystem has been
initiated or is being considered, otherwise it returns 0.
@@ -210,7 +210,7 @@ mounts that were created by `d_automount()` returning a filesystem to be
mounted. As autofs doesn't return such a filesystem but leaves the
mounting to the automount daemon, it must involve the automount daemon
in unmounting as well. This also means that autofs has more control
-of expiry.
+over expiry.
The VFS also supports "expiry" of mounts using the MNT_EXPIRE flag to
the `umount` system call. Unmounting with MNT_EXPIRE will fail unless
@@ -225,7 +225,7 @@ unmount any filesystems mounted on the autofs filesystem or remove any
symbolic links or empty directories any time it likes. If the unmount
or removal is successful the filesystem will be returned to the state
it was before the mount or creation, so that any access of the name
-will trigger normal auto-mount processing. In particlar, `rmdir` and
+will trigger normal auto-mount processing. In particular, `rmdir` and
`unlink` do not leave negative entries in the dcache as a normal
filesystem would, so an attempt to access a recently-removed object is
passed to autofs for handling.
@@ -240,11 +240,18 @@ Normally the daemon only wants to remove entries which haven't been
used for a while. For this purpose autofs maintains a "`last_used`"
time stamp on each directory or symlink. For symlinks it genuinely
does record the last time the symlink was "used" or followed to find
-out where it points to. For directories the field is a slight
-misnomer. It actually records the last time that autofs checked if
-the directory or one of its descendents was busy and found that it
-was. This is just as useful and doesn't require updating the field so
-often.
+out where it points to. For directories the field is used slightly
+differently. The field is updated at mount time and during expire
+checks if it is found to be in use (ie. open file descriptor or
+process working directory) and during path walks. The update done
+during path walks prevents frequent expire and immediate mount of
+frequently accessed automounts. But in the case where a GUI continually
+access or an application frequently scans an autofs directory tree
+there can be an accumulation of mounts that aren't actually being
+used. To cater for this case the "`strictexpire`" autofs mount option
+can be used to avoid the "`last_used`" update on path walk thereby
+preventing this apparent inability to expire mounts that aren't
+really in use.
The daemon is able to ask autofs if anything is due to be expired,
using an `ioctl` as discussed later. For a *direct* mount, autofs
@@ -255,8 +262,12 @@ up.
There is an option with indirect mounts to consider each of the leaves
that has been mounted on instead of considering the top-level names.
-This is intended for compatability with version 4 of autofs and should
-be considered as deprecated.
+This was originally intended for compatibility with version 4 of autofs
+and should be considered as deprecated for Sun Format automount maps.
+However, it may be used again for amd format mount maps (which are
+generally indirect maps) because the amd automounter allows for the
+setting of an expire timeout for individual mounts. But there are
+some difficulties in making the needed changes for this.
When autofs considers a directory it checks the `last_used` time and
compares it with the "timeout" value set when the filesystem was
@@ -273,7 +284,7 @@ mounts. If it finds something in the root directory to expire it will
return the name of that thing. Once a name has been returned the
automount daemon needs to unmount any filesystems mounted below the
name normally. As described above, this is unsafe for non-toplevel
-mounts in a version-5 autofs. For this reason the current `automountd`
+mounts in a version-5 autofs. For this reason the current `automount(8)`
does not use this ioctl.
The second mechanism uses either the **AUTOFS_DEV_IOCTL_EXPIRE_CMD** or
@@ -345,7 +356,7 @@ The `wait_queue_token` is a unique number which can identify a
particular request to be acknowledged. When a message is sent over
the pipe the affected dentry is marked as either "active" or
"expiring" and other accesses to it block until the message is
-acknowledged using one of the ioctls below and the relevant
+acknowledged using one of the ioctls below with the relevant
`wait_queue_token`.
Communicating with autofs: root directory ioctls
@@ -367,15 +378,14 @@ The available ioctl commands are:
This mode is also entered if a write to the pipe fails.
- **AUTOFS_IOC_PROTOVER**: This returns the protocol version in use.
- **AUTOFS_IOC_PROTOSUBVER**: Returns the protocol sub-version which
- is really a version number for the implementation. It is
- currently 2.
+ is really a version number for the implementation.
- **AUTOFS_IOC_SETTIMEOUT**: This passes a pointer to an unsigned
long. The value is used to set the timeout for expiry, and
the current timeout value is stored back through the pointer.
- **AUTOFS_IOC_ASKUMOUNT**: Returns, in the pointed-to `int`, 1 if
the filesystem could be unmounted. This is only a hint as
the situation could change at any instant. This call can be
- use to avoid a more expensive full unmount attempt.
+ used to avoid a more expensive full unmount attempt.
- **AUTOFS_IOC_EXPIRE**: as described above, this asks if there is
anything suitable to expire. A pointer to a packet:
@@ -400,6 +410,11 @@ The available ioctl commands are:
**AUTOFS_EXP_IMMEDIATE** causes `last_used` time to be ignored
and objects are expired if the are not in use.
+ **AUTOFS_EXP_FORCED** causes the in use status to be ignored
+ and objects are expired ieven if they are in use. This assumes
+ that the daemon has requested this because it is capable of
+ performing the umount.
+
**AUTOFS_EXP_LEAVES** will select a leaf rather than a top-level
name to expire. This is only safe when *maxproto* is 4.
@@ -415,7 +430,7 @@ which can be used to communicate directly with the autofs filesystem.
It requires CAP_SYS_ADMIN for access.
The `ioctl`s that can be used on this device are described in a separate
-document `autofs-mount-control.txt`, and are summarized briefly here.
+document `autofs-mount-control.txt`, and are summarised briefly here.
Each ioctl is passed a pointer to an `autofs_dev_ioctl` structure:
struct autofs_dev_ioctl {
@@ -511,6 +526,21 @@ directories.
Catatonic mode can only be left via the
**AUTOFS_DEV_IOCTL_OPENMOUNT_CMD** ioctl on the `/dev/autofs`.
+The "ignore" mount option
+-------------------------
+
+The "ignore" mount option can be used to provide a generic indicator
+to applications that the mount entry should be ignored when displaying
+mount information.
+
+In other OSes that provide autofs and that provide a mount list to user
+space based on the kernel mount list a no-op mount option ("ignore" is
+the one use on the most common OSes) is allowed so that autofs file
+system users can optionally use it.
+
+This is intended to be used by user space programs to exclude autofs
+mounts from consideration when reading the mounts list.
+
autofs, name spaces, and shared mounts
--------------------------------------
diff --git a/Documentation/filesystems/debugfs.txt b/Documentation/filesystems/debugfs.txt
index 4f45f71149cb..4a0a9c3f4af6 100644
--- a/Documentation/filesystems/debugfs.txt
+++ b/Documentation/filesystems/debugfs.txt
@@ -31,10 +31,10 @@ This call, if successful, will make a directory called name underneath the
indicated parent directory. If parent is NULL, the directory will be
created in the debugfs root. On success, the return value is a struct
dentry pointer which can be used to create files in the directory (and to
-clean it up at the end). A NULL return value indicates that something went
-wrong. If ERR_PTR(-ENODEV) is returned, that is an indication that the
-kernel has been built without debugfs support and none of the functions
-described below will work.
+clean it up at the end). An ERR_PTR(-ERROR) return value indicates that
+something went wrong. If ERR_PTR(-ENODEV) is returned, that is an
+indication that the kernel has been built without debugfs support and none
+of the functions described below will work.
The most general way to create a file within a debugfs directory is with:
@@ -48,8 +48,9 @@ should hold the file, data will be stored in the i_private field of the
resulting inode structure, and fops is a set of file operations which
implement the file's behavior. At a minimum, the read() and/or write()
operations should be provided; others can be included as needed. Again,
-the return value will be a dentry pointer to the created file, NULL for
-error, or ERR_PTR(-ENODEV) if debugfs support is missing.
+the return value will be a dentry pointer to the created file,
+ERR_PTR(-ERROR) on error, or ERR_PTR(-ENODEV) if debugfs support is
+missing.
Create a file with an initial size, the following function can be used
instead:
@@ -214,7 +215,8 @@ can be removed with:
void debugfs_remove(struct dentry *dentry);
-The dentry value can be NULL, in which case nothing will be removed.
+The dentry value can be NULL or an error value, in which case nothing will
+be removed.
Once upon a time, debugfs users were required to remember the dentry
pointer for every debugfs file they created so that all files could be
diff --git a/Documentation/filesystems/mount_api.txt b/Documentation/filesystems/mount_api.txt
index 944d1965e917..00ff0cfccfa7 100644
--- a/Documentation/filesystems/mount_api.txt
+++ b/Documentation/filesystems/mount_api.txt
@@ -12,11 +12,13 @@ CONTENTS
(4) Filesystem context security.
- (5) VFS filesystem context operations.
+ (5) VFS filesystem context API.
- (6) Parameter description.
+ (6) Superblock creation helpers.
- (7) Parameter helper functions.
+ (7) Parameter description.
+
+ (8) Parameter helper functions.
========
@@ -41,12 +43,15 @@ The creation of new mounts is now to be done in a multistep process:
(7) Destroy the context.
-To support this, the file_system_type struct gains a new field:
+To support this, the file_system_type struct gains two new fields:
int (*init_fs_context)(struct fs_context *fc);
+ const struct fs_parameter_description *parameters;
-which is invoked to set up the filesystem-specific parts of a filesystem
-context, including the additional space.
+The first is invoked to set up the filesystem-specific parts of a filesystem
+context, including the additional space, and the second points to the
+parameter description for validation at registration time and querying by a
+future system call.
Note that security initialisation is done *after* the filesystem is called so
that the namespaces may be adjusted first.
@@ -73,9 +78,9 @@ context. This is represented by the fs_context structure:
void *s_fs_info;
unsigned int sb_flags;
unsigned int sb_flags_mask;
+ unsigned int s_iflags;
+ unsigned int lsm_flags;
enum fs_context_purpose purpose:8;
- bool sloppy:1;
- bool silent:1;
...
};
@@ -141,6 +146,10 @@ The fs_context fields are as follows:
Which bits SB_* flags are to be set/cleared in super_block::s_flags.
+ (*) unsigned int s_iflags
+
+ These will be bitwise-OR'd with s->s_iflags when a superblock is created.
+
(*) enum fs_context_purpose
This indicates the purpose for which the context is intended. The
@@ -150,17 +159,6 @@ The fs_context fields are as follows:
FS_CONTEXT_FOR_SUBMOUNT -- New automatic submount of extant mount
FS_CONTEXT_FOR_RECONFIGURE -- Change an existing mount
- (*) bool sloppy
- (*) bool silent
-
- These are set if the sloppy or silent mount options are given.
-
- [NOTE] sloppy is probably unnecessary when userspace passes over one
- option at a time since the error can just be ignored if userspace deems it
- to be unimportant.
-
- [NOTE] silent is probably redundant with sb_flags & SB_SILENT.
-
The mount context is created by calling vfs_new_fs_context() or
vfs_dup_fs_context() and is destroyed with put_fs_context(). Note that the
structure is not refcounted.
@@ -342,28 +340,47 @@ number of operations used by the new mount code for this purpose:
It should return 0 on success or a negative error code on failure.
-=================================
-VFS FILESYSTEM CONTEXT OPERATIONS
-=================================
+==========================
+VFS FILESYSTEM CONTEXT API
+==========================
-There are four operations for creating a filesystem context and
-one for destroying a context:
+There are four operations for creating a filesystem context and one for
+destroying a context:
- (*) struct fs_context *vfs_new_fs_context(struct file_system_type *fs_type,
- struct dentry *reference,
- unsigned int sb_flags,
- unsigned int sb_flags_mask,
- enum fs_context_purpose purpose);
+ (*) struct fs_context *fs_context_for_mount(
+ struct file_system_type *fs_type,
+ unsigned int sb_flags);
- Create a filesystem context for a given filesystem type and purpose. This
- allocates the filesystem context, sets the superblock flags, initialises
- the security and calls fs_type->init_fs_context() to initialise the
- filesystem private data.
+ Allocate a filesystem context for the purpose of setting up a new mount,
+ whether that be with a new superblock or sharing an existing one. This
+ sets the superblock flags, initialises the security and calls
+ fs_type->init_fs_context() to initialise the filesystem private data.
- reference can be NULL or it may indicate the root dentry of a superblock
- that is going to be reconfigured (FS_CONTEXT_FOR_RECONFIGURE) or
- the automount point that triggered a submount (FS_CONTEXT_FOR_SUBMOUNT).
- This is provided as a source of namespace information.
+ fs_type specifies the filesystem type that will manage the context and
+ sb_flags presets the superblock flags stored therein.
+
+ (*) struct fs_context *fs_context_for_reconfigure(
+ struct dentry *dentry,
+ unsigned int sb_flags,
+ unsigned int sb_flags_mask);
+
+ Allocate a filesystem context for the purpose of reconfiguring an
+ existing superblock. dentry provides a reference to the superblock to be
+ configured. sb_flags and sb_flags_mask indicate which superblock flags
+ need changing and to what.
+
+ (*) struct fs_context *fs_context_for_submount(
+ struct file_system_type *fs_type,
+ struct dentry *reference);
+
+ Allocate a filesystem context for the purpose of creating a new mount for
+ an automount point or other derived superblock. fs_type specifies the
+ filesystem type that will manage the context and the reference dentry
+ supplies the parameters. Namespaces are propagated from the reference
+ dentry's superblock also.
+
+ Note that it's not a requirement that the reference dentry be of the same
+ filesystem type as fs_type.
(*) struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc);
@@ -390,20 +407,6 @@ context pointer or a negative error code.
For the remaining operations, if an error occurs, a negative error code will be
returned.
- (*) int vfs_get_tree(struct fs_context *fc);
-
- Get or create the mountable root and superblock, using the parameters in
- the filesystem context to select/configure the superblock. This invokes
- the ->validate() op and then the ->get_tree() op.
-
- [NOTE] ->validate() could perhaps be rolled into ->get_tree() and
- ->reconfigure().
-
- (*) struct vfsmount *vfs_create_mount(struct fs_context *fc);
-
- Create a mount given the parameters in the specified filesystem context.
- Note that this does not attach the mount to anything.
-
(*) int vfs_parse_fs_param(struct fs_context *fc,
struct fs_parameter *param);
@@ -432,17 +435,80 @@ returned.
clear the pointer, but then becomes responsible for disposing of the
object.
- (*) int vfs_parse_fs_string(struct fs_context *fc, char *key,
+ (*) int vfs_parse_fs_string(struct fs_context *fc, const char *key,
const char *value, size_t v_size);
- A wrapper around vfs_parse_fs_param() that just passes a constant string.
+ A wrapper around vfs_parse_fs_param() that copies the value string it is
+ passed.
(*) int generic_parse_monolithic(struct fs_context *fc, void *data);
Parse a sys_mount() data page, assuming the form to be a text list
consisting of key[=val] options separated by commas. Each item in the
list is passed to vfs_mount_option(). This is the default when the
- ->parse_monolithic() operation is NULL.
+ ->parse_monolithic() method is NULL.
+
+ (*) int vfs_get_tree(struct fs_context *fc);
+
+ Get or create the mountable root and superblock, using the parameters in
+ the filesystem context to select/configure the superblock. This invokes
+ the ->get_tree() method.
+
+ (*) struct vfsmount *vfs_create_mount(struct fs_context *fc);
+
+ Create a mount given the parameters in the specified filesystem context.
+ Note that this does not attach the mount to anything.
+
+
+===========================
+SUPERBLOCK CREATION HELPERS
+===========================
+
+A number of VFS helpers are available for use by filesystems for the creation
+or looking up of superblocks.
+
+ (*) struct super_block *
+ sget_fc(struct fs_context *fc,
+ int (*test)(struct super_block *sb, struct fs_context *fc),
+ int (*set)(struct super_block *sb, struct fs_context *fc));
+
+ This is the core routine. If test is non-NULL, it searches for an
+ existing superblock matching the criteria held in the fs_context, using
+ the test function to match them. If no match is found, a new superblock
+ is created and the set function is called to set it up.
+
+ Prior to the set function being called, fc->s_fs_info will be transferred
+ to sb->s_fs_info - and fc->s_fs_info will be cleared if set returns
+ success (ie. 0).
+
+The following helpers all wrap sget_fc():
+
+ (*) int vfs_get_super(struct fs_context *fc,
+ enum vfs_get_super_keying keying,
+ int (*fill_super)(struct super_block *sb,
+ struct fs_context *fc))
+
+ This creates/looks up a deviceless superblock. The keying indicates how
+ many superblocks of this type may exist and in what manner they may be
+ shared:
+
+ (1) vfs_get_single_super
+
+ Only one such superblock may exist in the system. Any further
+ attempt to get a new superblock gets this one (and any parameter
+ differences are ignored).
+
+ (2) vfs_get_keyed_super
+
+ Multiple superblocks of this type may exist and they're keyed on
+ their s_fs_info pointer (for example this may refer to a
+ namespace).
+
+ (3) vfs_get_independent_super
+
+ Multiple independent superblocks of this type may exist. This
+ function never matches an existing one and always creates a new
+ one.
=====================
@@ -454,35 +520,22 @@ There's a core description struct that links everything together:
struct fs_parameter_description {
const char name[16];
- u8 nr_params;
- u8 nr_alt_keys;
- u8 nr_enums;
- bool ignore_unknown;
- bool no_source;
- const char *const *keys;
- const struct constant_table *alt_keys;
const struct fs_parameter_spec *specs;
const struct fs_parameter_enum *enums;
};
For example:
- enum afs_param {
+ enum {
Opt_autocell,
Opt_bar,
Opt_dyn,
Opt_foo,
Opt_source,
- nr__afs_params
};
static const struct fs_parameter_description afs_fs_parameters = {
.name = "kAFS",
- .nr_params = nr__afs_params,
- .nr_alt_keys = ARRAY_SIZE(afs_param_alt_keys),
- .nr_enums = ARRAY_SIZE(afs_param_enums),
- .keys = afs_param_keys,
- .alt_keys = afs_param_alt_keys,
.specs = afs_param_specs,
.enums = afs_param_enums,
};
@@ -494,28 +547,24 @@ The members are as follows:
The name to be used in error messages generated by the parse helper
functions.
- (2) u8 nr_params;
-
- The number of discrete parameter identifiers. This indicates the number
- of elements in the ->types[] array and also limits the values that may be
- used in the values that the ->keys[] array maps to.
-
- It is expected that, for example, two parameters that are related, say
- "acl" and "noacl" with have the same ID, but will be flagged to indicate
- that one is the inverse of the other. The value can then be picked out
- from the parse result.
+ (2) const struct fs_parameter_specification *specs;
- (3) const struct fs_parameter_specification *specs;
+ Table of parameter specifications, terminated with a null entry, where the
+ entries are of type:
- Table of parameter specifications, where the entries are of type:
-
- struct fs_parameter_type {
- enum fs_parameter_spec type:8;
- u8 flags;
+ struct fs_parameter_spec {
+ const char *name;
+ u8 opt;
+ enum fs_parameter_type type:8;
+ unsigned short flags;
};
- and the parameter identifier is the index to the array. 'type' indicates
- the desired value type and must be one of:
+ The 'name' field is a string to match exactly to the parameter key (no
+ wildcards, patterns and no case-independence) and 'opt' is the value that
+ will be returned by the fs_parser() function in the case of a successful
+ match.
+
+ The 'type' field indicates the desired value type and must be one of:
TYPE NAME EXPECTED VALUE RESULT IN
======================= ======================= =====================
@@ -525,85 +574,65 @@ The members are as follows:
fs_param_is_u32_octal 32-bit octal int result->uint_32
fs_param_is_u32_hex 32-bit hex int result->uint_32
fs_param_is_s32 32-bit signed int result->int_32
+ fs_param_is_u64 64-bit unsigned int result->uint_64
fs_param_is_enum Enum value name result->uint_32
fs_param_is_string Arbitrary string param->string
fs_param_is_blob Binary blob param->blob
fs_param_is_blockdev Blockdev path * Needs lookup
fs_param_is_path Path * Needs lookup
- fs_param_is_fd File descriptor param->file
-
- And each parameter can be qualified with 'flags':
-
- fs_param_v_optional The value is optional
- fs_param_neg_with_no If key name is prefixed with "no", it is false
- fs_param_neg_with_empty If value is "", it is false
- fs_param_deprecated The parameter is deprecated.
-
- For example:
-
- static const struct fs_parameter_spec afs_param_specs[nr__afs_params] = {
- [Opt_autocell] = { fs_param_is flag },
- [Opt_bar] = { fs_param_is_enum },
- [Opt_dyn] = { fs_param_is flag },
- [Opt_foo] = { fs_param_is_bool, fs_param_neg_with_no },
- [Opt_source] = { fs_param_is_string },
- };
+ fs_param_is_fd File descriptor result->int_32
Note that if the value is of fs_param_is_bool type, fs_parse() will try
to match any string value against "0", "1", "no", "yes", "false", "true".
- [!] NOTE that the table must be sorted according to primary key name so
- that ->keys[] is also sorted.
-
- (4) const char *const *keys;
-
- Table of primary key names for the parameters. There must be one entry
- per defined parameter. The table is optional if ->nr_params is 0. The
- table is just an array of names e.g.:
+ Each parameter can also be qualified with 'flags':
- static const char *const afs_param_keys[nr__afs_params] = {
- [Opt_autocell] = "autocell",
- [Opt_bar] = "bar",
- [Opt_dyn] = "dyn",
- [Opt_foo] = "foo",
- [Opt_source] = "source",
- };
-
- [!] NOTE that the table must be sorted such that the table can be searched
- with bsearch() using strcmp(). This means that the Opt_* values must
- correspond to the entries in this table.
-
- (5) const struct constant_table *alt_keys;
- u8 nr_alt_keys;
-
- Table of additional key names and their mappings to parameter ID plus the
- number of elements in the table. This is optional. The table is just an
- array of { name, integer } pairs, e.g.:
+ fs_param_v_optional The value is optional
+ fs_param_neg_with_no result->negated set if key is prefixed with "no"
+ fs_param_neg_with_empty result->negated set if value is ""
+ fs_param_deprecated The parameter is deprecated.
- static const struct constant_table afs_param_keys[] = {
- { "baz", Opt_bar },
- { "dynamic", Opt_dyn },
+ These are wrapped with a number of convenience wrappers:
+
+ MACRO SPECIFIES
+ ======================= ===============================================
+ fsparam_flag() fs_param_is_flag
+ fsparam_flag_no() fs_param_is_flag, fs_param_neg_with_no
+ fsparam_bool() fs_param_is_bool
+ fsparam_u32() fs_param_is_u32
+ fsparam_u32oct() fs_param_is_u32_octal
+ fsparam_u32hex() fs_param_is_u32_hex
+ fsparam_s32() fs_param_is_s32
+ fsparam_u64() fs_param_is_u64
+ fsparam_enum() fs_param_is_enum
+ fsparam_string() fs_param_is_string
+ fsparam_blob() fs_param_is_blob
+ fsparam_bdev() fs_param_is_blockdev
+ fsparam_path() fs_param_is_path
+ fsparam_fd() fs_param_is_fd
+
+ all of which take two arguments, name string and option number - for
+ example:
+
+ static const struct fs_parameter_spec afs_param_specs[] = {
+ fsparam_flag ("autocell", Opt_autocell),
+ fsparam_flag ("dyn", Opt_dyn),
+ fsparam_string ("source", Opt_source),
+ fsparam_flag_no ("foo", Opt_foo),
+ {}
};
- [!] NOTE that the table must be sorted such that strcmp() can be used with
- bsearch() to search the entries.
-
- The parameter ID can also be fs_param_key_removed to indicate that a
- deprecated parameter has been removed and that an error will be given.
- This differs from fs_param_deprecated where the parameter may still have
- an effect.
-
- Further, the behaviour of the parameter may differ when an alternate name
- is used (for instance with NFS, "v3", "v4.2", etc. are alternate names).
+ An addition macro, __fsparam() is provided that takes an additional pair
+ of arguments to specify the type and the flags for anything that doesn't
+ match one of the above macros.
(6) const struct fs_parameter_enum *enums;
- u8 nr_enums;
- Table of enum value names to integer mappings and the number of elements
- stored therein. This is of type:
+ Table of enum value names to integer mappings, terminated with a null
+ entry. This is of type:
struct fs_parameter_enum {
- u8 param_id;
+ u8 opt;
char name[14];
u8 value;
};
@@ -621,11 +650,6 @@ The members are as follows:
try to look the value up in the enum table and the result will be stored
in the parse result.
- (7) bool no_source;
-
- If this is set, fs_parse() will ignore any "source" parameter and not
- pass it to the filesystem.
-
The parser should be pointed to by the parser pointer in the file_system_type
struct as this will provide validation on registration (if
CONFIG_VALIDATE_FS_PARSER=y) and will allow the description to be queried from
@@ -650,9 +674,8 @@ process the parameters it is given.
int value;
};
- and it must be sorted such that it can be searched using bsearch() using
- strcmp(). If a match is found, the corresponding value is returned. If a
- match isn't found, the not_found value is returned instead.
+ If a match is found, the corresponding value is returned. If a match
+ isn't found, the not_found value is returned instead.
(*) bool validate_constant_table(const struct constant_table *tbl,
size_t tbl_size,
@@ -665,36 +688,36 @@ process the parameters it is given.
should just be set to lie inside the low-to-high range.
If all is good, true is returned. If the table is invalid, errors are
- logged to dmesg, the stack is dumped and false is returned.
+ logged to dmesg and false is returned.
+
+ (*) bool fs_validate_description(const struct fs_parameter_description *desc);
+
+ This performs some validation checks on a parameter description. It
+ returns true if the description is good and false if it is not. It will
+ log errors to dmesg if validation fails.
(*) int fs_parse(struct fs_context *fc,
- const struct fs_param_parser *parser,
+ const struct fs_parameter_description *desc,
struct fs_parameter *param,
- struct fs_param_parse_result *result);
+ struct fs_parse_result *result);
This is the main interpreter of parameters. It uses the parameter
- description (parser) to look up the name of the parameter to use and to
- convert that to a parameter ID (stored in result->key).
+ description to look up a parameter by key name and to convert that to an
+ option number (which it returns).
If successful, and if the parameter type indicates the result is a
boolean, integer or enum type, the value is converted by this function and
- the result stored in result->{boolean,int_32,uint_32}.
+ the result stored in result->{boolean,int_32,uint_32,uint_64}.
If a match isn't initially made, the key is prefixed with "no" and no
value is present then an attempt will be made to look up the key with the
prefix removed. If this matches a parameter for which the type has flag
- fs_param_neg_with_no set, then a match will be made and the value will be
- set to false/0/NULL.
-
- If the parameter is successfully matched and, optionally, parsed
- correctly, 1 is returned. If the parameter isn't matched and
- parser->ignore_unknown is set, then 0 is returned. Otherwise -EINVAL is
- returned.
-
- (*) bool fs_validate_description(const struct fs_parameter_description *desc);
+ fs_param_neg_with_no set, then a match will be made and result->negated
+ will be set to true.
- This is validates the parameter description. It returns true if the
- description is good and false if it is not.
+ If the parameter isn't matched, -ENOPARAM will be returned; if the
+ parameter is matched, but the value is erroneous, -EINVAL will be
+ returned; otherwise the parameter's option number will be returned.
(*) int fs_lookup_param(struct fs_context *fc,
struct fs_parameter *value,
diff --git a/Documentation/filesystems/porting b/Documentation/filesystems/porting
index cf43bc4dbf31..3bd1148d8bb6 100644
--- a/Documentation/filesystems/porting
+++ b/Documentation/filesystems/porting
@@ -638,3 +638,38 @@ in your dentry operations instead.
inode to d_splice_alias() will also do the right thing (equivalent of
d_add(dentry, NULL); return NULL;), so that kind of special cases
also doesn't need a separate treatment.
+--
+[strongly recommended]
+ take the RCU-delayed parts of ->destroy_inode() into a new method -
+ ->free_inode(). If ->destroy_inode() becomes empty - all the better,
+ just get rid of it. Synchronous work (e.g. the stuff that can't
+ be done from an RCU callback, or any WARN_ON() where we want the
+ stack trace) *might* be movable to ->evict_inode(); however,
+ that goes only for the things that are not needed to balance something
+ done by ->alloc_inode(). IOW, if it's cleaning up the stuff that
+ might have accumulated over the life of in-core inode, ->evict_inode()
+ might be a fit.
+
+ Rules for inode destruction:
+ * if ->destroy_inode() is non-NULL, it gets called
+ * if ->free_inode() is non-NULL, it gets scheduled by call_rcu()
+ * combination of NULL ->destroy_inode and NULL ->free_inode is
+ treated as NULL/free_inode_nonrcu, to preserve the compatibility.
+
+ Note that the callback (be it via ->free_inode() or explicit call_rcu()
+ in ->destroy_inode()) is *NOT* ordered wrt superblock destruction;
+ as the matter of fact, the superblock and all associated structures
+ might be already gone. The filesystem driver is guaranteed to be still
+ there, but that's it. Freeing memory in the callback is fine; doing
+ more than that is possible, but requires a lot of care and is best
+ avoided.
+--
+[mandatory]
+ DCACHE_RCUACCESS is gone; having an RCU delay on dentry freeing is the
+ default. DCACHE_NORCU opts out, and only d_alloc_pseudo() has any
+ business doing so.
+--
+[mandatory]
+ d_alloc_pseudo() is internal-only; uses outside of alloc_file_pseudo() are
+ very suspect (and won't work in modules). Such uses are very likely to
+ be misspelled d_alloc_anon().
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 761c6fd24a53..57fc576b1f3e 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -3,8 +3,6 @@
Original author: Richard Gooch <[email protected]>
- Last updated on June 24, 2007.
-
Copyright (C) 1999 Richard Gooch
Copyright (C) 2005 Pekka Enberg
@@ -465,6 +463,12 @@ otherwise noted.
argument. If request can't be handled without leaving RCU mode,
have it return ERR_PTR(-ECHILD).
+ If the filesystem stores the symlink target in ->i_link, the
+ VFS may use it directly without calling ->get_link(); however,
+ ->get_link() must still be provided. ->i_link must not be
+ freed until after an RCU grace period. Writing to ->i_link
+ post-iget() time requires a 'release' memory barrier.
+
readlink: this is now just an override for use by readlink(2) for the
cases when ->get_link uses nd_jump_link() or object is not in
fact a symlink. Normally filesystems should only implement