diff options
author | Linus Torvalds <[email protected]> | 2022-01-12 13:45:12 -0800 |
---|---|---|
committer | Linus Torvalds <[email protected]> | 2022-01-12 13:45:12 -0800 |
commit | 8834147f9505661859ce44549bf601e2a06bba7c (patch) | |
tree | d8f1086c626c77fceb100bd2fc5ea011e1212070 /Documentation/filesystems/caching/fscache.rst | |
parent | 8975f8974888b3cd25aa8cf9eba24edbb9230bb2 (diff) | |
parent | d7bdba1c81f7e7bad12c7c7ce55afa3c7b0821ef (diff) |
Merge tag 'fscache-rewrite-20220111' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache rewrite from David Howells:
"This is a set of patches that rewrites the fscache driver and the
cachefiles driver, significantly simplifying the code compared to
what's upstream, removing the complex operation scheduling and object
state machine in favour of something much smaller and simpler.
The series is structured such that the first few patches disable
fscache use by the network filesystems using it, remove the cachefiles
driver entirely and as much of the fscache driver as can be got away
with without causing build failures in the network filesystems.
The patches after that recreate fscache and then cachefiles,
attempting to add the pieces in a logical order. Finally, the
filesystems are reenabled and then the very last patch changes the
documentation.
[!] Note: I have dropped the cifs patch for the moment, leaving local
caching in cifs disabled. I've been having trouble getting that
working. I think I have it done, but it needs more testing (there
seem to be some test failures occurring with v5.16 also from
xfstests), so I propose deferring that patch to the end of the
merge window.
WHY REWRITE?
============
Fscache's operation scheduling API was intended to handle sequencing
of cache operations, which were all required (where possible) to run
asynchronously in parallel with the operations being done by the
network filesystem, whilst allowing the cache to be brought online and
offline and to interrupt service for invalidation.
With the advent of the tmpfile capacity in the VFS, however, an
opportunity arises to do invalidation much more simply, without having
to wait for I/O that's actually in progress: Cachefiles can simply
create a tmpfile, cut over the file pointer for the backing object
attached to a cookie and abandon the in-progress I/O, dismissing it
upon completion.
Future work here would involve using Omar Sandoval's vfs_link() with
AT_LINK_REPLACE[1] to allow an extant file to be displaced by a new
hard link from a tmpfile as currently I have to unlink the old file
first.
These patches can also simplify the object state handling as I/O
operations to the cache don't all have to be brought to a stop in
order to invalidate a file. To that end, and with an eye on to writing
a new backing cache model in the future, I've taken the opportunity to
simplify the indexing structure.
I've separated the index cookie concept from the file cookie concept
by C type now. The former is now called a "volume cookie" (struct
fscache_volume) and there is a container of file cookies. There are
then just the two levels. All the index cookie levels are collapsed
into a single volume cookie, and this has a single printable string as
a key. For instance, an AFS volume would have a key of something like
"afs,example.com,1000555", combining the filesystem name, cell name
and volume ID. This is freeform, but must not have '/' chars in it.
I've also eliminated all pointers back from fscache into the network
filesystem. This required the duplication of a little bit of data in
the cookie (cookie key, coherency data and file size), but it's not
actually that much. This gets rid of problems with making sure we keep
netfs data structures around so that the cache can access them.
These patches mean that most of the code that was in the drivers
before is simply gone and those drivers are now almost entirely new
code. That being the case, there doesn't seem any particular reason to
try and maintain bisectability across it. Further, there has to be a
point in the middle where things are cut over as there's a single
point everything has to go through (ie. /dev/cachefiles) and it can't
be in use by two drivers at once.
ISSUES YET OUTSTANDING
======================
There are some issues still outstanding, unaddressed by this patchset,
that will need fixing in future patchsets, but that don't stop this
series from being usable:
(1) The cachefiles driver needs to stop using the backing filesystem's
metadata to store information about what parts of the cache are
populated. This is not reliable with modern extent-based
filesystems.
Fixing this is deferred to a separate patchset as it involves
negotiation with the network filesystem and the VM as to how much
data to download to fulfil a read - which brings me on to (2)...
(2) NFS (and CIFS with the dropped patch) do not take account of how
the cache would like I/O to be structured to meet its granularity
requirements. Previously, the cache used page granularity, which
was fine as the network filesystems also dealt in page
granularity, and the backing filesystem (ext4, xfs or whatever)
did whatever it did out of sight. However, we now have folios to
deal with and the cache will now have to store its own metadata to
track its contents.
The change I'm looking at making for cachefiles is to store
content bitmaps in one or more xattrs and making a bit in the map
correspond to something like a 256KiB block. However, the size of
an xattr and the fact that they have to be read/updated in one go
means that I'm looking at covering 1GiB of data per 512-byte map
and storing each map in an xattr. Cachefiles has the potential to
grow into a fully fledged filesystem of its very own if I'm not
careful.
However, I'm also looking at changing things even more radically
and going to a different model of how the cache is arranged and
managed - one that's more akin to the way, say, openafs does
things - which brings me on to (3)...
(3) The way cachefilesd does culling is very inefficient for large
caches and it would be better to move it into the kernel if I can
as cachefilesd has to keep asking the kernel if it can cull a
file. Changing the way the backend works would allow this to be
addressed.
BITS THAT MAY BE CONTROVERSIAL
==============================
There are some bits I've added that may be controversial:
(1) I've provided a flag, S_KERNEL_FILE, that cachefiles uses to check
if a files is already being used by some other kernel service
(e.g. a duplicate cachefiles cache in the same directory) and
reject it if it is. This isn't entirely necessary, but it helps
prevent accidental data corruption.
I don't want to use S_SWAPFILE as that has other effects, but
quite possibly swapon() should set S_KERNEL_FILE too.
Note that it doesn't prevent userspace from interfering, though
perhaps it should. (I have made it prevent a marked directory from
being rmdir-able).
(2) Cachefiles wants to keep the backing file for a cookie open whilst
we might need to write to it from network filesystem writeback.
The problem is that the network filesystem unuses its cookie when
its file is closed, and so we have nothing pinning the cachefiles
file open and it will get closed automatically after a short time
to avoid EMFILE/ENFILE problems.
Reopening the cache file, however, is a problem if this is being
done due to writeback triggered by exit(). Some filesystems will
oops if we try to open a file in that context because they want to
access current->fs or suchlike.
To get around this, I added the following:
(A) An inode flag, I_PINNING_FSCACHE_WB, to be set on a network
filesystem inode to indicate that we have a usage count on the
cookie caching that inode.
(B) A flag in struct writeback_control, unpinned_fscache_wb, that
is set when __writeback_single_inode() clears the last dirty
page from i_pages - at which point it clears
I_PINNING_FSCACHE_WB and sets this flag.
This has to be done here so that clearing I_PINNING_FSCACHE_WB
can be done atomically with the check of PAGECACHE_TAG_DIRTY
that clears I_DIRTY_PAGES.
(C) A function, fscache_set_page_dirty(), which if it is not set,
sets I_PINNING_FSCACHE_WB and calls fscache_use_cookie() to
pin the cache resources.
(D) A function, fscache_unpin_writeback(), to be called by
->write_inode() to unuse the cookie.
(E) A function, fscache_clear_inode_writeback(), to be called when
the inode is evicted, before clear_inode() is called. This
cleans up any lingering I_PINNING_FSCACHE_WB.
The network filesystem can then use these tools to make sure that
fscache_write_to_cache() can write locally modified data to the
cache as well as to the server.
For the future, I'm working on write helpers for netfs lib that
should allow this facility to be removed by keeping track of the
dirty regions separately - but that's incomplete at the moment and
is also going to be affected by folios, one way or another, since
it deals with pages"
Link: https://lore.kernel.org/all/[email protected]/
Tested-by: Dominique Martinet <[email protected]> # 9p
Tested-by: [email protected] # afs
Tested-by: Jeff Layton <[email protected]> # ceph
Tested-by: Dave Wysochanski <[email protected]> # nfs
Tested-by: Daire Byrne <[email protected]> # nfs
* tag 'fscache-rewrite-20220111' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: (67 commits)
9p, afs, ceph, nfs: Use current_is_kswapd() rather than gfpflags_allow_blocking()
fscache: Add a tracepoint for cookie use/unuse
fscache: Rewrite documentation
ceph: add fscache writeback support
ceph: conversion to new fscache API
nfs: Implement cache I/O by accessing the cache directly
nfs: Convert to new fscache volume/cookie API
9p: Copy local writes to the cache when writing to the server
9p: Use fscache indexing rewrite and reenable caching
afs: Skip truncation on the server of data we haven't written yet
afs: Copy local writes to the cache when writing to the server
afs: Convert afs to use the new fscache API
fscache, cachefiles: Display stat of culling events
fscache, cachefiles: Display stats of no-space events
cachefiles: Allow cachefiles to actually function
fscache, cachefiles: Store the volume coherency data
cachefiles: Implement the I/O routines
cachefiles: Implement cookie resize for truncate
cachefiles: Implement begin and end I/O operation
cachefiles: Implement backing file wrangling
...
Diffstat (limited to 'Documentation/filesystems/caching/fscache.rst')
-rw-r--r-- | Documentation/filesystems/caching/fscache.rst | 525 |
1 files changed, 154 insertions, 371 deletions
diff --git a/Documentation/filesystems/caching/fscache.rst b/Documentation/filesystems/caching/fscache.rst index 70de86922b6a..a74d7b052dc1 100644 --- a/Documentation/filesystems/caching/fscache.rst +++ b/Documentation/filesystems/caching/fscache.rst @@ -10,25 +10,25 @@ Overview This facility is a general purpose cache for network filesystems, though it could be used for caching other things such as ISO9660 filesystems too. -FS-Cache mediates between cache backends (such as CacheFS) and network +FS-Cache mediates between cache backends (such as CacheFiles) and network filesystems:: +---------+ - | | +--------------+ - | NFS |--+ | | - | | | +-->| CacheFS | - +---------+ | +----------+ | | /dev/hda5 | - | | | | +--------------+ - +---------+ +-->| | | - | | | |--+ - | AFS |----->| FS-Cache | - | | | |--+ - +---------+ +-->| | | - | | | | +--------------+ - +---------+ | +----------+ | | | - | | | +-->| CacheFiles | - | ISOFS |--+ | /var/cache | - | | +--------------+ + | | +--------------+ + | NFS |--+ | | + | | | +-->| CacheFS | + +---------+ | +----------+ | | /dev/hda5 | + | | | | +--------------+ + +---------+ +-------------->| | | + | | +-------+ | |--+ + | AFS |----->| | | FS-Cache | + | | | netfs |-->| |--+ + +---------+ +-->| lib | | | | + | | | | | | +--------------+ + +---------+ | +-------+ +----------+ | | | + | | | +-->| CacheFiles | + | 9P |--+ | /var/cache | + | | +--------------+ +---------+ Or to look at it another way, FS-Cache is a module that provides a caching @@ -84,101 +84,62 @@ then serving the pages out of that cache rather than the netfs inode because: one-off access of a small portion of it (such as might be done with the "file" program). -It instead serves the cache out in PAGE_SIZE chunks as and when requested by -the netfs('s) using it. +It instead serves the cache out in chunks as and when requested by the netfs +using it. FS-Cache provides the following facilities: - (1) More than one cache can be used at once. Caches can be selected + * More than one cache can be used at once. Caches can be selected explicitly by use of tags. - (2) Caches can be added / removed at any time. + * Caches can be added / removed at any time, even whilst being accessed. - (3) The netfs is provided with an interface that allows either party to + * The netfs is provided with an interface that allows either party to withdraw caching facilities from a file (required for (2)). - (4) The interface to the netfs returns as few errors as possible, preferring + * The interface to the netfs returns as few errors as possible, preferring rather to let the netfs remain oblivious. - (5) Cookies are used to represent indices, files and other objects to the - netfs. The simplest cookie is just a NULL pointer - indicating nothing - cached there. - - (6) The netfs is allowed to propose - dynamically - any index hierarchy it - desires, though it must be aware that the index search function is - recursive, stack space is limited, and indices can only be children of - indices. - - (7) Data I/O is done direct to and from the netfs's pages. The netfs - indicates that page A is at index B of the data-file represented by cookie - C, and that it should be read or written. The cache backend may or may - not start I/O on that page, but if it does, a netfs callback will be - invoked to indicate completion. The I/O may be either synchronous or - asynchronous. - - (8) Cookies can be "retired" upon release. At this point FS-Cache will mark - them as obsolete and the index hierarchy rooted at that point will get - recycled. - - (9) The netfs provides a "match" function for index searches. In addition to - saying whether a match was made or not, this can also specify that an - entry should be updated or deleted. - -(10) As much as possible is done asynchronously. - - -FS-Cache maintains a virtual indexing tree in which all indices, files, objects -and pages are kept. Bits of this tree may actually reside in one or more -caches:: - - FSDEF - | - +------------------------------------+ - | | - NFS AFS - | | - +--------------------------+ +-----------+ - | | | | - homedir mirror afs.org redhat.com - | | | - +------------+ +---------------+ +----------+ - | | | | | | - 00001 00002 00007 00125 vol00001 vol00002 - | | | | | - +---+---+ +-----+ +---+ +------+------+ +-----+----+ - | | | | | | | | | | | | | - PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak - | | - PG0 +-------+ - | | - 00001 00003 - | - +---+---+ - | | | - PG0 PG1 PG2 - -In the example above, you can see two netfs's being backed: NFS and AFS. These -have different index hierarchies: - - * The NFS primary index contains per-server indices. Each server index is - indexed by NFS file handles to get data file objects. Each data file - objects can have an array of pages, but may also have further child - objects, such as extended attributes and directory entries. Extended - attribute objects themselves have page-array contents. - - * The AFS primary index contains per-cell indices. Each cell index contains - per-logical-volume indices. Each of volume index contains up to three - indices for the read-write, read-only and backup mirrors of those volumes. - Each of these contains vnode data file objects, each of which contains an - array of pages. - -The very top index is the FS-Cache master index in which individual netfs's -have entries. - -Any index object may reside in more than one cache, provided it only has index -children. Any index with non-index object children will be assumed to only -reside in one cache. + * There are three types of cookie: cache, volume and data file cookies. + Cache cookies represent the cache as a whole and are not normally visible + to the netfs; the netfs gets a volume cookie to represent a collection of + files (typically something that a netfs would get for a superblock); and + data file cookies are used to cache data (something that would be got for + an inode). + + * Volumes are matched using a key. This is a printable string that is used + to encode all the information that might be needed to distinguish one + superblock, say, from another. This would be a compound of things like + cell name or server address, volume name or share path. It must be a + valid pathname. + + * Cookies are matched using a key. This is a binary blob and is used to + represent the object within a volume (so the volume key need not form + part of the blob). This might include things like an inode number and + uniquifier or a file handle. + + * Cookie resources are set up and pinned by marking the cookie in-use. + This prevents the backing resources from being culled. Timed garbage + collection is employed to eliminate cookies that haven't been used for a + short while, thereby reducing resource overload. This is intended to be + used when a file is opened or closed. + + A cookie can be marked in-use multiple times simultaneously; each mark + must be unused. + + * Begin/end access functions are provided to delay cache withdrawal for the + duration of an operation and prevent structs from being freed whilst + we're looking at them. + + * Data I/O is done by asynchronous DIO to/from a buffer described by the + netfs using an iov_iter. + + * An invalidation facility is available to discard data from the cache and + to deal with I/O that's in progress that is accessing old data. + + * Cookies can be "retired" upon release, thereby causing the object to be + removed from the cache. The netfs API to FS-Cache can be found in: @@ -189,11 +150,6 @@ The cache backend API to FS-Cache can be found in: Documentation/filesystems/caching/backend-api.rst -A description of the internal representations and object state machine can be -found in: - - Documentation/filesystems/caching/object.rst - Statistical Information ======================= @@ -201,333 +157,162 @@ Statistical Information If FS-Cache is compiled with the following options enabled:: CONFIG_FSCACHE_STATS=y - CONFIG_FSCACHE_HISTOGRAM=y -then it will gather certain statistics and display them through a number of -proc files. +then it will gather certain statistics and display them through: -/proc/fs/fscache/stats ----------------------- + /proc/fs/fscache/stats - This shows counts of a number of events that can happen in FS-Cache: +This shows counts of a number of events that can happen in FS-Cache: +--------------+-------+-------------------------------------------------------+ |CLASS |EVENT |MEANING | +==============+=======+=======================================================+ -|Cookies |idx=N |Number of index cookies allocated | -+ +-------+-------------------------------------------------------+ -| |dat=N |Number of data storage cookies allocated | +|Cookies |n=N |Number of data storage cookies allocated | + +-------+-------------------------------------------------------+ -| |spc=N |Number of special cookies allocated | -+--------------+-------+-------------------------------------------------------+ -|Objects |alc=N |Number of objects allocated | -+ +-------+-------------------------------------------------------+ -| |nal=N |Number of object allocation failures | +| |v=N |Number of volume index cookies allocated | + +-------+-------------------------------------------------------+ -| |avl=N |Number of objects that reached the available state | -+ +-------+-------------------------------------------------------+ -| |ded=N |Number of objects that reached the dead state | -+--------------+-------+-------------------------------------------------------+ -|ChkAux |non=N |Number of objects that didn't have a coherency check | +| |vcol=N |Number of volume index key collisions | + +-------+-------------------------------------------------------+ -| |ok=N |Number of objects that passed a coherency check | -+ +-------+-------------------------------------------------------+ -| |upd=N |Number of objects that needed a coherency data update | -+ +-------+-------------------------------------------------------+ -| |obs=N |Number of objects that were declared obsolete | -+--------------+-------+-------------------------------------------------------+ -|Pages |mrk=N |Number of pages marked as being cached | -| |unc=N |Number of uncache page requests seen | +| |voom=N |Number of OOM events when allocating volume cookies | +--------------+-------+-------------------------------------------------------+ |Acquire |n=N |Number of acquire cookie requests seen | + +-------+-------------------------------------------------------+ -| |nul=N |Number of acq reqs given a NULL parent | -+ +-------+-------------------------------------------------------+ -| |noc=N |Number of acq reqs rejected due to no cache available | -+ +-------+-------------------------------------------------------+ | |ok=N |Number of acq reqs succeeded | + +-------+-------------------------------------------------------+ -| |nbf=N |Number of acq reqs rejected due to error | -+ +-------+-------------------------------------------------------+ | |oom=N |Number of acq reqs failed on ENOMEM | +--------------+-------+-------------------------------------------------------+ -|Lookups |n=N |Number of lookup calls made on cache backends | +|LRU |n=N |Number of cookies currently on the LRU | + +-------+-------------------------------------------------------+ -| |neg=N |Number of negative lookups made | +| |exp=N |Number of cookies expired off of the LRU | + +-------+-------------------------------------------------------+ -| |pos=N |Number of positive lookups made | +| |rmv=N |Number of cookies removed from the LRU | + +-------+-------------------------------------------------------+ -| |crt=N |Number of objects created by lookup | +| |drp=N |Number of LRU'd cookies relinquished/withdrawn | + +-------+-------------------------------------------------------+ -| |tmo=N |Number of lookups timed out and requeued | +| |at=N |Time till next LRU cull (jiffies) | ++--------------+-------+-------------------------------------------------------+ +|Invals |n=N |Number of invalidations | +--------------+-------+-------------------------------------------------------+ |Updates |n=N |Number of update cookie requests seen | + +-------+-------------------------------------------------------+ -| |nul=N |Number of upd reqs given a NULL parent | +| |rsz=N |Number of resize requests | + +-------+-------------------------------------------------------+ -| |run=N |Number of upd reqs granted CPU time | +| |rsn=N |Number of skipped resize requests | +--------------+-------+-------------------------------------------------------+ |Relinqs |n=N |Number of relinquish cookie requests seen | + +-------+-------------------------------------------------------+ -| |nul=N |Number of rlq reqs given a NULL parent | +| |rtr=N |Number of rlq reqs with retire=true | + +-------+-------------------------------------------------------+ -| |wcr=N |Number of rlq reqs waited on completion of creation | +| |drop=N |Number of cookies no longer blocking re-acquisition | +--------------+-------+-------------------------------------------------------+ -|AttrChg |n=N |Number of attribute changed requests seen | -+ +-------+-------------------------------------------------------+ -| |ok=N |Number of attr changed requests queued | -+ +-------+-------------------------------------------------------+ -| |nbf=N |Number of attr changed rejected -ENOBUFS | +|NoSpace |nwr=N |Number of write requests refused due to lack of space | + +-------+-------------------------------------------------------+ -| |oom=N |Number of attr changed failed -ENOMEM | +| |ncr=N |Number of create requests refused due to lack of space | + +-------+-------------------------------------------------------+ -| |run=N |Number of attr changed ops given CPU time | +| |cull=N |Number of objects culled to make space | +--------------+-------+-------------------------------------------------------+ -|Allocs |n=N |Number of allocation requests seen | +|IO |rd=N |Number of read operations in the cache | + +-------+-------------------------------------------------------+ -| |ok=N |Number of successful alloc reqs | -+ +-------+-------------------------------------------------------+ -| |wt=N |Number of alloc reqs that waited on lookup completion | -+ +-------+-------------------------------------------------------+ -| |nbf=N |Number of alloc reqs rejected -ENOBUFS | -+ +-------+-------------------------------------------------------+ -| |int=N |Number of alloc reqs aborted -ERESTARTSYS | -+ +-------+-------------------------------------------------------+ -| |ops=N |Number of alloc reqs submitted | -+ +-------+-------------------------------------------------------+ -| |owt=N |Number of alloc reqs waited for CPU time | -+ +-------+-------------------------------------------------------+ -| |abt=N |Number of alloc reqs aborted due to object death | -+--------------+-------+-------------------------------------------------------+ -|Retrvls |n=N |Number of retrieval (read) requests seen | -+ +-------+-------------------------------------------------------+ -| |ok=N |Number of successful retr reqs | -+ +-------+-------------------------------------------------------+ -| |wt=N |Number of retr reqs that waited on lookup completion | -+ +-------+-------------------------------------------------------+ -| |nod=N |Number of retr reqs returned -ENODATA | -+ +-------+-------------------------------------------------------+ -| |nbf=N |Number of retr reqs rejected -ENOBUFS | -+ +-------+-------------------------------------------------------+ -| |int=N |Number of retr reqs aborted -ERESTARTSYS | -+ +-------+-------------------------------------------------------+ -| |oom=N |Number of retr reqs failed -ENOMEM | -+ +-------+-------------------------------------------------------+ -| |ops=N |Number of retr reqs submitted | -+ +-------+-------------------------------------------------------+ -| |owt=N |Number of retr reqs waited for CPU time | -+ +-------+-------------------------------------------------------+ -| |abt=N |Number of retr reqs aborted due to object death | -+--------------+-------+-------------------------------------------------------+ -|Stores |n=N |Number of storage (write) requests seen | -+ +-------+-------------------------------------------------------+ -| |ok=N |Number of successful store reqs | -+ +-------+-------------------------------------------------------+ -| |agn=N |Number of store reqs on a page already pending storage | -+ +-------+-------------------------------------------------------+ -| |nbf=N |Number of store reqs rejected -ENOBUFS | -+ +-------+-------------------------------------------------------+ -| |oom=N |Number of store reqs failed -ENOMEM | -+ +-------+-------------------------------------------------------+ -| |ops=N |Number of store reqs submitted | -+ +-------+-------------------------------------------------------+ -| |run=N |Number of store reqs granted CPU time | -+ +-------+-------------------------------------------------------+ -| |pgs=N |Number of pages given store req processing time | -+ +-------+-------------------------------------------------------+ -| |rxd=N |Number of store reqs deleted from tracking tree | -+ +-------+-------------------------------------------------------+ -| |olm=N |Number of store reqs over store limit | -+--------------+-------+-------------------------------------------------------+ -|VmScan |nos=N |Number of release reqs against pages with no | -| | |pending store | -+ +-------+-------------------------------------------------------+ -| |gon=N |Number of release reqs against pages stored by | -| | |time lock granted | -+ +-------+-------------------------------------------------------+ -| |bsy=N |Number of release reqs ignored due to in-progress store| -+ +-------+-------------------------------------------------------+ -| |can=N |Number of page stores cancelled due to release req | -+--------------+-------+-------------------------------------------------------+ -|Ops |pend=N |Number of times async ops added to pending queues | -+ +-------+-------------------------------------------------------+ -| |run=N |Number of times async ops given CPU time | -+ +-------+-------------------------------------------------------+ -| |enq=N |Number of times async ops queued for processing | -+ +-------+-------------------------------------------------------+ -| |can=N |Number of async ops cancelled | -+ +-------+-------------------------------------------------------+ -| |rej=N |Number of async ops rejected due to object | -| | |lookup/create failure | -+ +-------+-------------------------------------------------------+ -| |ini=N |Number of async ops initialised | -+ +-------+-------------------------------------------------------+ -| |dfr=N |Number of async ops queued for deferred release | -+ +-------+-------------------------------------------------------+ -| |rel=N |Number of async ops released | -| | |(should equal ini=N when idle) | -+ +-------+-------------------------------------------------------+ -| |gc=N |Number of deferred-release async ops garbage collected | -+--------------+-------+-------------------------------------------------------+ -|CacheOp |alo=N |Number of in-progress alloc_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |luo=N |Number of in-progress lookup_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |luc=N |Number of in-progress lookup_complete() cache ops | -+ +-------+-------------------------------------------------------+ -| |gro=N |Number of in-progress grab_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |upo=N |Number of in-progress update_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |dro=N |Number of in-progress drop_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |pto=N |Number of in-progress put_object() cache ops | -+ +-------+-------------------------------------------------------+ -| |syn=N |Number of in-progress sync_cache() cache ops | -+ +-------+-------------------------------------------------------+ -| |atc=N |Number of in-progress attr_changed() cache ops | -+ +-------+-------------------------------------------------------+ -| |rap=N |Number of in-progress read_or_alloc_page() cache ops | -+ +-------+-------------------------------------------------------+ -| |ras=N |Number of in-progress read_or_alloc_pages() cache ops | -+ +-------+-------------------------------------------------------+ -| |alp=N |Number of in-progress allocate_page() cache ops | -+ +-------+-------------------------------------------------------+ -| |als=N |Number of in-progress allocate_pages() cache ops | -+ +-------+-------------------------------------------------------+ -| |wrp=N |Number of in-progress write_page() cache ops | -+ +-------+-------------------------------------------------------+ -| |ucp=N |Number of in-progress uncache_page() cache ops | -+ +-------+-------------------------------------------------------+ -| |dsp=N |Number of in-progress dissociate_pages() cache ops | -+--------------+-------+-------------------------------------------------------+ -|CacheEv |nsp=N |Number of object lookups/creations rejected due to | -| | |lack of space | -+ +-------+-------------------------------------------------------+ -| |stl=N |Number of stale objects deleted | -+ +-------+-------------------------------------------------------+ -| |rtr=N |Number of objects retired when relinquished | -+ +-------+-------------------------------------------------------+ -| |cul=N |Number of objects culled | +| |wr=N |Number of write operations in the cache | +--------------+-------+-------------------------------------------------------+ +Netfslib will also add some stats counters of its own. -/proc/fs/fscache/histogram --------------------------- +Cache List +========== - :: +FS-Cache provides a list of cache cookies: - cat /proc/fs/fscache/histogram - JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS - ===== ===== ========= ========= ========= ========= ========= + /proc/fs/fscache/cookies - This shows the breakdown of the number of times each amount of time - between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The - columns are as follows: +This will look something like:: - ========= ======================================================= - COLUMN TIME MEASUREMENT - ========= ======================================================= - OBJ INST Length of time to instantiate an object - OP RUNS Length of time a call to process an operation took - OBJ RUNS Length of time a call to process an object event took - RETRV DLY Time between an requesting a read and lookup completing - RETRIEVLS Time between beginning and end of a retrieval - ========= ======================================================= + # cat /proc/fs/fscache/caches + CACHE REF VOLS OBJS ACCES S NAME + ======== ===== ===== ===== ===== = =============== + 00000001 2 1 2123 1 A default - Each row shows the number of events that took a particular range of times. - Each step is 1 jiffy in size. The JIFS column indicates the particular - jiffy range covered, and the SECS field the equivalent number of seconds. +where the columns are: + ======= =============================================================== + COLUMN DESCRIPTION + ======= =============================================================== + CACHE Cache cookie debug ID (also appears in traces) + REF Number of references on the cache cookie + VOLS Number of volumes cookies in this cache + OBJS Number of cache objects in use + ACCES Number of accesses pinning the cache + S State + NAME Name of the cache. + ======= =============================================================== + +The state can be (-) Inactive, (P)reparing, (A)ctive, (E)rror or (W)ithdrawing. -Object List +Volume List =========== -If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a -list of all the objects currently allocated and allow them to be viewed -through:: +FS-Cache provides a list of volume cookies: - /proc/fs/fscache/objects + /proc/fs/fscache/volumes This will look something like:: - [root@andromeda ~]# head /proc/fs/fscache/objects - OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA - ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================ - 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a - 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a + VOLUME REF nCOOK ACC FL CACHE KEY + ======== ===== ===== === == =============== ================ + 00000001 55 54 1 00 default afs,example.com,100058 -where the first set of columns before the '|' describe the object: +where the columns are: ======= =============================================================== COLUMN DESCRIPTION ======= =============================================================== - OBJECT Object debugging ID (appears as OBJ%x in some debug messages) - PARENT Debugging ID of parent object - STAT Object state - CHLDN Number of child objects of this object - OPS Number of outstanding operations on this object - OOP Number of outstanding child object management operations - IPR - EX Number of outstanding exclusive operations - READS Number of outstanding read operations - EM Object's event mask - EV Events raised on this object - F Object flags - S Object work item busy state mask (1:pending 2:running) + VOLUME The volume cookie debug ID (also appears in traces) + REF Number of references on the volume cookie + nCOOK Number of cookies in the volume + ACC Number of accesses pinning the cache + FL Flags on the volume cookie + CACHE Name of the cache or "-" + KEY The indexing key for the volume ======= =============================================================== -and the second set of columns describe the object's cookie, if present: - - ================ ====================================================== - COLUMN DESCRIPTION - ================ ====================================================== - NETFS_COOKIE_DEF Name of netfs cookie definition - TY Cookie type (IX - index, DT - data, hex - special) - FL Cookie flags - NETFS_DATA Netfs private data stored in the cookie - OBJECT_KEY Object key } 1 column, with separating comma - AUX_DATA Object aux data } presence may be configured - ================ ====================================================== - -The data shown may be filtered by attaching the a key to an appropriate keyring -before viewing the file. Something like:: - - keyctl add user fscache:objlist <restrictions> @s - -where <restrictions> are a selection of the following letters: - == ========================================================= - K Show hexdump of object key (don't show if not given) - A Show hexdump of object aux data (don't show if not given) - == ========================================================= +Cookie List +=========== -and the following paired letters: +FS-Cache provides a list of cookies: - == ========================================================= - C Show objects that have a cookie - c Show objects that don't have a cookie - B Show objects that are busy - b Show objects that aren't busy - W Show objects that have pending writes - w Show objects that don't have pending writes - R Show objects that have outstanding reads - r Show objects that don't have outstanding reads - S Show objects that have work queued - s Show objects that don't have work queued - == ========================================================= + /proc/fs/fscache/cookies -If neither side of a letter pair is given, then both are implied. For example: +This will look something like:: - keyctl add user fscache:objlist KB @s + # head /proc/fs/fscache/cookies + COOKIE VOLUME REF ACT ACC S FL DEF + ======== ======== === === === = == ================ + 00000435 00000001 1 0 -1 - 08 0000000201d080070000000000000000, 0000000000000000 + 00000436 00000001 1 0 -1 - 00 0000005601d080080000000000000000, 0000000000000051 + 00000437 00000001 1 0 -1 - 08 00023b3001d0823f0000000000000000, 0000000000000000 + 00000438 00000001 1 0 -1 - 08 0000005801d0807b0000000000000000, 0000000000000000 + 00000439 00000001 1 0 -1 - 08 00023b3201d080a10000000000000000, 0000000000000000 + 0000043a 00000001 1 0 -1 - 08 00023b3401d080a30000000000000000, 0000000000000000 + 0000043b 00000001 1 0 -1 - 08 00023b3601d080b30000000000000000, 0000000000000000 + 0000043c 00000001 1 0 -1 - 08 00023b3801d080b40000000000000000, 0000000000000000 -shows objects that are busy, and lists their object keys, but does not dump -their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is -not implied. +where the columns are: -By default all objects and all fields will be shown. + ======= =============================================================== + COLUMN DESCRIPTION + ======= =============================================================== + COOKIE The cookie debug ID (also appears in traces) + VOLUME The parent volume cookie debug ID + REF Number of references on the volume cookie + ACT Number of times the cookie is marked for in use + ACC Number of access pins in the cookie + S State of the cookie + FL Flags on the cookie + DEF Key, auxiliary data + ======= =============================================================== Debugging @@ -549,10 +334,8 @@ This is a bitmask of debugging streams to enable: 3 8 Cookie management Function entry trace 4 16 Function exit trace 5 32 General - 6 64 Page handling Function entry trace - 7 128 Function exit trace - 8 256 General - 9 512 Operation management Function entry trace + 6-8 (Not used) + 9 512 I/O operation management Function entry trace 10 1024 Function exit trace 11 2048 General ======= ======= =============================== ======================= @@ -560,6 +343,6 @@ This is a bitmask of debugging streams to enable: The appropriate set of values should be OR'd together and the result written to the control file. For example:: - echo $((1|8|64)) >/sys/module/fscache/parameters/debug + echo $((1|8|512)) >/sys/module/fscache/parameters/debug will turn on all function entry debugging. |