aboutsummaryrefslogtreecommitdiff
path: root/fs/xfs/libxfs
AgeCommit message (Collapse)AuthorFilesLines
2016-07-22Merge branch 'xfs-4.8-misc-fixes-4' into for-nextDave Chinner1-30/+29
2016-07-22libxfs: directory node splitting does not have an extra blockDave Chinner1-30/+29
xfsprogs source commit 4280e59dcbc4cd8e01585efe788a68eb378048e8 xfs_da3_split() has to handle all three versions of the directory/attribute btree structure. The attr tree is v1, the dir tre is v2 or v3. The main difference between the v1 and v2/3 trees is the way tree nodes are split - in the v1 tree we can require a double split to occur because the object to be inserted may be larger than the space made by splitting a leaf. In this case we need to do a double split - one to split the full leaf, then another to allocate an empty leaf block in the correct location for the new entry. This does not happen with dir (v2/v3) formats as the objects being inserted are always guaranteed to fit into the new space in the split blocks. Indeed, for directories they *may* be an extra block on this buffer pointer. However, it's guaranteed not to be a leaf block (i.e. a directory data block) - the directory code only ever places hash index or free space blocks in this pointer (as a cursor of sorts), and so to use it as a directory data block will immediately corrupt the directory. The problem is that the code assumes that there may be extra blocks that we need to link into the tree once we've split the root, but this is not true for either dir or attr trees, because the extra attr block is always consumed by the last node split before we split the root. Hence the linking in an extra block is always wrong at the root split level, and this manifests itself in repair as a directory corruption in a repaired directory, leaving the directory rebuild incomplete. This is a dir v2 zero-day bug - it was in the initial dir v2 commit that was made back in February 1998. Fix this by ensuring the linking of the blocks after the root split never tries to make use of the extra blocks that may be held in the cursor. They are held there for other purposes and should never be touched by the root splitting code. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-20Merge branch 'xfs-4.8-dir2-sf-fixes' into for-nextDave Chinner3-75/+37
2016-07-20Merge branch 'xfs-4.8-misc-fixes-3' into for-nextDave Chinner2-29/+45
2016-07-20xfs: remove __arch_packChristoph Hellwig1-1/+1
Instead we always declare struct xfs_dir2_sf_hdr as packed. That's the expected layout, and while most major architectures do the packing by default the new structure size and offset checker showed that not only the ARM old ABI got this wrong, but various minor embedded architectures did as well. [Verified that no code change on x86-64 results from this change] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-20xfs: kill xfs_dir2_inou_tChristoph Hellwig3-57/+26
And use an array of unsigned char values directly to avoid problems with architectures that pad the size of structures. This also gets rid of the xfs_dir2_ino4_t and xfs_dir2_ino8_t types, and introduces new constants for the size of 4 and 8 bytes as well as the size difference between the two. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-20xfs: kill xfs_dir2_sf_off_tChristoph Hellwig2-17/+10
Just use an array of two unsigned chars directly to avoid problems with architectures that pad the size of structures. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-20xfs: remove the magic numbers in xfs_btree_block-related len macrosHou Tao1-25/+41
replace the magic numbers by offsetof(...) and sizeof(...), and add two extra checks on xfs_check_ondisk_structs() [dchinner: renamed header structures to be more descriptive] Signed-off-by: Hou Tao <houtao1@huawei.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-07-20xfs: indentation fix in xfs_btree_get_iroot()Kaho Ng1-4/+4
The indentation in this function is different from the other functions. Those spacebars are converted to tabs to improve readability. Signed-off-by: Kaho Ng <ngkaho1234@gmail.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21Merge branch 'xfs-4.8-misc-fixes-2' into for-nextDave Chinner8-105/+116
2016-06-21xfs: refactor btree maxlevels computationDarrick J. Wong4-27/+28
Create a common function to calculate the maximum height of a per-AG btree. This will eventually be used by the rmapbt and refcountbt code to calculate appropriate maxlevels values for each. This is important because the verifiers and the transaction block reservations depend on accurate estimates of how many blocks are needed to satisfy a btree split. We were mistakenly using the max bnobt height for all the btrees, which creates a dangerous situation since the larger records and keys in an rmapbt make it very possible that the rmapbt will be taller than the bnobt and so we can run out of transaction block reservation. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21xfs: convert list of extents to free into a regular listDarrick J. Wong2-34/+19
In struct xfs_bmap_free, convert the open-coded free extent list to a regular list, then use list_sort to sort it prior to processing. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21xfs: separate freelist fixing into a separate helperDave Chinner2-30/+56
Break up xfs_free_extent() into a helper that fixes the freelist. This helper will be used subsequently to ensure the freelist during deferred rmap processing. [darrick: refactor to put this at the head of the patchset] Signed-off-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-21xfs: rearrange xfs_bmap_add_free parametersDarrick J. Wong4-14/+13
This is already in xfsprogs' libxfs, so port it to the kernel. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-01xfs: define XFS_IOC_FREEZE even if FIFREEZE is definedChristoph Hellwig1-6/+2
And the same for XFS_IOC_THAW. Just because we now have a common version of the ioctl we still need to provide the old name for it for anyone using those. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-06-01xfs: make several functions staticEric Sandeen4-12/+2
Al Viro noticed that xfs_lock_inodes should be static, and that led to ... a few more. These are just the easy ones, others require moving functions higher in source files, so that's not done here to keep this review simple. Signed-off-by: Eric Sandeen <sandeen@sandeen.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-05-26Merge tag 'xfs-for-linus-4.7-rc1' of ↵Linus Torvalds8-201/+103
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs Pull xfs updates from Dave Chinner: "A pretty average collection of fixes, cleanups and improvements in this request. Summary: - fixes for mount line parsing, sparse warnings, read-only compat feature remount behaviour - allow fast path symlink lookups for inline symlinks. - attribute listing cleanups - writeback goes direct to bios rather than indirecting through bufferheads - transaction allocation cleanup - optimised kmem_realloc - added configurable error handling for metadata write errors, changed default error handling behaviour from "retry forever" to "retry until unmount then fail" - fixed several inode cluster writeback lookup vs reclaim race conditions - fixed inode cluster writeback checking wrong inode after lookup - fixed bugs where struct xfs_inode freeing wasn't actually RCU safe - cleaned up inode reclaim tagging" * tag 'xfs-for-linus-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (39 commits) xfs: fix warning in xfs_finish_page_writeback for non-debug builds xfs: move reclaim tagging functions xfs: simplify inode reclaim tagging interfaces xfs: rename variables in xfs_iflush_cluster for clarity xfs: xfs_iflush_cluster has range issues xfs: mark reclaimed inodes invalid earlier xfs: xfs_inode_free() isn't RCU safe xfs: optimise xfs_iext_destroy xfs: skip stale inodes in xfs_iflush_cluster xfs: fix inode validity check in xfs_iflush_cluster xfs: xfs_iflush_cluster fails to abort on error xfs: remove xfs_fs_evict_inode() xfs: add "fail at unmount" error handling configuration xfs: add configuration handlers for specific errors xfs: add configuration of error failure speed xfs: introduce table-based init for error behaviors xfs: add configurable error support to metadata buffers xfs: introduce metadata IO error class xfs: configurable error behavior via sysfs xfs: buffer ->bi_end_io function requires irq-safe lock ...
2016-05-20Merge branch 'xfs-4.7-inode-reclaim' into for-nextDave Chinner1-8/+19
2016-05-20Merge branch 'xfs-4.7-misc-fixes' into for-nextDave Chinner1-7/+3
2016-05-20Merge branch 'xfs-4.7-optimise-inline-symlinks' into for-nextDave Chinner3-24/+48
2016-05-18xfs: optimise xfs_iext_destroyAlex Lyakas1-8/+19
When unmounting XFS, we call: xfs_inode_free => xfs_idestroy_fork => xfs_iext_destroy This goes over the whole indirection array and calls xfs_iext_irec_remove for each one of the erps (from the last one to the first one). As a result, we keep shrinking (reallocating actually) the indirection array until we shrink out all of its elements. When we have files with huge numbers of extents, umount takes 30-80 sec, depending on the amount of files that XFS loaded and the amount of indirection entries of each file. The unmount stack looks like: [<ffffffffc0b6d200>] xfs_iext_realloc_indirect+0x40/0x60 [xfs] [<ffffffffc0b6cd8e>] xfs_iext_irec_remove+0xee/0xf0 [xfs] [<ffffffffc0b6cdcd>] xfs_iext_destroy+0x3d/0xb0 [xfs] [<ffffffffc0b6cef6>] xfs_idestroy_fork+0xb6/0xf0 [xfs] [<ffffffffc0b87002>] xfs_inode_free+0xb2/0xc0 [xfs] [<ffffffffc0b87260>] xfs_reclaim_inode+0x250/0x340 [xfs] [<ffffffffc0b87583>] xfs_reclaim_inodes_ag+0x233/0x370 [xfs] [<ffffffffc0b8823d>] xfs_reclaim_inodes+0x1d/0x20 [xfs] [<ffffffffc0b96feb>] xfs_unmountfs+0x7b/0x1a0 [xfs] [<ffffffffc0b98e4d>] xfs_fs_put_super+0x2d/0x70 [xfs] [<ffffffff811e9e36>] generic_shutdown_super+0x76/0x100 [<ffffffff811ea207>] kill_block_super+0x27/0x70 [<ffffffff811ea519>] deactivate_locked_super+0x49/0x60 [<ffffffff811eaaee>] deactivate_super+0x4e/0x70 [<ffffffff81207593>] cleanup_mnt+0x43/0x90 [<ffffffff81207632>] __cleanup_mnt+0x12/0x20 [<ffffffff8108f8e7>] task_work_run+0xa7/0xe0 [<ffffffff81014ff7>] do_notify_resume+0x97/0xb0 [<ffffffff81717c6f>] int_signal+0x12/0x17 Further, this reallocation prevents us from freeing the extent list from a RCU callback as allocation can block. Hence if the extent list is in indirect format, optimise the freeing of the extent list to only use kmem_free calls by freeing entire extent buffer pages at a time, rather than extent by extent. [dchinner: simplified freeing loop based on Christoph's suggestion] Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-06xfs: improve kmem_reallocChristoph Hellwig1-7/+3
Use krealloc to implement our realloc function. This helps to avoid new allocations if we are still in the slab bucket. At least for the bmap btree root that's actually the common case. This also allows removing the now unused oldsize argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-06xfs: remove transaction typesChristoph Hellwig2-97/+5
These aren't used for CIL-style logging and can be dropped. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-06xfs: better xfs_trans_alloc interfaceChristoph Hellwig4-65/+28
Merge xfs_trans_reserve and xfs_trans_alloc into a single function call that returns a transaction with all the required log and block reservations, and which allows passing transaction flags directly to avoid the cumbersome _xfs_trans_alloc interface. While we're at it we also get rid of the transaction type argument that has been superflous since we stopped supporting the non-CIL logging mode. The guts of it will be removed in another patch. [dchinner: fixed transaction leak in error path in xfs_setattr_nonsize] Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-06xfs: optimize inline symlinksChristoph Hellwig1-4/+18
By overallocating the in-core inode fork data buffer and zero terminating the link target in xfs_init_local_fork we can avoid the memory allocation in ->follow_link. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-06xfs: factor out a helper to initialize a local format inode forkChristoph Hellwig3-24/+34
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-04-04mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macrosKirill A. Shutemov1-2/+2
PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time ago with promise that one day it will be possible to implement page cache with bigger chunks than PAGE_SIZE. This promise never materialized. And unlikely will. We have many places where PAGE_CACHE_SIZE assumed to be equal to PAGE_SIZE. And it's constant source of confusion on whether PAGE_CACHE_* or PAGE_* constant should be used in a particular case, especially on the border between fs and mm. Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much breakage to be doable. Let's stop pretending that pages in page cache are special. They are not. The changes are pretty straight-forward: - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>; - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN}; - page_cache_get() -> get_page(); - page_cache_release() -> put_page(); This patch contains automated changes generated with coccinelle using script below. For some reason, coccinelle doesn't patch header files. I've called spatch for them manually. The only adjustment after coccinelle is revert of changes to PAGE_CAHCE_ALIGN definition: we are going to drop it later. There are few places in the code where coccinelle didn't reach. I'll fix them manually in a separate patch. Comments and documentation also will be addressed with the separate patch. virtual patch @@ expression E; @@ - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ expression E; @@ - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) + E @@ @@ - PAGE_CACHE_SHIFT + PAGE_SHIFT @@ @@ - PAGE_CACHE_SIZE + PAGE_SIZE @@ @@ - PAGE_CACHE_MASK + PAGE_MASK @@ expression E; @@ - PAGE_CACHE_ALIGN(E) + PAGE_ALIGN(E) @@ expression E; @@ - page_cache_get(E) + get_page(E) @@ expression E; @@ - page_cache_release(E) + put_page(E) Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-15Merge branch 'xfs-misc-fixes-4.6-4' into for-nextDave Chinner2-44/+117
2016-03-15xfs: always set rvalp in xfs_dir2_node_trim_freeChristoph Hellwig1-1/+3
xfs_dir2_node_trim_free can return with setting the rvalp argument pointer. Initialize it to 0 at the beginning of the function and only update it to 1 if we succeeded trimming a freespace block. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: borrow indirect blocks from freed extent when availableBrian Foster1-10/+36
xfs_bmap_del_extent() handles extent removal from the in-core and on-disk extent lists. When removing a delalloc range, it updates the indirect block reservation appropriately based on the removal. It currently enforces that the new indirect block reservation is less than or equal to the original. This is normally the case in all situations except for in certain cases when the removed range creates a hole in a single delalloc extent, thus splitting a single delalloc extent in two. It is possible with small enough extents to split an indlen==1 extent into two such slightly smaller extents. This leaves one extent with 0 indirect blocks and leads to assert failures in other areas (e.g., xfs_bunmapi() if the extent happens to be removed). Update the indlen distribution code to steal blocks from the deleted extent, if necessary, to satisfy the worst case total indirect reservation for the new extents. This is safe as the caller does not update the fdblocks counters until the extent is removed. Blocks stolen in this manner simply remain accounted as allocated, having ownership transferred from the data extent to an indirect reservation. As a precaution, fall back to the original reservation algorithm if the new indlen requirement is not met and warn if we end up with extents without any reservation at all to detect this more easily in the future. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: refactor delalloc indlen reservation split into helperBrian Foster1-19/+54
The delayed allocation indirect reservation splitting code is not sufficient in some cases where a delalloc extent is split in two. In preparation for enhancements to this code, refactor the current indlen distribution algorithm into a new helper function. [dchinner: rename temp, temp2 variables] Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-15xfs: update freeblocks counter after extent deletionBrian Foster1-24/+34
xfs_bunmapi() currently updates the fdblocks counter, unreserves quota, etc. before the extent is deleted by xfs_bmap_del_extent(). The function has problems dividing up the indirect reserved blocks for scenarios where a single delalloc extent is split in two. Particularly, there aren't always enough blocks reserved for multiple extents in a single extent reservation. The solution to this problem is to allow the extent removal code to steal from the deleted extent to meet indirect reservation requirements. Move the block of code in xfs_bmapi() that updates the fdblocks counter to after the call to xfs_bmap_del_extent() to allow the codepath to update the extent record before the free blocks are accounted. Also, reshuffle the code slightly so the delalloc accounting occurs near the xfs_bmap_del_extent() call to provide context for the comments. Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-09Merge branch 'xfs-misc-fixes-4.6-3' into for-nextDave Chinner2-6/+3
2016-03-09xfs: remove impossible conditionLuis de Bethencourt1-4/+1
bp_release is set to 0 just before the breakpoint of the for loop before the conditional check (in line 458). The other breakpoint is a goto that skips the dead code. Addresses-Coverity-Id: 102338 Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-07Merge branch 'xfs-misc-fixes-4.6-2' into for-nextDave Chinner2-5/+5
2016-03-07Merge branch 'xfs-gut-icdinode-4.6' into for-nextDave Chinner7-81/+166
2016-03-07Merge branch 'xfs-misc-fixes-4.6' into for-nextDave Chinner8-39/+42
2016-03-07Merge branch 'xfs-get-next-dquot-4.6' into for-nextDave Chinner1-1/+2
2016-03-07xfs: fix computation of inode btree maxlevelsDarrick J. Wong1-2/+2
Commit 88740da18[1] introduced a function to compute the maximum height of the inode btree back in 1994. Back then, apparently, the freespace and inode btrees shared the same geometry; however, it has long since been the case that the inode and freespace btrees have different record and key sizes. Therefore, we must use m_inobt_mnr if we want a correct calculation/log reservation/etc. (Yes, this bug has been around for 21 years and ten months.) (Yes, I was in middle school when this bug was committed.) [1] http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=88740da18ddd9d7ba3ebaa9502fefc6ef2fd19cd Historical-research-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-03-02xfs: remove xfs_trans_get_block_resChristoph Hellwig2-5/+5
Just use the t_blk_res field directly instead of obsfucating the reference by a macro. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: mode di_mode to vfs inodeDave Chinner5-15/+14
Move the di_mode value from the xfs_icdinode to the VFS inode, reducing the xfs_icdinode byte another 2 bytes and collapsing another 2 byte hole in the structure. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: move di_changecount to VFS inodeDave Chinner2-3/+2
We can store the di_changecount in the i_version field of the VFS inode and remove another 8 bytes from the xfs_icdinode. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: move inode generation count to VFS inodeDave Chinner2-5/+4
Pull another 4 bytes out of the xfs_icdinode. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: use vfs inode nlink field everywhereDave Chinner2-4/+3
The VFS tracks the inode nlink just like the xfs_icdinode. We can remove the variable from the icdinode and use the VFS inode variable everywhere, reducing the size of the xfs_icdinode by a further 4 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: move v1 inode conversion to xfs_inode_from_diskDave Chinner3-24/+21
So we don't have to carry an di_onlink variable around anymore, move the inode conversion from v1 inode format to v2 inode format into xfs_inode_from_disk(). This means we can remove the di_onlink fields from the struct xfs_icdinode. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: cull unnecessary icdinode fieldsDave Chinner2-45/+19
Now that the struct xfs_icdinode is not directly related to the on-disk format, we can cull things in it we really don't need to store: - magic number never changes - padding is not necessary - next_unlinked is never used - inode number is redundant - uuid is redundant - lsn is accessed directly from dinode - inode CRC is only accessed directly from dinode Hence we can remove these from the struct xfs_icdinode and redirect the code that uses them to the xfs_dinode appripriately. This reduces the size of the struct icdinode from 152 bytes to 88 bytes, and removes a fair chunk of unnecessary code, too. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: remove timestamps from incore inodeDave Chinner3-18/+84
The struct xfs_inode has two copies of the current timestamps in it, one in the vfs inode and one in the struct xfs_icdinode. Now that we no longer log the struct xfs_icdinode directly, we don't need to keep the timestamps in this structure. instead we can copy them straight out of the VFS inode when formatting the inode log item or the on-disk inode. This reduces the struct xfs_inode in size by 24 bytes. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: introduce inode log format objectDave Chinner3-12/+64
We currently carry around and log an entire inode core in the struct xfs_inode. A lot of the information in the inode core is duplicated in the VFS inode, but we cannot remove this duplication of infomration because the inode core is logged directly in xfs_inode_item_format(). Add a new function xfs_inode_item_format_core() that copies the inode core data into a struct xfs_icdinode that is pulled directly from the log vector buffer. This means we no longer directly copy the inode core, but copy the structures one member at a time. This will be slightly less efficient than copying, but will allow us to remove duplicate and unnecessary items from the struct xfs_inode. To enable us to do this, call the new structure a xfs_log_dinode, so that we know it's different to the physical xfs_dinode and the in-core xfs_icdinode. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: RT bitmap and summary buffers need verifiersDave Chinner2-1/+27
Buffers without verifiers issue runtime warnings on XFS. We don't have anything we can actually verify in the RT buffers (no CRCs, not magic numbers, etc), but we still need verifiers to avoid the warnings. Add a set of dummy verifier operations for the realtime buffers and apply them in the appropriate places. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09xfs: RT bitmap and summary buffers are not typedDave Chinner2-0/+5
When logging buffers, we attach a type to them that follows the buffer all the way into the log and is used to identify the buffer contents in log recovery. Both the realtime summary buffers and the bitmap buffers do not have types defined or set, so when we try to log them we see assert failure: XFS: Assertion failed: (bip->bli_flags & XFS_BLI_STALE) || (xfs_blft_from_flags(&bip->__bli_format) > XFS_BLFT_UNKNOWN_BUF && xfs_blft_from_flags(&bip->__bli_format) < XFS_BLFT_MAX_BUF), file: fs/xfs/xfs_buf_item.c, line: 294 Fix this by adding buffer log format types for these buffers, and add identification support into log recovery for them. Only build the log recovery support if CONFIG_XFS_RT=y - we can't get into log recovery for real time filesystems if support is not built into the kernel, and this avoids potential build problems. Signed-off-by: Dave Chinner <dchinner@redhat.com> Tested-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>