Age | Commit message (Collapse) | Author | Files | Lines |
|
We already set rc to this return code further down in the function but
we can set it earlier in order to suppress a smash warning.
Also fix a false positive for Coverity. The reason this is a false positive is
that this happens during umount after all files and directories have been closed
but mosetting on ->on_list to suppress the warning.
Reported-by: Dan carpenter <[email protected]>
Reported-by: coverity-bot <[email protected]>
Addresses-Coverity-ID: 1525256 ("Concurrent data access violations")
Fixes: a350d6e73f5e ("cifs: enable caching of directories for which a lease is held")
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD().
Using list_move() instead of list_del() and list_add().
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
If stardup the symlink target failed, should free the xid,
otherwise the xid will be leaked.
Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Before return, should free the xid, otherwise, the
xid will be leaked.
Fixes: d70e9fa55884 ("cifs: try opening channels after mounting")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
If not flock, before return -ENOLCK, should free the xid,
otherwise, the xid will be leaked.
Fixes: d0677992d2af ("cifs: add support for flock")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
If the file is used by swap, before return -EOPNOTSUPP, should
free the xid, otherwise, the xid will be leaked.
Fixes: 4e8aea30f775 ("smb3: enable swap on SMB3 mounts")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
If the cifs already shutdown, we should free the xid before return,
otherwise, the xid will be leaked.
Fixes: 087f757b0129 ("cifs: add shutdown support")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Zhang Xiaoxu <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
s_inodes is superblock-specific resource, which should be
protected by sb's specific lock s_inode_list_lock.
Link: https://lore.kernel.org/r/TYCP286MB23238380DE3B74874E8D78ABCA299@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM
Fixes: 7d41963759fe ("erofs: Support sharing cookies in the same domain")
Reviewed-by: Yue Hu <[email protected]>
Reviewed-by: Jia Zhu <[email protected]>
Reviewed-by: Jingbo Xu <[email protected]>
Signed-off-by: Dawei Li <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
|
|
Partial decompression should be checked after updating length.
It's a new regression when introducing multi-reference pclusters.
Fixes: 2bfab9c0edac ("erofs: record the longest decompressed size in this round")
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
If other duplicated copies exist in one decompression shot, should
leave the old page as is rather than replace it with the new duplicated
one. Otherwise, the following cold path to deal with duplicated copies
will use the invalid bvec. It impacts compressed data deduplication.
Also, shift the onlinepage EIO bit to avoid touching the signed bit.
Fixes: 267f2492c8f7 ("erofs: introduce multi-reference pclusters (fully-referenced)")
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Note that we are still accessing 'h_idata_size' and 'h_fragmentoff'
after calling erofs_put_metabuf(), that is not correct. Fix it.
Fixes: ab92184ff8f1 ("erofs: add on-disk compressed tail-packing inline support")
Fixes: b15b2e307c3a ("erofs: support on-disk compressed fragments data")
Signed-off-by: Yue Hu <[email protected]>
Reviewed-by: Gao Xiang <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Gao Xiang <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull more random number generator updates from Jason Donenfeld:
"This time with some large scale treewide cleanups.
The intent of this pull is to clean up the way callers fetch random
integers. The current rules for doing this right are:
- If you want a secure or an insecure random u64, use get_random_u64()
- If you want a secure or an insecure random u32, use get_random_u32()
The old function prandom_u32() has been deprecated for a while
now and is just a wrapper around get_random_u32(). Same for
get_random_int().
- If you want a secure or an insecure random u16, use get_random_u16()
- If you want a secure or an insecure random u8, use get_random_u8()
- If you want secure or insecure random bytes, use get_random_bytes().
The old function prandom_bytes() has been deprecated for a while
now and has long been a wrapper around get_random_bytes()
- If you want a non-uniform random u32, u16, or u8 bounded by a
certain open interval maximum, use prandom_u32_max()
I say "non-uniform", because it doesn't do any rejection sampling
or divisions. Hence, it stays within the prandom_*() namespace, not
the get_random_*() namespace.
I'm currently investigating a "uniform" function for 6.2. We'll see
what comes of that.
By applying these rules uniformly, we get several benefits:
- By using prandom_u32_max() with an upper-bound that the compiler
can prove at compile-time is ≤65536 or ≤256, internally
get_random_u16() or get_random_u8() is used, which wastes fewer
batched random bytes, and hence has higher throughput.
- By using prandom_u32_max() instead of %, when the upper-bound is
not a constant, division is still avoided, because
prandom_u32_max() uses a faster multiplication-based trick instead.
- By using get_random_u16() or get_random_u8() in cases where the
return value is intended to indeed be a u16 or a u8, we waste fewer
batched random bytes, and hence have higher throughput.
This series was originally done by hand while I was on an airplane
without Internet. Later, Kees and I worked on retroactively figuring
out what could be done with Coccinelle and what had to be done
manually, and then we split things up based on that.
So while this touches a lot of files, the actual amount of code that's
hand fiddled is comfortably small"
* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
prandom: remove unused functions
treewide: use get_random_bytes() when possible
treewide: use get_random_u32() when possible
treewide: use get_random_{u8,u16}() when possible, part 2
treewide: use get_random_{u8,u16}() when possible, part 1
treewide: use prandom_u32_max() when possible, part 2
treewide: use prandom_u32_max() when possible, part 1
|
|
git://git.samba.org/sfrench/cifs-2.6
Pull more cifs updates from Steve French:
- fix a regression in guest mounts to old servers
- improvements to directory leasing (caching directory entries safely
beyond the root directory)
- symlink improvement (reducing roundtrips needed to process symlinks)
- an lseek fix (to problem where some dir entries could be skipped)
- improved ioctl for returning more detailed information on directory
change notifications
- clarify multichannel interface query warning
- cleanup fix (for better aligning buffers using ALIGN and round_up)
- a compounding fix
- fix some uninitialized variable bugs found by Coverity and the kernel
test robot
* tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
smb3: improve SMB3 change notification support
cifs: lease key is uninitialized in two additional functions when smb1
cifs: lease key is uninitialized in smb1 paths
smb3: must initialize two ACL struct fields to zero
cifs: fix double-fault crash during ntlmssp
cifs: fix static checker warning
cifs: use ALIGN() and round_up() macros
cifs: find and use the dentry for cached non-root directories also
cifs: enable caching of directories for which a lease is held
cifs: prevent copying past input buffer boundaries
cifs: fix uninitialised var in smb2_compound_op()
cifs: improve symlink handling for smb2+
smb3: clarify multichannel warning
cifs: fix regression in very old smb1 mounts
cifs: fix skipping to incorrect offset in emit_cached_dirents
|
|
Change notification is a commonly supported feature by most servers,
but the current ioctl to request notification when a directory is
changed does not return the information about what changed
(even though it is returned by the server in the SMB3 change
notify response), it simply returns when there is a change.
This ioctl improves upon CIFS_IOC_NOTIFY by returning the notify
information structure which includes the name of the file(s) that
changed and why. See MS-SMB2 2.2.35 for details on the individual
filter flags and the file_notify_information structure returned.
To use this simply pass in the following (with enough space
to fit at least one file_notify_information structure)
struct __attribute__((__packed__)) smb3_notify {
uint32_t completion_filter;
bool watch_tree;
uint32_t data_len;
uint8_t data[];
} __packed;
using CIFS_IOC_NOTIFY_INFO 0xc009cf0b
or equivalently _IOWR(CIFS_IOCTL_MAGIC, 11, struct smb3_notify_info)
The ioctl will block until the server detects a change to that
directory or its subdirectories (if watch_tree is set).
Acked-by: Paulo Alcantara (SUSE) <[email protected]>
Acked-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
cifs_open and _cifsFileInfo_put also end up with lease_key uninitialized
in smb1 mounts. It is cleaner to set lease key to zero in these
places where leases are not supported (smb1 can not return lease keys
so the field was uninitialized).
Addresses-Coverity: 1514207 ("Uninitialized scalar variable")
Addresses-Coverity: 1514331 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
It is cleaner to set lease key to zero in the places where leases are not
supported (smb1 can not return lease keys so the field was uninitialized).
Addresses-Coverity: 1513994 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Coverity spotted that we were not initalizing Stbz1 and Stbz2 to
zero in create_sd_buf.
Addresses-Coverity: 1513848 ("Uninitialized scalar variable")
Cc: <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
The crash occurred because we were calling memzero_explicit() on an
already freed sess_data::iov[1] (ntlmsspblob) in sess_free_buffer().
Fix this by not calling memzero_explicit() on sess_data::iov[1] as
it's already by handled by callers.
Fixes: a4e430c8c8ba ("cifs: replace kfree() with kfree_sensitive() for sensitive data")
Reviewed-by: Enzo Matsumiya <[email protected]>
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBI and UBIFS updates from Richard Weinberger:
"UBI:
- Use bitmap API to allocate bitmaps
- New attach mode, disable_fm, to attach without fastmap
- Fixes for various typos in comments
UBIFS:
- Fix for a deadlock when setting xattrs for encrypted file
- Fix for an assertion failures when truncating encrypted files
- Fixes for various typos in comments"
* tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubi: fastmap: Add fastmap control support for 'UBI_IOCATT' ioctl
ubi: fastmap: Use the bitmap API to allocate bitmaps
ubifs: Fix AA deadlock when setting xattr for encrypted file
ubifs: Fix UBIFS ro fail due to truncate in the encrypted directory
mtd: ubi: drop unexpected word 'a' in comments
ubi: block: Fix typos in comments
ubi: fastmap: Fix typo in comments
ubi: Fix repeated words in comments
ubi: ubi-media.h: Fix comment typo
ubi: block: Remove in vain semicolon
ubifs: Fix ubifs_check_dir_empty() kernel-doc comment
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML updates from Richard Weinberger:
- Move to strscpy()
- Improve panic notifiers
- Fix NR_CPUS usage
- Fixes for various comments
- Fixes for virtio driver
* tag 'for-linus-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
uml: Remove the initialization of statics to 0
um: Do not initialise statics to 0.
um: Fix comment typo
um: Improve panic notifiers consistency and ordering
um: remove unused reactivate_chan() declaration
um: mmaper: add __exit annotations to module exit funcs
um: virt-pci: add __init/__exit annotations to module init/exit funcs
hostfs: move from strlcpy with unused retval to strscpy
um: move from strlcpy with unused retval to strscpy
um: increase default virtual physical memory to 64 MiB
UM: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
um: read multiple msg from virtio slave request fd
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- fix a race which causes page refcounting errors in ZONE_DEVICE pages
(Alistair Popple)
- fix userfaultfd test harness instability (Peter Xu)
- various other patches in MM, mainly fixes
* tag 'mm-stable-2022-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (29 commits)
highmem: fix kmap_to_page() for kmap_local_page() addresses
mm/page_alloc: fix incorrect PGFREE and PGALLOC for high-order page
mm/selftest: uffd: explain the write missing fault check
mm/hugetlb: use hugetlb_pte_stable in migration race check
mm/hugetlb: fix race condition of uffd missing/minor handling
zram: always expose rw_page
LoongArch: update local TLB if PTE entry exists
mm: use update_mmu_tlb() on the second thread
kasan: fix array-bounds warnings in tests
hmm-tests: add test for migrate_device_range()
nouveau/dmem: evict device private memory during release
nouveau/dmem: refactor nouveau_dmem_fault_copy_one()
mm/migrate_device.c: add migrate_device_range()
mm/migrate_device.c: refactor migrate_vma and migrate_deivce_coherent_page()
mm/memremap.c: take a pgmap reference on page allocation
mm: free device private pages have zero refcount
mm/memory.c: fix race when faulting a device private page
mm/damon: use damon_sz_region() in appropriate place
mm/damon: move sz_damon_region to damon_sz_region
lib/test_meminit: add checks for the allocation functions
...
|
|
Remove unnecessary NULL check of oparam->cifs_sb when parsing symlink
error response as it's already set by all smb2_open_file() callers and
deferenced earlier.
This fixes below report:
fs/cifs/smb2file.c:126 smb2_open_file()
warn: variable dereferenced before check 'oparms->cifs_sb' (see line 112)
Link: https://lore.kernel.org/r/Y0kt42j2tdpYakRu@kili
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Pull ceph updates from Ilya Dryomov:
"A quiet round this time: several assorted filesystem fixes, the most
noteworthy one being some additional wakeups in cap handling code, and
a messenger cleanup"
* tag 'ceph-for-6.1-rc1' of https://github.com/ceph/ceph-client:
ceph: remove Sage's git tree from documentation
ceph: fix incorrectly showing the .snap size for stat
ceph: fail the open_by_handle_at() if the dentry is being unlinked
ceph: increment i_version when doing a setattr with caps
ceph: Use kcalloc for allocating multiple elements
ceph: no need to wait for transition RDCACHE|RD -> RD
ceph: fail the request if the peer MDS doesn't support getvxattr op
ceph: wake up the waiters if any new caps comes
libceph: drop last_piece flag from ceph_msg_data_cursor
|
|
Pull NFS client updates from Anna Schumaker:
"New Features:
- Add NFSv4.2 xattr tracepoints
- Replace xprtiod WQ in rpcrdma
- Flexfiles cancels I/O on layout recall or revoke
Bugfixes and Cleanups:
- Directly use ida_alloc() / ida_free()
- Don't open-code max_t()
- Prefer using strscpy over strlcpy
- Remove unused forward declarations
- Always return layout states on flexfiles layout return
- Have LISTXATTR treat NFS4ERR_NOXATTR as an empty reply instead of
error
- Allow more xprtrdma memory allocations to fail without triggering a
reclaim
- Various other xprtrdma clean ups
- Fix rpc_killall_tasks() races"
* tag 'nfs-for-6.1-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFSv4/flexfiles: Cancel I/O if the layout is recalled or revoked
SUNRPC: Add API to force the client to disconnect
SUNRPC: Add a helper to allow pNFS drivers to selectively cancel RPC calls
SUNRPC: Fix races with rpc_killall_tasks()
xprtrdma: Fix uninitialized variable
xprtrdma: Prevent memory allocations from driving a reclaim
xprtrdma: Memory allocation should be allowed to fail during connect
xprtrdma: MR-related memory allocation should be allowed to fail
xprtrdma: Clean up synopsis of rpcrdma_regbuf_alloc()
xprtrdma: Clean up synopsis of rpcrdma_req_create()
svcrdma: Clean up RPCRDMA_DEF_GFP
SUNRPC: Replace the use of the xprtiod WQ in rpcrdma
NFSv4.2: Add a tracepoint for listxattr
NFSv4.2: Add tracepoints for getxattr, setxattr, and removexattr
NFSv4.2: Move TRACE_DEFINE_ENUM(NFS4_CONTENT_*) under CONFIG_NFS_V4_2
NFSv4.2: Add special handling for LISTXATTR receiving NFS4ERR_NOXATTR
nfs: remove nfs_wait_atomic_killable() and nfs_write_prepare() declaration
NFSv4: remove nfs4_renewd_prepare_shutdown() declaration
fs/nfs/pnfs_nfs.c: fix spelling typo and syntax error in comment
NFSv4/pNFS: Always return layout stats on layout return for flexfiles
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux
Pull orangefs update from Mike Marshall:
"Change iterate to iterate_shared"
* tag 'for-linus-6.1-ofs1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
Orangefs: change iterate to iterate_shared
|
|
This is a conditional tracepoint. Call it every time, not just when
nfs_permission fails.
Signed-off-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
|
|
Improve code readability by using existing macros:
Replace hardcoded alignment computations (e.g. (len + 7) & ~0x7) by
ALIGN()/IS_ALIGNED() macros.
Also replace (DIV_ROUND_UP(len, 8) * 8) with ALIGN(len, 8), which, if
not optimized by the compiler, has the overhead of a multiplication
and a division. Do the same for roundup() by replacing it by round_up()
(division-less version, but requires the multiple to be a power of 2,
which is always the case for us).
And remove some unnecessary checks where !IS_ALIGNED() would fit, but
calling round_up() directly is fine as it's a no-op if the value is
already aligned.
Signed-off-by: Enzo Matsumiya <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This allows us to use cached attributes for the entries in a cached
directory for as long as a lease is held on the directory itself.
Previously we have always allowed "used cached attributes for 1 second"
but this extends this to the lifetime of the lease as well as making the
caching safer.
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This expands the directory caching to now cache an open handle for all
directories (up to a maximum) and not just the root directory.
In this patch, locking and refcounting is intended to work as so:
The main function to get a reference to a cached handle is
find_or_create_cached_dir() called from open_cached_dir()
These functions are protected under the cfid_list_lock spin-lock
to make sure we do not race creating new references for cached dirs
with deletion of expired ones.
An successful open_cached_dir() will take out 2 references to the cfid if
this was the very first and successful call to open the directory and
it acquired a lease from the server.
One reference is for the lease and the other is for the cfid that we
return. The is lease reference is tracked by cfid->has_lease.
If the directory already has a handle with an active lease, then we just
take out one new reference for the cfid and return it.
It can happen that we have a thread that tries to open a cached directory
where we have a cfid already but we do not, yet, have a working lease. In
this case we will just return NULL, and this the caller will fall back to
the case when no handle was available.
In this model the total number of references we have on a cfid is
1 for while the handle is open and we have a lease, and one additional
reference for each open instance of a cfid.
Once we get a lease break (cached_dir_lease_break()) we remove the
cfid from the list under the spinlock. This prevents any new threads to
use it, and we also call smb2_cached_lease_break() via the work_queue
in order to drop the reference we got for the lease (we drop it outside
of the spin-lock.)
Anytime a thread calls close_cached_dir() we also drop a reference to the
cfid.
When the last reference to the cfid is released smb2_close_cached_fid()
will be invoked which will drop the reference ot the dentry we held for
this cfid and it will also, if we the handle is open/has a lease
also call SMB2_close() to close the handle on the server.
Two events require special handling:
invalidate_all_cached_dirs() this function is called from SMB2_tdis()
and cifs_mark_open_files_invalid().
In both cases the tcon is either gone already or will be shortly so
we do not need to actually close the handles. They will be dropped
server side as part of the tcon dropping.
But we have to be careful about a potential race with a concurrent
lease break so we need to take out additional refences to avoid the
cfid from being freed while we are still referencing it.
free_cached_dirs() which is called from tconInfoFree().
This is called quite late in the umount process so there should no longer
be any open handles or files and we can just free all the remaining data.
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Prevent copying past @data buffer in smb2_validate_and_copy_iov() as
the output buffer in @iov might be potentially bigger and thus copying
more bytes than requested in @minbufsize.
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
Fix uninitialised variable @idata when calling smb2_compound_op() with
SMB2_OP_POSIX_QUERY_INFO.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
When creating inode for symlink, the client used to send below
requests to fill it in:
* create+query_info+close (STATUS_STOPPED_ON_SYMLINK)
* create(+reparse_flag)+query_info+close (set file attrs)
* create+ioctl(get_reparse)+close (query reparse tag)
and then for every access to the symlink dentry, the ->link() method
would send another:
* create+ioctl(get_reparse)+close (parse symlink)
So, in order to improve:
(i) Get rid of unnecessary roundtrips and then resolve symlinks as
follows:
* create+query_info+close (STATUS_STOPPED_ON_SYMLINK +
parse symlink + get reparse tag)
* create(+reparse_flag)+query_info+close (set file attrs)
(ii) Set the resolved symlink target directly in inode->i_link and
use simple_get_link() for ->link() to simply return it.
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
When server does not return network interfaces, clarify the
message to indicate that "multichannel not available" not just
that "empty network interface returned by server ..."
Suggested-by: Tom Talpey <[email protected]>
Reviewed-by: Bharath SM <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
BZ: 215375
Fixes: 76a3c92ec9e0 ("cifs: remove support for NTLM and weaker authentication algorithms")
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
The recent change of page_cache_ra_unbounded() arguments was buggy in the
two callers, causing us to readahead the wrong pages. Move the definition
of ractl down to after the index is set correctly. This affected
performance on configurations that use fs-verity.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 73bb49da50cd ("mm/readahead: make page_cache_ra_unbounded take a readahead_control")
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Reported-by: Jintao Yin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"Five hotfixes - three for nilfs2, two for MM. For are cc:stable, one
is not"
* tag 'mm-hotfixes-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
nilfs2: fix leak of nilfs_root in case of writer thread creation failure
nilfs2: fix NULL pointer dereference at nilfs_bmap_lookup_at_level()
nilfs2: fix use-after-free bug of struct nilfs_root
mm/damon/core: initialize damon_target->list in damon_new_target()
mm/hugetlb: fix races when looking up a CONT-PTE/PMD size hugetlb page
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- hfs and hfsplus kmap API modernization (Fabio Francesco)
- make crash-kexec work properly when invoked from an NMI-time panic
(Valentin Schneider)
- ntfs bugfixes (Hawkins Jiawei)
- improve IPC msg scalability by replacing atomic_t's with percpu
counters (Jiebin Sun)
- nilfs2 cleanups (Minghao Chi)
- lots of other single patches all over the tree!
* tag 'mm-nonmm-stable-2022-10-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
include/linux/entry-common.h: remove has_signal comment of arch_do_signal_or_restart() prototype
proc: test how it holds up with mapping'less process
mailmap: update Frank Rowand email address
ia64: mca: use strscpy() is more robust and safer
init/Kconfig: fix unmet direct dependencies
ia64: update config files
nilfs2: replace WARN_ONs by nilfs_error for checkpoint acquisition failure
fork: remove duplicate included header files
init/main.c: remove unnecessary (void*) conversions
proc: mark more files as permanent
nilfs2: remove the unneeded result variable
nilfs2: delete unnecessary checks before brelse()
checkpatch: warn for non-standard fixes tag style
usr/gen_init_cpio.c: remove unnecessary -1 values from int file
ipc/msg: mitigate the lock contention with percpu counter
percpu: add percpu_counter_add_local and percpu_counter_sub_local
fs/ocfs2: fix repeated words in comments
relay: use kvcalloc to alloc page array in relay_alloc_page_array
proc: make config PROC_CHILDREN depend on PROC_FS
fs: uninline inode_maybe_inc_iversion()
...
|
|
If nilfs_attach_log_writer() failed to create a log writer thread, it
frees a data structure of the log writer without any cleanup. After
commit e912a5b66837 ("nilfs2: use root object to get ifile"), this causes
a leak of struct nilfs_root, which started to leak an ifile metadata inode
and a kobject on that struct.
In addition, if the kernel is booted with panic_on_warn, the above
ifile metadata inode leak will cause the following panic when the
nilfs2 kernel module is removed:
kmem_cache_destroy nilfs2_inode_cache: Slab cache still has objects when
called from nilfs_destroy_cachep+0x16/0x3a [nilfs2]
WARNING: CPU: 8 PID: 1464 at mm/slab_common.c:494 kmem_cache_destroy+0x138/0x140
...
RIP: 0010:kmem_cache_destroy+0x138/0x140
Code: 00 20 00 00 e8 a9 55 d8 ff e9 76 ff ff ff 48 8b 53 60 48 c7 c6 20 70 65 86 48 c7 c7 d8 69 9c 86 48 8b 4c 24 28 e8 ef 71 c7 00 <0f> 0b e9 53 ff ff ff c3 48 81 ff ff 0f 00 00 77 03 31 c0 c3 53 48
...
Call Trace:
<TASK>
? nilfs_palloc_freev.cold.24+0x58/0x58 [nilfs2]
nilfs_destroy_cachep+0x16/0x3a [nilfs2]
exit_nilfs_fs+0xa/0x1b [nilfs2]
__x64_sys_delete_module+0x1d9/0x3a0
? __sanitizer_cov_trace_pc+0x1a/0x50
? syscall_trace_enter.isra.19+0x119/0x190
do_syscall_64+0x34/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
...
</TASK>
Kernel panic - not syncing: panic_on_warn set ...
This patch fixes these issues by calling nilfs_detach_log_writer() cleanup
function if spawning the log writer thread fails.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: e912a5b66837 ("nilfs2: use root object to get ifile")
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Tested-by: Ryusuke Konishi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If the i_mode field in inode of metadata files is corrupted on disk, it
can cause the initialization of bmap structure, which should have been
called from nilfs_read_inode_common(), not to be called. This causes a
lockdep warning followed by a NULL pointer dereference at
nilfs_bmap_lookup_at_level().
This patch fixes these issues by adding a missing sanitiy check for the
i_mode field of metadata file's inode.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Reported-by: Tetsuo Handa <[email protected]>
Tested-by: Ryusuke Konishi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If the beginning of the inode bitmap area is corrupted on disk, an inode
with the same inode number as the root inode can be allocated and fail
soon after. In this case, the subsequent call to nilfs_clear_inode() on
that bogus root inode will wrongly decrement the reference counter of
struct nilfs_root, and this will erroneously free struct nilfs_root,
causing kernel oopses.
This fixes the problem by changing nilfs_new_inode() to skip reserved
inode numbers while repairing the inode bitmap.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Reported-by: Khalid Masum <[email protected]>
Tested-by: Ryusuke Konishi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If creation or finalization of a checkpoint fails due to anomalies in the
checkpoint metadata on disk, a kernel warning is generated.
This patch replaces the WARN_ONs by nilfs_error, so that a kernel, booted
with panic_on_warn, does not panic. A nilfs_error is appropriate here to
handle the abnormal filesystem condition.
This also replaces the detected error codes with an I/O error so that
neither of the internal error codes is returned to callers.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ryusuke Konishi <[email protected]>
Reported-by: [email protected]
Signed-off-by: Andrew Morton <[email protected]>
|
|
The prandom_bytes() function has been a deprecated inline wrapper around
get_random_bytes() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. This was done as a basic find and replace.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]> # powerpc
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
|
|
The prandom_u32() function has been a deprecated inline wrapper around
get_random_u32() for several releases now, and compiles down to the
exact same code. Replace the deprecated wrapper with a direct call to
the real function. The same also applies to get_random_int(), which is
just a wrapper around get_random_u32(). This was done as a basic find
and replace.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Reviewed-by: Jan Kara <[email protected]> # for ext4
Acked-by: Toke Høiland-Jørgensen <[email protected]> # for sch_cake
Acked-by: Chuck Lever <[email protected]> # for nfsd
Acked-by: Jakub Kicinski <[email protected]>
Acked-by: Mika Westerberg <[email protected]> # for thunderbolt
Acked-by: Darrick J. Wong <[email protected]> # for xfs
Acked-by: Helge Deller <[email protected]> # for parisc
Acked-by: Heiko Carstens <[email protected]> # for s390
Signed-off-by: Jason A. Donenfeld <[email protected]>
|
|
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done by hand, covering things that coccinelle could not do on its own.
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Reviewed-by: Jan Kara <[email protected]> # for ext2, ext4, and sbitmap
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Jason A. Donenfeld <[email protected]>
|
|
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:
@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)
@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@
- RAND = get_random_u32();
... when != RAND
- RAND %= (E);
+ RAND = prandom_u32_max(E);
// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@
((T)get_random_u32()@p & (LITERAL))
// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@
value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
print("Skipping 0x%x for cleanup elsewhere" % (value))
cocci.include_match(False)
elif value & (value + 1) != 0:
print("Skipping 0x%x because it's not a power of two minus one" % (value))
cocci.include_match(False)
elif literal.startswith('0x'):
coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))
// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@
- (FUNC()@p & (LITERAL))
+ prandom_u32_max(RESULT)
@collapse_ret@
type T;
identifier VAR;
expression E;
@@
{
- T VAR;
- VAR = (E);
- return VAR;
+ return E;
}
@drop_var@
type T;
identifier VAR;
@@
{
- T VAR;
... when != VAR
}
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Reviewed-by: Yury Norov <[email protected]>
Reviewed-by: KP Singh <[email protected]>
Reviewed-by: Jan Kara <[email protected]> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <[email protected]> # for drbd
Acked-by: Jakub Kicinski <[email protected]>
Acked-by: Heiko Carstens <[email protected]> # for s390
Acked-by: Ulf Hansson <[email protected]> # for mmc
Acked-by: Darrick J. Wong <[email protected]> # for xfs
Signed-off-by: Jason A. Donenfeld <[email protected]>
|
|
When application has done lseek() to a different offset on a directory fd
we skipped one entry too many before we start emitting directory entries
from the cache.
We need to also make sure that when we are starting to emit directory
entries from the cache, the ->pos sequence might have holes and skip
some indices.
Signed-off-by: Ronnie Sahlberg <[email protected]>
Reviewed-by: Tom Talpey <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
syzbot is reporting UAF read at register_shrinker_prepared() [1], for
commit 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on
low memory condition") missed that nfsd4_leases_net_shutdown() from
nfsd_exit_net() is called only when nfsd_init_net() succeeded.
If nfsd_init_net() fails due to nfsd_reply_cache_init() failure,
register_shrinker() from nfsd4_init_leases_net() has to be undone
before nfsd_init_net() returns.
Link: https://syzkaller.appspot.com/bug?extid=ff796f04613b4c84ad89 [1]
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Fixes: 7746b32f467b3813 ("NFSD: add shrinker to reap courtesy clients on low memory condition")
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: Chuck Lever <[email protected]>
|
|
The path cache used during fiemap used to determine the sharedness of
extent buffers in a path from a leaf containing a file extent item
pointing to our data extent up to the root node of the tree, is meant to
be used for a single path. Having a single path is by far the most common
case, and therefore worth to optimize for, but it's possible to actually
have multiple paths because we have 2 or more leaves.
If we have multiple leaves, the 'level' variable keeps getting incremented
in each iteration of the while loop at btrfs_is_data_extent_shared(),
which means we will treat the second leaf in the 'tmp' ulist as a level 1
node, and so forth. In the worst case this can lead to getting a level
greater than or equals to BTRFS_MAX_LEVEL (8), which will trigger a
WARN_ON_ONCE() in the functions to lookup from or store in the path cache
(lookup_backref_shared_cache() and store_backref_shared_cache()). If the
current level never goes beyond 8, due to shared nodes in the paths and
a fs tree height smaller than 8, it can still result in incorrectly
marking one leaf as shared because some other leaf is shared and is stored
one level below that other leaf, as when storing a true sharedness value
in the cache results in updating the sharedness to true of all entries in
the cache below the current level.
Having multiple leaves happens in a case like the following:
- We have a file extent item point to data extent at bytenr X, for
a file range [0, 1M[ for example;
- At this moment we have an extent data ref for the extent, with
an offset of 0 and a count of 1;
- A write into the middle of the extent happens, file range [64K, 128K)
so the file extent item is split into two (at btrfs_drop_extents()):
1) One for file range [0, 64K), with a length (num_bytes field) of
64K and an extent offset of 0;
2) Another one for file range [128K, 1M), with a length of 896K
(1M - 128K) and an extent offset of 128K.
- At this moment the two file extent items are located in the same
leaf;
- A new file extent item for the range [64K, 128K), pointing to a new
data extent, is inserted in the leaf. This results in a leaf split
and now those two file extent items pointing to data extent X end
up located in different leaves;
- Once delayed refs are run, we still have a single extent data ref
item for our data extent at bytenr X, for offset 0, but now with a
count of 2 instead of 1;
- So during fiemap, at btrfs_is_data_extent_shared(), after we call
find_parent_nodes() for the data extent, we get two leaves, since
we have two file extent items point to data extent at bytenr X that
are located in two different leaves.
So skip the use of the path cache when we get more than one leaf.
Fixes: 12a824dc67a61e ("btrfs: speedup checking for extent sharedness during fiemap")
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
During backref walking, when processing a delayed reference with a type of
BTRFS_TREE_BLOCK_REF_KEY, we have two bugs there:
1) We are accessing the delayed references extent_op, and its key, without
the protection of the delayed ref head's lock;
2) If there's no extent op for the delayed ref head, we end up with an
uninitialized key in the stack, variable 'tmp_op_key', and then pass
it to add_indirect_ref(), which adds the reference to the indirect
refs rb tree.
This is wrong, because indirect references should have a NULL key
when we don't have access to the key, and in that case they should be
added to the indirect_missing_keys rb tree and not to the indirect rb
tree.
This means that if have BTRFS_TREE_BLOCK_REF_KEY delayed ref resulting
from freeing an extent buffer, therefore with a count of -1, it will
not cancel out the corresponding reference we have in the extent tree
(with a count of 1), since both references end up in different rb
trees.
When using fiemap, where we often need to check if extents are shared
through shared subtrees resulting from snapshots, it means we can
incorrectly report an extent as shared when it's no longer shared.
However this is temporary because after the transaction is committed
the extent is no longer reported as shared, as running the delayed
reference results in deleting the tree block reference from the extent
tree.
Outside the fiemap context, the result is unpredictable, as the key was
not initialized but it's used when navigating the rb trees to insert
and search for references (prelim_ref_compare()), and we expect all
references in the indirect rb tree to have valid keys.
The following reproducer triggers the second bug:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount -o compress $DEV $MNT
# With a compressed 128M file we get a tree height of 2 (level 1 root).
xfs_io -f -c "pwrite -b 1M 0 128M" $MNT/foo
btrfs subvolume snapshot $MNT $MNT/snap
# Fiemap should output 0x2008 in the flags column.
# 0x2000 means shared extent
# 0x8 means encoded extent (because it's compressed)
echo
echo "fiemap after snapshot, range [120M, 120M + 128K):"
xfs_io -c "fiemap -v 120M 128K" $MNT/foo
echo
# Overwrite one extent and fsync to flush delalloc and COW a new path
# in the snapshot's tree.
#
# After this we have a BTRFS_DROP_DELAYED_REF delayed ref of type
# BTRFS_TREE_BLOCK_REF_KEY with a count of -1 for every COWed extent
# buffer in the path.
#
# In the extent tree we have inline references of type
# BTRFS_TREE_BLOCK_REF_KEY, with a count of 1, for the same extent
# buffers, so they should cancel each other, and the extent buffers in
# the fs tree should no longer be considered as shared.
#
echo "Overwriting file range [120M, 120M + 128K)..."
xfs_io -c "pwrite -b 128K 120M 128K" $MNT/snap/foo
xfs_io -c "fsync" $MNT/snap/foo
# Fiemap should output 0x8 in the flags column. The extent in the range
# [120M, 120M + 128K) is no longer shared, it's now exclusive to the fs
# tree.
echo
echo "fiemap after overwrite range [120M, 120M + 128K):"
xfs_io -c "fiemap -v 120M 128K" $MNT/foo
echo
umount $MNT
Running it before this patch:
$ ./test.sh
(...)
wrote 134217728/134217728 bytes at offset 0
128 MiB, 128 ops; 0.1152 sec (1.085 GiB/sec and 1110.5809 ops/sec)
Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'
fiemap after snapshot, range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
Overwriting file range [120M, 120M + 128K)...
wrote 131072/131072 bytes at offset 125829120
128 KiB, 1 ops; 0.0001 sec (683.060 MiB/sec and 5464.4809 ops/sec)
fiemap after overwrite range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
The extent in the range [120M, 120M + 128K) is still reported as shared
(0x2000 bit set) after overwriting that range and flushing delalloc, which
is not correct - an entire path was COWed in the snapshot's tree and the
extent is now only referenced by the original fs tree.
Running it after this patch:
$ ./test.sh
(...)
wrote 134217728/134217728 bytes at offset 0
128 MiB, 128 ops; 0.1198 sec (1.043 GiB/sec and 1068.2067 ops/sec)
Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap'
fiemap after snapshot, range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x2008
Overwriting file range [120M, 120M + 128K)...
wrote 131072/131072 bytes at offset 125829120
128 KiB, 1 ops; 0.0001 sec (694.444 MiB/sec and 5555.5556 ops/sec)
fiemap after overwrite range [120M, 120M + 128K):
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [245760..246015]: 34304..34559 256 0x8
Now the extent is not reported as shared anymore.
So fix this by passing a NULL key pointer to add_indirect_ref() when
processing a delayed reference for a tree block if there's no extent op
for our delayed ref head with a defined key. Also access the extent op
only after locking the delayed ref head's lock.
The reproducer will be converted later to a test case for fstests.
Fixes: 86d5f994425252 ("btrfs: convert prelimary reference tracking to use rbtrees")
Fixes: a6dbceafb915e8 ("btrfs: Remove unused op_key var from add_delayed_refs")
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
When processing delayed data references during backref walking and we are
using a share context (we are being called through fiemap), whenever we
find a delayed data reference for an inode different from the one we are
interested in, then we immediately exit and consider the data extent as
shared. This is wrong, because:
1) This might be a DROP reference that will cancel out a reference in the
extent tree;
2) Even if it's an ADD reference, it may be followed by a DROP reference
that cancels it out.
In either case we should not exit immediately.
Fix this by never exiting when we find a delayed data reference for
another inode - instead add the reference and if it does not cancel out
other delayed reference, we will exit early when we call
extent_is_shared() after processing all delayed references. If we find
a drop reference, then signal the code that processes references from
the extent tree (add_inline_refs() and add_keyed_refs()) to not exit
immediately if it finds there a reference for another inode, since we
have delayed drop references that may cancel it out. In this later case
we exit once we don't have references in the rb trees that cancel out
each other and have two references for different inodes.
Example reproducer for case 1):
$ cat test-1.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount $DEV $MNT
xfs_io -f -c "pwrite 0 64K" $MNT/foo
cp --reflink=always $MNT/foo $MNT/bar
echo
echo "fiemap after cloning:"
xfs_io -c "fiemap -v" $MNT/foo
rm -f $MNT/bar
echo
echo "fiemap after removing file bar:"
xfs_io -c "fiemap -v" $MNT/foo
umount $MNT
Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:
$ ./test-1.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
Example reproducer for case 2):
$ cat test-2.sh
#!/bin/bash
DEV=/dev/sdj
MNT=/mnt/sdj
mkfs.btrfs -f $DEV
mount $DEV $MNT
xfs_io -f -c "pwrite 0 64K" $MNT/foo
cp --reflink=always $MNT/foo $MNT/bar
# Flush delayed references to the extent tree and commit current
# transaction.
sync
echo
echo "fiemap after cloning:"
xfs_io -c "fiemap -v" $MNT/foo
rm -f $MNT/bar
echo
echo "fiemap after removing file bar:"
xfs_io -c "fiemap -v" $MNT/foo
umount $MNT
Running it before this patch, the extent is still listed as shared, it has
the flag 0x2000 (FIEMAP_EXTENT_SHARED) set:
$ ./test-2.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
After this patch, after deleting bar in both tests, the extent is not
reported with the 0x2000 flag anymore, it gets only the flag 0x1
(which is FIEMAP_EXTENT_LAST):
$ ./test-1.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x1
$ ./test-2.sh
fiemap after cloning:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x2001
fiemap after removing file bar:
/mnt/sdj/foo:
EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS
0: [0..127]: 26624..26751 128 0x1
These tests will later be converted to a test case for fstests.
Fixes: dc046b10c8b7d4 ("Btrfs: make fiemap not blow when you have lots of snapshots")
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|