aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-07-18f2fs: avoid memory allocation failure due to a long lengthJaegeuk Kim1-18/+28
We need to avoid ENOMEM due to unexpected long length. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: reset default idle interval valueChao Yu1-1/+1
The default value of idle interval is 2 mins, but for most time when screen shutdown, there are still operations during the 2 mins interval, and gc's sleep time is about 30 secs to 60 secs, so there is almost no chance for GC thread to do garbage collecting. Set default value of idle interval value from 2 mins to 5 secs for fixing. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: use blk_plug in all the possible pathsJaegeuk Kim6-2/+29
This patch reverts 19a5f5e2ef37 (f2fs: drop any block plugging), and adds blk_plug in write paths additionally. The main reason is that blk_start_plug can be used to wake up from low-power mode before submitting further bios. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: fix to avoid data update racing between GC and DIOChao Yu4-1/+28
Datas in file can be operated by GC and DIO simultaneously, so we will face race case as below: For write case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - dio_bio_submit update user data to old block address For read case: Thread A Thread B - generic_file_direct_write - invalidate_inode_pages2_range - f2fs_direct_IO - do_blockdev_direct_IO - do_direct_IO - get_more_blocks - f2fs_balance_fs - f2fs_gc - do_garbage_collect - gc_data_segment - move_data_page - do_write_data_page migrate data block to new block address - write_checkpoint - do_checkpoint - clear_prefree_segments - f2fs_issue_discard discard old block adress - dio_bio_submit update user buffer from obsolete block address In order to fix this, for one file, we should let DIO and GC getting exclusion against with each other. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: add maximum prefree segmentsJaegeuk Kim2-0/+4
In 1TB storage, we need to admit 22841 prefree segments, which can consume too much segments. This patch sets 8GB in max. prefree segments in that case. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: disable extent_cache for fcollapse/finsert inodesJaegeuk Kim3-0/+19
This reduces the elapsed time to do xfstests/generic/017. Before: 458 s After: 390 s Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: refactor __exchange_data_block for speed upJaegeuk Kim2-66/+171
This reduces the elapsed time to do xfstests/generic/017. Before: 715 s After: 458 s Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-15f2fs: fix ERR_PTR returned by bioJaegeuk Kim1-1/+3
This is to fix wrong error pointer handling flow reported by Dan. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: avoid mark_inode_dirtyJaegeuk Kim10-44/+62
Let's check inode's dirtiness before calling mark_inode_dirty. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: move i_size_write in f2fs_write_endJaegeuk Kim1-1/+1
We don't need to do i_size_write under page lock. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: fix to avoid redundant discard during fstrimChao Yu1-2/+3
With below test steps, f2fs will issue redundant discard when doing fstrim, the reason is that we issue discards for both prefree segments and consecutive freed region user wants to trim, part regions they covered are overlapped, here, we change to do not to issue any discards for prefree segments in trimmed range. 1. mount -t f2fs -o discard /dev/zram0 /mnt/f2fs 2. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ 3. dd if=/dev/zero of=/mnt/f2fs/a bs=2M count=1 4. dd if=/dev/zero of=/mnt/f2fs/b bs=1M count=1 5. sync 6. rm /mnt/f2fs/a /mnt/f2fs/b 7. fstrim -o 0 -l 3221225472 -m 2097152 -v /mnt/f2fs/ Before: <...>-5428 [001] ...1 9511.052125: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x200 <...>-5428 [001] ...1 9511.052787: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 After: <...>-6764 [000] ...1 9720.382504: f2fs_issue_discard: dev = (251,0), blkstart = 0x2200, blklen = 0x300 Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: avoid mismatching block range for discardYunlei He1-0/+4
This patch skip discard block range smaller than trim_minlen, and can not be merged by neighbour Signed-off-by: Yunlei He <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: fix incorrect f_bfree calculation in ->statfsChao Yu1-1/+1
As manual described, f_bfree indicates total free blocks in fs, in f2fs, it includes two parts: visible free blocks and over-provision blocks. This patch corrrects the calculation. fsblkcnt_t f_bfree; /* free blocks in fs */ Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: use percpu_rw_semaphoreJaegeuk Kim3-30/+35
This patch replaces rw_semaphore with percpu_rw_semaphore for: sbi->cp_rwsem nm_i->nat_tree_lock Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: skip to check the block address of node pageJaegeuk Kim1-3/+3
If the node page is up-to-date, it should be alive. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: shrink critical region in spin_lockJaegeuk Kim1-15/+8
This patch shrinks the critical region in spin_lock. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: call SetPageUptodate if neededJaegeuk Kim5-15/+30
SetPageUptodate() issues memory barrier, resulting in performance degrdation. Let's avoid that. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: introduce f2fs_set_page_dirty_nobufferJaegeuk Kim4-3/+35
This patch adds f2fs_set_page_dirty_nobuffer() copied from __set_page_dirty_buffer. When appending 4KB blocks in f2fs on pmem with multiple cores, this improves the overall performance. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: remove unnecessary goto statementTiezhu Yang1-2/+2
When base_addr is NULL, there is no need to call kzfree, it should return -ENOMEM directly. Additionally, it is better to initialize variable 'error' with 0. Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: add nodiscard mount optionChao Yu2-1/+7
This patch adds 'nodiscard' mount option. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: fix to redirty page if fail to gc data pageChao Yu1-1/+12
If we fail to move data page during foreground GC, we should give another chance to writeback that page which was set dirty previously by writer. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: fix to detect truncation prior rather than EIO during readChao Yu3-13/+13
In procedure of synchonized read, after sending out the read request, reader will try to lock the page for waiting device to finish the read jobs and unlock the page, but meanwhile, truncater will race with reader, so after reader get lock of the page, it should check page's mapping to detect whether someone has truncated the page in advance, then reader has the chance to do the retry if truncation was done, otherwise read can be failed due to previous condition check. Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-08f2fs: fix to avoid reading out encrypted data in page cacheChao Yu1-43/+47
For encrypted inode, if user overwrites data of the inode, f2fs will read encrypted data into page cache, and then do the decryption. However reader can race with overwriter, and it will see encrypted data which has not been decrypted by overwriter yet. Fix it by moving decrypting work to background and keep page non-uptodated until data is decrypted. Thread A Thread B - f2fs_file_write_iter - __generic_file_write_iter - generic_perform_write - f2fs_write_begin - f2fs_submit_page_bio - generic_file_read_iter - do_generic_file_read - lock_page_killable - unlock_page - copy_page_to_iter hit the encrypted data in updated page - lock_page - fscrypt_decrypt_page Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: avoid latency-critical readahead of node pagesJaegeuk Kim2-2/+2
The f2fs_map_blocks is very related to the performance, so let's avoid any latency to read ahead node pages. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: avoid writing node/metapages during writesJaegeuk Kim1-3/+3
Let's keep more node/meta pages in run time. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: produce more nids and reduce readahead natsJaegeuk Kim6-8/+18
The readahead nat pages are more likely to be reclaimed quickly, so it'd better to gather more free nids in advance. And, let's keep some free nids as much as possible. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: detect host-managed SMR by feature flagJaegeuk Kim4-8/+35
If mkfs.f2fs gives a feature flag for host-managed SMR, we can set mode=lfs by default. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: call update_inode_page for orphan inodesJaegeuk Kim6-25/+9
Let's store orphan inode pages right away. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-07-06f2fs: report error for f2fs_parent_dirJaegeuk Kim1-6/+9
If there is no dentry, we can report its error correctly. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-15f2fs: find parent dentry correctlySheng Yong3-35/+2
If dotdot directory is corrupted, its slot may be ocupied by another file. In this case, dentry[1] is not the parent directory. Rename and cross-rename will update the inode in dentry[1] incorrectly. This patch finds dotdot dentry by name. Signed-off-by: Sheng Yong <[email protected]> [Jaegeuk Kim: remove wron bug_on] Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-13f2fs: fix deadlock in add_link failureJaegeuk Kim1-3/+0
mkdir sync_dirty_inode - init_inode_metadata - lock_page(node) - make_empty_dir - filemap_fdatawrite() - do_writepages - lock_page(data) - write_page(data) - lock_page(node) - f2fs_init_acl - error - truncate_inode_pages - lock_page(data) So, we don't need to truncate data pages in this error case, which will be done by f2fs_evict_inode. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-13f2fs: introduce mode=lfs mount optionJaegeuk Kim9-4/+74
This mount option is to enable original log-structured filesystem forcefully. So, there should be no random writes for main area. Especially, this supports host-managed SMR device. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-08f2fs: skip clean segment for gcJaegeuk Kim1-0/+4
If a segment in a section is clean or prefreed, we don't need to get its summary and do gc. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-08f2fs: drop any block pluggingJaegeuk Kim4-22/+11
In f2fs, we don't need to keep block plugging for NODE and DATA writes, since we already merged bios as much as possible. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-08f2fs: avoid reverse IO order for NODE and DATAJaegeuk Kim3-0/+9
There is a data race between allocate_data_block() and f2fs_sbumit_page_mbio(), which incur unnecessary reversed bio submission. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-08f2fs: set mapping error for EIOJaegeuk Kim1-1/+1
If EIO occurred, we need to set all the mapping to avoid any further IOs. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: control not to exceed # of cached nat entriesJaegeuk Kim3-0/+16
This is to avoid cache entry management overhead including radix tree. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: fix wrong percentageJaegeuk Kim1-1/+1
This should be 1%, 10MB / 1GB. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: avoid data race between FI_DIRTY_INODE flag and update_inodeJaegeuk Kim1-1/+2
FI_DIRTY_INODE flag is not covered by inode page lock, so it can be unset at any time like below. Thread #1 Thread #2 - lock_page(ipage) - update i_fields - update i_size/i_blocks/and so on - set FI_DIRTY_INODE - reset FI_DIRTY_INODE - set_page_dirty(ipage) In this case, we can lose the latest i_field information. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: remove obsolete parameter in f2fs_truncateJaegeuk Kim4-6/+6
We don't need lock parameter, which is always true. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: avoid wrong count on dirty inodesJaegeuk Kim1-2/+2
The number should be covered by spin_lock. Otherwise we can see wrong count in f2fs_stat. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-07f2fs: remove deprecated parameterJaegeuk Kim3-6/+5
Remove deprecated paramter. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: handle writepage correctlyJaegeuk Kim1-30/+14
Previously, f2fs_write_data_pages() calls __f2fs_writepage() which calls f2fs_write_data_page(). If f2fs_write_data_page() returns AOP_WRITEPAGE_ACTIVATE, __f2fs_writepage() calls mapping_set_error(). But, this should not happen at every time, since sometimes f2fs_write_data_page() tries to skip writing pages without error. For example, volatile_write() gives EIO all the time, as Shuoran Liu pointed out. Reported-by: Shuoran Liu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: return error of f2fs_lookupJaegeuk Kim2-2/+5
Now we can report an error to f2fs_lookup given by f2fs_find_entry. Suggested-by: He YunLei <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: return the errno to the caller to avoid using a wrong pageYunlong Song1-5/+10
Commit aaf9607516ed38825268515ef4d773289a44f429 ("f2fs: check node page contents all the time") pointed out that "sometimes it was reported that its contents was missing", so it checks the page's mapping and contents. When "nid != nid_of_node(page)", ERR_PTR(-EIO) will be returned to the caller. However, commit e1c51b9f1df2f9efc2ec11488717e40cd12015f9 ("f2fs: clean up node page updating flow") moves "nid != nid_of_node(page)" test to "f2fs_bug_on(sbi, nid != nid_of_node(page))", this will return a wrong page to the caller when F2FS_CHECK_FS is off when "sometimes it was reported that its contents was missing" happens. This patch restores to check node page contents all the time, and returns the errno to make the caller known something is wrong and avoid to use the page. This patch also moves f2fs_bug_on to its proper location. Signed-off-by: Yunlong Song <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: remove two steps to flush dirty data pagesJaegeuk Kim1-10/+1
If there is no cold page, we don't need to do a loop to flush dirty data pages. On /dev/pmem0, 1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync Before : 1.1 GB/s After : 1.2 GB/s 2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 Before : 2.2 GB/s After : 2.3 GB/s Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: do not skip writing data pagesJaegeuk Kim2-9/+6
For data pages, let's try to flush as much as possible in background. On /dev/pmem0, 1. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 conv=fsync Before : 800 MB/s After : 1.1 GB/s 2. dd if=/dev/zero of=/mnt/test/testfile bs=1M count=2048 Before : 1.3 GB/s After : 2.2 GB/s Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: inject to produce some orphan inodesJaegeuk Kim3-0/+9
Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: propagate error given by f2fs_find_entryJaegeuk Kim3-8/+24
If we get ENOMEM or EIO in f2fs_find_entry, we should stop right away. Otherwise, for example, we can get duplicate directory entry by ->chash and ->clevel. Signed-off-by: Jaegeuk Kim <[email protected]>
2016-06-02f2fs: remove writepages lockJaegeuk Kim3-9/+0
This patch removes writepages lock. We can improve multi-threading performance. tiobench, 32 threads, 4KB write per fsync on SSD Before: 25.88 MB/s After: 28.03 MB/s Signed-off-by: Jaegeuk Kim <[email protected]>