Age | Commit message (Collapse) | Author | Files | Lines |
|
Currently pagewalker splits all THP pages on any clear_refs request. It's
not necessary. We can handle this on PMD level.
One side effect is that soft dirty will potentially see more dirty memory,
since we will mark whole THP page dirty at once.
Sanity checked with CRIU test suite. More testing is required.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Cyrill Gorcunov <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
walk_page_range() silently skips vma having VM_PFNMAP set, which leads to
undesirable behaviour at client end (who called walk_page_range). For
example for pagemap_read(), when no callbacks are called against VM_PFNMAP
vma, pagemap_read() may prepare pagemap data for next virtual address
range at wrong index. That could confuse and/or break userspace
applications.
This patch avoid this misbehavior caused by vma(VM_PFNMAP) like follows:
- for pagemap_read() which has its own ->pte_hole(), call the ->pte_hole()
over vma(VM_PFNMAP),
- for clear_refs and queue_pages which have their own ->tests_walk,
just return 1 and skip vma(VM_PFNMAP). This is no problem because
these are not interested in hole regions,
- for other callers, just skip the vma(VM_PFNMAP) as a default behavior.
Signed-off-by: Naoya Horiguchi <[email protected]>
Signed-off-by: Shiraz Hashim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
pagewalk.c can handle vma in itself, so we don't have to pass vma via
walk->private. And show_numa_map() walks pages on vma basis, so using
walk_page_vma() is preferable.
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Just doing s/gather_hugetbl_stats/gather_hugetlb_stats/g, this makes code
grep-friendly.
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Page table walker has the information of the current vma in mm_walk, so we
don't have to call find_vma() in each pagemap_(pte|hugetlb)_range() call
any longer. Currently pagemap_pte_range() does vma loop itself, so this
patch reduces many lines of code.
NULL-vma check is omitted because we assume that we never run these
callbacks on any address outside vma. And even if it were broken, NULL
pointer dereference would be detected, so we can get enough information
for debugging.
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
clear_refs_write() has some prechecks to determine if we really walk over
a given vma. Now we have a test_walk() callback to filter vmas, so let's
utilize it.
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
pagewalk.c can handle vma in itself, so we don't have to pass vma via
walk->private. And show_smap() walks pages on vma basis, so using
walk_page_vma() is preferable.
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Cyrill Gorcunov <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Lockless access to pte in pagemap_pte_range() might race with page
migration and trigger BUG_ON(!PageLocked()) in migration_entry_to_page():
CPU A (pagemap) CPU B (migration)
lock_page()
try_to_unmap(page, TTU_MIGRATION...)
make_migration_entry()
set_pte_at()
<read *pte>
pte_to_pagemap_entry()
remove_migration_ptes()
unlock_page()
if(is_migration_entry())
migration_entry_to_page()
BUG_ON(!PageLocked(page))
Also lockless read might be non-atomic if pte is larger than wordsize.
Other pte walkers (smaps, numa_maps, clear_refs) already lock ptes.
Fixes: 052fb0d635df ("proc: report file/anon bit in /proc/pid/pagemap")
Signed-off-by: Konstantin Khlebnikov <[email protected]>
Reported-by: Andrey Ryabinin <[email protected]>
Reviewed-by: Cyrill Gorcunov <[email protected]>
Acked-by: Naoya Horiguchi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: <[email protected]> [3.5+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Helper account_page_redirty() fixes dirty pages counter for redirtied
pages. This patch puts it after dirtying and prevents temporary
underflows of dirtied pages counters on zone/bdi and current->nr_dirtied.
Signed-off-by: Konstantin Khebnikov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Dave noticed that unprivileged process can allocate significant amount of
memory -- >500 MiB on x86_64 -- and stay unnoticed by oom-killer and
memory cgroup. The trick is to allocate a lot of PMD page tables. Linux
kernel doesn't account PMD tables to the process, only PTE.
The use-cases below use few tricks to allocate a lot of PMD page tables
while keeping VmRSS and VmPTE low. oom_score for the process will be 0.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#define PUD_SIZE (1UL << 30)
#define PMD_SIZE (1UL << 21)
#define NR_PUD 130000
int main(void)
{
char *addr = NULL;
unsigned long i;
prctl(PR_SET_THP_DISABLE);
for (i = 0; i < NR_PUD ; i++) {
addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
break;
}
*addr = 'x';
munmap(addr, PMD_SIZE);
mmap(addr, PMD_SIZE, PROT_WRITE|PROT_READ,
MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
if (addr == MAP_FAILED)
perror("re-mmap"), exit(1);
}
printf("PID %d consumed %lu KiB in PMD page tables\n",
getpid(), i * 4096 >> 10);
return pause();
}
The patch addresses the issue by account PMD tables to the process the
same way we account PTE.
The main place where PMD tables is accounted is __pmd_alloc() and
free_pmd_range(). But there're few corner cases:
- HugeTLB can share PMD page tables. The patch handles by accounting
the table to all processes who share it.
- x86 PAE pre-allocates few PMD tables on fork.
- Architectures with FIRST_USER_ADDRESS > 0. We need to adjust sanity
check on exit(2).
Accounting only happens on configuration where PMD page table's level is
present (PMD is not folded). As with nr_ptes we use per-mm counter. The
counter value is used to calculate baseline for badness score by
oom-killer.
Signed-off-by: Kirill A. Shutemov <[email protected]>
Reported-by: Dave Hansen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Reviewed-by: Cyrill Gorcunov <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: David Rientjes <[email protected]>
Tested-by: Sedat Dilek <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add KPF_ZERO_PAGE flag for zero_page, so that userspace processes can
detect zero_page in /proc/kpageflags, and then do memory analysis more
accurately.
Signed-off-by: Yalin Wang <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Konstantin Khlebnikov <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
rwlock can provide better concurrency when there are much more readers than
writers because readers can hold the rwlock simultaneously.
But now, for segmap_lock rwlock in struct free_segmap_info, there is only one
reader 'mount' from below call path:
->f2fs_fill_super
->build_segment_manager
->build_dirty_segmap
->init_dirty_segmap
->find_next_inuse
read_lock
...
read_unlock
Now that our concurrency can not be improved since there is no other reader for
this lock, we do not need to use rwlock_t type for segmap_lock, let's replace it
with spinlock_t type.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch fixes the following test.
This causes:
attempt to access beyond end of device
sdb2: rw=16384, want=14413962000, limit=16777216
The reason is:
- f2fs_write_begin
- f2fs_convert_inline_inode returns -ENOSPC
- f2fs_write_failed
- truncate_blocks
- truncate_partial_data_page
- find_data_page
- get_dnode_of_data returns wrong data index retrieved from inline_data
- f2fs_submit_page_bio(wrong data index)
- submit_bio(wrong data index)
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Instead of using variable length array, this patch let preallocate memory for
them.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch resolves the following warnings.
include/trace/events/f2fs.h:150:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:180:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:990:1: warning: expression using sizeof bool
include/trace/events/f2fs.h:150:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:180:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
include/trace/events/f2fs.h:990:1: warning: odd constant _Bool cast (ffffffffffffffff becomes 1)
fs/f2fs/checkpoint.c:27:19: warning: symbol 'inode_entry_slab' was not declared. Should it be static?
fs/f2fs/checkpoint.c:577:15: warning: cast to restricted __le32
fs/f2fs/checkpoint.c:592:15: warning: cast to restricted __le32
fs/f2fs/trace.c:19:1: warning: symbol 'pids' was not declared. Should it be static?
fs/f2fs/trace.c:21:21: warning: symbol 'last_io' was not declared. Should it be static?
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds preallocation for data blocks to prepare f2fs_direct_IO.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds two macros for transition between byte and block offsets.
Currently, f2fs only supports 4KB blocks, so use the default size for now.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch fixes wrong handling of buffer_new flag in get_block.
If f2fs allocates new blocks and mapped buffer_head, it needs to set buffer_new
for the bh_result.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
In get_node_page, if the page is up-to-date, we assumed that the page was not
reclaimed at all.
But, sometimes it was reported that its contents was missing.
So, just for sure, let's check its mapping and contents.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
xfstest generic/285 complains our issue in lseeking huge file.
Here is the detail output of generic/285:
"./check -f2fs tests/generic/285
Ran: generic/285
Failures: generic/285
Failed 1 of 1 tests
10. Test a huge file for offset overflow
10.01 SEEK_HOLE expected 65536 or 8589934592, got 65536. succ
10.02 SEEK_HOLE expected 65536 or 8589934592, got 65536. succ
10.03 SEEK_DATA expected 0 or 0, got 0. succ
10.04 SEEK_DATA expected 1 or 1, got 1. succ
10.05 SEEK_HOLE expected 8589934592 or 8589934592, got 0. FAIL
10.06 SEEK_DATA expected 8589869056 or 8589869056, got 8589869056. succ
10.07 SEEK_DATA expected 8589869057 or 8589869057, got 8589869057. succ
10.08 SEEK_DATA expected 8589869056 or 8589869056, got 4294901760. FAIL"
The reason of this issue is:
We will calculate current offset through left shifting page-offset with
PAGE_CACHE_SHIFT bits, but our page-offset is a type of unsigned long, its size
is 4 bytes in 32-bits machine.
So if our page-offset is bigger than (1 << 32 / pagesize - 1), result of left
shifting will overflow.
Let's fix this issue by casting type of page-offset to type of current offset:
loff_t.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
In commit a78186ebe516 ("f2fs: use highmem for directory pages"), we have set
__GFP_HIGHMEM into dir mapping's gfp flag in f2fs_iget, so high address memory
could be used for these existing dir's page.
But we forgot to set flag for newly created dir, due to this reason, our newly
created dir pages could not be allocated from high address memory. Fix it.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch introduces a batched trimming feature, which submits split discard
commands.
This is to avoid long latency due to huge trim commands.
If fstrim was triggered ranging from 0 to the end of device, we should lock
all the checkpoint-related mutexes, resulting in very long latency.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch merges ->{invalidate,release}page function for meta/node/data pages.
After this, duplication of codes could be removed.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds the # of writeback pages in stat info.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
If PagePrivate is removed by releasepage, f2fs loses counting dirty pages.
e.g., try_to_release_page will not release page when the page is dirty,
but our releasepage removes PagePrivate.
[<ffffffff81188d75>] try_to_release_page+0x35/0x50
[<ffffffff811996f9>] invalidate_inode_pages2_range+0x2f9/0x3b0
[<ffffffffa02a7f54>] ? truncate_blocks+0x384/0x4d0 [f2fs]
[<ffffffffa02b7583>] ? f2fs_direct_IO+0x283/0x290 [f2fs]
[<ffffffffa02b7fb0>] ? get_data_block_fiemap+0x20/0x20 [f2fs]
[<ffffffff8118aa53>] generic_file_direct_write+0x163/0x170
[<ffffffff8118ad06>] __generic_file_write_iter+0x2a6/0x350
[<ffffffff8118adef>] generic_file_write_iter+0x3f/0xb0
[<ffffffff81203081>] new_sync_write+0x81/0xb0
[<ffffffff81203837>] vfs_write+0xb7/0x1f0
[<ffffffff81204459>] SyS_write+0x49/0xb0
[<ffffffff817c286d>] system_call_fastpath+0x16/0x1b
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
If device is read-only, we should not proceed data recovery.
But, if the previous checkpoint was done by normal clean shutdown, it's safe to
proceed the recovery, since there will be no data to be recovered.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds FASTBOOT flag into checkpoint as follows.
- CP_UMOUNT_FLAG is set when system is umounted.
- CP_FASTBOOT_FLAG is set when intermediate checkpoint having node summaries
was done.
So, if you get CP_UMOUNT_FLAG from checkpoint, the system was umounted cleanly.
Instead, if there was sudden-power-off, you can get CP_FASTBOOT_FLAG or nothing.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Do not change any partition when f2fs is changed to readonly mode.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds a mount option, norecovery, which is mostly same as
disable_roll_forward. The only difference is that norecovery should be activated
with read-only mount option.
This can be used when user wants to check whether f2fs is mountable or not
without any recovery process. (e.g., xfstests/200)
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
If wrong mount option was requested, f2fs tries to fill_super again.
But, during the next trial, f2fs has no valid mount options, since
parse_options deleted all the separators in the original string.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Currently, there are several variables with Boolean type as below:
struct f2fs_sb_info {
...
int s_dirty;
bool need_fsck;
bool s_closing;
...
bool por_doing;
...
}
For this there are some issues:
1. there are some space of f2fs_sb_info is wasted due to aligning after Boolean
type variables by compiler.
2. if we continuously add new flag into f2fs_sb_info, structure will be messed
up.
So in this patch, we try to:
1. switch s_dirty to Boolean type variable since it has two status 0/1.
2. merge s_dirty/need_fsck/s_closing/por_doing variables into s_flag.
3. introduce an enum type which can indicate different states of sbi.
4. use new introduced universal interfaces is_sbi_flag_set/{set,clear}_sbi_flag
to operate flags for sbi.
After that, above issues will be fixed.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Use pointer parameter @wait to pass result in {in,de}create_sleep_time for
cleanup.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
1. make truncate_inline_date static;
2. remove parameter @from of truncate_inline_date as callers only pass zero.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Introduced by a6dda0e63e97122ce9e0ba04367e37cca28315fa
"f2fs: use generic posix ACL infrastructure".
When testing default acl, gets in recent kernel (3.19.0-rc5),
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group::r-x
default:group:root:rwx
default:mask::rwx
default:other::r-x
]# getfacl testdir/
user::rwx
group::rwx
// missing an acl "group:root:rwx" inherited from parent
other::r-x
default:user::rwx
default:group::r-x
default:group:root:rwx
default:mask::rwx
default:other::r-x
Signed-off-by: Kinglong Mee <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
No modification in functionality, just clean codes with f2fs_radix_tree_insert.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
In this patch we add the FS_IOC_GETVERSION ioctl for getting i_generation from
inode, after that, users can list file's generation number by using "lsattr -v".
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
During the recovery, any xattr blocks should not be found, since they are
written into cold log, not the warm node chain.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
We will encounter deadloop in below scenario:
1. increase page count for F2FS_DIRTY_META type in following path:
->recover_fsync_data
->recover_data
->do_recover_data
->recover_data_page
->change_curseg
->write_sum_page
->set_page_dirty
2. fail in recover_data()
3. invalidate meta pages in truncate_inode_pages_final without decreasing page
count.
4. deadloop when sync_meta_pages as page count will always be non-zero.
message:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s!
[<c1129a37>] pagevec_lookup_tag+0x27/0x30
[<f0e774c7>] sync_meta_pages+0x87/0x160 [f2fs]
[<f0e86dd9>] recover_fsync_data+0xeb9/0xf10 [f2fs]
[<f0e75398>] f2fs_fill_super+0x888/0x980 [f2fs]
[<c11733ca>] mount_bdev+0x16a/0x1a0
[<f0e7180f>] f2fs_mount+0x1f/0x30 [f2fs]
[<c1173da6>] mount_fs+0x36/0x170
[<c118b6f5>] vfs_kern_mount+0x55/0xe0
[<c118d63f>] do_mount+0x1df/0x9f0
[<c118e110>] SyS_mount+0x70/0xb0
[<c15a0c48>] sysenter_do_call+0x12/0x12
To avoid page count leak, let's add ->invalidatepage and ->releasepage in
f2fs_meta_aops as f2fs_node_aops to release meta page count correctly.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
If the previous checkpoint was done without CP_UMOUNT flag, it needs to do
checkpoint with CP_UMOUNT for the next fast boot.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch fixes to trigger checkpoint with umount flag when kill_sb was called.
In kill_sb, f2fs_sync_fs was finally called, but at this time, f2fs can't do
checkpoint with CP_UMOUNT.
After then, f2fs_put_super is not doing checkpoint, since it is not dirty.
So, this patch adds a flag to indicate f2fs_sync_fs is called during umount.
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
This patch adds missing memory usages, and splits them in detail.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
Our value of memory footprint statistics showed in debugfs is not calculated
correctly. Fix it in this patch.
Signed-off-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
If cp_error is set, we should avoid all the infinite loop.
In f2fs_sync_file, there is a hole, and this patch fixes that.
Signed-off-by: Jaegeuk Kim <[email protected]>
|
|
For added overflow protection...
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the call to decode_rc_list() fails due to a memory allocation error,
then we need to truncate the array size to ensure that we only call
kfree() on those pointer that were allocated.
Reported-by: David Ramos <[email protected]>
Fixes: 4aece6a19cf7f ("nfs41: cb_sequence xdr implementation")
Cc: [email protected]
Signed-off-by: Trond Myklebust <[email protected]>
|
|
Pull networking updates from David Miller:
1) More iov_iter conversion work from Al Viro.
[ The "crypto: switch af_alg_make_sg() to iov_iter" commit was
wrong, and this pull actually adds an extra commit on top of the
branch I'm pulling to fix that up, so that the pre-merge state is
ok. - Linus ]
2) Various optimizations to the ipv4 forwarding information base trie
lookup implementation. From Alexander Duyck.
3) Remove sock_iocb altogether, from CHristoph Hellwig.
4) Allow congestion control algorithm selection via routing metrics.
From Daniel Borkmann.
5) Make ipv4 uncached route list per-cpu, from Eric Dumazet.
6) Handle rfs hash collisions more gracefully, also from Eric Dumazet.
7) Add xmit_more support to r8169, e1000, and e1000e drivers. From
Florian Westphal.
8) Transparent Ethernet Bridging support for GRO, from Jesse Gross.
9) Add BPF packet actions to packet scheduler, from Jiri Pirko.
10) Add support for uniqu flow IDs to openvswitch, from Joe Stringer.
11) New NetCP ethernet driver, from Muralidharan Karicheri and Wingman
Kwok.
12) More sanely handle out-of-window dupacks, which can result in
serious ACK storms. From Neal Cardwell.
13) Various rhashtable bug fixes and enhancements, from Herbert Xu,
Patrick McHardy, and Thomas Graf.
14) Support xmit_more in be2net, from Sathya Perla.
15) Group Policy extensions for vxlan, from Thomas Graf.
16) Remove Checksum Offload support for vxlan, from Tom Herbert.
17) Like ipv4, support lockless transmit over ipv6 UDP sockets. From
Vlad Yasevich.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1494+1 commits)
crypto: fix af_alg_make_sg() conversion to iov_iter
ipv4: Namespecify TCP PMTU mechanism
i40e: Fix for stats init function call in Rx setup
tcp: don't include Fast Open option in SYN-ACK on pure SYN-data
openvswitch: Only set TUNNEL_VXLAN_OPT if VXLAN-GBP metadata is set
ipv6: Make __ipv6_select_ident static
ipv6: Fix fragment id assignment on LE arches.
bridge: Fix inability to add non-vlan fdb entry
net: Mellanox: Delete unnecessary checks before the function call "vunmap"
cxgb4: Add support in cxgb4 to get expansion rom version via ethtool
ethtool: rename reserved1 memeber in ethtool_drvinfo for expansion ROM version
net: dsa: Remove redundant phy_attach()
IB/mlx4: Reset flow support for IB kernel ULPs
IB/mlx4: Always use the correct port for mirrored multicast attachments
net/bonding: Fix potential bad memory access during bonding events
tipc: remove tipc_snprintf
tipc: nl compat add noop and remove legacy nl framework
tipc: convert legacy nl stats show to nl compat
tipc: convert legacy nl net id get to nl compat
tipc: convert legacy nl net id set to nl compat
...
|
|
Merge misc updates from Andrew Morton:
"Bite-sized chunks this time, to avoid the MTA ratelimiting woes.
- fs/notify updates
- ocfs2
- some of MM"
That laconic "some MM" is mainly the removal of remap_file_pages(),
which is a big simplification of the VM, and which gets rid of a *lot*
of random cruft and special cases because we no longer support the
non-linear mappings that it used.
From a user interface perspective, nothing has changed, because the
remap_file_pages() syscall still exists, it's just done by emulating the
old behavior by creating a lot of individual small mappings instead of
one non-linear one.
The emulation is slower than the old "native" non-linear mappings, but
nobody really uses or cares about remap_file_pages(), and simplifying
the VM is a big advantage.
* emailed patches from Andrew Morton <[email protected]>: (78 commits)
memcg: zap memcg_slab_caches and memcg_slab_mutex
memcg: zap memcg_name argument of memcg_create_kmem_cache
memcg: zap __memcg_{charge,uncharge}_slab
mm/page_alloc.c: place zone_id check before VM_BUG_ON_PAGE check
mm: hugetlb: fix type of hugetlb_treat_as_movable variable
mm, hugetlb: remove unnecessary lower bound on sysctl handlers"?
mm: memory: merge shared-writable dirtying branches in do_wp_page()
mm: memory: remove ->vm_file check on shared writable vmas
xtensa: drop _PAGE_FILE and pte_file()-related helpers
x86: drop _PAGE_FILE and pte_file()-related helpers
unicore32: drop pte_file()-related helpers
um: drop _PAGE_FILE and pte_file()-related helpers
tile: drop pte_file()-related helpers
sparc: drop pte_file()-related helpers
sh: drop _PAGE_FILE and pte_file()-related helpers
score: drop _PAGE_FILE and pte_file()-related helpers
s390: drop pte_file()-related helpers
parisc: drop _PAGE_FILE and pte_file()-related helpers
openrisc: drop _PAGE_FILE and pte_file()-related helpers
nios2: drop _PAGE_FILE and pte_file()-related helpers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw
Pull gfs2 updates from Steven Whitehouse:
"This time we have mostly clean ups. There is a bug fix for a NULL
dereference relating to ACLs, and another which improves (but does not
fix entirely) an allocation fall-back code path. The other three
patches are small clean ups"
* tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw:
GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl()
GFS2: use __vmalloc GFP_NOFS for fs-related allocations.
GFS2: Eliminate a nonsense goto
GFS2: fix sprintf format specifier
GFS2: Eliminate __gfs2_glock_remove_from_lru
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs
Pull xfs update from Dave Chinner:
"This update contains:
- RENAME_EXCHANGE support
- Rework of the superblock logging infrastructure
- Rework of the XFS_IOCTL_SETXATTR implementation
* enables use inside user namespaces
* fixes inconsistencies setting extent size hints
- fixes for missing buffer type annotations used in log recovery
- more consolidation of libxfs headers
- preparation patches for block based PNFS support
- miscellaneous bug fixes and cleanups"
* tag 'xfs-for-linus-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs: (37 commits)
xfs: only trace buffer items if they exist
xfs: report proper f_files in statfs if we overshoot imaxpct
xfs: fix panic_mask documentation
xfs: xfs_ioctl_setattr_check_projid can be static
xfs: growfs should use synchronous transactions
xfs: fix behaviour of XFS_IOC_FSSETXATTR on directories
xfs: factor projid hint checking out of xfs_ioctl_setattr
xfs: factor extsize hint checking out of xfs_ioctl_setattr
xfs: XFS_IOCTL_SETXATTR can run in user namespaces
xfs: kill xfs_ioctl_setattr behaviour mask
xfs: disaggregate xfs_ioctl_setattr
xfs: factor out xfs_ioctl_setattr transaciton preamble
xfs: separate xflags from xfs_ioctl_setattr
xfs: FSX_NONBLOCK is not used
xfs: don't allocate an ioend for direct I/O completions
xfs: change kmem_free to use generic kvfree()
xfs: factor out a xfs_update_prealloc_flags() helper
xfs: remove incorrect error negation in attr_multi ioctl
xfs: set superblock buffer type correctly
xfs: set buf types when converting extent formats
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota interface unification and misc cleanups from Jan Kara:
"The first part of the series unifying XFS and VFS quota interfaces.
This part unifies turning quotas on and off so quota-tools and
xfs_quota can be used to manage any filesystem. This is useful so
that userspace doesn't have to distinguish which filesystem it is
working with. As a result we can then easily reuse tests for project
quotas in XFS for ext4.
This also contains minor cleanups and fixes for udf, isofs, and ext3"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: (23 commits)
udf: remove bool assignment to 0/1
udf: use bool for done
quota: Store maximum space limit in bytes
quota: Remove quota_on_meta callback
ocfs2: Use generic helpers for quotaon and quotaoff
ext4: Use generic helpers for quotaon and quotaoff
quota: Add ->quota_{enable,disable} callbacks for VFS quotas
quota: Wire up ->quota_{enable,disable} callbacks into Q_QUOTA{ON,OFF}
quota: Split ->set_xstate callback into two
xfs: Remove some pointless quota checks
xfs: Remove some useless flags tests
xfs: Remove useless test
quota: Verify flags passed to Q_SETINFO
quota: Cleanup flags definitions
ocfs2: Move OLQF_CLEAN flag out of generic quota flags
quota: Don't store flags for v2 quota format
jbd: drop jbd_ENOSYS debug
udf: destroy sbi mutex in put_super
udf: Check length of extended attributes and allocation descriptors
udf: Remove repeated loads blocksize
...
|