Age | Commit message (Collapse) | Author | Files | Lines |
|
We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
check if ->iov_offset is non-zero (i.e. that pipe is not empty)
if so, get the corresponding pipe_buffer and check its ->ops
if it's &default_pipe_buf_ops, we have an anon buffer.
Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
empty, no buffers occupied: 0
anon, with bytes up to N-1 filled: N
zero-copy, with bytes up to N-1 filled: -N
That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.
Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.
Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator. About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe. There are only two cases where we leave the
sane state:
1) iov_iter_get_pages()/iov_iter_get_pages_alloc(). Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
2) iov_iter copied, then something is put into the copy. Since
they share the underlying pipe, the original gets behind. When we
decide that we are done with the copy (original is not usable until then)
we advance the original. direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data. At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.
Signed-off-by: Al Viro <[email protected]>
|
|
Fold pipe_truncate() into it, clean up. We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.
Signed-off-by: Al Viro <[email protected]>
|
|
instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.
Signed-off-by: Al Viro <[email protected]>
|
|
it's only used to get to the partial buffer we can add to,
and that's always the last one, i.e. pipe->head - 1.
Signed-off-by: Al Viro <[email protected]>
|
|
Expand the only remaining call of push_pipe() (in
__pipe_get_pages()), combine it with the page-collecting loop there.
Note that the only reason it's not a loop doing append_pipe() is
that append_pipe() is advancing, while iov_iter_get_pages() is not.
As soon as it switches to saner semantics, this thing will switch
to using append_pipe().
Signed-off-by: Al Viro <[email protected]>
|
|
New helper: append_pipe(). Extends the last buffer if possible,
allocates a new one otherwise. Returns page and offset in it
on success, NULL on failure. iov_iter is advanced past the
data we've got.
Use that instead of push_pipe() in copy-to-pipe primitives;
they get simpler that way. Handling of short copy (in "mc" one)
is done simply by iov_iter_revert() - iov_iter is in consistent
state after that one, so we can use that.
[Fix for braino caught by Liu Xinpeng <[email protected]> folded in]
[another braino fix, this time in copy_pipe_to_iter() and pipe_zero();
caught by testcase from Hugh Dickins]
Signed-off-by: Al Viro <[email protected]>
|
|
There are only two kinds of pipe_buffer in the area used by ITER_PIPE.
1) anonymous - copy_to_iter() et.al. end up creating those and copying
data there. They have zero ->offset, and their ->ops points to
default_pipe_page_ops.
2) zero-copy ones - those come from copy_page_to_iter(), and page
comes from caller. ->offset is also caller-supplied - it might be
non-zero. ->ops points to page_cache_pipe_buf_ops.
Move creation and insertion of those into helpers - push_anon(pipe, size)
and push_page(pipe, page, offset, size) resp., separating them from
the "could we avoid creating a new buffer by merging with the current
head?" logics.
Acked-by: Jeff Layton <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
pipe_buffer instances of a pipe are organized as a ring buffer,
with power-of-2 size. Indices are kept *not* reduced modulo ring
size, so the buffer refered to by index N is
pipe->bufs[N & (pipe->ring_size - 1)].
Ring size can change over the lifetime of a pipe, but not while
the pipe is locked. So for any iov_iter primitives it's a constant.
Original conversion of pipes to this layout went overboard trying
to microoptimize that - calculating pipe->ring_size - 1, storing
it in a local variable and using through the function. In some
cases it might be warranted, but most of the times it only
obfuscates what's going on in there.
Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
that and use it in the obvious cases. More will follow...
Reviewed-by: Jeff Layton <[email protected]>
Reviewed-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother
with rather non-obvious use of iov_iter_advance() in there.
Reviewed-by: Jeff Layton <[email protected]>
Reviewed-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Reviewed-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Al Viro <[email protected]>
|
|
Equivalent of single-segment iovec. Initialized by iov_iter_ubuf(),
checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
ones.
We are going to expose the things like ->write_iter() et.al. to those
in subsequent commits.
New predicate (user_backed_iter()) that is true for ITER_IOVEC and
ITER_UBUF; places like direct-IO handling should use that for
checking that pages we modify after getting them from iov_iter_get_pages()
would need to be dirtied.
DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
will solve all problems - there's code that uses iter_is_iovec() to
decide how to poke around in iov_iter guts and for that the predicate
replacement obviously won't suffice.
Signed-off-by: Al Viro <[email protected]>
|
|
What happens if a thread is preempted after mapping pages with
kmap_local_page() was questioned recently.[1]
Commit f3ba3c710ac5 ("mm/highmem: Provide kmap_local*") from Thomas
Gleixner explains clearly that on context switch, the maps of an outgoing
task are removed and the map of the incoming task are restored and that
kmap_local_page() can be invoked from both preemptible and atomic
contexts.[2]
Therefore, for the purpose to make it clearer that users can call
kmap_local_page() from contexts that allow preemption, rework a couple of
sentences and add further information in highmem.rst.
[1] https://lore.kernel.org/lkml/5303077.Sb9uPGUboI@opensuse/
[2] https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
kmap_local_page() should always be preferred in place of kmap() and
kmap_atomic(). "Only use when really necessary." is not consistent with
the Documentation/mm/highmem.rst and these kdocs it embeds.
Therefore, delete the above-mentioned sentence from kdocs.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The reasoning for converting kmap() to kmap_local_page() was questioned
recently.[1]
There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) kmap() also requires global TLB invalidation when
its pool wraps and it might block when the mapping space is fully utilized
until a slot becomes available.
Warn users to avoid the use of kmap() and instead use kmap_local_page(),
by designing their code to map pages in the same context the mapping will
be used.
[1] https://lore.kernel.org/lkml/1891319.taCxCBeP46@opensuse/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Users of kmap_local_page() must be absolutely sure to not hand kernel
virtual address obtained calling kmap_local_page() on highmem pages to
other contexts because those pointers are thread local, therefore, they
are no longer valid across different contexts.
Extend the documentation of kmap_local_page() to warn users about the
above-mentioned potential invalid use of pointers returned by
kmap_local_page().
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There is no need to kmap*() pages which are guaranteed to come from
ZONE_NORMAL (or lower). Linux has currently several call sites of
kmap{,_atomic,_local_page}() on pages which are clearly known which can't
come from ZONE_HIGHMEM.
Therefore, add a paragraph to highmem.rst, to explain better that a plain
page_address() may be used for getting the address of pages which cannot
come from ZONE_HIGHMEM, although it is always safe to use
kmap_local_page() / kunmap_local() also on those pages.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In a recent thread about converting kmap() to kmap_local_page(), the
safety of calling kmap_local_page() was questioned.[1]
"any context" should probably be enough detail for users who want to know
whether or not kmap_local_page() can be called from interrupts. However,
Linux still has kmap_atomic() which might make users think they must use
the latter in interrupts.
Add "including interrupts" for better clarity.
[1] https://lore.kernel.org/lkml/3187836.aeNJFYEL58@opensuse/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "highmem: Extend kmap_local_page() documentation", v2.
The Highmem interface is evolving and the current documentation does not
reflect the intended uses of each of the calls. Furthermore, after a
recent series of reworks, the differences of the calls can still be
confusing and may lead to the expanded use of calls which are deprecated.
This series is the second round of changes towards an enhanced
documentation of the Highmem's interface; at this stage the patches are
only focused to kmap_local_page().
In addition it also contains some minor clean ups.
This patch (of 7):
In the kdocs of kmap_local_page(), the description of @page starts after
several unnecessary spaces.
Therefore, remove those spaces.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Fabio M. De Francesco <[email protected]>
Suggested-by: Ira Weiny <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Peter Collingbourne <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Now error handling code is prepared, so remove the blocking code and
enable memory error handling on 1GB hugepage.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Currently if memory_failure() (modified to remove blocking code with
subsequent patch) is called on a page in some 1GB hugepage, memory error
handling fails and the raw error page gets into leaked state. The impact
is small in production systems (just leaked single 4kB page), but this
limits the testability because unpoison doesn't work for it. We can no
longer create 1GB hugepage on the 1GB physical address range with such
leaked pages, that's not useful when testing on small systems.
When a hwpoison page in a 1GB hugepage is handled, it's caught by the
PageHWPoison check in free_pages_prepare() because the 1GB hugepage is
broken down into raw error pages before coming to this point:
if (unlikely(PageHWPoison(page)) && !order) {
...
return false;
}
Then, the page is not sent to buddy and the page refcount is left 0.
Originally this check is supposed to work when the error page is freed
from page_handle_poison() (that is called from soft-offline), but now we
are opening another path to call it, so the callers of
__page_handle_poison() need to handle the case by considering the return
value 0 as success. Then page refcount for hwpoison is properly
incremented so unpoison works.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
__page_handle_poison() returns bool that shows whether
take_page_off_buddy() has passed or not now. But we will want to
distinguish another case of "dissolve has passed but taking off failed" by
its return value. So change the type of the return value. No functional
change.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
If memory_failure() fails to grab page refcount on a hugetlb page because
it's busy, it returns without setting PG_hwpoison on it. This not only
loses a chance of error containment, but breaks the rule that
action_result() should be called only when memory_failure() do any of
handling work (even if that's just setting PG_hwpoison). This
inconsistency could harm code maintainability.
So set PG_hwpoison and call hugetlb_set_page_hwpoison() for such a case.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 405ce051236c ("mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Raw error info list needs to be removed when hwpoisoned hugetlb is
unpoisoned. And unpoison handler needs to know how many errors there are
in the target hugepage. So add them.
HPageVmemmapOptimized(hpage) and HPageRawHwpUnreliable(hpage)) sometimes
can't be unpoisoned, so skip them.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When handling memory error on a hugetlb page, the error handler tries to
dissolve and turn it into 4kB pages. If it's successfully dissolved,
PageHWPoison flag is moved to the raw error page, so that's all right.
However, dissolve sometimes fails, then the error page is left as
hwpoisoned hugepage. It's useful if we can retry to dissolve it to save
healthy pages, but that's not possible now because the information about
where the raw error pages is lost.
Use the private field of a few tail pages to keep that information. The
code path of shrinking hugepage pool uses this info to try delayed
dissolve. In order to remember multiple errors in a hugepage, a
singly-linked list originated from SUBPAGE_INDEX_HWPOISON-th tail page is
constructed. Only simple operations (adding an entry or clearing all) are
required and the list is assumed not to be very long, so this simple data
structure should be enough.
If we failed to save raw error info, the hwpoison hugepage has errors on
unknown subpage, then this new saving mechanism does not work any more, so
disable saving new raw error info and freeing hwpoison hugepages.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
follow_pud_mask() does not support non-present pud entry now. As long as
I tested on x86_64 server, follow_pud_mask() still simply returns
no_page_table() for non-present_pud_entry() due to pud_bad(), so no severe
user-visible effect should happen. But generally we should call
follow_huge_pud() for non-present pud entry for 1GB hugetlb page.
Update pud_huge() and follow_huge_pud() to handle non-present pud entries.
The changes are similar to previous works for pud entries commit
e66f17ff7177 ("mm/hugetlb: take page table lock in follow_huge_pmd()") and
commit cbef8478bee5 ("mm/hugetlb: pmd_huge() returns true for non-present
hugepage").
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
return_unused_surplus_pages()
Patch series "mm, hwpoison: enable 1GB hugepage support", v7.
This patch (of 8):
I found a weird state of 1GB hugepage pool, caused by the following
procedure:
- run a process reserving all free 1GB hugepages,
- shrink free 1GB hugepage pool to zero (i.e. writing 0 to
/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages), then
- kill the reserving process.
, then all the hugepages are free *and* surplus at the same time.
$ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
3
$ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/free_hugepages
3
$ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/resv_hugepages
0
$ cat /sys/kernel/mm/hugepages/hugepages-1048576kB/surplus_hugepages
3
This state is resolved by reserving and allocating the pages then freeing
them again, so this seems not to result in serious problem. But it's a
little surprising (shrinking pool suddenly fails).
This behavior is caused by hstate_is_gigantic() check in
return_unused_surplus_pages(). This was introduced so long ago in 2008 by
commit aa888a74977a ("hugetlb: support larger than MAX_ORDER"), and at
that time the gigantic pages were not supposed to be allocated/freed at
run-time. Now kernel can support runtime allocation/free, so let's check
gigantic_page_runtime_supported() together.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Naoya Horiguchi <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Liu Shixin <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: kernel test robot <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There is already a macro PTRS_PER_PTE to represent the number of page
table entries, just use it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
All the comments which explains how HVO works are moved to
vmemmap_dedup.rst since
commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound devmaps")
except some comments above page_fixed_fake_head(). This commit moves
those comments to vmemmap_dedup.rst and improve vmemmap_dedup.rst as well.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
There is a discussion about the name of hugetlb_vmemmap_alloc/free in
thread [1]. The suggestion suggested by David is rename "alloc/free" to
"optimize/restore" to make functionalities clearer to users, "optimize"
means the function will optimize vmemmap pages, while "restore" means
restoring its vmemmap pages discared before. This commit does this.
Another discussion is the confusion RESERVE_VMEMMAP_NR isn't used
explicitly for vmemmap_addr but implicitly for vmemmap_end in
hugetlb_vmemmap_alloc/free. David suggested we can compute what
hugetlb_vmemmap_init() does now at runtime. We do not need to worry for
the overhead of computing at runtime since the calculation is simple
enough and those functions are not in a hot path. This commit has the
following improvements:
1) The function suffixed name ("optimize/restore") is more expressive.
2) The logic becomes less weird in hugetlb_vmemmap_optimize/restore().
3) The hugetlb_vmemmap_init() does not need to be exported anymore.
4) A ->optimize_vmemmap_pages field in struct hstate is killed.
5) There is only one place where checks is_power_of_2(sizeof(struct
page)) instead of two places.
6) Add more comments for hugetlb_vmemmap_optimize/restore().
7) For external users, hugetlb_optimize_vmemmap_pages() is used for
detecting if the HugeTLB's vmemmap pages is optimizable originally.
In this commit, it is killed and we introduce a new helper
hugetlb_vmemmap_optimizable() to replace it. The name is more
expressive.
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
After the following commit:
78f39084b41d ("mm: hugetlb_vmemmap: add hugetlb_optimize_vmemmap sysctl")
There is no order requirement between the parameter of
"hugetlb_free_vmemmap" and "hugepages" since we have removed the check of
whether HVO is enabled from hugetlb_vmemmap_init(). Therefore we can
safely replace early_param() with core_param() to simplify the code.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
When I first introduced vmemmap manipulation functions related to HugeTLB,
I thought those functions may be reused by other modules (e.g. using
similar approach to optimize vmemmap pages, unfortunately, the DAX used
the same approach but does not use those functions). After two years, we
didn't see any other users. So move those functions to hugetlb_vmemmap.c.
Code movement without any functional change.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
It it inconvenient to mention the feature of optimizing vmemmap pages
associated with HugeTLB pages when communicating with others since there
is no specific or abbreviated name for it when it is first introduced.
Let us give it a name HVO (HugeTLB Vmemmap Optimization) from now.
This commit also updates the document about "hugetlb_free_vmemmap" by the
way discussed in thread [1].
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
We hold an another reference to hugetlb_optimize_vmemmap_key when making
vmemmap_optimize_mode on, because we use static_key to tell memory_hotplug
that memory_hotplug.memmap_on_memory should be overridden. However, this
rule has gone when we have introduced PageVmemmapSelfHosted. Therefore,
we could simplify vmemmap_optimize_mode handling by not holding an another
reference to hugetlb_optimize_vmemmap_key. This also means that we not
incur the extra page_fixed_fake_head checks if there are no vmemmap
optinmized hugetlb pages after this change.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "Simplify hugetlb vmemmap and improve its readability", v2.
This series aims to simplify hugetlb vmemmap and improve its readability.
This patch (of 8):
The name hugetlb_optimize_vmemmap_enabled() a bit confusing as it tests
two conditions (enabled and pages in use). Instead of coming up to an
appropriate name, we could just delete it. There is already a discussion
about deleting it in thread [1].
There is only one user of hugetlb_optimize_vmemmap_enabled() outside of
hugetlb_vmemmap, that is flush_dcache_page() in arch/arm64/mm/flush.c.
However, it does not need to call hugetlb_optimize_vmemmap_enabled() in
flush_dcache_page() since HugeTLB pages are always fully mapped and only
head page will be set PG_dcache_clean meaning only head page's flag may
need to be cleared (see commit cf5a501d985b). So it is easy to remove
hugetlb_optimize_vmemmap_enabled().
Link: https://lore.kernel.org/all/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Muchun Song <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Xiongchun Duan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
"This introduces support for the remoteproc on Mediatek MT8188, and
enables caches for MT8186 SCP. It adds support for PRU cores found on
the TI K3 AM62x SoCs.
It moves the recovery work after a firmware crash to an unbound
workqueue, to allow recovery to happen in parallel.
A new DMA API is introduced to release dma_mem for a device.
It adds support a panic handler for the Qualcomm modem remoteproc,
with the goal of having caches flushed in memory dumps for post-mortem
debugging and it introduces a mechanism to wait for the modem firmware
on SM8450 to decrypt part of its memory for post-mortem debugging.
Qualcomm sysmon is restricted to only inform remote processors about
peers that are actually running, to avoid a race where Linux tries to
notify a recovering remote processor about its peers new state. A
mechanism for waiting for the sysmon connection to be established is
also introduced, to avoid out-of-sync updates for rapidly restarting
remote processors.
A number of Devicetree binding cleanups and conversions to YAML are
introduced, to facilitate Devicetree validation. Lastly it introduces
a number of smaller fixes and cleanups in the core and a few different
drivers"
* tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (42 commits)
remoteproc: qcom_q6v5_pas: Do not fail if regulators are not found
drivers/remoteproc: fix repeated words in comments
remoteproc: Directly use ida_alloc()/free()
remoteproc: Use unbounded workqueue for recovery work
remoteproc: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators
remoteproc: sysmon: Send sysmon state only for running rprocs
remoteproc: sysmon: Wait for SSCTL service to come up
remoteproc: qcom: q6v5: Set q6 state to offline on receiving wdog irq
remoteproc: qcom: pas: Check if coredump is enabled
remoteproc: qcom: pas: Mark devices as wakeup capable
remoteproc: qcom: pas: Mark va as io memory
remoteproc: qcom: pas: Add decrypt shutdown support for modem
remoteproc: qcom: q6v5-mss: add powerdomains to MSM8996 config
remoteproc: qcom_q6v5: Introduce panic handler for MSS
remoteproc: qcom_q6v5_mss: Update MBA log info
remoteproc: qcom: correct kerneldoc
remoteproc: qcom_q6v5_mss: map/unmap metadata region before/after use
remoteproc: qcom: using pm_runtime_resume_and_get to simplify the code
remoteproc: mediatek: Support MT8188 SCP
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
"This contains fixes and cleanups in the rpmsg core, Qualcomm SMD and
GLINK drivers, a circular lock dependency in the Mediatek driver and
a possible race condition in the rpmsg_char driver is resolved"
* tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
rpmsg: convert sysfs snprintf to sysfs_emit
rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge
rpmsg: qcom: correct kerneldoc
rpmsg: qcom: glink: remove unused name
rpmsg: qcom: glink: replace strncpy() with strscpy_pad()
rpmsg: Strcpy is not safe, use strscpy_pad() instead
rpmsg: Fix possible refcount leak in rpmsg_register_device_override()
rpmsg: Fix parameter naming for announce_create/destroy ops
rpmsg: mtk_rpmsg: Fix circular locking dependency
rpmsg: char: Add mutex protection for rpmsg_eptdev_open()
|
|
git://www.linux-watchdog.org/linux-watchdog
Pull watchdog updates from Wim Van Sebroeck:
- add RTL9310 support
- sp805_wdt: add arm cmsdk apb wdt support
- Remove #ifdef guards for PM related functions for several watchdog
device drivers
- pm8916_wdt reboot improvements
- Several other fixes and improvements
* tag 'linux-watchdog-5.20-rc1' of git://www.linux-watchdog.org/linux-watchdog: (24 commits)
watchdog: armada_37xx_wdt: check the return value of devm_ioremap() in armada_37xx_wdt_probe()
watchdog: dw_wdt: Fix comment typo
watchdog: Fix comment typo
dt-bindings: watchdog: Add fsl,scu-wdt yaml file
watchdog:Fix typo in comment
watchdog: pm8916_wdt: Handle watchdog enabled by bootloader
watchdog: pm8916_wdt: Report reboot reason
watchdog: pm8916_wdt: Avoid read of write-only PET register
watchdog: wdat_wdt: Remove #ifdef guards for PM related functions
watchdog: tegra_wdt: Remove #ifdef guards for PM related functions
watchdog: st_lpc_wdt: Remove #ifdef guards for PM related functions
watchdog: sama5d4_wdt: Remove #ifdef guards for PM related functions
watchdog: s3c2410_wdt: Remove #ifdef guards for PM related functions
watchdog: mtk_wdt: Remove #ifdef guards for PM related functions
watchdog: dw_wdt: Remove #ifdef guards for PM related functions
watchdog: bcm7038_wdt: Remove #ifdef guards for PM related functions
watchdog: realtek-otto: add RTL9310 support
dt-bindings: watchdog: realtek,otto-wdt: add RTL9310
watchdog: sp805_wdt: add arm cmsdk apb wdt support
watchdog: sp5100_tco: Fix a memory leak of EFCH MMIO resource
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These are ARM cpufreq updates and operating performance points (OPP)
updates plus one cpuidle update adding a new trace point.
Specifics:
- Fix return error code in mtk_cpu_dvfs_info_init (Yang Yingliang).
- Minor cleanups and support for new boards for Qcom cpufreq drivers
(Bryan O'Donoghue, Konrad Dybcio, Pierre Gondois, and Yicong Yang).
- Fix sparse warnings for Tegra cpufreq driver (Viresh Kumar).
- Make dev_pm_opp_set_regulators() accept NULL terminated list
(Viresh Kumar).
- Add dev_pm_opp_set_config() and friends and migrate other users and
helpers to using them (Viresh Kumar).
- Add support for multiple clocks for a device (Viresh Kumar and
Krzysztof Kozlowski).
- Configure resources before adding OPP table for Venus (Stanimir
Varbanov).
- Keep reference count up for opp->np and opp_table->np while they
are still in use (Liang He).
- Minor OPP cleanups (Viresh Kumar and Yang Li).
- Add a trace event for cpuidle to track missed (too deep or too
shallow) wakeups (Kajetan Puchalski)"
* tag 'pm-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (55 commits)
cpuidle: Add cpu_idle_miss trace event
venus: pm_helpers: Fix warning in OPP during probe
OPP: Don't drop opp->np reference while it is still in use
OPP: Don't drop opp_table->np reference while it is still in use
cpufreq: tegra194: Staticize struct tegra_cpufreq_soc instances
dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM6375 compatible
dt-bindings: opp: Add msm8939 to the compatible list
dt-bindings: opp: Add missing compat devices
dt-bindings: opp: opp-v2-kryo-cpu: Fix example binding checks
cpufreq: Change order of online() CB and policy->cpus modification
cpufreq: qcom-hw: Remove deprecated irq_set_affinity_hint() call
cpufreq: qcom-hw: Disable LMH irq when disabling policy
cpufreq: qcom-hw: Reset cancel_throttle when policy is re-enabled
cpufreq: qcom-cpufreq-hw: use HZ_PER_KHZ macro in units.h
cpufreq: mediatek: fix error return code in mtk_cpu_dvfs_info_init()
OPP: Remove dev{m}_pm_opp_of_add_table_noclk()
PM / devfreq: tegra30: Register config_clks helper
OPP: Allow config_clks helper for single clk case
OPP: Provide a simple implementation to configure multiple clocks
OPP: Assert clk_count == 1 for single clk helpers
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more thermal control updates from Rafael Wysocki:
"These fix an error code path issue leading to a NULL pointer
dereference, drop Kconfig dependencies that are not needed any more
after recent changes, add CPU IDs for new chips to a driver and fix up
the tmon utility.
Specifics:
- Fix NULL pointer dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Make the Intel TCC cooling driver support Alder Lake-N and Raptor
Lake-P (Sumeet Pawnikar).
- Fix possible path truncations in the tmon utility (Florian
Fainelli)"
* tag 'thermal-5.20-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
tools/thermal: Fix possible path truncations
thermal: Drop obsolete dependency on COMPILE_TEST
thermal: sysfs: Fix cooling_device_stats_setup() error code path
thermal: intel: Add TCC cooling support for Alder Lake-N and Raptor Lake-P
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull sysctl updates from Luis Chamberlain:
"There isn't much for 6.0 for sysctl stuff, most of the stuff went
through the networking subsystem (Kuniyuki Iwashima's trove of fixes
using READ_ONCE/WRITE_ONCE helpers) as most of the issues there have
been identified on networking side. So it is good we don't have much
updates as we would have ended up with tons of conflicts. I rebased my
delta just now to your tree so to avoid conflicts with that stuff.
This merge request is just minor fluff cleanups then. Perhaps for 6.1
kernel/sysctl.c will get more love than this release"
* tag 'sysctl-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
kernel/sysctl.c: Remove trailing white space
kernel/sysctl.c: Clean up indentation, replace spaces with tab.
sysctl: Merge adjacent CONFIG_TREE_RCU blocks
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module updates from Luis Chamberlain:
"For the 6.0 merge window the modules code shifts to cleanup and minor
fixes effort. This becomes much easier to do and review now due to the
code split to its own directory from effort on the last kernel
release. I expect to see more of this with time and as we expand on
test coverage in the future. The cleanups and fixes come from usual
suspects such as Christophe Leroy and Aaron Tomlin but there are also
some other contributors.
One particular minor fix worth mentioning is from Helge Deller, where
he spotted a *forever* incorrect natural alignment on both ELF section
header tables:
* .altinstructions
* __bug_table sections
A lot of back and forth went on in trying to determine the ill effects
of this misalignment being present for years and it has been
determined there should be no real ill effects unless you have a buggy
exception handler. Helge actually hit one of these buggy exception
handlers on parisc which is how he ended up spotting this issue. When
implemented correctly these paths with incorrect misalignment would
just mean a performance penalty, but given that we are dealing with
alternatives on modules and with the __bug_table (where info regardign
BUG()/WARN() file/line information associated with it is stored) this
really shouldn't be a big deal.
The only other change with mentioning is the kmap() with
kmap_local_page() and my only concern with that was on what is done
after preemption, but the virtual addresses are restored after
preemption. This is only used on module decompression.
This all has sit on linux-next for a while except the kmap stuff which
has been there for 3 weeks"
* tag 'modules-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module: Replace kmap() with kmap_local_page()
module: Show the last unloaded module's taint flag(s)
module: Use strscpy() for last_unloaded_module
module: Modify module_flags() to accept show_state argument
module: Move module's Kconfig items in kernel/module/
MAINTAINERS: Update file list for module maintainers
module: Use vzalloc() instead of vmalloc()/memset(0)
modules: Ensure natural alignment for .altinstructions and __bug_table sections
module: Increase readability of module_kallsyms_lookup_name()
module: Fix ERRORs reported by checkpatch.pl
module: Add support for default value for module async_probe
|
|
NFS unlink() (and rename over existing target) must determine if the
file is open, and must perform a "silly rename" instead of an unlink (or
before rename) if it is. Otherwise the client might hold a file open
which has been removed on the server.
Consequently if it determines that the file isn't open, it must block
any subsequent opens until the unlink/rename has been completed on the
server.
This is currently achieved by unhashing the dentry. This forces any
open attempt to the slow-path for lookup which will block on i_rwsem on
the directory until the unlink/rename completes. A future patch will
change the VFS to only get a shared lock on i_rwsem for unlink, so this
will no longer work.
Instead we introduce an explicit interlock. A special value is stored
in dentry->d_fsdata while the unlink/rename is running and
->d_revalidate blocks while that value is present. When ->d_revalidate
unblocks, the dentry will be invalid. This closes the race
without requiring exclusion on i_rwsem.
d_fsdata is already used in two different ways.
1/ an IS_ROOT directory dentry might have a "devname" stored in
d_fsdata. Such a dentry doesn't have a name and so cannot be the
target of unlink or rename. For safety we check if an old devname
is still stored, and remove it if it is.
2/ a dentry with DCACHE_NFSFS_RENAMED set will have a 'struct
nfs_unlinkdata' stored in d_fsdata. While this is set maydelete()
will fail, so an unlink or rename will never proceed on such
a dentry.
Neither of these can be in effect when a dentry is the target of unlink
or rename. So we can expect d_fsdata to be NULL, and store a special
value ((void*)1) which is given the name NFS_FSDATA_BLOCKED to indicate
that any lookup will be blocked.
The d_count() is incremented under d_lock() when a lookup finds the
dentry, so we check d_count() is low, and set NFS_FSDATA_BLOCKED under
the same lock to avoid any races.
Signed-off-by: NeilBrown <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds
Pull LED updates from Pavel Machek:
"A new driver for bcm63138, is31fl319x updates, fixups for multicolor.
The clevo-mail driver got disabled, it needs an API fix"
* tag 'leds-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: (23 commits)
leds: is31fl319x: use simple i2c probe function
leds: is31fl319x: Fix devm vs. non-devm ordering
leds: is31fl319x: Make use of dev_err_probe()
leds: is31fl319x: Make use of device properties
leds: is31fl319x: Cleanup formatting and dev_dbg calls
leds: is31fl319x: Add support for is31fl319{0,1,3} chips
leds: is31fl319x: Move chipset-specific values in chipdef struct
leds: is31fl319x: Use non-wildcard names for vars, structs and defines
leds: is31fl319x: Add missing si-en compatibles
dt-bindings: leds: pwm-multicolor: document max-brigthness
leds: turris-omnia: convert to use dev_groups
leds: leds-bcm63138: get rid of LED_OFF
leds: add help info about BCM63138 module name
dt-bindings: leds: leds-bcm63138: unify full stops in descriptions
dt-bindings: leds: lp50xx: fix LED children names
dt-bindings: leds: class-multicolor: reference class directly in multi-led node
leds: bcm63138: add support for BCM63138 controller
dt-bindings: leds: add Broadcom's BCM63138 controller
leds: clevo-mail: Mark as broken pending interface fix
leds: pwm-multicolor: Support active-low LEDs
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.0-rc1.
It was delayed from last week as I wanted to make sure the last commit
here got some good testing in linux-next and elsewhere as it seemed to
show up only late in testing for some reason.
Nothing major here, just lots of cleanups from Jiri and Ilpo to make
the tty core cleaner (Jiri) and the rs485 code simpler to use (Ilpo).
Also included in here is the obligatory n_gsm updates from Daniel
Starke and lots of tiny driver updates and minor fixes and tweaks for
other smaller serial drivers.
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (186 commits)
tty: serial: qcom-geni-serial: Fix %lu -> %u in print statements
tty: amiserial: Fix comment typo
tty: serial: document uart_get_console()
tty: serial: serial_core, reformat kernel-doc for functions
Documentation: serial: link uart_ops properly
Documentation: serial: move GPIO kernel-doc to the functions
Documentation: serial: dedup kernel-doc for uart functions
Documentation: serial: move uart_ops documentation to the struct
dt-bindings: serial: snps-dw-apb-uart: Document Rockchip RV1126
serial: mvebu-uart: uart2 error bits clearing
tty: serial: fsl_lpuart: correct the count of break characters
serial: stm32: make info structs static to avoid sparse warnings
serial: fsl_lpuart: zero out parity bit in CS7 mode
tty: serial: qcom-geni-serial: Fix get_clk_div_rate() which otherwise could return a sub-optimal clock rate.
serial: 8250_bcm2835aux: Add missing clk_disable_unprepare()
tty: vt: initialize unicode screen buffer
serial: remove VR41XX serial driver
serial: 8250: lpc18xx: Remove redundant sanity check for RS485 flags
serial: 8250_dwlib: remove redundant sanity check for RS485 flags
dt_bindings: rs485: Correct delay values
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this cycle, we mainly fixed some corner cases that manipulate a
per-file compression flag inappropriately. And, we found f2fs counted
valid blocks in a section incorrectly when zone capacity is set, and
thus, fixed it with additional sysfs entry to check it easily.
Lastly, this series includes several patches with respect to the new
atomic write support such as a couple of bug fixes and re-adding
atomic_write_abort support that we removed by mistake in the previous
release.
Enhancements:
- add sysfs entries to understand atomic write operations and zone
capacity
- introduce memory mode to get a hint for low-memory devices
- adjust the waiting time of foreground GC
- decompress clusters under softirq to avoid non-deterministic
latency
- do not skip updating inode when retrying to flush node page
- enforce single zone capacity
Bug fixes:
- set the compression/no-compression flags correctly
- revive F2FS_IOC_ABORT_VOLATILE_WRITE
- check inline_data during compressed inode conversion
- understand zone capacity when calculating valid block count
As usual, the series includes several minor clean-ups and sanity
checks"
* tag 'f2fs-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (29 commits)
f2fs: use onstack pages instead of pvec
f2fs: intorduce f2fs_all_cluster_page_ready
f2fs: clean up f2fs_abort_atomic_write()
f2fs: handle decompress only post processing in softirq
f2fs: do not allow to decompress files have FI_COMPRESS_RELEASED
f2fs: do not set compression bit if kernel doesn't support
f2fs: remove device type check for direct IO
f2fs: fix null-ptr-deref in f2fs_get_dnode_of_data
f2fs: revive F2FS_IOC_ABORT_VOLATILE_WRITE
f2fs: fix to do sanity check on segment type in build_sit_entries()
f2fs: obsolete unused MAX_DISCARD_BLOCKS
f2fs: fix to avoid use f2fs_bug_on() in f2fs_new_node_page()
f2fs: fix to remove F2FS_COMPR_FL and tag F2FS_NOCOMP_FL at the same time
f2fs: introduce sysfs atomic write statistics
f2fs: don't bother wait_ms by foreground gc
f2fs: invalidate meta pages only for post_read required inode
f2fs: allow compression of files without blocks
f2fs: fix to check inline_data during compressed inode conversion
f2fs: Delete f2fs_copy_page() and replace with memcpy_page()
f2fs: fix to invalidate META_MAPPING before DIO write
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Fix an issue with reusing the bdi in case of block based filesystems
- Allow root (in init namespace) to access fuse filesystems in user
namespaces if expicitly enabled with a module param
- Misc fixes
* tag 'fuse-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: retire block-device-based superblock on force unmount
vfs: function to prevent re-use of block-device-based superblocks
virtio_fs: Modify format for virtio_fs_direct_access
virtiofs: delete unused parameter for virtio_fs_cleanup_vqs
fuse: Add module param for CAP_SYS_ADMIN access bypassing allow_other
fuse: Remove the control interface for virtio-fs
fuse: ioctl: translate ENOSYS
fuse: limit nsec
fuse: avoid unnecessary spinlock bump
fuse: fix deadlock between atomic O_TRUNC and page invalidation
fuse: write inode in fuse_release()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi:
"Just a small update"
* tag 'ovl-update-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: fix spelling mistakes
ovl: drop WARN_ON() dentry is NULL in ovl_encode_fh()
ovl: improve ovl_get_acl() if POSIX ACL support is off
ovl: fix some kernel-doc comments
ovl: warn if trusted xattr creation fails
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- fix the error code of rename syscall
- cleanup and suppress the superfluous error messages
- remove duplicate directory entry update
- add exfat git tree in MAINTAINERS
* tag 'exfat-for-5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
MAINTAINERS: Add Namjae's exfat git tree
exfat: Drop superfluous new line for error messages
exfat: Downgrade ENAMETOOLONG error message to debug messages
exfat: Expand exfat_err() and co directly to pr_*() macro
exfat: Define NLS_NAME_* as bit flags explicitly
exfat: Return ENAMETOOLONG consistently for oversized paths
exfat: remove duplicate write inode for extending dir/file
exfat: remove duplicate write inode for truncating file
exfat: reuse __exfat_write_inode() to update directory entry
|
|
If something manages to set the maximum file size to MAX_OFFSET+1, this
can cause the xfs and ext4 filesystems at least to become corrupt.
Ordinarily, the kernel protects against userspace trying this by
checking the value early in the truncate() and ftruncate() system calls
calls - but there are at least two places that this check is bypassed:
(1) Cachefiles will round up the EOF of the backing file to DIO block
size so as to allow DIO on the final block - but this might push
the offset negative. It then calls notify_change(), but this
inadvertently bypasses the checking. This can be triggered if
someone puts an 8EiB-1 file on a server for someone else to try and
access by, say, nfs.
(2) ksmbd doesn't check the value it is given in set_end_of_file_info()
and then calls vfs_truncate() directly - which also bypasses the
check.
In both cases, it is potentially possible for a network filesystem to
cause a disk filesystem to be corrupted: cachefiles in the client's
cache filesystem; ksmbd in the server's filesystem.
nfsd is okay as it checks the value, but we can then remove this check
too.
Fix this by adding a check to inode_newsize_ok(), as called from
setattr_prepare(), thereby catching the issue as filesystems set up to
perform the truncate with minimal opportunity for bypassing the new
check.
Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling")
Fixes: f44158485826 ("cifsd: add file operations")
Signed-off-by: David Howells <[email protected]>
Reported-by: Jeff Layton <[email protected]>
Tested-by: Jeff Layton <[email protected]>
Reviewed-by: Namjae Jeon <[email protected]>
Cc: [email protected]
Acked-by: Alexander Viro <[email protected]>
cc: Steve French <[email protected]>
cc: Hyunchul Lee <[email protected]>
cc: Chuck Lever <[email protected]>
cc: Dave Wysochanski <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Merge additional changes in the thermal core and thermal tools updates
for 5.20-rc1:
- Fix NULL poiter dereference in the thermal sysfs interface that
results from an error code path mishandling (Rafael Wysocki).
- Drop COMPILE_TEST dependency that's not needed any more from two
thermal Kconfig entries (Jean Delvare).
- Fix possible path truncations in the tmon utility (Florian Fainelli).
* thermal-core:
thermal: Drop obsolete dependency on COMPILE_TEST
thermal: sysfs: Fix cooling_device_stats_setup() error code path
* thermal-tools:
tools/thermal: Fix possible path truncations
|