Age | Commit message (Collapse) | Author | Files | Lines |
|
Commit 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
changed the behavior when deflation happens automatically. Instead of
deflating when called by the OOM handler, the shrinker is used.
However, the balloon is not simply some other slab cache that should be
shrunk when under memory pressure. The shrinker does not have a concept
of priorities yet, so this behavior cannot be configured. Eventually once
that is in place, we might want to switch back after doing proper testing.
There was a report that this results in undesired side effects when
inflating the balloon to shrink the page cache. [1]
"When inflating the balloon against page cache (i.e. no free memory
remains) vmscan.c will both shrink page cache, but also invoke the
shrinkers -- including the balloon's shrinker. So the balloon
driver allocates memory which requires reclaim, vmscan gets this
memory by shrinking the balloon, and then the driver adds the
memory back to the balloon. Basically a busy no-op."
The name "deflate on OOM" makes it pretty clear when deflation should
happen - after other approaches to reclaim memory failed, not while
reclaiming. This allows to minimize the footprint of a guest - memory
will only be taken out of the balloon when really needed.
Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because
this has no such side effects. Always register the shrinker with
VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse free
pages that are still to be processed by the guest. The hypervisor takes
care of identifying and resolving possible races between processing a
hinting request and the guest reusing a page.
In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom
notifier with shrinker"), don't add a module parameter to configure the
number of pages to deflate on OOM. Can be re-added if really needed.
Also, pay attention that leak_balloon() returns the number of 4k pages -
convert it properly in virtio_balloon_oom_notify().
Testing done by Tyler for future reference:
Test setup: VM with 16 CPU, 64GB RAM. Running Debian 10. We have a 42
GB file full of random bytes that we continually cat to /dev/null.
This fills the page cache as the file is read. Meanwhile, we trigger
the balloon to inflate, with a target size of 53 GB. This setup causes
the balloon inflation to pressure the page cache as the page cache is
also trying to grow. Afterwards we shrink the balloon back to zero (so
total deflate == total inflate).
Without this patch (kernel 4.19.0-5):
Inflation never reaches the target until we stop the "cat file >
/dev/null" process. Total inflation time was 542 seconds. The longest
period that made no net forward progress was 315 seconds.
Result of "grep balloon /proc/vmstat" after the test:
balloon_inflate 154828377
balloon_deflate 154828377
With this patch (kernel 5.6.0-rc4+):
Total inflation duration was 63 seconds. No deflate-queue activity
occurs when pressuring the page-cache.
Result of "grep balloon /proc/vmstat" after the test:
balloon_inflate 12968539
balloon_deflate 12968539
Conclusion: This patch fixes the issue. In the test it reduced
inflate/deflate activity by 12x, and reduced inflation time by 8.6x.
But more importantly, if we hadn't killed the "cat file > /dev/null"
process then, without the patch, the inflation process would never reach
the target.
[1] https://www.spinics.net/lists/linux-virtualization/msg40863.html
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 71994620bb25 ("virtio_balloon: replace oom notifier with shrinker")
Signed-off-by: David Hildenbrand <[email protected]>
Reported-by: Tyler Sanderson <[email protected]>
Tested-by: Tyler Sanderson <[email protected]>
Acked-by: David Rientjes <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add documentation for free page reporting. Currently the only consumer is
virtio-balloon, however it is possible that other drivers might make use
of this so it is best to add a bit of documetation explaining at a high
level how to use the API.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In order to keep ourselves from reporting pages that are just going to be
reused again in the case of heavy churn we can put a limit on how many
total pages we will process per pass. Doing this will allow the worker
thread to go into idle much more quickly so that we avoid competing with
other threads that might be allocating or freeing pages.
The logic added here will limit the worker thread to no more than one
sixteenth of the total free pages in a given area per list. Once that
limit is reached it will update the state so that at the end of the pass
we will reschedule the worker to try again in 2 seconds when the memory
churn has hopefully settled down.
Again this optimization doesn't show much of a benefit in the standard
case as the memory churn is minmal. However with page allocator shuffling
enabled the gain is quite noticeable. Below are the results with a THP
enabled version of the will-it-scale page_fault1 test showing the
improvement in iterations for 16 processes or threads.
Without:
tasks processes processes_idle threads threads_idle
16 8283274.75 0.17 5594261.00 38.15
With:
tasks processes processes_idle threads threads_idle
16 8767010.50 0.21 5791312.75 36.98
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Rather than walking over the same pages again and again to get to the
pages that have yet to be reported we can save ourselves a significant
amount of time by simply rotating the list so that when we have a full
list of reported pages the head of the list is pointing to the next
non-reported page. Doing this should save us some significant time when
processing each free list.
This doesn't gain us much in the standard case as all of the non-reported
pages should be near the top of the list already. However in the case of
page shuffling this results in a noticeable improvement. Below are the
will-it-scale page_fault1 w/ THP numbers for 16 tasks with and without
this patch.
Without:
tasks processes processes_idle threads threads_idle
16 8093776.25 0.17 5393242.00 38.20
With:
tasks processes processes_idle threads threads_idle
16 8283274.75 0.17 5594261.00 38.15
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add support for the page reporting feature provided by virtio-balloon.
Reporting differs from the regular balloon functionality in that is is
much less durable than a standard memory balloon. Instead of creating a
list of pages that cannot be accessed the pages are only inaccessible
while they are being indicated to the virtio interface. Once the
interface has acknowledged them they are placed back into their respective
free lists and are once again accessible by the guest system.
Unlike a standard balloon we don't inflate and deflate the pages. Instead
we perform the reporting, and once the reporting is completed it is
assumed that the page has been dropped from the guest and will be faulted
back in the next time the page is accessed.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently the page poisoning setting wasn't being enabled unless free page
hinting was enabled. However we will need the page poisoning tracking
logic as well for free page reporting. As such pull it out and make it a
separate bit of config in the probe function.
In addition we need to add support for the more recent init_on_free
feature which expects a behavior similar to page poisoning in that we
expect the page to be pre-zeroed.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In order to pave the way for free page reporting in virtualized
environments we will need a way to get pages out of the free lists and
identify those pages after they have been returned. To accomplish this,
this patch adds the concept of a Reported Buddy, which is essentially
meant to just be the Uptodate flag used in conjunction with the Buddy page
type.
To prevent the reported pages from leaking outside of the buddy lists I
added a check to clear the PageReported bit in the del_page_from_free_list
function. As a result any reported page that is split, merged, or
allocated will have the flag cleared prior to the PageBuddy value being
cleared.
The process for reporting pages is fairly simple. Once we free a page
that meets the minimum order for page reporting we will schedule a worker
thread to start 2s or more in the future. That worker thread will begin
working from the lowest supported page reporting order up to MAX_ORDER - 1
pulling unreported pages from the free list and storing them in the
scatterlist.
When processing each individual free list it is necessary for the worker
thread to release the zone lock when it needs to stop and report the full
scatterlist of pages. To reduce the work of the next iteration the worker
thread will rotate the free list so that the first unreported page in the
free list becomes the first entry in the list.
It will then call a reporting function providing information on how many
entries are in the scatterlist. Once the function completes it will
return the pages to the free area from which they were allocated and start
over pulling more pages from the free areas until there are no longer
enough pages to report on to keep the worker busy, or we have processed as
many pages as were contained in the free area when we started processing
the list.
The worker thread will work in a round-robin fashion making its way though
each zone requesting reporting, and through each reportable free list
within that zone. Once all free areas within the zone have been processed
it will check to see if there have been any requests for reporting while
it was processing. If so it will reschedule the worker thread to start up
again in roughly 2s and exit.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are cases where we would benefit from avoiding having to go through
the allocation and free cycle to return an isolated page.
Examples for this might include page poisoning in which we isolate a page
and then put it back in the free list without ever having actually
allocated it.
This will enable us to also avoid notifiers for the future free page
reporting which will need to avoid retriggering page reporting when
returning pages that have been reported on.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In order to enable the use of the zone from the list manipulator functions
I will need access to the zone pointer. As it turns out most of the
accessors were always just being directly passed &zone->free_area[order]
anyway so it would make sense to just fold that into the function itself
and pass the zone and order as arguments instead of the free area.
In order to be able to reference the zone we need to move the declaration
of the functions down so that we have the zone defined before we define
the list manipulation functions. Since the functions are only used in the
file mm/page_alloc.c we can just move them there to reduce noise in the
header.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm / virtio: Provide support for free page reporting", v17.
This series provides an asynchronous means of reporting free guest pages
to a hypervisor so that the memory associated with those pages can be
dropped and reused by other processes and/or guests on the host. Using
this it is possible to avoid unnecessary I/O to disk and greatly improve
performance in the case of memory overcommit on the host.
When enabled we will be performing a scan of free memory every 2 seconds
while pages of sufficiently high order are being freed. In each pass at
least one sixteenth of each free list will be reported. By doing this we
avoid racing against other threads that may be causing a high amount of
memory churn.
The lowest page order currently scanned when reporting pages is
pageblock_order so that this feature will not interfere with the use of
Transparent Huge Pages in the case of virtualization.
Currently this is only in use by virtio-balloon however there is the hope
that at some point in the future other hypervisors might be able to make
use of it. In the virtio-balloon/QEMU implementation the hypervisor is
currently using MADV_DONTNEED to indicate to the host kernel that the page
is currently free. It will be zeroed and faulted back into the guest the
next time the page is accessed.
To track if a page is reported or not the Uptodate flag was repurposed and
used as a Reported flag for Buddy pages. We walk though the free list
isolating pages and adding them to the scatterlist until we either
encounter the end of the list or have processed at least one sixteenth of
the pages that were listed in nr_free prior to us starting. If we fill
the scatterlist before we reach the end of the list we rotate the list so
that the first unreported page we encounter is moved to the head of the
list as that is where we will resume after we have freed the reported
pages back into the tail of the list.
Below are the results from various benchmarks. I primarily focused on two
tests. The first is the will-it-scale/page_fault2 test, and the other is
a modified version of will-it-scale/page_fault1 that was enabled to use
THP. I did this as it allows for better visibility into different parts
of the memory subsystem. The guest is running with 32G for RAM on one
node of a E5-2630 v3. The host has had some features such as CPU turbo
disabled in the BIOS.
Test page_fault1 (THP) page_fault2
Name tasks Process Iter STDEV Process Iter STDEV
Baseline 1 1012402.50 0.14% 361855.25 0.81%
16 8827457.25 0.09% 3282347.00 0.34%
Patches Applied 1 1007897.00 0.23% 361887.00 0.26%
16 8784741.75 0.39% 3240669.25 0.48%
Patches Enabled 1 1010227.50 0.39% 359749.25 0.56%
16 8756219.00 0.24% 3226608.75 0.97%
Patches Enabled 1 1050982.00 4.26% 357966.25 0.14%
page shuffle 16 8672601.25 0.49% 3223177.75 0.40%
Patches enabled 1 1003238.00 0.22% 360211.00 0.22%
shuffle w/ RFC 16 8767010.50 0.32% 3199874.00 0.71%
The results above are for a baseline with a linux-next-20191219 kernel,
that kernel with this patch set applied but page reporting disabled in
virtio-balloon, the patches applied and page reporting fully enabled, the
patches enabled with page shuffling enabled, and the patches applied with
page shuffling enabled and an RFC patch that makes used of MADV_FREE in
QEMU. These results include the deviation seen between the average value
reported here versus the high and/or low value. I observed that during
the test memory usage for the first three tests never dropped whereas with
the patches fully enabled the VM would drop to using only a few GB of the
host's memory when switching from memhog to page fault tests.
Any of the overhead visible with this patch set enabled seems due to page
faults caused by accessing the reported pages and the host zeroing the
page before giving it back to the guest. This overhead is much more
visible when using THP than with standard 4K pages. In addition page
shuffling seemed to increase the amount of faults generated due to an
increase in memory churn. The overehad is reduced when using MADV_FREE as
we can avoid the extra zeroing of the pages when they are reintroduced to
the host, as can be seen when the RFC is applied with shuffling enabled.
The overall guest size is kept fairly small to only a few GB while the
test is running. If the host memory were oversubscribed this patch set
should result in a performance improvement as swapping memory in the host
can be avoided.
A brief history on the background of free page reporting can be found at:
https://lore.kernel.org/lkml/[email protected]/
This patch (of 9):
Move the head/tail adding logic out of the shuffle code and into the
__free_one_page function since ultimately that is where it is really
needed anyway. By doing this we should be able to reduce the overhead and
can consolidate all of the list addition bits in one spot.
Signed-off-by: Alexander Duyck <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Cc: Yang Zhang <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Konrad Rzeszutek Wilk <[email protected]>
Cc: Nitesh Narayan Lal <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Luiz Capitulino <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: wei qi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Some comments for MADV_FREE is revised and added to help people understand
the MADV_FREE code, especially the page flag, PG_swapbacked. This makes
page_is_file_cache() isn't consistent with its comments. So the function
is renamed to page_is_file_lru() to make them consistent again. All these
are put in one patch as one logical change.
Suggested-by: David Hildenbrand <[email protected]>
Suggested-by: Johannes Weiner <[email protected]>
Suggested-by: David Rientjes <[email protected]>
Signed-off-by: "Huang, Ying" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Acked-by: David Rientjes <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Pankaj Gupta <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Rik van Riel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This updates get_user_pages()'s argument in ksm_test_exit()'s comment
Signed-off-by: Li Chen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit e496cf3d7821 ("thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE")
notes that it should be reverted when the PowerPC problem was fixed. The
commit fixing the PowerPC problem (953c66c2b22a) did not revert the
commit; instead setting CONFIG_TRANSPARENT_HUGE_PAGECACHE to the same as
CONFIG_TRANSPARENT_HUGEPAGE. Checking with Kirill and Aneesh, this was an
oversight, so remove the Kconfig symbol and undo the work of commit
e496cf3d7821.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
If THP is disabled, find_subpage() can become a no-op by using
hpage_nr_pages() instead of compound_nr(). hpage_nr_pages() embeds a
check for PageTail, so we can drop the check here.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The thp_fault_fallback and thp_file_fallback vmstats are incremented if
either the hugepage allocation fails through the page allocator or the
hugepage charge fails through mem cgroup.
This patch leaves this field untouched but adds two new fields,
thp_{fault,file}_fallback_charge, which is incremented only when the mem
cgroup charge fails.
This distinguishes between attempted hugepage allocations that fail due to
fragmentation (or low memory conditions) and those that fail due to mem
cgroup limits. That can be used to determine the impact of fragmentation
on the system by excluding faults that failed due to memcg usage.
Signed-off-by: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jeremy Cline <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The existing thp_fault_fallback indicates when thp attempts to allocate a
hugepage but fails, or if the hugepage cannot be charged to the mem cgroup
hierarchy.
Extend this to shmem as well. Adds a new thp_file_fallback to complement
thp_file_alloc that gets incremented when a hugepage is attempted to be
allocated but fails, or if it cannot be charged to the mem cgroup
hierarchy.
Additionally, remove the check for CONFIG_TRANSPARENT_HUGE_PAGECACHE from
shmem_alloc_hugepage() since it is only called with this configuration
option.
Signed-off-by: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Jeremy Cline <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently the migration code doesn't migrate PG_readahead flag.
Theoretically this would incur slight performance loss as the application
might have to ramp its readahead back up again. Even though such problem
happens, it might be hidden by something else since migration is typically
triggered by compaction and NUMA balancing, any of which should be more
noticeable.
Migrate the flag after end_page_writeback() since it may clear PG_reclaim
flag, which is the same bit as PG_readahead, for the new page.
[[email protected]: tweak comment]
Signed-off-by: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mel Gorman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
It can currently happen that we store the status of a page twice:
* Once we detect that it is already on the target node
* Once we moved a bunch of pages, and a page that's already on the
target node is contained in the current interval.
Let's simplify the code and always call do_move_pages_to_node() in case we
did not queue a page for migration. Note that pages that are already on
the target node are not added to the pagelist and are, therefore, ignored
by do_move_pages_to_node() - there is no functional change.
The status of such a page is now only stored once.
[[email protected] rephrase changelog]
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When pagelist is empty, it is not necessary to do the move and store.
Also it consolidate the empty list check in one place.
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Usually, do_move_pages_to_node() and store_status() are used in
combination. We have three similar call sites.
Let's provide a wrapper for both function calls -
move_pages_and_store_status - to make the calling code easier to maintain
and fix (as noted by Yang Shi, the return value handling of
do_move_pages_to_node() has a flaw).
[[email protected] rephrase changelog]
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "cleanup on do_pages_move()", v5.
The logic in do_pages_move() is a little mess for audience to read and has
some potential error on handling the return value. Especially there are
three calls on do_move_pages_to_node() and store_status() with almost the
same form.
This patch set tries to make the code a little friendly for audience by
consolidate the calls.
This patch (of 4):
At this point, we always have i >= start. If i == start, store_status()
will return 0. So we can drop the check for i > start.
[[email protected] rephrase changelog]
Signed-off-by: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
allocations
While it might be really clear to MM developers that gfp reclaim modifiers
are applicable only to sleepable allocations (those with
__GFP_DIRECT_RECLAIM) it seems that actual users of the API are not always
sure. Make it explicit that they are not applicable for GFP_NOWAIT or
GFP_ATOMIC allocations which are the most commonly used non-sleepable
allocation masks.
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Joel Fernandes (Google) <[email protected]>
Acked-by: Paul E. McKenney <[email protected]>
Acked-by: David Rientjes <[email protected]>
Cc: Neil Brown <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a typo in comment, fix it.
"exeeds" -> "exceeds"
Signed-off-by: Qiujun Huang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
It is unlikely that an inaccessible VMA without required permission flags
will get a page fault. Hence lets just append unlikely() directive to
such checks in order to improve performance while also standardizing it
across various platforms.
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Mike Rapoport <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This replaces all remaining open encodings with vma_is_anonymous().
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]
Cc: Alexander Viro <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This replaces all remaining open encodings with is_vm_hugetlb_page().
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
available for general use. While here, this replaces all remaining open
encodings for VMA access check with vma_is_accessible().
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Acked-by: Guo Ren <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm/vma: Use all available wrappers when possible", v2.
Apart from adding a VMA flag readable name for trace purpose, this series
does some open encoding replacements with availabe VMA specific wrappers.
This skips VM_HUGETLB check in vma_migratable() as its already being done
with another patch (https://patchwork.kernel.org/patch/11347831/) which is
yet to be merged.
This patch (of 4):
This just adds the missing readable name for VM_SYNC.
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Paul Burton <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Set ->vm_next and ->vm_prev to NULL to prevent potential misuse from the
new duplicated vma.
Currently, only in fork path there are misuse for handling anon_vma. No
other bugs been revealed with this patch applied.
Signed-off-by: Li Xinhai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This reverts commit 4e4a9eb921332b9d1 ("mm/rmap.c: reuse mergeable
anon_vma as parent when fork").
In dup_mmap(), anon_vma_fork() is called for attaching anon_vma and
parameter 'tmp' (i.e., the new vma of child) has same ->vm_next and
->vm_prev as its parent vma. That causes the anon_vma used by parent been
mistakenly shared by child (In anon_vma_clone(), the code added by that
commit will do this reuse work).
Besides this issue, the design of reusing anon_vma from vma which has gone
through fork should be avoided ([1]). So, this patch reverts that commit
and maintains the consistent logic of reusing anon_vma for
fork/split/merge vma.
Reusing anon_vma within the process is fine. But if a vma has gone
through fork(), then that vma's anon_vma should not be shared with its
neighbor vma. As explained in [1], when vma gone through fork(), the
check for list_is_singular(vma->anon_vma_chain) will be false, and
don't share anon_vma.
With current issue, one example can clarify more. Parent process do
below two steps:
1. p_vma_1 is created and p_anon_vma_1 is prepared;
2. p_vma_2 is created and share p_anon_vma_1; (this is allowed,
becaues p_vma_1 didn't gothrough fork()); parent process do fork():
3. c_vma_1 is dup from p_vma_1, and has its own c_anon_vma_1
prepared; at this point, c_vma_1->anon_vma_chain has two items, one
for p_anon_vma_1 and one for c_anon_vma_1;
4. c_vma_2 is dup from p_vma_2, it is not allowed to share
c_anon_vma_1, because
c_vma_1->anon_vma_chain has two items.
[1] commit d0e9fe1758f2 ("Simplify and comment on anon_vma re-use for
anon_vma_prepare()") explains the test of "list_is_singular()".
Fixes: 4e4a9eb92133 ("mm/rmap.c: reuse mergeable anon_vma as parent when fork")
Signed-off-by: Li Xinhai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Rik van Riel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm: Fix misuse of parent anon_vma in dup_mmap path".
This patchset fixes the misuse of parenet anon_vma, which mainly caused by
child vma's vm_next and vm_prev are left same as its parent after
duplicate vma. Finally, code reached parent vma's neighbor by referring
pointer of child vma and executed wrong logic.
The first two patches fix relevant issues, and the third patch sets
vm_next and vm_prev to NULL when duplicate vma to prevent potential misuse
in future.
Effects of the first bug is that causes rmap code to check both parent and
child's page table, although a page couldn't be mapped by both parent and
child, because child vma has WIPEONFORK so all pages mapped by child are
'new' and not relevant to parent.
Effects of the second bug is that the relationship of anon_vma of parent
and child are totallyconvoluted. It would cause 'son', 'grandson', ...,
etc, to share 'parent' anon_vma, which disobey the design rule of reusing
anon_vma (the rule to be followed is that reusing should among vma of same
process, and vma should not gone through fork).
So, both issues should cause unnecessary rmap walking and have unexpected
complexity.
These two issues would not be directly visible, I used debugging code to
check the anon_vma pointers of parent and child when inspecting the
suspicious implementation of issue #2, then find the problem.
This patch (of 3):
In dup_mmap(), anon_vma_prepare() is called for vma has VM_WIPEONFORK, and
parameter 'tmp' (i.e., the new vma of child) has same ->vm_next and
->vm_prev as its parent vma. That allows anon_vma used by parent been
mistakenly shared by child (find_mergeable_anon_vma() will do this reuse
work).
Besides this issue, call anon_vma_prepare() should be avoided because we
don't copy page for this vma. Preparing anon_vma will be handled during
fault.
Fixes: d2cd9ede6e19 ("mm,fork: introduce MADV_WIPEONFORK")
Signed-off-by: Li Xinhai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Johannes Weiner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The root of the hierarchy cannot have high set, so we will never reclaim
based on it. This makes that clearer and avoids another entry.
Signed-off-by: Chris Down <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Tejun Heo <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Michal Hocko <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Sergey Vidishev <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/2374188.AZIXMmL6Zy@sergeyv-box
|
|
When CONFIG_DEVFREQ_THERMAL is disabled all functions except
of_devfreq_cooling_register_power() were already inlined. Also inline
the last function to avoid compile errors when multiple drivers call
of_devfreq_cooling_register_power() when CONFIG_DEVFREQ_THERMAL is not
set. Compilation failed with the following message:
multiple definition of `of_devfreq_cooling_register_power'
(which then lists all usages of of_devfreq_cooling_register_power())
Thomas Zimmermann reported this problem [0] on a kernel config with
CONFIG_DRM_LIMA={m,y}, CONFIG_DRM_PANFROST={m,y} and
CONFIG_DEVFREQ_THERMAL=n after both, the lima and panfrost drivers
gained devfreq cooling support.
[0] https://www.spinics.net/lists/dri-devel/msg252825.html
Fixes: a76caf55e5b356 ("thermal: Add devfreq cooling")
Cc: [email protected]
Reported-by: Thomas Zimmermann <[email protected]>
Signed-off-by: Martin Blumenstingl <[email protected]>
Tested-by: Thomas Zimmermann <[email protected]>
Signed-off-by: Daniel Lezcano <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Signed-off-by: Ben Skeggs <[email protected]>
|
|
Signed-off-by: Ben Skeggs <[email protected]>
|
|
Signed-off-by: Ben Skeggs <[email protected]>
|
|
Certain boards with GP107/GP108 chipsets hang (often, but randomly) for
unknown reasons during GR initialisation.
The first tell-tale symptom of this issue is:
nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 409800 [ TIMEOUT ]
appearing in dmesg, likely followed by many other failures being logged.
Karol found this WAR for the issue a while back, but efforts to isolate
the root cause and proper fix have not yielded success so far. I've
modified the original patch to include a few more details, limit it to
GP107/GP108 by default, and added a config option to override this choice.
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
certain intel bridges
Fixes the infamous 'runtime PM' bug many users are facing on Laptops with
Nvidia Pascal GPUs by skipping said PCI power state changes on the GPU.
Depending on the used kernel there might be messages like those in demsg:
"nouveau 0000:01:00.0: Refused to change power state, currently in D3"
"nouveau 0000:01:00.0: can't change power state from D3cold to D0 (config
space inaccessible)"
followed by backtraces of kernel crashes or timeouts within nouveau.
It's still unkown why this issue exists, but this is a reliable workaround
and solves a very annoying issue for user having to choose between a
crashing kernel or higher power consumption of their Laptops.
Signed-off-by: Karol Herbst <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Lyude Paul <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Mika Westerberg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=205623
Signed-off-by: Ben Skeggs <[email protected]>
|
|
When nouveau processes GPU faults, it checks to see if the fault address
falls within the "unmanaged" range which is reserved for fixed allocations
instead of addresses chosen by the core mm code. If start is greater than
or equal to svmm->unmanaged.limit, then limit will also be greater than
svmm->unmanaged.limit which is greater than svmm->unmanaged.start and the
start = max_t(u64, start, svmm->unmanaged.limit) will change nothing.
Just remove the useless lines of code.
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
|
|
When migrating system memory to GPU memory, check that SVM has been
enabled. Even though most errors can be ignored since migration is
a performance optimization, return an error because this is a violation
of the API.
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
|
|
find_vma_intersection(mm, start, end) only guarantees that end is greater
than or equal to vma->vm_start but doesn't guarantee that start is
greater than or equal to vma->vm_start. The calculation for the
intersecting range in nouveau_svmm_bind() isn't accounting for this and
can call migrate_vma_setup() with a starting address less than
vma->vm_start. This results in migrate_vma_setup() returning -EINVAL for
the range instead of nouveau skipping that part of the range and migrating
the rest.
Signed-off-by: Ralph Campbell <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
|
|
As there is no need to check for the return value of debugfs_create_file
and drm_debugfs_create_files, remove unnecessary checks and error
handling in nouveau_drm_debugfs_init.
Signed-off-by: Wambui Karuga <[email protected]>
Signed-off-by: Ben Skeggs <[email protected]>
|
|
Signed-off-by: Ben Skeggs <[email protected]>
|
|
Prepare input updates for 5.7 merge window.
|
|
The warning message when a led is renamed due to name collition can fail
to show proper original name if init_data is used. Eg:
[ 9.073996] leds-gpio a0040000.leds_0: Led (null) renamed to red_led_1 due to name collision
Fixes: bb4e9af0348d ("leds: core: Add support for composing LED class device names")
Signed-off-by: Ricardo Ribalda Delgado <[email protected]>
Acked-by: Jacek Anaszewski <[email protected]>
Signed-off-by: Pavel Machek <[email protected]>
|
|
Group LED functions according to functionality, and add some
explaining comments.
Signed-off-by: Pavel Machek <[email protected]>
|
|
Sort Makefile entries to reduce risk of rejects.
Signed-off-by: Pavel Machek <[email protected]>
|
|
Warn about old defines that probably should not be used.
Signed-off-by: Pavel Machek <[email protected]>
|
|
Make label "white:power" to be consistent with dt-bindings/leds/common.h .
Signed-off-by: Pavel Machek <[email protected]>
|