aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-01-09doc: trace: fix reference to cpuidle documentation fileOtto Sabart1-1/+1
Old cpuidle/sysfs.txt file was replaced in aa5eee355b46. So, refer to an updated file. Fixes: aa5eee355b46 (Documentation: admin-guide: PM: Add cpuidle document) Signed-off-by: Otto Sabart <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2019-01-09drm/bridge: tc358767: use DP connector if no panel setTomi Valkeinen1-1/+2
tc358767 driver sets the connector type always to eDP. This patch sets the type to DP if there is no panel defined, which implies that there's a DP connector on the board. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: fix output H/V syncsTomi Valkeinen1-1/+5
The H and V syncs of the DP output are always set to active high. This patch fixes the syncs by configuring them according to the videomode. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: reject modes which require too much BWTomi Valkeinen1-0/+10
The current driver accepts any videomode with pclk < 154MHz. This is not correct, as with 1 lane and/or 1.62Mbps speed not all videomodes can be supported. Add code to reject modes that require more bandwidth that is available. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: fix initial DP0/1_SRCCTRL valueTomi Valkeinen1-6/+5
Initially DP0_SRCCTRL is set to a static value which includes DP0_SRCCTRL_LANES_2 and DP0_SRCCTRL_BW27, even when only 1 lane of 1.62Gbps speed is used. DP1_SRCCTRL is configured to a magic number. This patch changes the configuration as follows: Configure DP0_SRCCTRL by using tc_srcctrl() which provides the correct value. DP1_SRCCTRL needs two bits to be set to the same value as DP0_SRCCTRL: SSCG and BW27. All other bits can be zero. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: fix single lane configurationTomi Valkeinen1-2/+8
PHY_2LANE bit is always set in DP_PHY_CTRL, breaking 1 lane use. Set PHY_2LANE only when 2 lanes are used. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: add defines for DP1_SRCCTRL & PHY_2LANETomi Valkeinen1-3/+7
DP1_SRCCTRL register and PHY_2LANE field did not have matching defines. Add these. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09drm/bridge: tc358767: add bus flagsTomi Valkeinen1-0/+4
tc358767 driver does not set DRM bus_flags, even if it does configures the polarity settings into its registers. This means that the DPI source can't configure the polarities correctly. Add sync flags accordingly. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Andrzej Hajda <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-09x86, modpost: Replace last remnants of RETPOLINE with CONFIG_RETPOLINEWANG Chao4-4/+4
Commit 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") replaced the RETPOLINE define with CONFIG_RETPOLINE checks. Remove the remaining pieces. [ bp: Massage commit message. ] Fixes: 4cd24de3a098 ("x86/retpoline: Make CONFIG_RETPOLINE depend on compiler support") Signed-off-by: WANG Chao <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Zhenzhong Duan <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Kees Cook <[email protected]> Cc: Konrad Rzeszutek Wilk <[email protected]> Cc: Luc Van Oostenryck <[email protected]> Cc: Michal Marek <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tim Chen <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: [email protected] Cc: [email protected] Cc: stable <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-01-09x86/cache: Rename config option to CONFIG_X86_RESCTRLBorislav Petkov6-8/+8
CONFIG_RESCTRL is too generic. The final goal is to have a generic option called like this which is selected by the arch-specific ones CONFIG_X86_RESCTRL and CONFIG_ARM64_RESCTRL. The generic one will cover the resctrl filesystem and other generic and shared bits of functionality. Signed-off-by: Borislav Petkov <[email protected]> Suggested-by: Ingo Molnar <[email protected]> Requested-by: Linus Torvalds <[email protected]> Cc: Babu Moger <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: James Morse <[email protected]> Cc: Reinette Chatre <[email protected]> Cc: Tony Luck <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected]
2019-01-09ALSA: hda/realtek - Disable headset Mic VREF for headset mode of ALC225Kailang Yang1-1/+15
Disable Headset Mic VREF for headset mode of ALC225. This will be controlled by coef bits of headset mode functions. [ Fixed a compile warning and code simplification -- tiwai ] Signed-off-by: Kailang Yang <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-01-09ALSA: hda/realtek - Add unplug function into unplug state of Headset Mode ↵Kailang Yang1-0/+1
for ALC225 Forgot to add unplug function to unplug state of headset mode for ALC225. Signed-off-by: Kailang Yang <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-01-09Merge tag 'perf-core-for-mingo-5.0-20190108' of ↵Ingo Molnar21-250/+238
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core fixes and improvements from Arnaldo Carvalho de Melo: perf top: Arnaldo Carvalho de Melo: - Lift restriction on using callchains without "sym" in --sort perf trace: Arnaldo Carvalho de Melo: - Fix ')' placement in "interrupted" syscall lines. - Fix alignment for [continued] lines. perf tests: Florian Fainelli: - Add a test for the ARM 32-bit [vectors] page. tools lib traceevent: Tzvetomir Stoyanov: - Introduce new libtracevent API: tep_override_comm(). - Initialize host_bigendian at tep_handle allocation. - More namespacing changes. - Remove superfluous APIs. tools headers uapi: Arnaldo Carvalho de Melo: . Update linux/{fs,vhost}.h, grab a copy o linux/mount.h, where the MS_ mount flags were moved. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2019-01-09drm/i915/gvt: Fix workload request allocation before request addZhenyu Wang2-22/+43
In commit 6bb2a2af8b1b ("drm/i915/gvt: Fix crash after request->hw_context change"), forgot to handle workload scan path in ELSP handler case which was to optimize scanning earlier instead of in gvt submission thread, so request alloc and add was splitting then which is against right process. This trys to do a partial revert of that commit which still has workload request alloc helper and make sure shadow state population is handled after request alloc for target state buffer. v3: Fix missed workload status setting in request alloc error path v2: Fix dispatch workload err path that should add request after alloc anyway. Fixes: 6bb2a2af8b1b ("drm/i915/gvt: Fix crash after request->hw_context change") Cc: Bin Yang <[email protected]> Cc: Chris Wilson <[email protected]> Tested-by: Bin Yang <[email protected]> Reviewed-by: Xiaolin Zhang <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
2019-01-08block: clarify documentation for blk_{start|finish}_plugJeff Moyer1-0/+19
There was some confusion about what these functions did. Make it clear that this is a hint for upper layers to pass to the block layer, and that it does not guarantee that I/O will not be submitted between a start and finish plug. Reported-by: "Darrick J. Wong" <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jeff Moyer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-01-08Merge branch 'akpm' (patches from Andrew)Linus Torvalds21-212/+289
Merge misc fixes from Andrew Morton: "14 fixes" * emailed patches from Andrew Morton <[email protected]>: mm, page_alloc: do not wake kswapd with zone lock held hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization" hugetlbfs: revert "Use i_mmap_rwsem to fix page fault/truncate race" mm: page_mapped: don't assume compound page is huge or THP mm/memory.c: initialise mmu_notifier_range correctly tools/vm/page_owner: use page_owner_sort in the use example kasan: fix krealloc handling for tag-based mode kasan: make tag based mode work with CONFIG_HARDENED_USERCOPY kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligning mm, memcg: fix reclaim deadlock with writeback mm/usercopy.c: no check page span for stack objects slab: alien caches must not be initialized if the allocation of the alien cache failed fork, memcg: fix cached_stacks case zram: idle writeback fixes and cleanup
2019-01-08arch/openrisc: Fix issues with access_ok()Stafford Horne1-2/+6
The commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'") exposed incorrect implementations of access_ok() macro in several architectures. This change fixes 2 issues found in OpenRISC. OpenRISC was not properly using parenthesis for arguments and also using arguments twice. This patch fixes those 2 issues. I test booted this patch with v5.0-rc1 on qemu and it's working fine. Cc: Guenter Roeck <[email protected]> Cc: Linus Torvalds <[email protected]> Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Stafford Horne <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08mm, page_alloc: do not wake kswapd with zone lock heldMel Gorman2-1/+13
syzbot reported the following regression in the latest merge window and it was confirmed by Qian Cai that a similar bug was visible from a different context. ====================================================== WARNING: possible circular locking dependency detected 4.20.0+ #297 Not tainted ------------------------------------------------------ syz-executor0/8529 is trying to acquire lock: 000000005e7fb829 (&pgdat->kswapd_wait){....}, at: __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120 but task is already holding lock: 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:329 [inline] 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_bulk mm/page_alloc.c:2548 [inline] 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: __rmqueue_pcplist mm/page_alloc.c:3021 [inline] 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue_pcplist mm/page_alloc.c:3050 [inline] 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: rmqueue mm/page_alloc.c:3072 [inline] 000000009bb7bae0 (&(&zone->lock)->rlock){-.-.}, at: get_page_from_freelist+0x1bae/0x52a0 mm/page_alloc.c:3491 It appears to be a false positive in that the only way the lock ordering should be inverted is if kswapd is waking itself and the wakeup allocates debugging objects which should already be allocated if it's kswapd doing the waking. Nevertheless, the possibility exists and so it's best to avoid the problem. This patch flags a zone as needing a kswapd using the, surprisingly, unused zone flag field. The flag is read without the lock held to do the wakeup. It's possible that the flag setting context is not the same as the flag clearing context or for small races to occur. However, each race possibility is harmless and there is no visible degredation in fragmentation treatment. While zone->flag could have continued to be unused, there is potential for moving some existing fields into the flags field instead. Particularly read-mostly ones like zone->initialized and zone->contiguous. Link: http://lkml.kernel.org/r/[email protected] Fixes: 1c30844d2dfe ("mm: reclaim small amounts of memory when an external fragmentation event occurs") Reported-by: [email protected] Signed-off-by: Mel Gorman <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Tested-by: Qian Cai <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08hugetlbfs: revert "use i_mmap_rwsem for more pmd sharing synchronization"Mike Kravetz5-88/+20
This reverts b43a9990055958e70347c56f90ea2ae32c67334c The reverted commit caused issues with migration and poisoning of anon huge pages. The LTP move_pages12 test will cause an "unable to handle kernel NULL pointer" BUG would occur with stack similar to: RIP: 0010:down_write+0x1b/0x40 Call Trace: migrate_pages+0x81f/0xb90 __ia32_compat_sys_migrate_pages+0x190/0x190 do_move_pages_to_node.isra.53.part.54+0x2a/0x50 kernel_move_pages+0x566/0x7b0 __x64_sys_move_pages+0x24/0x30 do_syscall_64+0x5b/0x180 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The purpose of the reverted patch was to fix some long existing races with huge pmd sharing. It used i_mmap_rwsem for this purpose with the idea that this could also be used to address truncate/page fault races with another patch. Further analysis has determined that i_mmap_rwsem can not be used to address all these hugetlbfs synchronization issues. Therefore, revert this patch while working an another approach to the underlying issues. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Kravetz <[email protected]> Reported-by: Jan Stancek <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: "Aneesh Kumar K . V" <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Prakash Sangappa <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08hugetlbfs: revert "Use i_mmap_rwsem to fix page fault/truncate race"Mike Kravetz2-38/+44
This reverts c86aa7bbfd5568ba8a82d3635d8f7b8a8e06fe54 The reverted commit caused ABBA deadlocks when file migration raced with file eviction for specific hugetlbfs files. This was discovered with a modified version of the LTP move_pages12 test. The purpose of the reverted patch was to close a long existing race between hugetlbfs file truncation and page faults. After more analysis of the patch and impacted code, it was determined that i_mmap_rwsem can not be used for all required synchronization. Therefore, revert this patch while working an another approach to the underlying issue. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Kravetz <[email protected]> Reported-by: Jan Stancek <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: "Aneesh Kumar K . V" <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Prakash Sangappa <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08mm: page_mapped: don't assume compound page is huge or THPJan Stancek1-1/+1
LTP proc01 testcase has been observed to rarely trigger crashes on arm64: page_mapped+0x78/0xb4 stable_page_flags+0x27c/0x338 kpageflags_read+0xfc/0x164 proc_reg_read+0x7c/0xb8 __vfs_read+0x58/0x178 vfs_read+0x90/0x14c SyS_read+0x60/0xc0 The issue is that page_mapped() assumes that if compound page is not huge, then it must be THP. But if this is 'normal' compound page (COMPOUND_PAGE_DTOR), then following loop can keep running (for HPAGE_PMD_NR iterations) until it tries to read from memory that isn't mapped and triggers a panic: for (i = 0; i < hpage_nr_pages(page); i++) { if (atomic_read(&page[i]._mapcount) >= 0) return true; } I could replicate this on x86 (v4.20-rc4-98-g60b548237fed) only with a custom kernel module [1] which: - allocates compound page (PAGEC) of order 1 - allocates 2 normal pages (COPY), which are initialized to 0xff (to satisfy _mapcount >= 0) - 2 PAGEC page structs are copied to address of first COPY page - second page of COPY is marked as not present - call to page_mapped(COPY) now triggers fault on access to 2nd COPY page at offset 0x30 (_mapcount) [1] https://github.com/jstancek/reproducers/blob/master/kernel/page_mapped_crash/repro.c Fix the loop to iterate for "1 << compound_order" pages. Kirrill said "IIRC, sound subsystem can producuce custom mapped compound pages". Link: http://lkml.kernel.org/r/c440d69879e34209feba21e12d236d06bc0a25db.1543577156.git.jstancek@redhat.com Fixes: e1534ae95004 ("mm: differentiate page_mapped() from page_mapcount() for compound pages") Signed-off-by: Jan Stancek <[email protected]> Debugged-by: Laszlo Ersek <[email protected]> Suggested-by: "Kirill A. Shutemov" <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Andrea Arcangeli <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08mm/memory.c: initialise mmu_notifier_range correctlyMatthew Wilcox1-2/+2
One of the paths in follow_pte_pmd() initialised the mmu_notifier_range incorrectly. Link: http://lkml.kernel.org/r/[email protected] Fixes: ac46d4f3c432 ("mm/mmu_notifier: use structure for invalidate_range_start/end calls v2") Signed-off-by: Matthew Wilcox <[email protected]> Tested-by: Dave Chinner <[email protected]> Reviewed-by: Jérôme Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jan Kara <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08tools/vm/page_owner: use page_owner_sort in the use exampleMiles Chen1-1/+3
The example in comment does not useable because the output binary is named "page_owner_sort", not "sort". Also add a reference to Documentation/vm/page_owner.rst Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Miles Chen <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08kasan: fix krealloc handling for tag-based modeAndrey Konovalov1-20/+43
Right now tag-based KASAN can retag the memory that is reallocated via krealloc and return a differently tagged pointer even if the same slab object gets used and no reallocated technically happens. There are a few issues with this approach. One is that krealloc callers can't rely on comparing the return value with the passed argument to check whether reallocation happened. Another is that if a caller knows that no reallocation happened, that it can access object memory through the old pointer, which leads to false positives. Look at nf_ct_ext_add() to see an example. Fix this by keeping the same tag if the memory don't actually gets reallocated during krealloc. Link: http://lkml.kernel.org/r/bb2a71d17ed072bcc528cbee46fcbd71a6da3be4.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08kasan: make tag based mode work with CONFIG_HARDENED_USERCOPYAndrey Konovalov1-0/+2
With CONFIG_HARDENED_USERCOPY enabled __check_heap_object() compares and then subtracts a potentially tagged pointer with a non-tagged address of the page that this pointer belongs to, which leads to unexpected behavior. Untag the pointer in __check_heap_object() before doing any of these operations. Link: http://lkml.kernel.org/r/7e756a298d514c4482f52aea6151db34818d395d.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08kasan, arm64: use ARCH_SLAB_MINALIGN instead of manual aligningAndrey Konovalov2-2/+6
Instead of changing cache->align to be aligned to KASAN_SHADOW_SCALE_SIZE in kasan_cache_create() we can reuse the ARCH_SLAB_MINALIGN macro. Link: http://lkml.kernel.org/r/52ddd881916bcc153a9924c154daacde78522227.1546540962.git.andreyknvl@google.com Signed-off-by: Andrey Konovalov <[email protected]> Suggested-by: Vincenzo Frascino <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Vincenzo Frascino <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08mm, memcg: fix reclaim deadlock with writebackMichal Hocko1-0/+22
Liu Bo has experienced a deadlock between memcg (legacy) reclaim and the ext4 writeback task1: wait_on_page_bit+0x82/0xa0 shrink_page_list+0x907/0x960 shrink_inactive_list+0x2c7/0x680 shrink_node_memcg+0x404/0x830 shrink_node+0xd8/0x300 do_try_to_free_pages+0x10d/0x330 try_to_free_mem_cgroup_pages+0xd5/0x1b0 try_charge+0x14d/0x720 memcg_kmem_charge_memcg+0x3c/0xa0 memcg_kmem_charge+0x7e/0xd0 __alloc_pages_nodemask+0x178/0x260 alloc_pages_current+0x95/0x140 pte_alloc_one+0x17/0x40 __pte_alloc+0x1e/0x110 alloc_set_pte+0x5fe/0xc20 do_fault+0x103/0x970 handle_mm_fault+0x61e/0xd10 __do_page_fault+0x252/0x4d0 do_page_fault+0x30/0x80 page_fault+0x28/0x30 task2: __lock_page+0x86/0xa0 mpage_prepare_extent_to_map+0x2e7/0x310 [ext4] ext4_writepages+0x479/0xd60 do_writepages+0x1e/0x30 __writeback_single_inode+0x45/0x320 writeback_sb_inodes+0x272/0x600 __writeback_inodes_wb+0x92/0xc0 wb_writeback+0x268/0x300 wb_workfn+0xb4/0x390 process_one_work+0x189/0x420 worker_thread+0x4e/0x4b0 kthread+0xe6/0x100 ret_from_fork+0x41/0x50 He adds "task1 is waiting for the PageWriteback bit of the page that task2 has collected in mpd->io_submit->io_bio, and tasks2 is waiting for the LOCKED bit the page which tasks1 has locked" More precisely task1 is handling a page fault and it has a page locked while it charges a new page table to a memcg. That in turn hits a memory limit reclaim and the memcg reclaim for legacy controller is waiting on the writeback but that is never going to finish because the writeback itself is waiting for the page locked in the #PF path. So this is essentially ABBA deadlock: lock_page(A) SetPageWriteback(A) unlock_page(A) lock_page(B) lock_page(B) pte_alloc_pne shrink_page_list wait_on_page_writeback(A) SetPageWriteback(B) unlock_page(B) # flush A, B to clear the writeback This accumulating of more pages to flush is used by several filesystems to generate a more optimal IO patterns. Waiting for the writeback in legacy memcg controller is a workaround for pre-mature OOM killer invocations because there is no dirty IO throttling available for the controller. There is no easy way around that unfortunately. Therefore fix this specific issue by pre-allocating the page table outside of the page lock. We have that handy infrastructure for that already so simply reuse the fault-around pattern which already does this. There are probably other hidden __GFP_ACCOUNT | GFP_KERNEL allocations from under a fs page locked but they should be really rare. I am not aware of a better solution unfortunately. [[email protected]: fix mm/memory.c:__do_fault()] [[email protected]: coding-style fixes] [[email protected]: enhance comment, per Johannes] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Fixes: c3b94f44fcb0 ("memcg: further prevent OOM with too many dirty pages") Signed-off-by: Michal Hocko <[email protected]> Reported-by: Liu Bo <[email protected]> Debugged-by: Liu Bo <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: Johannes Weiner <[email protected]> Reviewed-by: Liu Bo <[email protected]> Cc: Jan Kara <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Theodore Ts'o <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08mm/usercopy.c: no check page span for stack objectsQian Cai1-4/+5
It is easy to trigger this with CONFIG_HARDENED_USERCOPY_PAGESPAN=y, usercopy: Kernel memory overwrite attempt detected to spans multiple pages (offset 0, size 23)! kernel BUG at mm/usercopy.c:102! For example, print_worker_info char name[WQ_NAME_LEN] = { }; char desc[WORKER_DESC_LEN] = { }; probe_kernel_read(name, wq->name, sizeof(name) - 1); probe_kernel_read(desc, worker->desc, sizeof(desc) - 1); __copy_from_user_inatomic check_object_size check_heap_object check_page_span This is because on-stack variables could cross PAGE_SIZE boundary, and failed this check, if (likely(((unsigned long)ptr & (unsigned long)PAGE_MASK) == ((unsigned long)end & (unsigned long)PAGE_MASK))) ptr = FFFF889007D7EFF8 end = FFFF889007D7F00E Hence, fix it by checking if it is a stack object first. [[email protected]: improve comments after reorder] Link: http://lkml.kernel.org/r/20190103165151.GA32845@beast Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Qian Cai <[email protected]> Signed-off-by: Kees Cook <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08slab: alien caches must not be initialized if the allocation of the alien ↵Christoph Lameter1-2/+4
cache failed Callers of __alloc_alien() check for NULL. We must do the same check in __alloc_alien_cache to avoid NULL pointer dereferences on allocation failures. Link: http://lkml.kernel.org/r/010001680f42f192-82b4e12e-1565-4ee0-ae1f-1e98974906aa-000000@email.amazonses.com Fixes: 49dfc304ba241 ("slab: use the lock on alien_cache, instead of the lock on array_cache") Fixes: c8522a3a5832b ("Slab: introduce alloc_alien") Signed-off-by: Christoph Lameter <[email protected]> Reported-by: [email protected] Reviewed-by: Andrew Morton <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08fork, memcg: fix cached_stacks caseShakeel Butt1-0/+1
Commit 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") fixes a crash caused due to failed memcg charge of the kernel stack. However the fix misses the cached_stacks case which this patch fixes. So, the same crash can happen if the memcg charge of a cached stack is failed. Link: http://lkml.kernel.org/r/[email protected] Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") Signed-off-by: Shakeel Butt <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08zram: idle writeback fixes and cleanupMinchan Kim4-55/+125
This patch includes some fixes and cleanup for idle-page writeback. 1. writeback_limit interface Now writeback_limit interface is rather conusing. For example, once writeback limit budget is exausted, admin can see 0 from /sys/block/zramX/writeback_limit which is same semantic with disable writeback_limit at this moment. IOW, admin cannot tell that zero came from disable writeback limit or exausted writeback limit. To make the interface clear, let's sepatate enable of writeback limit to another knob - /sys/block/zram0/writeback_limit_enable * before: while true : # to re-enable writeback limit once previous one is used up echo 0 > /sys/block/zram0/writeback_limit echo $((200<<20)) > /sys/block/zram0/writeback_limit .. .. # used up the writeback limit budget * new # To enable writeback limit, from the beginning, admin should # enable it. echo $((200<<20)) > /sys/block/zram0/writeback_limit echo 1 > /sys/block/zram/0/writeback_limit_enable while true : echo $((200<<20)) > /sys/block/zram0/writeback_limit .. .. # used up the writeback limit budget It's much strightforward. 2. fix condition check idle/huge writeback mode check The mode in writeback_store is not bit opeartion any more so no need to use bit operations. Furthermore, current condition check is broken in that it does writeback every pages regardless of huge/idle. 3. clean up idle_store No need to use goto. [[email protected]: missed spin_lock_init] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Minchan Kim <[email protected]> Suggested-by: John Dias <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: John Dias <[email protected]> Cc: Srinivas Paladugu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08arm64: defconfig: enable modules for amlogic s400 sound cardJerome Brunet1-0/+4
Compile the necessary drivers as modules, including codecs, for the s400 sound card. Signed-off-by: Jerome Brunet <[email protected]> Signed-off-by: Kevin Hilman <[email protected]>
2019-01-08drm/dp_mst: Add __must_check to drm_dp_mst_topology_mgr_resume()Lyude Paul1-1/+2
Since I've had to fix two cases of drivers not checking the return code from this function, let's make the compiler complain so this doesn't come up again in the future. Changes since v1: * Remove unneeded __must_check in function declaration - danvet Signed-off-by: Lyude Paul <[email protected]> Cc: Jerry Zuo <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-08drm/amdgpu: Don't fail resume process if resuming atomic state failsLyude Paul1-3/+2
This is an ugly one unfortunately. Currently, all DRM drivers supporting atomic modesetting will save the state that userspace had set before suspending, then attempt to restore that state on resume. This probably worked very well at one point, like many other things, until DP MST came into the picture. While it's easy to restore state on normal display connectors that were disconnected during suspend regardless of their state post-resume, this can't really be done with MST because of the fact that setting up a downstream sink requires performing sideband transactions between the source and the MST hub, sending out the ACT packets, etc. Because of this, there isn't really a guarantee that we can restore the atomic state we had before suspend once we've resumed. This sucks pretty bad, but so far I haven't run into any compositors that this actually causes serious issues with. Most compositors will notice the hotplug we send afterwards, and then reprobe state. Since nouveau and i915 also don't fail the suspend/resume process due to failing to restore the atomic state, let's make amdgpu match this behavior. Better to resume the GPU properly, then to stop the process half way because of a potentially unavoidable atomic commit failure. Eventually, we'll have a real fix for this problem on the DRM level. But we've got some more important low-hanging fruit to deal with first. Signed-off-by: Lyude Paul <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Cc: Jerry Zuo <[email protected]> Cc: <[email protected]> # v4.15+ Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-08drm/amdgpu: Don't ignore rc from drm_dp_mst_topology_mgr_resume()Lyude Paul1-9/+23
drm_dp_mst_topology_mgr_resume() returns whether or not it managed to find the topology in question after a suspend resume cycle, and the driver is supposed to check this value and disable MST accordingly if it's gone-in addition to sending a hotplug in order to notify userspace that something changed during suspend. Currently, amdgpu just makes the mistake of ignoring the return code from drm_dp_mst_topology_mgr_resume() which means that if a topology was removed in suspend, amdgpu never notices and assumes it's still connected which leads to all sorts of problems. So, fix this by actually checking the rc from drm_dp_mst_topology_mgr_resume(). Also, reformat the rest of the function while we're at it to fix the over-indenting. Signed-off-by: Lyude Paul <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Cc: Jerry Zuo <[email protected]> Cc: <[email protected]> # v4.15+ Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-01-08ALSA: usb-audio: fix CM6206 register definitionsAmadeusz Sławiński1-1/+1
fix typo after a recent commit causing headphones to have no sound Fixes: ad43d528a7ac (ALSA: usb-audio: Define registers for CM6206) Signed-off-by: Amadeusz Sławiński <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2019-01-08drm/amdgpu: validate user GEM object sizeYu Zhao1-0/+8
When creating frame buffer, userspace may request to attach to a previously allocated GEM object that is smaller than what GPU requires. Validation must be done to prevent out-of-bound DMA, otherwise it could be exploited to reveal sensitive data. This fix is not done in a common code path because individual driver might have different requirement. Cc: [email protected] # v4.2+ Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Yu Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08drm/amdgpu: validate user pitch alignmentYu Zhao1-0/+10
Userspace may request pitch alignment that is not supported by GPU. Some requests 32, but GPU ignores it and uses default 64 when cpp is 4. If GEM object is allocated based on the smaller alignment, GPU DMA will go out of bound. Cc: [email protected] # v4.2+ Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Yu Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08drm/amd/powerplay: drop the unnecessary uclk hard min settingEvan Quan1-7/+0
Since soft min setting is enough. Hard min setting is redundant. Reported-by: Likun Gao <[email protected]> Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08drm/amd/powerplay: avoid possible buffer overflowEvan Quan1-0/+14
Make sure the clock level enforced is within the allowed ranges. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08drm/amd/powerplay: create pp_od_clk_voltage device file under OD supportEvan Quan1-8/+14
Since pp_od_clk_voltage device file is for OD related sysfs operations. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08drm/amd/powerplay: update OD support flag for SKU with no OD capabilitiesEvan Quan1-0/+3
For those ASICs with no overdrive capabilities, the OD support flag will be reset. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-08fork: record start_time lateDavid Herrmann1-2/+11
This changes the fork(2) syscall to record the process start_time after initializing the basic task structure but still before making the new process visible to user-space. Technically, we could record the start_time anytime during fork(2). But this might lead to scenarios where a start_time is recorded long before a process becomes visible to user-space. For instance, with userfaultfd(2) and TLS, user-space can delay the execution of fork(2) for an indefinite amount of time (and will, if this causes network access, or similar). By recording the start_time late, it much closer reflects the point in time where the process becomes live and can be observed by other processes. Lastly, this makes it much harder for user-space to predict and control the start_time they get assigned. Previously, user-space could fork a process and stall it in copy_thread_tls() before its pid is allocated, but after its start_time is recorded. This can be misused to later-on cycle through PIDs and resume the stalled fork(2) yielding a process that has the same pid and start_time as a process that existed before. This can be used to circumvent security systems that identify processes by their pid+start_time combination. Even though user-space was always aware that start_time recording is flaky (but several projects are known to still rely on start_time-based identification), changing the start_time to be recorded late will help mitigate existing attacks and make it much harder for user-space to control the start_time a process gets assigned. Reported-by: Jann Horn <[email protected]> Signed-off-by: Tom Gundersen <[email protected]> Signed-off-by: David Herrmann <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-01-08tools include uapi: Sync linux/vhost.h with the kernel sourcesArnaldo Carvalho de Melo1-111/+2
To get the changes in: 4b86713236e4 ("vhost: split structs into a separate header file") Silencing this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h' diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h Those didn't touch things used in tools, i.e. the following continues working: $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh static const char *vhost_virtio_ioctl_cmds[] = { [0x00] = "SET_FEATURES", [0x01] = "SET_OWNER", [0x02] = "RESET_OWNER", [0x03] = "SET_MEM_TABLE", [0x04] = "SET_LOG_BASE", [0x07] = "SET_LOG_FD", [0x10] = "SET_VRING_NUM", [0x11] = "SET_VRING_ADDR", [0x12] = "SET_VRING_BASE", [0x13] = "SET_VRING_ENDIAN", [0x14] = "GET_VRING_ENDIAN", [0x20] = "SET_VRING_KICK", [0x21] = "SET_VRING_CALL", [0x22] = "SET_VRING_ERR", [0x23] = "SET_VRING_BUSYLOOP_TIMEOUT", [0x24] = "GET_VRING_BUSYLOOP_TIMEOUT", [0x25] = "SET_BACKEND_FEATURES", [0x30] = "NET_SET_BACKEND", [0x40] = "SCSI_SET_ENDPOINT", [0x41] = "SCSI_CLEAR_ENDPOINT", [0x42] = "SCSI_GET_ABI_VERSION", [0x43] = "SCSI_SET_EVENTS_MISSED", [0x44] = "SCSI_GET_EVENTS_MISSED", [0x60] = "VSOCK_SET_GUEST_CID", [0x61] = "VSOCK_SET_RUNNING", }; static const char *vhost_virtio_ioctl_read_cmds[] = { [0x00] = "GET_FEATURES", [0x12] = "GET_VRING_BASE", [0x26] = "GET_BACKEND_FEATURES", }; $ At some point in the eBPFication of perf, using something like: # perf trace -e ioctl(cmd=VHOST_VRING*) Will setup a BPF filter right at the raw_syscalls:sys_enter tracepoint, i.e. filtering at the origin. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08tools include uapi: Sync linux/fs.h copy with the kernel sourcesArnaldo Carvalho de Melo1-52/+8
To get the changes in: e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled") That made the mount flags string table generator to switch to using mount.h instead. This silences the following perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/fs.h' differs from latest version at 'include/uapi/linux/fs.h' diff -u tools/include/uapi/linux/fs.h include/uapi/linux/fs.h Cc: Adrian Hunter <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08perf beauty: Switch from using uapi/linux/fs.h to uapi/linux/mount.hArnaldo Carvalho de Melo1-2/+2
As now we'll update our fs.h copy and what tools/perf/trace/beauty/mount_flags.sh needs just got moved to mount.h, use that instead. Cc: Adrian Hunter <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08tools include uapi: Grab a copy of linux/mount.hArnaldo Carvalho de Melo2-0/+59
We were using a copy of uapi/linux/fs.h to create the mount syscall 'flags' string table to use in 'perf trace', to convert from the number obtained via the raw_syscalls:sys_enter into a string, using tools/perf/trace/beauty/mount_flags.sh, but in e262e32d6bde ("vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled") those defines got moved to linux/mount.h, so grab a copy of mount.h too. Keep the uapi/linux/fs.h as we'll use it for the SEEK_ constants. Cc: Adrian Hunter <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08vfio/type1: Fix unmap overflow off-by-oneAlex Williamson1-1/+1
The below referenced commit adds a test for integer overflow, but in doing so prevents the unmap ioctl from ever including the last page of the address space. Subtract one to compare to the last address of the unmap to avoid the overflow and wrap-around. Fixes: 71a7d3d78e3c ("vfio/type1: silence integer overflow warning") Link: https://bugzilla.redhat.com/show_bug.cgi?id=1662291 Cc: [email protected] # v4.15+ Reported-by: Pei Zhang <[email protected]> Debugged-by: Peter Xu <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Reviewed-by: Peter Xu <[email protected]> Tested-by: Peter Xu <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Alex Williamson <[email protected]>
2019-01-08perf top: Lift restriction on using callchains without "sym" in --sortArnaldo Carvalho de Melo1-6/+1
This restriction is not present in 'perf report' and since 'perf top' uses the same hists browser, remove it from it as well. With this we create per event buckets with callchain trees, so that # perf top --sort dso -g --no-children Bucketizes samples by DSO and below it shows the callchains leading to functions in this DSO. Try also: # perf top -e sched:*switch -g --no-children To see the callchains leading to sched switches, pressing 'E' to expand all one can quickly see the most common scheduler switches and what leads to them, for instance, calls to IO, futexes, etc. Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-08tools lib traceevent: Remove tep_data_event_from_type() APITzvetomir Stoyanov2-13/+0
In order to make libtraceevent into a proper library, its API should be straightforward. After discussion with Steven Rostedt, we decided to remove the tep_data_event_from_type() API and to replace it with tep_find_event(), as it does the same. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>