aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-08-30drm/amd/display: Fix black flash when switching from ODM2to1 to ODMBypassVladimir Stempen1-1/+2
[Why] On secondary display hotplug we switch primary stream from ODM2to1 to ODMBypass mode. Current logic will trigger disabling front end for this stream. [How] We need to check if prev_odm_pipe is equal to NULL in order to disable dangling planes in this scenario. Reviewed-by: Ariel Bernstein <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Vladimir Stempen <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Fix check for stream and planeEthan Wellenreiter1-1/+1
[WHY] Function wasn't returning false when it had a no stream [HOW] Made it return false when it had no stream. Reviewed-by: Alvin Lee <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Ethan Wellenreiter <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Re-initialize viewport after pipe mergeEthan Wellenreiter1-0/+9
[Why] Pipes get merged in preparation for SubVP but if they don't get used, and are in ODM or some other multi pipe config, it would calculate the voltage level with a viewport of just one pipe from when they were split resulting in too low of a voltage level. [How] Made it so that the viewport and other timing settings get rebuilt and re- initialized after the pipe merge, before calculating the voltage level so it would calculate it correctly. Reviewed-by: Alvin Lee <[email protected]> Reviewed-by: Jun Lei <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Ethan Wellenreiter <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Use correct plane for CAB cursor size allocationAurabindo Pillai1-11/+21
[Why&How] plane and stream variables used for cursor size allocation calculation were stale from previous iteration. Redo the iteration to find the correct cursor plane for the calculation. Reviewed-by: Alvin Lee <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amdgpu: ensure no PCIe peer access for CPU XGMI iolinksAlex Sierra1-1/+2
[Why] Devices with CPU XGMI iolink do not support PCIe peer access. Signed-off-by: Alex Sierra <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/pm: bump SMU 13.0.0 driver_if header versionEvan Quan2-2/+2
To suppress the warning about version mismatch with the latest 78.54.0 PMFW. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/pm: use vbios carried pptable for all SMU13.0.7 SKUsEvan Quan1-13/+22
For those SMU13.0.7 unsecure SKUs, the vbios carried pptable is ready to go. Use that one instead of hardcoded softpptable. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/pm: use vbios carried pptable for those supported SKUsEvan Quan3-24/+77
For some SMU13.0.0 SKUs, the vbios carried pptable is ready to go. Use that one instead of hardcoded softpptable. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: fix wrong register accessCharlene Liu2-0/+4
[why] fw version check was for release branch. for staging, it has a chance to enter wrong code path. Reviewed-by: Hansen Dsouza <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Charlene Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: use actual cursor size instead of max for CAB allocationAurabindo Pillai1-1/+14
[Why&How] When calculating allocation for cursor size, get the real cursor through the HUBP instead of using the maximum cursor size for more optimal allocation Reviewed-by: Alvin Lee <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: disable display fresh from MALL on an edge case for DCN321Aurabindo Pillai1-16/+27
[Why&How] When using a 4k monitor when cursor caching is not supported due to framebuffer being on an uncacheable address, enabling display refresh from MALL would trigger corruption if SS is enabled. Prevent entering SS if we are on the edge case and cursor caching is not possible. Do this only if cursor size larger than a 64x64@4bpp. Pull the cursor size calculation out of if condition since cursor address may not be set on all platforms Reviewed-by: Alvin Lee <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Fix CAB cursor size allocation for DCN32/321Aurabindo Pillai1-1/+1
For calculating cursor size allocation, surface size was used, resulting in over allocation Reviewed-by: Alvin Lee <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Missing HPO instance addedLeo Chen1-0/+1
[Why & How] Number of encoder is set to 4 but only 3 instances are created. Reviewed-by: Charlene Liu <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Leo Chen <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: set dig fifo read start level to 7 before dig fifo resetWang Fudong1-0/+5
[Why] DIG_FIFO_ERROR = 1 caused mst daisy chain 2nd monitor black. [How] We need to set dig fifo read start level = 7 before dig fifo reset during dig fifo enable according to hardware designer's suggestion. If it is zero, it will cause underflow or overflow and DIG_FIFO_ERROR = 1. Reviewed-by: Alvin Lee <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Wang Fudong <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amdgpu: Fix use-after-free in amdgpu_cs_ioctlYuBiao Wang1-1/+4
[Why] In amdgpu_cs_ioctl, amdgpu_job_free could be performed ealier if there is -ERESTARTSYS error. In this case, job->hw_fence could be not initialized yet. Putting hw_fence during amdgpu_job_free could lead to a use-after-free warning. [How] Check if drm_sched_job_init is performed before job_free by checking s_fence. v2: Check hw_fence.ops instead since it could be NULL if fence is not initialized. Reverse the condition since !=NULL check is discouraged in kernel. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Fix OTG H timing reset for dcn314Duncan Ma1-1/+2
[Why] When ODM is enabled, H timing control register reset to 0. Div mode manual field get overwritten causing no display on certain modes for dcn314. [How] Use REG_UPDATE instead of REG_SET to set div_mode field. Reviewed-by: Charlene Liu <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: Duncan Ma <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amd/display: Fix DCN32 DPSTREAMCLK_CNTL programmingGeorge Shen2-6/+6
[Why] Each index in the DPSTREAMCLK_CNTL register phyiscally maps 1-to-1 with HPO stream encoder instance. On the other hand, each index in DTBCLK_P_CNTL physically maps 1-to-1 with OTG instance. Current DCN32 DPSTREAMCLK_CLK programing assumes that OTG instance always maps 1-to-1 with HPO stream encoder instance. This is not always guaranteed and can result in blackscreen. [How] Program the correct dpstreamclk instance with the correct dtbclk_p source. Reviewed-by: Ariel Bernstein <[email protected]> Acked-by: Brian Chang <[email protected]> Signed-off-by: George Shen <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amdgpu: Update mes_v11_api_def.hGraham Sider2-1/+3
New GFX11 MES FW adds the trap_en bit. For now hardcode to 1 (traps enabled). Signed-off-by: Graham Sider <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-30drm/amdgpu: disable FRU access on special SIENNA CICHLID cardGuchun Chen1-2/+7
Below driver load error will be printed, not friendly to end user. amdgpu: ATOM BIOS: 113-D603GLXE-077 [drm] FRU: Failed to get size field [drm:amdgpu_fru_get_product_info [amdgpu]] *ERROR* Failed to read FRU Manufacturer, ret:-5 Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-28Linux 6.0-rc3Linus Torvalds1-1/+1
2022-08-28Merge tag 'mm-hotfixes-stable-2022-08-28' of ↵Linus Torvalds21-60/+108
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more hotfixes from Andrew Morton: "Seventeen hotfixes. Mostly memory management things. Ten patches are cc:stable, addressing pre-6.0 issues" * tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: .mailmap: update Luca Ceresoli's e-mail address mm/mprotect: only reference swap pfn page if type match squashfs: don't call kmalloc in decompressors mm/damon/dbgfs: avoid duplicate context directory creation mailmap: update email address for Colin King asm-generic: sections: refactor memory_intersects bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown Revert "memcg: cleanup racy sum avoidance code" mm/zsmalloc: do not attempt to free IS_ERR handle binder_alloc: add missing mmap_lock calls when using the VMA mm: re-allow pinning of zero pfns (again) vmcoreinfo: add kallsyms_num_syms symbol mailmap: update Guilherme G. Piccoli's email addresses writeback: avoid use-after-free after removing device shmem: update folio if shmem_replace_page() updates the page mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
2022-08-28Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linuxLinus Torvalds4-24/+38
Pull bitmap fixes from Yury Norov: "Fix the reported issues, and implements the suggested improvements, for the version of the cpumask tests [1] that was merged with commit c41e8866c28c ("lib/test: introduce cpumask KUnit test suite"). These changes include fixes for the tests, and better alignment with the KUnit style guidelines" * tag 'bitmap-6.0-rc3' of github.com:/norov/linux: lib/cpumask_kunit: add tests file to MAINTAINERS lib/cpumask_kunit: log mask contents lib/test_cpumask: follow KUnit style guidelines lib/test_cpumask: fix cpu_possible_mask last test lib/test_cpumask: drop cpu_possible_mask full test
2022-08-28.mailmap: update Luca Ceresoli's e-mail addressLuca Ceresoli1-0/+1
My Bootlin address is preferred from now on. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Luca Ceresoli <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Atish Patra <[email protected]> Cc: Hans Verkuil <[email protected]> Cc: Thomas Petazzoni <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mm/mprotect: only reference swap pfn page if type matchPeter Xu1-1/+2
Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]: kernel BUG at include/linux/swapops.h:117! CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S O L 6.0.0-dbg-DEV #2 RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0 Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6 c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b 48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48 RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282 RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000 RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000 R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738 R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a FS: 00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> change_pte_range+0x36e/0x880 change_p4d_range+0x2e8/0x670 change_protection_range+0x14e/0x2c0 mprotect_fixup+0x1ee/0x330 do_mprotect_pkey+0x34c/0x440 __x64_sys_mprotect+0x1d/0x30 It triggers because pfn_swap_entry_to_page() could be called upon e.g. a genuine swap entry. Fix it by only calling it when it's a write migration entry where the page* is used. [1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: 6c287605fd56 ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive") Signed-off-by: Peter Xu <[email protected]> Reported-by: Yu Zhao <[email protected]> Tested-by: Yu Zhao <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28squashfs: don't call kmalloc in decompressorsPhillip Lougher4-21/+22
The decompressors may be called while in an atomic section. So move the kmalloc() out of this path, and into the "page actor" init function. This fixes a regression introduced by commit f268eedddf35 ("squashfs: extend "page actor" to handle missing pages") Link: https://lkml.kernel.org/r/[email protected] Fixes: f268eedddf35 ("squashfs: extend "page actor" to handle missing pages") Reported-by: Chris Murphy <[email protected]> Signed-off-by: Phillip Lougher <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mm/damon/dbgfs: avoid duplicate context directory creationBadari Pulavarty1-0/+3
When user tries to create a DAMON context via the DAMON debugfs interface with a name of an already existing context, the context directory creation fails but a new context is created and added in the internal data structure, due to absence of the directory creation success check. As a result, memory could leak and DAMON cannot be turned on. An example test case is as below: # cd /sys/kernel/debug/damon/ # echo "off" > monitor_on # echo paddr > target_ids # echo "abc" > mk_context # echo "abc" > mk_context # echo $$ > abc/target_ids # echo "on" > monitor_on <<< fails Return value of 'debugfs_create_dir()' is expected to be ignored in general, but this is an exceptional case as DAMON feature is depending on the debugfs functionality and it has the potential duplicate name issue. This commit therefore fixes the issue by checking the directory creation failure and immediately return the error in the case. Link: https://lkml.kernel.org/r/[email protected] Fixes: 75c1c2b53c78 ("mm/damon/dbgfs: support multiple contexts") Signed-off-by: Badari Pulavarty <[email protected]> Signed-off-by: SeongJae Park <[email protected]> Cc: <[email protected]> [ 5.15.x] Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mailmap: update email address for Colin KingColin Ian King1-2/+1
Colin King is working on kernel janitorial fixes in his spare time and using his Intel email is confusing. Use his gmail account as the default email address. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28asm-generic: sections: refactor memory_intersectsQuanyang Wang1-2/+5
There are two problems with the current code of memory_intersects: First, it doesn't check whether the region (begin, end) falls inside the region (virt, vend), that is (virt < begin && vend > end). The second problem is if vend is equal to begin, it will return true but this is wrong since vend (virt + size) is not the last address of the memory region but (virt + size -1) is. The wrong determination will trigger the misreporting when the function check_for_illegal_area calls memory_intersects to check if the dma region intersects with stext region. The misreporting is as below (stext is at 0x80100000): WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536] Modules linked in: CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5 Hardware name: Xilinx Zynq Platform unwind_backtrace from show_stack+0x18/0x1c show_stack from dump_stack_lvl+0x58/0x70 dump_stack_lvl from __warn+0xb0/0x198 __warn from warn_slowpath_fmt+0x80/0xb4 warn_slowpath_fmt from check_for_illegal_area+0x130/0x168 check_for_illegal_area from debug_dma_map_sg+0x94/0x368 debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128 __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24 dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4 usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214 usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118 usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70 usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360 usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440 usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238 usb_stor_control_thread from kthread+0xf8/0x104 kthread from ret_from_fork+0x14/0x2c Refactor memory_intersects to fix the two problems above. Before the 1d7db834a027e ("dma-debug: use memory_intersects() directly"), memory_intersects is called only by printk_late_init: printk_late_init -> init_section_intersects ->memory_intersects. There were few places where memory_intersects was called. When commit 1d7db834a027e ("dma-debug: use memory_intersects() directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA subsystem uses it to check for an illegal area and the calltrace above is triggered. [[email protected]: fix nearby comment typo] Link: https://lkml.kernel.org/r/[email protected] Fixes: 979559362516 ("asm/sections: add helpers to check for section data") Signed-off-by: Quanyang Wang <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Thierry Reding <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28bootmem: remove the vmemmap pages from kmemleak in put_page_bootmemLiu Shixin1-0/+2
The vmemmap pages is marked by kmemleak when allocated from memblock. Remove it from kmemleak when freeing the page. Otherwise, when we reuse the page, kmemleak may report such an error and then stop working. kmemleak: Cannot insert 0xffff98fb6eab3d40 into the object search tree (overlaps existing) kmemleak: Kernel memory leak detector disabled kmemleak: Object 0xffff98fb6be00000 (size 335544320): kmemleak: comm "swapper", pid 0, jiffies 4294892296 kmemleak: min_count = 0 kmemleak: count = 0 kmemleak: flags = 0x1 kmemleak: checksum = 0 kmemleak: backtrace: Link: https://lkml.kernel.org/r/[email protected] Fixes: f41f2ed43ca5 (mm: hugetlb: free the vmemmap pages associated with each HugeTLB page) Signed-off-by: Liu Shixin <[email protected]> Reviewed-by: Muchun Song <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdownHeming Zhao2-5/+6
After commit 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before return error"), any procedure after ocfs2_dlm_init() fails will trigger crash when calling ocfs2_dlm_shutdown(). ie: On local mount mode, no dlm resource is initialized. If ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will trigger kernel crash. This solution should bypass uninitialized resources in ocfs2_dlm_shutdown(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 0737e01de9c4 ("ocfs2: ocfs2_mount_volume does cleanup job before return error") Signed-off-by: Heming Zhao <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28Revert "memcg: cleanup racy sum avoidance code"Shakeel Butt1-2/+13
This reverts commit 96e51ccf1af33e82f429a0d6baebba29c6448d0f. Recently we started running the kernel with rstat infrastructure on production traffic and begin to see negative memcg stats values. Particularly the 'sock' stat is the one which we observed having negative value. $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 18446744073708724224 Re-run after couple of seconds $ grep "sock " /mnt/memory/job/memory.stat sock 253952 total_sock 53248 For now we are only seeing this issue on large machines (256 CPUs) and only with 'sock' stat. I think the networking stack increase the stat on one cpu and decrease it on another cpu much more often. So, this negative sock is due to rstat flusher flushing the stats on the CPU that has seen the decrement of sock but missed the CPU that has increments. A typical race condition. For easy stable backport, revert is the most simple solution. For long term solution, I am thinking of two directions. First is just reduce the race window by optimizing the rstat flusher. Second is if the reader sees a negative stat value, force flush and restart the stat collection. Basically retry but limited. Link: https://lkml.kernel.org/r/[email protected] Fixes: 96e51ccf1af33e8 ("memcg: cleanup racy sum avoidance code") Signed-off-by: Shakeel Butt <[email protected]> Cc: "Michal Koutný" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Roman Gushchin <[email protected]> Cc: Muchun Song <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Yosry Ahmed <[email protected]> Cc: Greg Thelen <[email protected]> Cc: <[email protected]> [5.15] Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mm/zsmalloc: do not attempt to free IS_ERR handleSergey Senozhatsky1-1/+1
zsmalloc() now returns ERR_PTR values as handles, which zram accidentally can pass to zs_free(). Another bad scenario is when zcomp_compress() fails - handle has default -ENOMEM value, and zs_free() will try to free that "pointer value". Add the missing check and make sure that zs_free() bails out when ERR_PTR() is passed to it. Link: https://lkml.kernel.org/r/[email protected] Fixes: c7e6f17b52e9 ("zsmalloc: zs_malloc: return ERR_PTR on failure") Signed-off-by: Sergey Senozhatsky <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nitin Gupta <[email protected]>, Signed-off-by: Andrew Morton <[email protected]>
2022-08-28binder_alloc: add missing mmap_lock calls when using the VMALiam Howlett1-10/+21
Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages() and when checking for a VMA in binder_alloc_new_buf_locked(). It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock after it verifies a VMA exists, but may be taken again deeper in the call stack, if necessary. Link: https://lkml.kernel.org/r/[email protected] Fixes: a43cfc87caaf (android: binder: stop saving a pointer to the VMA) Signed-off-by: Liam R. Howlett <[email protected]> Reported-by: Ondrej Mosnacek <[email protected]> Reported-by: <[email protected]> Acked-by: Carlos Llamas <[email protected]> Tested-by: Ondrej Mosnacek <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Christian Brauner (Microsoft) <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Hridya Valsaraju <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Martijn Coenen <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Todd Kjos <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: "Arve Hjønnevåg" <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mm: re-allow pinning of zero pfns (again)Alex Williamson1-3/+10
The below referenced commit makes the same error as 1c563432588d ("mm: fix is_pinnable_page against a cma page"), re-interpreting the logic to exclude pinning of the zero page, which breaks device assignment with vfio. To avoid further subtle mistakes, split the logic into discrete tests. [[email protected]: simplify comment, per John] Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen Fixes: f25cbb7a95a2 ("mm: add zone device coherent type memory support") Signed-off-by: Alex Williamson <[email protected]> Suggested-by: Matthew Wilcox <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Tested-by: Slawomir Laba <[email protected]> Reviewed-by: John Hubbard <[email protected]> Cc: Alex Sierra <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Alistair Popple <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28vmcoreinfo: add kallsyms_num_syms symbolStephen Brennan1-0/+1
The rest of the kallsyms symbols are useless without knowing the number of symbols in the table. In an earlier patch, I somehow dropped the kallsyms_num_syms symbol, so add it back in. Link: https://lkml.kernel.org/r/[email protected] Fixes: 5fd8fea935a1 ("vmcoreinfo: include kallsyms symbols") Signed-off-by: Stephen Brennan <[email protected]> Cc: Baoquan He <[email protected]> Cc: Dave Young <[email protected]> Cc: Vivek Goyal <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mailmap: update Guilherme G. Piccoli's email addressesGuilherme G. Piccoli1-0/+2
Both @canonical and @ibm email addresses are invalid now; use my personal address instead. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Guilherme G. Piccoli <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28writeback: avoid use-after-free after removing deviceKhazhismel Kumykov3-12/+16
When a disk is removed, bdi_unregister gets called to stop further writeback and wait for associated delayed work to complete. However, wb_inode_writeback_end() may schedule bandwidth estimation dwork after this has completed, which can result in the timer attempting to access the just freed bdi_writeback. Fix this by checking if the bdi_writeback is alive, similar to when scheduling writeback work. Since this requires wb->work_lock, and wb_inode_writeback_end() may get called from interrupt, switch wb->work_lock to an irqsafe lock. Link: https://lkml.kernel.org/r/[email protected] Fixes: 45a2966fd641 ("writeback: fix bandwidth estimate for spiky workload") Signed-off-by: Khazhismel Kumykov <[email protected]> Reviewed-by: Jan Kara <[email protected]> Cc: Michael Stapelberg <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: Alexander Viro <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28shmem: update folio if shmem_replace_page() updates the pageMatthew Wilcox (Oracle)1-0/+1
If we allocate a new page, we need to make sure that our folio matches that new page. If we do end up in this code path, we store the wrong page in the shmem inode's page cache, and I would rather imagine that data corruption ensues. This will be solved by changing shmem_replace_page() to shmem_replace_folio(), but this is the minimal fix. Link: https://lkml.kernel.org/r/[email protected] Fixes: da08e9b79323 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: William Kucharski <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pteMiaohe Lin1-1/+1
In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the page cache are installed in the ptes. But hugepage_add_new_anon_rmap is called for them mistakenly because they're not vm_shared. This will corrupt the page->mapping used by page cache code. Link: https://lkml.kernel.org/r/[email protected] Fixes: f619147104c8 ("userfaultfd: add UFFDIO_CONTINUE ioctl") Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Peter Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-28Merge tag 'for-6.0-rc3-tag' of ↵Linus Torvalds11-70/+79
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Fixes: - check that subvolume is writable when changing xattrs from security namespace - fix memory leak in device lookup helper - update generation of hole file extent item when merging holes - fix space cache corruption and potential double allocations; this is a rare bug but can be serious once it happens, stable backports and analysis tool will be provided - fix error handling when deleting root references - fix crash due to assert when attempting to cancel suspended device replace, add message what to do if mount fails due to missing replace item Regressions: - don't merge pages into bio if their page offset is not contiguous - don't allow large NOWAIT direct reads, this could lead to short reads eg. in io_uring" * tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: add info when mount fails due to stale replace target btrfs: replace: drop assert for suspended replace btrfs: fix silent failure when deleting root reference btrfs: fix space cache corruption and potential double allocations btrfs: don't allow large NOWAIT direct reads btrfs: don't merge pages into bio if their page offset is not contiguous btrfs: update generation of hole file extent item when merging holes btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() btrfs: check if root is readonly while setting security xattr
2022-08-28Merge tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds6-67/+70
Pull cfis fixes from Steve French: - two locking fixes (zero range, punch hole) - DFS 9 fix (padding), affecting some servers - three minor cleanup changes * tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add helper function to check smb1+ server cifs: Use help macro to get the mid header size cifs: Use help macro to get the header preamble size cifs: skip extra NULL byte in filenames smb3: missing inode locks in punch hole smb3: missing inode locks in zero range
2022-08-28Merge tag 'x86-urgent-2022-08-28' of ↵Linus Torvalds15-88/+183
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix PAT on Xen, which caused i915 driver failures - Fix compat INT 80 entry crash on Xen PV guests - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs - Fix RSB stuffing regressions - Fix ORC unwinding on ftrace trampolines - Add Intel Raptor Lake CPU model number - Fix (work around) a SEV-SNP bootloader bug providing bogus values in boot_params->cc_blob_address, by ignoring the value on !SEV-SNP bootups. - Fix SEV-SNP early boot failure - Fix the objtool list of noreturn functions and annotate snp_abort(), which bug confused objtool on gcc-12. - Fix the documentation for retbleed * tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/ABI: Mention retbleed vulnerability info file for sysfs x86/sev: Mark snp_abort() noreturn x86/sev: Don't use cc_platform_has() for early SEV-SNP calls x86/boot: Don't propagate uninitialized boot_params->cc_blob_address x86/cpu: Add new Raptor Lake CPU model number x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry x86/nospec: Fix i386 RSB stuffing x86/nospec: Unwreck the RSB stuffing x86/bugs: Add "unknown" reporting for MMIO Stale Data x86/entry: Fix entry_INT80_compat for Xen PV guests x86/PAT: Have pat_enabled() properly reflect state when running on Xen
2022-08-28Merge tag 'perf-urgent-2022-08-28' of ↵Linus Torvalds4-7/+36
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 perf fixes from Ingo Molnar: "Misc fixes: an Arch-LBR fix, a PEBS enumeration fix, an Intel DS fix, PEBS constraints fix on Alder Lake CPUs and an Intel uncore PMU fix" * tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU perf/x86/intel: Fix pebs event constraints for ADL perf/x86/intel/ds: Fix precise store latency handling perf/x86/core: Set pebs_capable and PMU_FL_PEBS_ALL for the Baseline perf/x86/lbr: Enable the branch type for the Arch LBR by default
2022-08-28Merge tag 'perf-tools-fixes-for-v6.0-2022-08-27' of ↵Linus Torvalds8-32/+61
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fixup setup of weak groups when using 'perf stat --repeat', add a 'perf test' for it. - Fix memory leaks in 'perf sched record' detected with -fsanitize=address. - Fix build when PYTHON_CONFIG is user supplied. - Capitalize topdown metrics' names in 'perf stat', so that the output, sometimes parsed, matches the Intel SDM docs. - Make sure the documentation for the save_type filter about Intel systems with Arch LBR support (12th-Gen+ client or 4th-Gen Xeon+ server) reflects recent related kernel changes. - Fix 'perf record' man page formatting of description of support to hybrid systems. - Update arm64´s KVM header from the kernel sources. * tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf stat: Capitalize topdown metrics' names perf docs: Update the documentation for the save_type filter perf sched: Fix memory leaks in __cmd_record detected with -fsanitize=address perf record: Fix manpage formatting of description of support to hybrid systems perf test: Stat test for repeat with a weak group perf stat: Clear evsel->reset_group for each stat run tools kvm headers arm64: Update KVM header from the kernel sources perf python: Fix build when PYTHON_CONFIG is user supplied
2022-08-27Merge tag 'thermal-6.0-rc3' of ↵Linus Torvalds3-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control fixes from Rafael Wysocki: "Fix two issues introduced recently and one driver problem leading to a NULL pointer dereference in some cases. Specifics: - Add missing EXPORT_SYMBOL_GPL in the thermal core and add back the required 'trips' property to the thermal zone DT bindings (Daniel Lezcano) - Prevent the int340x_thermal driver from crashing when a package with a buffer of 0 length is returned by an ACPI control method evaluated by it (Lee, Chun-Yi)" * tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR dt-bindings: thermal: Fix missing required property thermal/core: Add missing EXPORT_SYMBOL_GPL
2022-08-27Merge tag 'pm-6.0-rc3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Make __resolve_freq() check the presence of the frequency table instead of checking whether or not the ->target_index() callback is implemented by the driver, because that need not be the case when __resolve_freq() is used (Lukasz Luba)" * tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: check only freq_table in __resolve_freq()
2022-08-27Merge tag 'acpi-6.0-rc3' of ↵Linus Torvalds2-7/+6
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix issues introduced by recent changes related to the handling of ACPI device properties and a coding mistake in the exit path of the ACPI processor driver. Specifics: - Prevent acpi_thermal_cpufreq_exit() from attempting to remove the same frequency QoS request multiple times (Riwen Lu) - Fix type detection for integer ACPI device properties (Stefan Binding) - Avoid emitting false-positive warnings when processing ACPI device properties and drop the useless default case from the acpi_copy_property_array_uint() macro (Sakari Ailus)" * tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: property: Remove default association from integer maximum values ACPI: property: Ignore already existing data node tags ACPI: property: Fix type detection of unified integer reading functions ACPI: processor: Remove freq Qos request for all CPUs
2022-08-27Merge tag 's390-6.0-2' of ↵Linus Torvalds2-7/+19
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Fix double free of guarded storage and runtime instrumentation control blocks on fork() failure - Fix triggering write fault when VMA does not allow VM_WRITE * tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: do not trigger write fault when vma does not allow VM_WRITE s390: fix double free of GS and RI CBs on fork() failure
2022-08-27Merge tag 'for-linus-6.0-rc3-tag' of ↵Linus Torvalds4-13/+13
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two minor cleanups - a fix of the xen/privcmd driver avoiding a possible NULL dereference in an error case * tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/privcmd: fix error exit of privcmd_ioctl_dm_op() xen: move from strlcpy with unused retval to strscpy xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY
2022-08-27Merge tag 'audit-pr-20220826' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "Another small audit patch, this time to fix a bug where the return codes were not properly set before the audit filters were run, potentially resulting in missed audit records" * tag 'audit-pr-20220826' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: move audit_return_fixup before the filters