aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-05-20xfs: fix deadlock retry tracepoint argumentsDarrick J. Wong1-1/+3
sc->ip is the inode that's being scrubbed, which means that it's not set for scrub types that don't involve inodes. If one of those scrubbers (e.g. inode btrees) returns EDEADLOCK, we'll trip over the null pointer. Fix that by reporting either the file being examined or the file that was used to call scrub. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]>
2021-05-20xfs: retry allocations when locality-based search failsDarrick J. Wong1-1/+14
If a realtime allocation fails because we can't find a sufficiently large free extent satisfying locality rules, relax the locality rules and try again. This reduces the occurrence of short writes to realtime files when the write size is large and the free space is fragmented. This was originally discovered by running generic/186 with the realtime reflink patchset and a 128k cow extent size hint, but the short write symptoms can manifest with a 128k extent size hint and no reflink, so apply the fix now. Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Allison Henderson <[email protected]>
2021-05-21powerpc/64s/syscall: Fix ptrace syscall info with scv syscallsNicholas Piggin2-35/+52
The scv implementation missed updating syscall return value and error value get/set functions to deal with the changed register ABI. This broke ptrace PTRACE_GET_SYSCALL_INFO as well as some kernel auditing and tracing functions. Fix. tools/testing/selftests/ptrace/get_syscall_info now passes when scv is used. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: [email protected] # v5.9+ Reported-by: "Dmitry V. Levin" <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Reviewed-by: Dmitry V. Levin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-21powerpc/64s/syscall: Use pt_regs.trap to distinguish syscall ABI difference ↵Nicholas Piggin2-9/+28
between sc and scv syscalls The sc and scv 0 system calls have different ABI conventions, and ptracers need to know which system call type is being used if they want to look at the syscall registers. Document that pt_regs.trap can be used for this, and fix one in-tree user to work with scv 0 syscalls. Fixes: 7fa95f9adaee ("powerpc/64s: system call support for scv/rfscv instructions") Cc: [email protected] # v5.9+ Reported-by: "Dmitry V. Levin" <[email protected]> Suggested-by: "Dmitry V. Levin" <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-20block: fix a race between del_gendisk and BLKRRPARTGulam Mohamed1-0/+3
When BLKRRPART is called concurrently with del_gendisk, the partitions rescan can create a stale partition that will never be be cleaned up. Fix this by checking the the disk is up before rescanning partitions while under bd_mutex. Signed-off-by: Gulam Mohamed <[email protected]> [hch: split from a larger patch] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-05-20block: prevent block device lookups at the beginning of del_gendiskChristoph Hellwig3-22/+6
As an artifact of how gendisk lookup used to work in earlier kernels, GENHD_FL_UP is only cleared very late in del_gendisk, and a global lock is used to prevent opens from succeeding while del_gendisk is tearing down the gendisk. Switch to clearing the flag early and under bd_mutex so that callers can use bd_mutex to stabilize the flag, which removes the need for the global mutex. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ming Lei <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
2021-05-20Merge tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme into block-5.13Jens Axboe4-4/+19
Pull NVMe fixes from Christoph: "nvme fixes for Linux 5.13: - nvme-tcp corruption and timeout fixes (Sagi Grimberg, Keith Busch) - nvme-fc teardown fix (James Smart) - nvmet/nvme-loop memory leak fixes (Wu Bo)" * tag 'nvme-5.13-2021-05-20' of git://git.infradead.org/nvme: nvme-fc: clear q_live at beginning of association teardown nvme-tcp: rerun io_work if req_list is not empty nvme-tcp: fix possible use-after-completion nvme-loop: fix memory leak in nvme_loop_create_ctrl() nvmet: fix memory leak in nvmet_alloc_ctrl()
2021-05-20drm/i915/gt: fix typo issuewengjianfeng1-2/+2
Change 'freqency' to 'frequency'. Signed-off-by: wengjianfeng <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Signed-off-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-20btrfs: zoned: fix parallel compressed writesJohannes Thumshirn1-4/+38
When multiple processes write data to the same block group on a compressed zoned filesystem, the underlying device could report I/O errors and data corruption is possible. This happens because on a zoned file system, compressed data writes where sent to the device via a REQ_OP_WRITE instead of a REQ_OP_ZONE_APPEND operation. But with REQ_OP_WRITE and parallel submission it cannot be guaranteed that the data is always submitted aligned to the underlying zone's write pointer. The change to using REQ_OP_ZONE_APPEND instead of REQ_OP_WRITE on a zoned filesystem is non intrusive on a regular file system or when submitting to a conventional zone on a zoned filesystem, as it is guarded by btrfs_use_zone_append. Reported-by: David Sterba <[email protected]> Fixes: 9d294a685fbc ("btrfs: zoned: enable to mount ZONED incompat flag") CC: [email protected] # 5.12.x: e380adfc213a13: btrfs: zoned: pass start block to btrfs_use_zone_append CC: [email protected] # 5.12.x Signed-off-by: Johannes Thumshirn <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-05-20btrfs: zoned: pass start block to btrfs_use_zone_appendJohannes Thumshirn4-7/+6
btrfs_use_zone_append only needs the passed in extent_map's block_start member, so there's no need to pass in the full extent map. This also enables the use of btrfs_use_zone_append in places where we only have a start byte but no extent_map. Signed-off-by: Johannes Thumshirn <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2021-05-20io_uring: fortify tctx/io_wq cleanupPavel Begunkov1-4/+4
We don't want anyone poking into tctx->io_wq awhile it's being destroyed by io_wq_put_and_exit(), and even though it shouldn't even happen, if buggy would be preferable to get a NULL-deref instead of subtle delayed failure or UAF. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/827b021de17926fd807610b3e53a5a5fa8530856.1621513214.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2021-05-20platform/x86: touchscreen_dmi: Add info for the Chuwi Hi10 Pro (CWI529) tabletHans de Goede1-0/+35
Add touchscreen info for the Chuwi Hi10 Pro (CWI529) tablet. This includes info for getting the firmware directly from the UEFI, so that the user does not need to manually install the firmware in /lib/firmware/silead. This change will make the touchscreen on these devices work OOTB, without requiring any manual setup. Signed-off-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-20dma-buf: fix unintended pin/unpin warningsChristian König1-5/+5
DMA-buf internal users call the pin/unpin functions without having a dynamic attachment. Avoid the warning and backtrace in the logs. Signed-off-by: Christian König <[email protected]> Bugs: https://gitlab.freedesktop.org/drm/intel/-/issues/3481 Fixes: c545781e1c55 ("dma-buf: doc polish for pin/unpin") Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> CC: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-20drm/bridge: anx7625: Synchronously run runtime suspend.Pi-Hsun Shih1-2/+2
Originally when using pm_runtime_put, there's a chance that the runtime suspend hook will be run after the following anx7625_bridge_mode_set call, resulting in the display_timing_valid field to be cleared, and the following power on fail. Change all pm_runtime_put to pm_runtime_put_sync, so all power off operations are guaranteed to be done after the call returns. Fixes: 60487584a79a ("drm/bridge: anx7625: refactor power control to use runtime PM framework") Signed-off-by: Pi-Hsun Shih <[email protected]> Tested-by: Tzung-Bi Shih <[email protected]> Reviewed-by: Robert Foss <[email protected]> Signed-off-by: Robert Foss <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-20powerpc: Fix early setup to make early_ioremap() workAlexey Kardashevskiy1-2/+2
The immediate problem is that after commit 0bd3f9e953bd ("powerpc/legacy_serial: Use early_ioremap()") the kernel silently reboots on some systems. The reason is that early_ioremap() returns broken addresses as it uses slot_virt[] array which initialized with offsets from FIXADDR_TOP == IOREMAP_END+FIXADDR_SIZE == KERN_IO_END - FIXADDR_SIZ + FIXADDR_SIZE == __kernel_io_end which is 0 when early_ioremap_setup() is called. __kernel_io_end is initialized little bit later in early_init_mmu(). This fixes the initialization by swapping early_ioremap_setup() and early_init_mmu(). Fixes: 265c3491c4bc ("powerpc: Add support for GENERIC_EARLY_IOREMAP") Signed-off-by: Alexey Kardashevskiy <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> [mpe: Drop unrelated cleanup & cleanup change log] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-05-19drm/amdgpu: Unmap all MMIO mappingsAndrey Grodzovsky3-11/+23
Access to those must be prevented post pci_remove v6: Drop BOs list, unampping VRAM BAR is enough. v8: Add condition of xgmi.connected_to_cpu to MTTR handling and remove MTTR handling from the old place. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Verify DMA opearations from device are doneAndrey Grodzovsky1-0/+6
In case device remove is just simualted by sysfs then verify device doesn't keep doing DMA to the released memory after pci_remove is done. Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amd/display: Remove superfluous drm_mode_config_cleanupAndrey Grodzovsky1-1/+0
It's already being released by DRM core through devm Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/sched: Avoid data corruptionsAndrey Grodzovsky1-0/+5
Wait for all dependencies of a job to complete before killing it to avoid data corruptions. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/scheduler: Fix hang when sched_entity releasedAndrey Grodzovsky2-1/+26
Problem: If scheduler is already stopped by the time sched_entity is released and entity's job_queue not empty I encountred a hang in drm_sched_entity_flush. This is because drm_sched_entity_is_idle never becomes false. Fix: In drm_sched_fini detach all sched_entities from the scheduler's run queues. This will satisfy drm_sched_entity_is_idle. Also wakeup all those processes stuck in sched_entity flushing as the scheduler main thread which wakes them up is stopped by now. v2: Reverse order of drm_sched_rq_remove_entity and marking s_entity as stopped to prevent reinserion back to rq due to race. v3: Drop drm_sched_rq_remove_entity, only modify entity->stopped and check for it in drm_sched_entity_is_idle Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Fix hang on device removal.Andrey Grodzovsky1-6/+10
If removing while commands in flight you cannot wait to flush the HW fences on a ring since the device is gone. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Prevent any job recoveries after device is unplugged.Andrey Grodzovsky1-3/+16
Return DRM_TASK_STATUS_ENODEV back to the scheduler when device is not present so they timeout timer will not be rearmed. v5: Update to match updated return values in enum drm_gpu_sched_stat Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/sched: Make timeout timer rearm conditional.Andrey Grodzovsky1-4/+7
We don't want to rearm the timer if driver hook reports that the device is gone. v5: Update drm_gpu_sched_stat values in code. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Guard against write accesses after device removalAndrey Grodzovsky13-96/+168
This should prevent writing to memory or IO ranges possibly already allocated for other uses after our device is removed. v5: Protect more places wher memcopy_to/form_io takes place Protect IB submissions v6: Switch to !drm_dev_enter instead of scoping entire code with brackets. v7: Drop guard of HW ring commands emission protection since they are in GART and not in MMIO. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Convert driver sysfs attributes to static attributesAndrey Grodzovsky4-32/+37
This allows to remove explicit creation and destruction of those attrs and by this avoids warnings on device finalizing post physical device extraction. v5: Use newly added pci_driver.dev_groups directly Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19PCI: Add support for dev_groups to struct pci_driverAndrey Grodzovsky2-0/+4
This helps converting PCI drivers sysfs attributes to static. Analogous to' commit b71b283e3d6d ("USB: add support for dev_groups to struct usb_driver")' Signed-off-by: Andrey Grodzovsky <[email protected]> Suggested-by: Greg Kroah-Hartman <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Remap all page faults to per process dummy page.Andrey Grodzovsky1-5/+16
On device removal reroute all CPU mappings to dummy page per drm_file instance or imported GEM object. v4: Update for modified ttm_bo_vm_dummy_page Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Handle IOMMU enabled case.Andrey Grodzovsky18-40/+13
Problem: Handle all DMA IOMMU group related dependencies before the group is removed. Those manifest themself in that when IOMMU enabled DMA map/unmap is dependent on the presence of IOMMU group the device belongs to but, this group is released once the device is removed from PCI topology. Fix: Expedite all such unmap operations to pci remove driver callback. v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate v6: Drop the BO unamp list v7: Drop amdgpu_gart_fini In amdgpu_ih_ring_fini do uncinditional check (!ih->ring) to avoid freeing uniniitalized rings. Call amdgpu_ih_ring_fini unconditionally. v8: Add deatiled explanation Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Add early fini callbackAndrey Grodzovsky6-24/+54
Use it to call disply code dependent on device->drv_data before it's set to NULL on device unplug v5: Move HW finilization into this callback to prevent MMIO accesses post cpi remove. v7: Split kfd suspend from device exit to expdite HW related stuff to amdgpu_pci_remove v8: Squash previous KFD commit into this commit to avoid compile break. Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: Split amdgpu_device_fini into early and lateAndrey Grodzovsky17-36/+79
Some of the stuff in amdgpu_device_fini such as HW interrupts disable and pending fences finilization must be done right away on pci_remove while most of the stuff which relates to finilizing and releasing driver data structures can be kept until drm_driver.release hook is called, i.e. when the last device reference is dropped. v4: Change functions prefix early->hw and late->sw Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/ttm: Remap all page faults to per process dummy page.Andrey Grodzovsky2-1/+55
On device removal reroute all CPU mappings to dummy page. v3: Remove loop to find DRM file and instead access it by vma->vm_file->private_data. Move dummy page installation into a separate function. v4: Map the entire BOs VA space into on demand allocated dummy page on the first fault for that BO. v5: Remove duplicate return. v6: Polish ttm_bo_vm_dummy_page, remove superfluous code. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-20Merge tag 'drm-misc-next-2021-05-17' of ↵Dave Airlie17-32/+108
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.14: UAPI Changes: Cross-subsystem Changes: Core Changes: * aperture: Fix unlocking on errors * legacy: Fix some doc comments Driver Changes: * drm/amdgpu: Free resource on fence usage query; Fix fence calculation; * drm/bridge: Lt9611: Add missing MODULE_DEVICE_TABLE * drm/i915: Print formats with %p4cc * drm/ingenic: IPU planes are now always of type OVERLAY * drm/nouveau: Remove left-over reference to struct drm_device.pdev * drm/panfrost: Disable devfreq if num_supplies > 1; Add Mediatek MT8183 + DT bindings; Cleanups * drm/simpledrm: Print resources with %pr; Fix use-after-free errors; Fix NULL deref; Fix MAINTAINERS entry * drm/vmwgfx: Fix memory allocation and leak in FIFO allocation; Fix return value in PCI resource setup Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-05-19drm/amdgpu: stop touching sched.ready in the backendChristian König4-16/+1
This unfortunately comes up in regular intervals and breaks GPU reset for the engine in question. The sched.ready flag controls if an engine can't get working during hw_init, but should never be set to false during hw_fini. v2: squash in unused variable fix (Alex) Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amd/amdgpu: fix a potential deadlock in gpu resetLang Yu1-1/+0
When amdgpu_ib_ring_tests failed, the reset logic called amdgpu_device_ip_suspend twice, then deadlock occurred. Deadlock log: [ 805.655192] amdgpu 0000:04:00.0: amdgpu: ib ring test failed (-110). [ 806.290952] [drm] free PSP TMR buffer [ 806.319406] ============================================ [ 806.320315] WARNING: possible recursive locking detected [ 806.321225] 5.11.0-custom #1 Tainted: G W OEL [ 806.322135] -------------------------------------------- [ 806.323043] cat/2593 is trying to acquire lock: [ 806.323825] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.325668] but task is already holding lock: [ 806.326664] ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.328430] other info that might help us debug this: [ 806.329539] Possible unsafe locking scenario: [ 806.330549] CPU0 [ 806.330983] ---- [ 806.331416] lock(&adev->dm.dc_lock); [ 806.332086] lock(&adev->dm.dc_lock); [ 806.332738] *** DEADLOCK *** [ 806.333747] May be due to missing lock nesting notation [ 806.334899] 3 locks held by cat/2593: [ 806.335537] #0: ffff888100d3f1b8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_read+0x4e/0x110 [ 806.337009] #1: ffff888136b1fd78 (&adev->reset_sem){++++}-{3:3}, at: amdgpu_device_lock_adev+0x42/0x94 [amdgpu] [ 806.339018] #2: ffff888136b1cdc8 (&adev->dm.dc_lock){+.+.}-{3:3}, at: dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.340869] stack backtrace: [ 806.341621] CPU: 6 PID: 2593 Comm: cat Tainted: G W OEL 5.11.0-custom #1 [ 806.342921] Hardware name: AMD Celadon-CZN/Celadon-CZN, BIOS WLD0C23N_Weekly_20_12_2 12/23/2020 [ 806.344413] Call Trace: [ 806.344849] dump_stack+0x93/0xbd [ 806.345435] __lock_acquire.cold+0x18a/0x2cf [ 806.346179] lock_acquire+0xca/0x390 [ 806.346807] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.347813] __mutex_lock+0x9b/0x930 [ 806.348454] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.349434] ? amdgpu_device_indirect_rreg+0x58/0x70 [amdgpu] [ 806.350581] ? _raw_spin_unlock_irqrestore+0x47/0x50 [ 806.351437] ? dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.352437] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.353252] ? rcu_read_lock_sched_held+0x4f/0x80 [ 806.354064] mutex_lock_nested+0x1b/0x20 [ 806.354747] ? mutex_lock_nested+0x1b/0x20 [ 806.355457] dm_suspend+0xb8/0x1d0 [amdgpu] [ 806.356427] ? soc15_common_set_clockgating_state+0x17d/0x19 [amdgpu] [ 806.357736] amdgpu_device_ip_suspend_phase1+0x78/0xd0 [amdgpu] [ 806.360394] amdgpu_device_ip_suspend+0x21/0x70 [amdgpu] [ 806.362926] amdgpu_device_pre_asic_reset+0xb3/0x270 [amdgpu] [ 806.365560] amdgpu_device_gpu_recover.cold+0x679/0x8eb [amdgpu] Signed-off-by: Lang Yu <[email protected]> Acked-by: Christian KÃnig <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: modify system reference clock source for navi+ (V2)Aaron Liu1-0/+15
Starting from Navi+, the rlc reference clock is used for system clock from vbios gfx_info table. It is incorrect to use core_refclk_10khz of vbios smu_info table as system clock. Signed-off-by: Aaron Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: update sdma golden setting for Navi12Guchun Chen1-0/+4
Current golden setting is out of date. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: update gc golden setting for Navi12Guchun Chen1-2/+4
Current golden setting is out of date. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Fix a use-after-freexinhui pan1-0/+1
looks like we forget to set ttm->sg to NULL. Hit panic below [ 1235.844104] general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b7b4b: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI [ 1235.989074] Call Trace: [ 1235.991751] sg_free_table+0x17/0x20 [ 1235.995667] amdgpu_ttm_backend_unbind.cold+0x4d/0xf7 [amdgpu] [ 1236.002288] amdgpu_ttm_backend_destroy+0x29/0x130 [amdgpu] [ 1236.008464] ttm_tt_destroy+0x1e/0x30 [ttm] [ 1236.013066] ttm_bo_cleanup_memtype_use+0x51/0xa0 [ttm] [ 1236.018783] ttm_bo_release+0x262/0xa50 [ttm] [ 1236.023547] ttm_bo_put+0x82/0xd0 [ttm] [ 1236.027766] amdgpu_bo_unref+0x26/0x50 [amdgpu] [ 1236.032809] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x7aa/0xd90 [amdgpu] [ 1236.040400] kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu] [ 1236.046912] kfd_ioctl+0x463/0x690 [amdgpu] Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu/display: restore the backlight on modeset (v2)Alex Deucher1-0/+6
To stay consistent with the user's setting. v2: rebase on multi-eDP support Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1337 Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu/display: add helper functions to get/set backlight (v2)Alex Deucher2-11/+38
And cache the value. These can be used by the backlight callbacks and modesetting functions. v2: rebase on latest backlight changes. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1337 Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Query correct register for DF hashing on AldebaranMukul Joshi2-4/+8
For Aldebaran, driver needs to query DramMegaBaseAddress to check if DF hashing is enabled. Signed-off-by: Mukul Joshi <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: add video_codecs query support for aldebaranJames Zhu1-0/+1
Add video_codecs query support for aldebaran. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdkfd: fix a resource leakage issueDennis Li1-0/+2
The function kfd_lookup_process_by_pasid will increase the reference count of kfd_process object, its caller should call kfd_unref_process to decrease the reference count. Otherwise resource leakage will happen. Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Move dmabuf attach/detach to backend_(un)bindFelix Kuehling2-29/+25
The dmabuf attachment should be updated by moving the SG BO to DOMAIN_CPU and back to DOMAIN_GTT. This does not necessarily invoke the populate/unpopulate callbacks. Do this in backend_bind/unbind instead. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Add DMA mapping of GTT BOsFelix Kuehling2-1/+77
Use DMABufs with dynamic attachment to DMA-map GTT BOs on other GPUs. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Move kfd_mem_attach outside reservationFelix Kuehling1-31/+44
This is needed to avoid deadlocks with DMA buf import in the next patch. Also move PT/PD validation out of kfd_mem_attach, that way the caller can bo this unconditionally. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: DMA map/unmap when updating GPU mappingsFelix Kuehling1-27/+29
DMA map kfd_mem_attachments in update_gpuvm_pte. This function is called with the BO and page tables reserved, so we can safely update the DMA mapping. DMA unmap when a BO is unmapped from a GPU and before updating mappings in restore workers. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Add multi-GPU DMA mapping helpersFelix Kuehling2-9/+148
Add BO-type specific helpers functions to DMA-map and unmap kfd_mem_attachments. Implement this functionality for userptrs by creating one SG BO per GPU and filling it with a DMA mapping of the pages from the original mem->bo. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Simplify AQL queue mappingFelix Kuehling1-55/+48
Do AQL queue double-mapping with a single attach call. That will make it easier to create per-GPU BOs later, to be shared between the two BO VA mappings on the same GPU. Freeing the attachments is not necessary if map_to_gpu fails. These will be cleaned up when the kdg_mem object is destroyed in amdgpu_amdkfd_gpuvm_free_memory_of_gpu. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdgpu: Keep a bo-reference per-attachmentFelix Kuehling1-5/+17
For now they all reference the same BO. For correct DMA mappings they will refer to different BOs per-GPU. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oak Zeng <[email protected]> Acked-by: Ramesh Errabolu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>