aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2023-10-04drm/amdgpu: Fix a memory leakLuben Tuikov1-0/+1
Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <[email protected]> Reported-by: Yang Wang <[email protected]> Fixes: 0dbf2c562625 ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: ratelimited override pte flags messagesPhilip Yang2-16/+10
Use ratelimited version of dev_dbg to avoid flooding dmesg log. No functional change. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amd: Drop all hand-built MIN and MAX macros in the amdgpu base driverMario Limonciello1-2/+0
Several files declare MIN() or MAX() macros that ignore the types of the values being compared. Drop these macros and switch to min() min_t(), and max() from `linux/minmax.h`. Suggested-by: Hamza Mahfooz <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: cache gpuvm fault information for gmc7+Alex Deucher5-4/+19
Cache the current fault info in the vm struct. This can be queried by userspace later to help debug UMDs. Cc: [email protected] Reviewed-by: Christian König <[email protected]> Acked-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: add cached GPU fault structure to vm structAlex Deucher2-0/+49
When we get a GPU page fault, cache the fault for later analysis. Cc: [email protected] Reviewed-by: Christian König <[email protected]> Acked-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: Use ttm_pages_limit to override vram reportingRajneesh Bhardwaj2-10/+16
On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages limit in NPS1 mode. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: Rework KFD memory max limitsRajneesh Bhardwaj1-2/+8
To allow bigger allocations specially on systems such as GFXIP 9.4.3 that use GTT memory for VRAM allocations, relax the limits to maximize ROCm allocations. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu/gmc11: set gart placement GC11Alex Deucher1-1/+1
Needed to avoid a hardware issue. v2: force high for all GC11 parts for consistency (Alex) v3: rebase Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu/gmc: add a way to force a particular placement for GARTAlex Deucher8-12/+31
We normally place GART based on the location of VRAM and the available address space around that, but provide an option to force a particular location for hardware that needs it. v2: Switch to passing the placement via parameter Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-04drm/amdgpu: correct gpu clock counter query on cyan skilfishLang Yu1-0/+21
Cayn skilfish uses SMUIO v11.0.8 offset. Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: <[email protected]> # v5.15+
2023-10-03drm/amd: Fix detection of _PR3 on the PCIe root portMario Limonciello1-1/+1
On some systems with Navi3x dGPU will attempt to use BACO for runtime PM but fails to resume properly. This is because on these systems the root port goes into D3cold which is incompatible with BACO. This happens because in this case dGPU is connected to a bridge between root port which causes BOCO detection logic to fail. Fix the intent of the logic by looking at root port, not the immediate upstream bridge for _PR3. Cc: [email protected] Suggested-by: Jun Ma <[email protected]> Tested-by: David Perry <[email protected]> Fixes: b10c1c5b3a4e ("drm/amdgpu: add check for ACPI power resources") Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-03drm/amd/display: Add writeback enable field (wb_enabled)Alex Hung1-0/+1
[WHAT] Add a new field to keep track whether a crtc is previously writeback-enabled. Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-03drm/amd/display: Hande writeback request from userspaceAlex Hung1-0/+3
[WHAT] Handle writeback requests and fill in the required information for DWB programming and setup. Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-10-03drm/amdkfd: drop struct kfd_cu_infoAlex Deucher2-24/+0
I think this was an abstraction back from when kfd supported both radeon and amdgpu. Since we just support amdgpu now, there is no more need for this and we can use the amdgpu structures directly. This also avoids having the kfd_cu_info structures on the stack when inlining which can blow up the stack. Cc: Arnd Bergmann <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-29Merge tag 'drm-misc-next-2023-09-27' of ↵Dave Airlie2-56/+8
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v6.7-rc1: UAPI Changes: - drm_file owner is now updated during use, in the case of a drm fd opened by the display server for a client, the correct owner is displayed. - Qaic gains support for the QAIC_DETACH_SLICE_BO ioctl to allow bo recycling. Cross-subsystem Changes: - Disable boot logo for au1200fb, mmpfb and unexport logo helpers. Only fbcon should manage display of logo. - Update freescale in MAINTAINERS. - Add some bridge files to bridge in MAINTAINERS. - Update gma500 driver repo in MAINTAINERS to point to drm-misc. Core Changes: - Move size computations to drm buddy allocator. - Make drm_atomic_helper_shutdown(NULL) a nop. - Assorted small fixes in drm_debugfs, DP-MST payload addition error handling. - Fix DRM_BRIDGE_ATTACH_NO_CONNECTOR handling. - Handle bad (h/v)sync_end in EDID by clipping to htotal. - Build GPUVM as a module. Driver Changes: - Simple drivers don't need to cache prepared result. - Call drm_atomic_helper_shutdown() in shutdown/unbind for a whole lot more drm drivers. - Assorted small fixes in amdgpu, ssd130x, bridge/it6621, accel/qaic, nouveau, tc358768. - Add NV12 for komeda writeback. - Add arbitration lost event to synopsis/dw-hdmi-cec. - Speed up s/r in nouveau by not restoring some big bo's. - Assorted nouveau display rework in preparation for GSP-RM, especially related to how the modeset sequence works and the DP sequence in relation to link training. - Update anx7816 panel. - Support NVSYNC and NHSYNC in tegra. - Allow multiple power domains in simple driver. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-09-28drm/amdgpu: Use pci_get_base_class() to reduce duplicated codeSui Jingfeng2-22/+9
Use pci_get_base_class() to reduce duplicated code. No functional change intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sui Jingfeng <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: Alex Deucher <[email protected]>
2023-09-28drm/amdgpu: update retry times for psp vmbx waitTao Zhou1-1/+4
Increase the retry loops and replace the constant number with macro. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amdgpu: exit directly if gpu reset failsTao Zhou1-1/+1
No need to perform the full reset operation in case of gpu reset failure. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for CIK SDMAMario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for SDMA v2.4Mario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for SDMA v3.0Mario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for SDMA v5.2Mario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for SDMA v6.0Mario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Move microcode init from sw_init to early_init for SDMA v5.0Mario Limonciello1-4/+5
As part of IP discovery early_init is run for all HW IP blocks. During this phase all firmware is supposed to be identified that may be missing so that the driver can avoid releasing resources used by the EFI framebuffer or simpledrm until the last possible moment. Move microcode loading from sw_init to early_init. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Drop error message about failing to load SDMA firmwareMario Limonciello8-24/+8
The error path for SDMA firmware loading is unnecessarily noisy. When a firmware is missing 3 errors show up: ``` amdgpu 0000:07:00.0: Direct firmware load for amdgpu/green_sardine_sdma.bin failed with error -2 [drm:sdma_v4_0_early_init [amdgpu]] *ERROR* Failed to load sdma firmware! [drm:amdgpu_device_init [amdgpu]] *ERROR* early_init of IP block <sdma_v4_0> failed -19 ``` The error code for the device init is bubbled up already, remove the second one. Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd/pm: deprecate allow_xgmi_power_down interfaceLe Ma1-2/+2
Replace with set_plpd_mode uniformly for places to use. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-28drm/amd: Limit seamless boot by default to APUsMario Limonciello1-0/+3
A hang is reported on DCN 3.2 with seamless boot enabled. As the benefits come from an eDP setup, limit it to only enabled by default with APUs. Suggested-by: [email protected] Reported-by: [email protected] Closes: https://lore.kernel.org/amd-gfx/[email protected]/T/#m2887e919d7c01b2a4860d2261b366d22e070f309 Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu/gmc11: disable AGP on GC 11.5Alex Deucher1-1/+2
AGP aperture is deprecated and no longer functional. v2: fix typo (Alex) v3: just skip the agp setup call v4: revert back to the original model v5: back to v3 Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: not to save bo in the case of RAS err_event_athubDavid (Ming Qiang) Wu1-0/+7
err_event_athub will corrupt VCPU buffer and not good to be restored in amdgpu_vcn_resume() and in this case the VCPU buffer needs to be cleared for VCN firmware to work properly. Acked-by: Leo Liu <[email protected]> Signed-off-by: David (Ming Qiang) Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: Fix a memory leakLuben Tuikov1-0/+1
Fix a memory leak in amdgpu_fru_get_product_info(). Cc: Alex Deucher <[email protected]> Reported-by: Yang Wang <[email protected]> Fixes: 0dbf2c562625 ("drm/amdgpu: Interpret IPMI data for product information (v2)") Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu/gmc: set a default disable value for AGPAlex Deucher9-17/+36
To disable AGP, the start needs to be set to a higher value than the end. Set a default disable value for the AGP aperture and allow the IP specific GMC code to enable it selectively be calling amdgpu_gmc_agp_location(). Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu/gmc6-8: properly disable the AGP apertureAlex Deucher3-3/+3
The BOT register needs to be larger than the TOP register for this to be properly disabled. The lower 22 bits of the BOT address are always 0 and the lower 22 bits of the TOP register are always 1 so you need to make the upper bits of BOT larger than the upper bits of BOT. Reviewed-by: Christian König <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu:Expose physical id of device in XGMI hiveMangesh Gadre1-0/+20
This identifies the physical ordering of devices in the hive v2: fix compilation issue Signed-off-by: Mangesh Gadre <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu/vpe: fix truncation warningsLang Yu1-4/+1
Fix truncation warnings. Fixes: 9d4346bdbc64 ("drm/amdgpu: add VPE 6.1.0 support") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Reported-by: kernel test robot <[email protected]> Link: https://lore.kernel.org/oe-kbuild-all/[email protected] Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdkfd: Move dma unmapping after TLB flushPhilip Yang2-4/+23
Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: further move TLB hw workarounds a layer upChristian König2-51/+42
For the PASID flushing we already handled that at a higher layer, apply those workarounds to the standard flush as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: rework lock handling for flush_tlb v2Christian König5-20/+11
Instead of each implementation doing this more or less correctly move taking the reset lock at a higher level. v2: fix typo Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: drop error return from flush_gpu_tlb_pasidChristian König7-29/+24
That function never fails, drop the error return. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasidChristian König1-44/+19
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: cleanup gmc_v10_0_flush_gpu_tlb_pasidChristian König1-47/+19
The same PASID can be used by more than one VMID, reset each of them. Use the common KIQ handling. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasidChristian König3-79/+102
Testing for reset is pointless since the reset can start right after the test. The same PASID can be used by more than one VMID, invalidate each of them. Move the KIQ and all the workaround handling into common GMC code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix and cleanup gmc_v8_0_flush_gpu_tlb_pasidChristian König1-10/+10
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Shashank Sharma <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix and cleanup gmc_v7_0_flush_gpu_tlb_pasidChristian König1-9/+10
Testing for reset is pointless since the reset can start right after the test. Grab the reset semaphore instead. The same PASID can be used by more than once VMID, build a mask of VMIDs to invalidate instead of just restting the first one. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Shashank Sharma <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlbChristian König2-69/+43
Remove leftovers from copying this from the gmc v10 code. v2: squash in fix from Yifan Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2Christian König5-119/+99
Move the SDMA workaround necessary for Navi 1x into a higher layer. v2: use dev_err Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: change if condition for bad channel bitmap updateTao Zhou1-2/+4
The amdgpu_ras_eeprom_control.bad_channel_bitmap is u32 type, but the channel index could be larger than 32. For the ASICs whose channel number is more than 32, the amdgpu_dpm_send_hbm_bad_channel_flag interface is not supported, so we simply bypass channel bitmap update under this condition. v2: replace sizeof with BITS_PER_TYPE, we should check bit number instead of byte number. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix value of some UMC parameters for UMC v12Tao Zhou2-1/+5
Prepare for bad page retirement. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdkfd: Don't use sw fault filter if retry cam enabledPhilip Yang1-1/+4
If retry cam enabled, we don't use sw retry fault filter and add fault into sw filter ring, so we shouldn't remove fault from sw filter. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlbChristian König1-11/+18
The KIQ code path was ignoring the second flush. Also avoid long lines and re-calculating the register offsets over and over again. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-09-26drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPUPhilip Yang1-1/+1
On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900", because dGPU mode has 272 cam entries. After increasing IH soft ring to 512 entries, no more IH soft ring overflow message and application passed. Fixes: bf80d34b6c58 ("drm/amdgpu: Increase soft IH ring size") Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>