aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
AgeCommit message (Collapse)AuthorFilesLines
2021-11-09drm/amd/amdgpu: fix the kfd pre_reset sequence in sriovshaoyunl1-4/+1
The KFD pre_reset should be called before reset been executed, it will hold the lock to prevent other rocm process to sent the packlage to hiq during host execute the real reset on the HW Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-05drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()Alex Deucher1-1/+11
Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not set. Fixes: f7f12b25823c0d ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-03drm/amdgpu: remove duplicated kfd_resume_iommuJames Zhu1-4/+0
Remove duplicated kfd_resume_iommu which already runs in mdgpu_amdkfd_device_init. Tested-By: Ken Moffat <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-03drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdrJingwen Chen1-0/+4
[Why] In advance tdr mode, the real bad job will be resubmitted twice, while in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job is put one more time than other jobs. [How] Adding dma_fence_get before resbumit job in amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs Signed-off-by: Jingwen Chen <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-28drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()Lang Yu1-1/+1
amdgpu_fence_driver_sw_fini() should be executed before amdgpu_device_ip_fini(), otherwise fence driver resource won't be properly freed as adev->rings have been tore down. Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-28BackMerge tag 'v5.15-rc7' into drm-nextDave Airlie1-0/+4
The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in. Signed-off-by: Dave Airlie <[email protected]>
2021-10-19drm/amd/amdgpu: Do irq_fini_hw after ip_fini_earlyYuBiao Wang1-2/+2
[Why] drm_irq_uninstall is called in irq_fini_hw so that irq is disabled in sw stage. SMU (and maybe other IP blocks) fini_hw will call irq_put for cleanup and the whole cleanup process will be skipped because of drm->irq_enable = false. [How] Move ip_fini_early before irq_fini_hw. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-13drm/amdkfd: fix boot failure when iommu is disabled in Picasso.Yifan Zhang1-4/+0
When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling <[email protected]> Tested-by: youling <[email protected]> Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-08drm/amdgpu: use adev_to_drm for consistency when accessing drm_deviceGuchun Chen1-1/+1
adev_to_drm is used everywhere, so improve recent changes when accessing drm_device pointer from amdgpu_device. Signed-off-by: Guchun Chen <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-07drm/amdgpu: unify BO evicting method in amdgpu_ttmNirmoy Das1-6/+24
Unify BO evicting functionality for possible memory types in amdgpu_ttm.c. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resumeGuchun Chen1-0/+6
In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen. Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery") Suggested-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: init iommu after amdkfd device initYifan Zhang1-4/+4
This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: James Zhu <[email protected]> Tested-by: James Zhu <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resumeGuchun Chen1-0/+6
In current code, when a PCI error state pci_channel_io_normal is detectd, it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI driver will continue the execution of PCI resume callback report_resume by pci_walk_bridge, and the callback will go into amdgpu_pci_resume finally, where write lock is releasd unconditionally without acquiring such lock first. In this case, a deadlock will happen when other threads start to acquire the read lock. To fix this, add a member in amdgpu_device strucutre to cache pci_channel_state, and only continue the execution in amdgpu_pci_resume when it's pci_channel_io_frozen. Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery") Suggested-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: print warning and taint kernel if lockup timeout is disabledChristian König1-0/+2
Make sure that we notice this in error reports. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: revert "Add autodump debugfs node for gpu reset v8"Christian König1-4/+0
This reverts commit 728e7e0cd61899208e924472b9e641dbeb0775c4. Further discussion reveals that this feature is severely broken and needs to be reverted ASAP. GPU reset can never be delayed by userspace even for debugging or otherwise we can run into in kernel deadlocks. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Nirmoy Das <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05drm/amdgpu: init iommu after amdkfd device initYifan Zhang1-4/+4
This patch is to fix clinfo failure in Raven/Picasso: Number of platforms: 1 Platform Profile: FULL_PROFILE Platform Version: OpenCL 2.2 AMD-APP (3364.0) Platform Name: AMD Accelerated Parallel Processing Platform Vendor: Advanced Micro Devices, Inc. Platform Extensions: cl_khr_icd cl_amd_event_callback Platform Name: AMD Accelerated Parallel Processing Number of devices: 0 Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: James Zhu <[email protected]> Tested-by: James Zhu <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: add new asic_type for IP discoveryAlex Deucher1-0/+1
Add a new asic type for asics where we don't have an explicit entry in the PCI ID list. We don't need an asic type for these asics, other than something higher than the existing ones, so just use this for all new asics. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: default to true in amdgpu_device_asic_has_dc_supportAlex Deucher1-1/+3
We are not going to support any new chips with the old non-DC code so make it the default. Reviewed-by: Harry Wentland <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: drive all vega asics from the IP discovery tableAlex Deucher1-16/+0
Rather than hardcoding based on asic_type, use the IP discovery table to configure the driver. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: drive all navi asics from the IP discovery tableAlex Deucher1-20/+0
Rather than hardcoding based on asic_type, use the IP discovery table to configure the driver. v2: rebase Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: drive nav10 from the IP discovery tableAlex Deucher1-1/+0
Rather than hardcoding based on asic_type, use the IP discovery table to configure the driver. Only tested on Navi10 so far. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amdgpu: Use IP discovery to drive setting IP blocks by defaultAlex Deucher1-2/+4
Drive the asic setup from the IP discovery table rather than hardcoded settings based on asic type. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-04drm/amd/display: add cyan_skillfish display supportZhan Liu1-0/+1
[Why] add display related cyan_skillfish files in. makefile controlled by CONFIG_DRM_AMD_DC_DCN201 flag. v2: squash in clang fixes from Harry, Nathan v3: squash in missing CONFIG_DRM_AMD_DC check (Alex) Signed-off-by: Charlene Liu <[email protected]> Signed-off-by: Zhan Liu <[email protected]> Reviewed-by: Charlene Liu <[email protected]> Acked-by: Jun Lei <[email protected]> Acked-by: Harry Wentland <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-09-29drm/amdgpu: drm/amdgpu: Handle IOMMU enabled caseAndrey Grodzovsky1-0/+2
Handle all DMA IOMMU group related dependencies before the group is removed and we try to access it after free. v2: Move the actul handling function to TTM Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-09-23drm/amdgpu: move amdgpu_virt_release_full_gpu to fini_early stageGuchun Chen1-4/+5
adev->rmmio is set to be NULL in amdgpu_device_unmap_mmio to prevent access after pci_remove, however, in SRIOV case, amdgpu_virt_release_full_gpu will still use adev->rmmio for access after amdgpu_device_unmap_mmio. The patch is to move such SRIOV calling earlier to fini_early stage. Fixes: 07775fc13878 ("drm/amdgpu: Unmap all MMIO mappings") Cc: Andrey Grodzovsky <[email protected]> Signed-off-by: Leslie Shi <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-09-16drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+12
Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-09-14drm/amdgpu: move iommu_resume before ip init/resumeJames Zhu1-0/+12
Separate iommu_resume from kfd_resume, and move it before other amdgpu ip init/resume. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277 Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-09-14drm/amdgpu: Get atomicOps info from Host for sriov setupshaoyunl1-11/+13
The AtomicOp Requester Enable bit is reserved in VFs and the PF value applies to all associated VFs. so guest driver can not directly enable the atomicOps for VF, it depends on PF to enable it. In current design, amdgpu driver will get the enabled atomicOps bits through private pf2vf data Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer1-6/+5
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: [email protected] Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> # v3 Acked-by: Christian König <[email protected]> # v3 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd/amdgpu:flush ttm delayed work before cancel_syncYuBiao Wang1-1/+3
[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/amdgpu embed hw_fence into amdgpu_jobJack Zhang1-1/+12
Why: Previously hw fence is alloced separately with job. It caused historical lifetime issues and corner cases. The ideal situation is to take fence to manage both job and fence's lifetime, and simplify the design of gpu-scheduler. How: We propose to embed hw_fence into amdgpu_job. 1. We cover the normal job submission by this method. 2. For ib_test, and submit without a parent job keep the legacy way to create a hw fence separately. v2: use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is embedded in a job. v3: remove redundant variable ring in amdgpu_job v4: add tdr sequence support for this feature. Add a job_run_counter to indicate whether this job is a resubmit job. v5 add missing handling in amdgpu_fence_enable_signaling Signed-off-by: Jingwen Chen <[email protected]> Signed-off-by: Jack Zhang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Reviewed by: Monk Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-09drm/amd/amdgpu: skip locking delayed work if not initialized.YuBiao Wang1-1/+2
When init failed in early init stage, amdgpu_object has not been initialized, so hasn't the ttm delayed queue functions. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Emily.Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)Guchun Chen1-3/+2
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop scheduler in s3 test, otherwise, fence related failure will arrive after resume. To fix this and for a better clean up, move drm_sched_fini from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and should never be called in hw_fini. v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init, to keep sw_init and sw_fini paired. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668 Fixes: 8d35a2596164c1 ("drm/amdgpu: adjust fence driver enable sequence") Suggested-by: Christian König <[email protected]> Tested-by: Mike Lothian <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-30Merge tag 'amd-drm-next-5.15-2021-07-29' of ↵Dave Airlie1-44/+97
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-07-29: amdgpu: - VCN/JPEG power down sequencing fixes - Various navi pcie link handling fixes - Clockgating fixes - Yellow Carp fixes - Beige Goby fixes - Misc code cleanups - S0ix fixes - SMU i2c bus rework - EEPROM handling rework - PSP ucode handling cleanup - SMU error handling rework - AMD HDMI freesync fixes - USB PD firmware update rework - MMIO based vram access rework - Misc display fixes - Backlight fixes - Add initial Cyan Skillfish support - Overclocking fixes suspend/resume amdkfd: - Sysfs leak fix - Add counters for vm faults and migration - GPUVM TLB optimizations radeon: - Misc fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-07-28drm/amdgpu: adjust fence driver enable sequenceLikun Gao1-4/+6
Fence driver was enabled per ring when sw init on per IP block before. Change to enable all the fence driver at the same time after amdgpu_device_ip_init finished. Rename some function related to fence to make it reasonable for read. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23drm/amdgpu: Clear doorbell interrupt status for Sienna CichlidChengzhe Liu1-0/+4
On Sienna Cichlid, in pass-through mode, if we unload the driver in BACO mode(RTPM), then the kernel would receive thousands of interrupts. That's because there is doorbell monitor interrupt on BIF, so KVM keeps injecting interrupts to the guest VM. So we should clear the doorbell interrupt status after BACO exit. v2: Modify coding style and commit message Signed-off-by: Chengzhe Liu <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23drm/amdgpu: init family name for cyan_skillfishTao Zhou1-0/+1
Use FAMILY_NV for cyan_skillfish. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23drm/amdgpu: add cyan_skillfish asic typeTao Zhou1-0/+5
Add cyan_skillfish asic family. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23drm/amd/amdgpu: consider kernel job always not guiltyJingwen Chen1-3/+3
[Why] Currently all timedout job will be considered to be guilty. In SRIOV multi-vf use case, the vf flr happens first and then job time out is found. There can be several jobs timeout during a very small time slice. And if the innocent sdma job time out is found before the real bad job, then the innocent sdma job will be set to guilty. This will lead to a page fault after resubmitting job. [How] If the job is a kernel job, we will always consider it not guilty Reviewed-by: Christian König <[email protected]> Signed-off-by: Jingwen Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23drm/amdgpu: Change the imprecise function nameRoy Sun1-1/+1
The callback functions are used for SRIOV read/write instead of just for rlcg read/write Signed-off-by: Roy Sun <[email protected]> Reviewed-by: Zhou pengju <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-23Merge tag 'drm-misc-next-2021-07-22' of ↵Dave Airlie1-5/+6
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15-rc1: UAPI Changes: - Remove sysfs stats for dma-buf attachments, as it causes a performance regression. Previous merge is not in a rc kernel yet, so no userspace regression possible. Cross-subsystem Changes: - Sanitize user input in kyro's viewport ioctl. - Use refcount_t in fb_info->count - Assorted fixes to dma-buf. - Extend x86 efifb handling to all archs. - Fix neofb divide by 0. - Document corpro,gm7123 bridge dt bindings. Core Changes: - Slightly rework drm master handling. - Cleanup vgaarb handling. - Assorted fixes. Driver Changes: - Add support for ws2401 panel. - Assorted fixes to stm, ast, bochs. - Demidlayer ingenic irq. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-07-21vgaarb: don't pass a cookie to vga_client_registerChristoph Hellwig1-4/+5
The VGA arbitration is entirely based on pci_dev structures, so just pass that back to the set_vga_decode callback. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
2021-07-21vgaarb: remove the unused irq_set_state argument to vga_client_registerChristoph Hellwig1-1/+1
All callers pass NULL as the irq_set_state argument, so remove it and the ->irq_set_state member in struct vga_device. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
2021-07-21vgaarb: provide a vga_client_unregister wrapperChristoph Hellwig1-1/+1
Add a trivial wrapper for the unregister case that sets all fields to NULL. Signed-off-by: Christoph Hellwig <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Acked-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
2021-07-16drm/amdgpu: split amdgpu_device_access_vram() into two small partsKevin Wang1-30/+75
split amdgpu_device_access_vram() 1. amdgpu_device_mm_access(): using MM_INDEX/MM_DATA to access vram 2. amdgpu_device_aper_access(): using vram aperature to access vram (option) Signed-off-by: Kevin Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-01drm/amdgpu: Fix resource leak on probe error pathJiri Kosina1-6/+2
This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980. It is not true (as stated in the reverted commit changelog) that we never unmap the BAR on failure; it actually does happen properly on amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> amdgpu_device_fini() error path. What's worse, this commit actually completely breaks resource freeing on probe failure (like e.g. failure to load microcode), as amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too early, leaving all the resources that'd normally be freed in amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading to all sorts of oopses when someone tries to, for example, access the sysfs and procfs resources which are still around while the driver is gone. Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure") Reported-by: Vojtech Pavlik <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-01drm/amdgpu: move apu flags initialization to the start of device initHuang Rui1-0/+36
In some asics, we need to adjust the behavior according to the apu flags at very early stage. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-06-30drm/amd/amdgpu: enable gpu recovery for beige_gobyChengming Gui1-0/+1
Enable gpu recovery for beige_goby. Signed-off-by: Chengming Gui <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-06-15drm/amdgpu: move shadow_list to amdgpu_bo_vmNirmoy Das1-2/+3
Move shadow_list to struct amdgpu_bo_vm as shadow BOs are part of PT/PD BOs. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-06-10Merge tag 'amd-drm-next-5.14-2021-06-09' of ↵Dave Airlie1-2/+38
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.14-2021-06-09: amdgpu: - SR-IOV fixes - Smartshift updates - GPUVM TLB flush updates - 16bpc fixed point display fix for DCE11 - BACO cleanups and core refactoring - Aldebaran updates - Initial Yellow Carp support - RAS fixes - PM API cleanup - DC visual confirm updates - DC DP MST fixes - DC DML fixes - Misc code cleanups and bug fixes amdkfd: - Initial Yellow Carp support radeon: - memcpy_to/from_io fixes UAPI: - Add Yellow Carp chip family id Used internally in the kernel driver and by mesa Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]