aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2019-12-02drm/amdgpu: should stop GFX ring in hw_finiMonk Liu1-1/+1
To align with the scheme from gfx9 disabling GFX ring after VM shutdown could avoid garbage data be fetched to GFX RB which may lead to unnecessary screw up on GFX Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu: do autoload right after MEC loaded for SRIOV VFMonk Liu1-2/+2
since we don't have RLCG ucode loading and no SRlist as well Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu: skip rlc ucode loading for SRIOV gfx10Monk Liu1-39/+41
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu: fix calltrace during kmd unload(v3)Monk Liu5-144/+6
issue: kernel would report a warning from a double unpin during the driver unloading on the CSB bo why: we unpin it during hw_fini, and there will be another unpin in sw_fini on CSB bo. fix: actually we don't need to pin/unpin it during hw_init/fini since it is created with kernel pinned, we only need to fullfill the CSB again during hw_init to prevent CSB/VRAM lost after S3 v2: get_csb in init_rlc so hw_init() will make CSIB content back even after reset or s3 v3: use bo_create_kernel instead of bo_create_reserved for CSB otherwise the bo_free_kernel() on CSB is not aligned and would lead to its internal reserve pending there forever take care of gfx7/8 as well Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Xiaojie Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu: use CPU to flush vmhub if sched stoppedMonk Liu1-1/+2
otherwse the flush_gpu_tlb will hang if we unload the KMD becuase the schedulers already stopped Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu/gfx: Increase dispatch packet numberJames Zhu1-2/+2
For Arcturus, increase dispatch packet number to stress scheduler. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu/gfx: Clear more EDC cntJames Zhu1-0/+5
Clear SDMA and HDP EDC counter in GPR workarounds. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu/gfx10: remove outdated commentsXiaojie Yuan1-3/+0
Signed-off-by: Xiaojie Yuan <[email protected]> Reviewed-by: Zhan Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-02drm/amdgpu/gfx10: unlock srbm_mutex after queue programming finishXiaojie Yuan1-2/+3
srbm_mutex is to guarantee atomicity for r/w of gfx indexed registers Signed-off-by: Xiaojie Yuan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-30Merge tag 'for-linus-hmm' of ↵Linus Torvalds8-512/+169
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is another round of bug fixing and cleanup. This time the focus is on the driver pattern to use mmu notifiers to monitor a VA range. This code is lifted out of many drivers and hmm_mirror directly into the mmu_notifier core and written using the best ideas from all the driver implementations. This removes many bugs from the drivers and has a very pleasing diffstat. More drivers can still be converted, but that is for another cycle. - A shared branch with RDMA reworking the RDMA ODP implementation - New mmu_interval_notifier API. This is focused on the use case of monitoring a VA and simplifies the process for drivers - A common seq-count locking scheme built into the mmu_interval_notifier API usable by drivers that call get_user_pages() or hmm_range_fault() with the VA range - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev drivers to the new API. This deletes a lot of wonky driver code. - Two improvements for hmm_range_fault(), from testing done by Ralph" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap mm/hmm: make full use of walk_page_range() xen/gntdev: use mmu_interval_notifier_insert mm/hmm: remove hmm_mirror and related drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror drm/amdgpu: Call find_vma under mmap_sem nouveau: use mmu_interval_notifier instead of hmm_mirror nouveau: use mmu_notifier directly for invalidate_range_start drm/radeon: use mmu_interval_notifier_insert RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv RDMA/odp: Use mmu_interval_notifier_insert() mm/hmm: define the pre-processor related parts of hmm.h even if disabled mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror mm/mmu_notifier: add an interval tree notifier mm/mmu_notifier: define the header pre-processor parts even if disabled mm/hmm: allow snapshot of the special zero page
2019-11-28Merge branch 'pci/misc'Bjorn Helgaas2-71/+121
- Add NumaChip SPDX header (Krzysztof Wilczynski) - Replace EXTRA_CFLAGS with ccflags-y (Krzysztof Wilczynski) - Remove unused includes (Krzysztof Wilczynski) - Avoid AMD FCH XHCI USB PME# from D0 defect that prevents wakeup on USB 2.0 or 1.1 connect events (Kai-Heng Feng) - Removed unused sysfs attribute groups (Ben Dooks) - Remove PTM and ASPM dependencies on PCIEPORTBUS (Bjorn Helgaas) - Add PCIe Link Control 2 register field definitions to replace magic numbers in AMDGPU and Radeon CIK/SI (Bjorn Helgaas) - Fix incorrect Link Control 2 Transmit Margin usage in AMDGPU and Radeon CIK/SI PCIe Gen3 link training (Bjorn Helgaas) - Use pcie_capability_read_word() instead of pci_read_config_word() in AMDGPU and Radeon CIK/SI (Frederick Lawler) * pci/misc: drm/radeon: Prefer pcie_capability_read_word() drm/radeon: Replace numbers with PCI_EXP_LNKCTL2 definitions drm/radeon: Correct Transmit Margin masks drm/amdgpu: Prefer pcie_capability_read_word() drm/amdgpu: Replace numbers with PCI_EXP_LNKCTL2 definitions drm/amdgpu: Correct Transmit Margin masks PCI: Add #defines for Enter Compliance, Transmit Margin PCI: Allow building PCIe things without PCIEPORTBUS PCI: Remove PCIe Kconfig dependencies on PCI PCI/ASPM: Remove dependency on PCIEPORTBUS PCI/PTM: Remove dependency on PCIEPORTBUS PCI/PTM: Remove spurious "d" from granularity message PCI: sysfs: Remove unused attribute groups x86/PCI: Avoid AMD FCH XHCI USB PME# from D0 defect PCI: Remove unused includes and superfluous struct declaration x86/PCI: Replace deprecated EXTRA_CFLAGS with ccflags-y x86/PCI: Correct SPDX comment style x86/PCI: Add NumaChip SPDX GPL-2.0 to replace COPYING boilerplate
2019-11-26drm/amdgpu: Fix a bug in jpeg_v1_0_start()Dan Carpenter1-1/+2
Originally the last WREG32_SOC15() was a part of the if statement block but the curly braces are on the wrong line. Fixes: bb0db70f3f75 ("drm/amdgpu: separate JPEG1.0 code out from VCN1.0") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: flag vram lost on baco reset for VI/CIKAlex Deucher2-4/+10
VI/CIK BACO was inflight when this fix landed for SOC15/NV. Add the fix to VI/CIK as well. Acked-by: Evan Quan <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: move pci handling out of pm opsAlex Deucher3-29/+24
The documentation says the that PCI core handles this for you unless you choose to implement it. Just rely on the PCI core to handle the pci specific bits. Reviewed-by: Zhan Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: apply gpr/gds workaround before enabling GFX EDC modeHawking Zhang1-4/+4
gfx memory should be initialized before enabling DED and FUE field in mmGB_EDC_MODE Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amd/amdgpu/sriov skip jpeg ip block for ARCTURUS VFJack Zhang1-1/+2
Currently ARCTURUS VF doesn't support jpeg ip block. Skip jpeg ip block in case guest driver load fail. Signed-off-by: Jack Zhang <[email protected]> Reviewed-by: Zhexi Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: Optimize KFD page table reservationFelix Kuehling1-1/+14
Be less pessimistic about estimated page table use for KFD. Most allocations use 2MB pages and therefore need less VRAM for page tables. This allows more VRAM to be used for applications especially on large systems with many GPUs and hundreds of GB of system memory. Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page table reservation per GPU: 1GB New page table reservation per GPU: 32MB Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: xinhui pan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: Raise KFD unpinned system memory limitFelix Kuehling1-2/+2
Allow KFD applications to use more unpinned system memory through HMM. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Yong Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: Optimize KFD page table reservationFelix Kuehling1-1/+14
Be less pessimistic about estimated page table use for KFD. Most allocations use 2MB pages and therefore need less VRAM for page tables. This allows more VRAM to be used for applications especially on large systems with many GPUs and hundreds of GB of system memory. Example: 8 GPUs with 32GB VRAM each + 256GB system memory = 512GB Old page table reservation per GPU: 1GB New page table reservation per GPU: 32MB Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: xinhui pan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: flag vram lost on baco reset for VI/CIKAlex Deucher2-4/+10
VI/CIK BACO was inflight when this fix landed for SOC15/NV. Add the fix to VI/CIK as well. Acked-by: Evan Quan <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: Resolved offchip EEPROM I/O issueJohn Clements2-5/+13
Updated target I2C address Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-26drm/amdgpu: Ensure ret is always initialized when using SOC15_WAIT_ON_RREGNathan Chancellor1-0/+1
Commit b0f3cd3191cd ("drm/amdgpu: remove unnecessary JPEG2.0 code from VCN2.0") introduced a new clang warning in the vcn_v2_0_stop function: ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: warning: variable 'r' is used uninitialized whenever 'while' loop exits because its condition is false [-Wsometimes-uninitialized] SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note: expanded from macro 'SOC15_WAIT_ON_RREG' while ((tmp_ & (mask)) != (expected_value)) { \ ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1083:6: note: uninitialized use occurs here if (r) ^ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1082:2: note: remove the condition if it is always true SOC15_WAIT_ON_RREG(VCN, 0, mmUVD_STATUS, UVD_STATUS__IDLE, 0x7, r); ^ ../drivers/gpu/drm/amd/amdgpu/../amdgpu/soc15_common.h:55:10: note: expanded from macro 'SOC15_WAIT_ON_RREG' while ((tmp_ & (mask)) != (expected_value)) { \ ^ ../drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:1072:7: note: initialize the variable 'r' to silence this warning int r; ^ = 0 1 warning generated. To prevent warnings like this from happening in the future, make the SOC15_WAIT_ON_RREG macro initialize its ret variable before the while loop that can time out. This macro's return value is always checked so it should set ret in both the success and fail path. Link: https://github.com/ClangBuiltLinux/linux/issues/776 Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-25drm/amdgpu: Resolved offchip EEPROM I/O issueJohn Clements2-5/+13
Updated target I2C address Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-25drm/amdgpu: Apply noretry setting for mmhub9.4Oak Zeng1-2/+3
Config the translation retry behavior according to noretry kernel parameter Signed-off-by: Oak Zeng <[email protected]> Suggested-by: Jay Cornwall <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-23drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirrorJason Gunthorpe5-237/+94
Convert the collision-retry lock around hmm_range_fault to use the one now provided by the mmu_interval notifier. Although this driver does not seem to use the collision retry lock that hmm provides correctly, it can still be converted over to use the mmu_interval_notifier api instead of hmm_mirror without too much trouble. This also deletes another place where a driver is associating additional data (struct amdgpu_mn) with a mmu_struct. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/amdgpu: Use mmu_interval_insert instead of hmm_mirrorJason Gunthorpe6-281/+77
Remove the interval tree in the driver and rely on the tree maintained by the mmu_notifier for delivering mmu_notifier invalidation callbacks. For some reason amdgpu has a very complicated arrangement where it tries to prevent duplicate entries in the interval_tree, this is not necessary, each amdgpu_bo can be its own stand alone entry. interval_tree already allows duplicates and overlaps in the tree. Also, there is no need to remove entries upon a release callback, the mmu_interval API safely allows objects to remain registered beyond the lifetime of the mm. The driver only has to stop touching the pages during release. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/amdgpu: Call find_vma under mmap_semJason Gunthorpe1-16/+21
find_vma() must be called under the mmap_sem, reorganize this code to do the vma check after entering the lock. Further, fix the unlocked use of struct task_struct's mm, instead use the mm from hmm_mirror which has an active mm_grab. Also the mm_grab must be converted to a mm_get before acquiring mmap_sem or calling find_vma(). Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers") Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads") Link: https://lore.kernel.org/r/[email protected] Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-22drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10changzhu3-2/+116
It may lose gpuvm invalidate acknowldege state across power-gating off cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire before invalidation and semaphore release after invalidation. After adding semaphore acquire before invalidation, the semaphore register become read-only if another process try to acquire semaphore. Then it will not be able to release this semaphore. Then it may cause deadlock problem. If this deadlock problem happens, it needs a semaphore firmware fix. Signed-off-by: changzhu <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2019-11-22drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhubchangzhu6-0/+13
SW must acquire/release one of the vm_invalidate_eng*_sem around the invalidation req/ack. Through this way,it can avoid losing invalidate acknowledge state across power-gating off cycle. To use vm_invalidate_eng*_sem, it needs to initialize vm_invalidate_eng*_sem firstly. Signed-off-by: changzhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2019-11-22drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.Jack Zhang1-1/+4
After rlcg fw 2.1, kmd driver starts to load extra fw for LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw because all rlcg related fw have been loaded by host driver. Guest driver would load the three fw fail without this change. Signed-off-by: Jack Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VFJack Zhang1-0/+36
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF Currently the three features haven't been enabled at SRIOV, it would trigger guest driver load fail with the bare-metal path of the three features. Signed-off-by: Jack Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/gfx10: re-init clear state buffer after gpu resetXiaojie Yuan1-6/+37
This patch fixes 2nd baco reset failure with gfxoff enabled on navi1x. clear state buffer (resides in vram) is corrupted after 1st baco reset, upon gfxoff exit, CPF gets garbage header in CSIB and hangs. Signed-off-by: Xiaojie Yuan <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2019-11-22merge fix for "ftrace: Rework event_create_dir()"Stephen Rothwell1-1/+1
Reviewed-by: Kevin Wang <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: Update Arcturus golden registersJay Cornwall1-0/+1
Signed-off-by: Jay Cornwall <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/gfx10: fix out-of-bound mqd_backup array accessXiaojie Yuan1-2/+0
Fixes: 0900a9efdb7909 ("drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)") Signed-off-by: Xiaojie Yuan <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/gfx10: explicitly wait for cp idle after halt/unhaltXiaojie Yuan1-2/+12
50us is not enough to wait for cp ready after gpu reset on some navi asics. Signed-off-by: Xiaojie Yuan <[email protected]> Suggested-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2019-11-22Revert "drm/amd/display: enable S/G for RAVEN chip"Alex Deucher1-1/+1
This reverts commit 1c4259159132ae4ceaf7c6db37a6cf76417f73d9. S/G display is not stable with the IOMMU enabled on some platforms. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205523 Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: disable gfxoff on original ravenAlex Deucher1-2/+7
There are still combinations of sbios and firmware that are not stable. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=204689 Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: remove experimental flag for Navi14Alex Deucher1-4/+4
5.4 and newer works fine with navi14. Reviewed-by: Xiaojie Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: disable gfxoff when using register read interfaceAlex Deucher1-1/+5
When gfxoff is enabled, accessing gfx registers via MMIO can lead to a hang. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497 Acked-by: Xiaojie Yuan <[email protected]> Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: fix bad DMA from INTERRUPT_CNTL2Sam Bobroff1-1/+2
The INTERRUPT_CNTL2 register expects a valid DMA address, but is currently set with a GPU MC address. This can cause problems on systems that detect the resulting DMA read from an invalid address (found on a Power8 guest). Instead, use the DMA address of the dummy page because it will always be safe. Fixes: 27ae10641e9c ("drm/amdgpu: add interupt handler implementation for si v3") Signed-off-by: Sam Bobroff <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/nv: add asic func for fetching vbios from rom directlyAlex Deucher1-2/+22
Needed as a fallback if the vbios can't be fetched by other means. Reviewed-by: Xiaojie Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: put flush_delayed_work at firstYintian Tao1-3/+1
There is one regression from 042f3d7b745cd76aa To put flush_delayed_work after adev->shutdown = true which will make amdgpu_ih_process not response the irq At last, all ib ring tests will be failed just like below [drm] amdgpu: finishing device. [drm] Fence fallback timer expired on ring gfx [drm] Fence fallback timer expired on ring comp_1.0.0 [drm] Fence fallback timer expired on ring comp_1.1.0 [drm] Fence fallback timer expired on ring comp_1.2.0 [drm] Fence fallback timer expired on ring comp_1.3.0 [drm] Fence fallback timer expired on ring comp_1.0.1 amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.1.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.2.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on comp_1.3.1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on sdma1 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on uvd_enc_0.0 (-110). amdgpu 0000:00:07.0: [drm:amdgpu_ib_ring_tests [amdgpu]] *ERROR* IB test failed on vce0 (-110). [drm:amdgpu_device_delayed_init_work_handler [amdgpu]] *ERROR* ib ring test failed (-110). v2: replace cancel_delayed_work_sync() with flush_delayed_work() Signed-off-by: Yintian Tao <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/vcn2.5: fix the enc loop with hw finiLeo Liu1-3/+3
Signed-off-by: Leo Liu <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu/gfx10: fix mqd backup/restore for gfx rings (v2)Xiaojie Yuan3-6/+7
1. no need to allocate an extra member for 'mqd_backup' array 2. backup/restore mqd to/from the correct 'mqd_backup' array slot v2: warning fix (Alex) Signed-off-by: Xiaojie Yuan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: Use ARRAY_SIZE for sos_old_versionszhengbin1-1/+1
Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/psp_v3_1.c:182:40-41: WARNING: Use ARRAY_SIZE Reported-by: Hulk Robot <[email protected]> Signed-off-by: zhengbin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10changzhu3-2/+116
It may lose gpuvm invalidate acknowldege state across power-gating off cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire before invalidation and semaphore release after invalidation. After adding semaphore acquire before invalidation, the semaphore register become read-only if another process try to acquire semaphore. Then it will not be able to release this semaphore. Then it may cause deadlock problem. If this deadlock problem happens, it needs a semaphore firmware fix. Signed-off-by: changzhu <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: initialize vm_inv_eng0_sem for gfxhub and mmhubchangzhu6-0/+13
SW must acquire/release one of the vm_invalidate_eng*_sem around the invalidation req/ack. Through this way,it can avoid losing invalidate acknowledge state across power-gating off cycle. To use vm_invalidate_eng*_sem, it needs to initialize vm_invalidate_eng*_sem firstly. Signed-off-by: changzhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amd/amdgpu/sriov skip RLCG s/r list for arcturus VF.Jack Zhang1-1/+4
After rlcg fw 2.1, kmd driver starts to load extra fw for LIST_CNTL,GPM_MEM,SRM_MEM. We needs to skip the three fw because all rlcg related fw have been loaded by host driver. Guest driver would load the three fw fail without this change. Signed-off-by: Jack Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amd/amdgpu/sriov temporarily skip ras,dtm,hdcp for arcturus VFJack Zhang1-0/+36
Temporarily skip ras,dtm,hdcp initialize and terminate for arcturus VF Currently the three features haven't been enabled at SRIOV, it would trigger guest driver load fail with the bare-metal path of the three features. Signed-off-by: Jack Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>