Age | Commit message (Collapse) | Author | Files | Lines |
|
The scratch register should be accessed through MMIO instead of RLCG
in SRIOV, since it being used in RLCG register access function.
Fixes: d54762cc3e6a ("drm/amdgpu: nuke dynamic gfx scratch reg allocation")
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Gavin Wan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
TTM used to track the "acc_size" of all BOs internally. We needed to
keep track of it in our memory reservation to avoid TTM running out
of memory in its own accounting. However, that "acc_size" accounting
has since been removed from TTM. Therefore we don't really need to
track it any more.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Protect the struct amdgpu_bo_list with a mutex. This is used during command
submission in order to avoid buffer object corruption as recorded in
the link below.
v2 (chk): Keep the mutex looked for the whole CS to avoid using the
list from multiple CS threads at the same time.
Suggested-by: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Vitaly Prosyak <[email protected]>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2048
Signed-off-by: Luben Tuikov <[email protected]>
Signed-off-by: Christian König <[email protected]>
Tested-by: Luben Tuikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
enable mode1 reset for smu_v13_0_7 since it's missing.
Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To match with the enum defined in trusted os
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's redundant, as now switching to rpm_mode to indicate
runtime power management mode.
Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This quirk is not needed any more as it's fixed by bypassing
SMU FW reloading in runtime resume.
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
SMU is always alive, so it's fine to skip SMU FW reloading
when runpm resumed from BACO, this can avoid some race issues
when resuming SMU.
Suggested-by: Evan Quan <[email protected]>
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It can benefit code consistency in future.
Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Clarify which architecture those asics acronyms refers to.
Signed-off-by: André Almeida <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Save the original stable pstate on ctx init and restore
it on ctx fini so that we restore a manually selected
stable pstate on ctx exit.
v2: fix init order (Alex)
v3: don't add new variable to ctx struct (Evan)
Fixes: c65b364c52ba ("drm/amdgpu/ctx: only reset stable pstate if the user changed it (v2)")
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Align RDNA2.x with other asics. One HDP bit per SDMA instance,
aligned with firmware. This is effectively a revert of
commit 369b7d04baf3 ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.
Fixes: 369b7d04baf3 ("drm/amdgpu/nbio2.3: don't use GPU_HDP_FLUSH bit 12")
Reviewed-by: Kent Russell <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Align aldebaran with all other asics. One HDP bit per
SDMA instance, aligned with firmware. This is effectively
a revert of
commit a0f9f8546668 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12").
On further discussions with the relevant hardware teams,
re-align the bits for SDMA.
Fixes: a0f9f8546668 ("drm/amdgpu/nbio7.4: don't use GPU_HDP_FLUSH bit 12")
Reviewed-by: Kent Russell <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Jadeite platform uses I2S MICSP instance.
Create platform devices for DMA controller and I2S controller for
Jadeite platform.
Signed-off-by: Vijendar Mukunda <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
DMI check is required to distinguish Jadeite platform from
Stoney base variant.
Add DMI check logic for Jadeite platform.
Signed-off-by: Vijendar Mukunda <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fixed below checkpatch warnings and errors
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:131: CHECK: Comparison to NULL could be written "apd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:150: CHECK: Comparison to NULL could be written "apd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:196: CHECK: Prefer kernel type 'u64' over 'uint64_t'
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:224: CHECK: Please don't use multiple blank lines
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:226: CHECK: Comparison to NULL could be written "!adev->acp.acp_genpd"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:233: CHECK: Please don't use multiple blank lines
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:239: CHECK: Alignment should match open parenthesis
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:241: CHECK: Comparison to NULL could be written "!adev->acp.acp_cell"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:247: CHECK: Comparison to NULL could be written "!adev->acp.acp_res"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:253: CHECK: Comparison to NULL could be written "!i2s_pdata"
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:350: CHECK: Alignment should match open parenthesis
drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c:550: ERROR: that open brace { should be on the previous line
Signed-off-by: Vijendar Mukunda <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Support query umc ras error counter.
2. Support ras umc ue error address remapping.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is a follow-up cleanup to [1]. See bellow refcount balancing
for calling amdgpu_job_submit_direct after this cleanup as far
as I calculated.
amdgpu_fence_emit
dma_fence_init 1
dma_fence_get(fence) 2
rcu_assign_pointer(*ptr, dma_fence_get(fence) 3
---> amdgpu_job_submit_direct completes before fence signaled
amdgpu_sa_bo_free
(*sa_bo)->fence = dma_fence_get(fence) 4
amdgpu_job_free
dma_fence_put 3
amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 4
dma_fence_put(f); 3
amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 2
amdgpu_fence_process
dma_fence_put 1
amdgpu_sa_bo_remove_locked
dma_fence_put 0
---> amdgpu_job_submit_direct completes after fence signaled
amdgpu_fence_process
dma_fence_put 2
amdgpu_job_free
dma_fence_put 1
amdgpu_vcn_enc_get_destroy_msg
*fence = dma_fence_get(f) 2
dma_fence_put(f); 1
amdgpu_vcn_enc_ring_test_ib
dma_fence_put(fence) 0
[1] - https://patchwork.kernel.org/project/dri-devel/cover/[email protected]/
Signed-off-by: Andrey Grodzovsky <[email protected]>
Suggested-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
AV1 is only supported on first instance.
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
User reported gpu page fault when running graphics applications
and in some cases garbaged graphics are observed as soon as X
starts. This patch fixes all the issues.
Fixed the typecast issue for fpfn and lpfn variables, thus
preventing the overflow problem which resolves the memory
corruption.
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Reported-by: Mike Lothian <[email protected]>
Tested-by: Mike Lothian <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Christian König <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-14:
amdgpu:
- DCN3.2 updates
- DC SubVP support
- DP MST fixes
- Audio fixes
- DC code cleanup
- SMU13 updates
- Adjust GART size on newer APUs for S/G display
- Soft reset for GFX 11
- Soft reset for SDMA 6
- Add gfxoff status query for vangogh
- Improve BO domain pinning
- Fix timestamps for cursor only commits
- MES fixes
- DCN 3.1.4 support
- Misc fixes
- Misc code cleanup
amdkfd:
- Simplify GPUVM validation
- Unified memory for CWSR save/restore area
- fix possible list corruption on queue failure
radeon:
- Fix bogus power of two warning
UAPI:
- Unified memory for CWSR save/restore area for KFD
Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080952.html
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
When pinning a buffer, we should check to see if there are any
additional restrictions imposed by bo->preferred_domains. This will
prevent the BO from being moved to an invalid domain when pinning.
For example, this can happen if the user requests to create a BO in GTT
domain for display scanout. amdgpu_dm will allow pinning to either VRAM
or GTT domains, since DCN can scanout from either or. However, in
amdgpu_bo_pin_restricted(), pinning to VRAM is preferred if there is
adequate carveout. This can lead to pinning to VRAM despite the user
requesting GTT placement for the BO.
v2: Allow the kernel to override the domain, which can happen when
exporting a BO to a V4L camera (for example).
Signed-off-by: Leo Li <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Port aggregated doorbell support to gfx11.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Port aggregated doorbell support to sdma6.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Ring aggregated doorbel to make unmapped queue scheduled in mes firmware.
Signed-off-by: Le Ma <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Jack Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Allocate and enable aggregated doorbell.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Allocate and enable aggregated doorbell.
Signed-off-by: Le Ma <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Jack Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move reset_context out of gpu recover function to make it configurable
for different reset purpose.
For the reset way of call gpu_recovery sysfs, force to use full reset
method. Otherwise, try soft reset by default if the related ASIC
supportted, if soft reset failed, will use full reset.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Support SDMA soft reset for SDMA v6.
V3: use ib test to check soft reset.
V4: squash in unused variable fix (Alex)
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable soft reset for gfx 11.
V2: enable both gfx v11.0.0 and gfx v11.0.2.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Support GFX soft reset for gfx v11.
V3: use ib test check soft reset.
V4: squash in unused variable fix (Alex)
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set corresponding ready flag for mes ring when enable or disable
mes ring.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
That has been done in BO release notify.
Signed-off-by: xinhui pan <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For GMC 10 parts which support scatter/gather display (display
from system memory), we should allocate a larger gart size
to better handler larger displays. This mirrors what we already
do for GMC 9 parts.
v2: fix typo (Alex)
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Need reserve buffers before unmap mes ctx bo va.
v2: fix removal of dma_resv_excl_fence() (Alex)
v3: fix dma_resv_usage (Alex)
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.20-2022-07-05:
amdgpu:
- Various spelling and grammer fixes
- Various eDP fixes
- Various DMCUB fixes
- VCN fixes
- GMC 11 fixes
- RAS fixes
- TMZ support for GC 10.3.7
- GPUVM TLB flush fixes
- SMU 13.0.x updates
- DCN 3.2 Support
- DCN 3.2.1 Support
- MES updates
- GFX11 modifiers support
- USB-C fixes
- MMHUB 3.0.1 support
- SDMA 6.0 doorbell fixes
- Initial devcoredump support
- Enable high priority gfx queue on asics which support it
- Enable GPU reset for SMU 13.0.4
- OLED display fixes
- MPO fixes
- DC frame size fixes
- ASPM support for PCIE 7.4/7.6
- GPU reset support for SMU 13.0.0
- GFX11 updates
- VCN JPEG fix
- BACO support for SMU 13.0.7
- VCN instance handling fix
- GFX8 GPUVM TLB flush fix
- GPU reset rework
- VCN 4.0.2 support
- GTT size fixes
- DP link training fixes
- LSDMA 6.0.1 support
- Various backlight fixes
- Color encoding fixes
- Backlight config cleanup
- VCN 4.x unified queue cleanup
amdkfd:
- MMU notifier fixes
- Updates for GC 10.3.6 and 10.3.7
- P2P DMA support using dma-buf
- Add available memory IOCTL
- SDMA 6.0.1 fix
- MES fixes
- HMM profiler support
radeon:
- License fix
- Backlight config cleanup
UAPI:
- Add available memory IOCTL to amdkfd
Proposed userspace: https://www.mail-archive.com/[email protected]/msg75743.html
- HMM profiler support for amdkfd
Proposed userspace: https://lists.freedesktop.org/archives/amd-gfx/2022-June/080805.html
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
For some cases (accessing registers, unmap legacy queue), it needs
access mes in atomic context. Use spinlock to protect agaist mes
ring buffer race condition.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Seems to break hibernation. Disable for now until we can root
cause it.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=216119
Acked-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Was dropped when we converted to the generic helpers.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Acked-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To avoid code repetition, unify the function exit path when possible. No
functional changes.
Acked-by: Christian König <[email protected]>
Signed-off-by: André Almeida <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
if amdgpu_mes_ctx_alloc_meta_data() fails, we should call amdgpu_vm_fini()
to handle amdgpu_vm_init().
Add a new lable before amdgpu_vm_init() and goto this lable when
amdgpu_mes_ctx_alloc_meta_data() fails.
Signed-off-by: Jianglei Nie <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
MES fw updated to support unmapping legacy gfx/compute queue.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It should not init whole ras bad page framework on sriov guest side
due to it is handled on host side.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX is the only IP block that RAS TA needs to program
the hardware when receiving enable_feature command.
Changed from V1:
remove amdgpu_ras_need_send_ras_feature inline function,
use GFX RAS block check directly.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We don't need to validate and map root PD specially here,
it would be validated and mapped by amdgpu_vm_validate_pt_bos
if it is evicted.
The special case is when turning a GFX VM to a compute VM,
if vm_update_mode changed, we should make sure root PD gets
mapped. So just map root PD after updating vm->update_funcs
in amdgpu_vm_make_compute whether the vm_update_mode changed
or not.
v3:
- Add some comments suggested by Christian.
v2:
- Don't rename vm_validate_pt_pd_bos and make it public.
Signed-off-by: Lang Yu <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Output user queue eviction and restore event. User queue eviction may be
triggered by svm or userptr MMU notifier, TTM eviction, device suspend
and CRIU checkpoint and restore.
User queue restore may be rescheduled if eviction happens again while
restore.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 8748de873fedf4d55bdd99bbb738ee7ddf329792
since drv enabled mes to access registers.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable mes to access registers.
v2: squash mes sched ring enablement flag
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add mes register access routines:
1. read register
2. write register
3. wait register
4. write and wait register
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add misc op commands in mes11.
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|