Age | Commit message (Collapse) | Author | Files | Lines |
|
Enable it by default.
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The problem case is as follows:
1. GPU A triggers a gpu ras reset, and GPU A drives
GPU B to also perform a gpu ras reset.
2. After gpu B ras reset started, gpu B queried a DE
data. Since the DE data was queried in the ras reset
thread instead of the page retirement thread, bad
page retirement work would not be triggered. Then
even if all gpu resets are completed, the bad pages
will be cached in RAM until GPU B's bad page retirement
work is triggered again and then saved to eeprom.
This patch can save the bad pages to eeprom in time after gpu
ras reset is completed.
v2:
1. Add the above description to code comments.
2. Reuse existing function.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Before uninstalling gpu driver, flush all cached ras
bad pages to eeprom.
v2:
Put the same code into a function and reuse the function.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX12.
Signed-off-by: Sunil Khatri <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To enable mesa to use display dcc, DM should expose them in the
supported modifiers. Add the best (most efficient) modifiers first.
Signed-off-by: Aurabindo Pillai <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX11.
Signed-off-by: Sunil Khatri <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use the dev_info/err variants so we get per device logging
in multi-GPU cases.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX10.
Signed-off-by: Sunil Khatri <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Why:
If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver
load or driver unload, subsequent amdgpu driver load will fail at
smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean
environment is 0x1 and if value differs from expected, amdgpu driver
load will fail.
How to fix:
Ignore the initial value in smu response register before the first smu
message is sent,if smc in SMU_FW_INIT state, just proceed further to
send the message. If register holds an unexpected value after smu message
was sent set, smc_state to SMU_FW_HANG state and no further smu messages
will be sent.
v2:
Set SMU_FW_INIT state at the start of smu hw_init/resume.
Check smc_fw_state before sending smu message if in hang state skip
sending message.
Set SMU_FW_HANG only in case unexpected value is detected
Signed-off-by: Danijel Slivka <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For SOCs with GFX v9.4.3, a VF may have multiple compute partitions.
Fetch the partition information during init and initialize partition
nodes. There is no support to switch partition mode in VF mode, hence
disable the same.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have
sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For
Even numbered vfs, the sdma2 & sdma3 (irq srouce id
CLIENTID_SDMA2 and CLIENTID_SDMA3) should map to irq seq 0 & 1.
Signed-off-by: Gavin Wan <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
to avoid reading wrong WPTR from doorbell in sriov vf, set
CP_HQD_PQ_DOORBELL_CONTROL.DOORBELL_MODE to 1 to read WPTR from MQD.
Signed-off-by: Zhigang Luo <[email protected]>
Acked-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu ras 'event_state' sysfs device attribute support
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
V1: This patch enables following UMD stable Pstates profile
levels for power_dpm_force_performance_level interface.
- profile_peak
- profile_min_mclk
- profile_min_sclk
- profile_standard
V2: Fix conflict with commit "drm/amd/pm: smu v14.0.4 reuse smu v14.0.0 dpmtable "
V3: Add VCLK1 and DCLK1 support for SMU V14.0.1
And avoid to set VCLK1 and DCLK1 for SMU v14.0.0
Signed-off-by: Li Ma <[email protected]>
Reviewed-by: Tim Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu ras POSION_CONSUMPTION event id support.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX v9.4.4 uses mode1 reset to handle poison consumption.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu ras POSION_CREATION event id support.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
- use unified event id to manage ras events
- add a new function amdgpu_ras_query_error_status_with_event() to accept
event type as parameter.
v2:
add a warn log to show the location of function failure
when calling amdgpu_ras_mark_event(). (Tao Zhou)
v3:
change RAS_EVENT_TYPE_ISR to RAS_EVENT_TYPE_FATAL.
v4:
rename amdgpu_ras_get_recovery_event() to
amdgpu_ras_get_fatal_error_event().
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
In dm resume, we firstly restore dc state and do the mst resume for topology
probing thereafter. If we change dpcd DP_MSTM_CTRL value after LT in mst reume,
it will cause light up problem on the hub.
[How]
Revert commit 202dc359adda ("drm/amd/display: Defer handling mst up request in resume").
And adjust the reason to trigger dc_link_detect by DETECT_REASON_RESUMEFROMS3S4.
Cc: [email protected]
Fixes: 202dc359adda ("drm/amd/display: Defer handling mst up request in resume")
Signed-off-by: Wayne Lin <[email protected]>
Reviewed-by: Fangzhi Zuo <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
A gang submit won't work if the VMID is reserved and we can't flush out
VM changes from multiple engines at the same time.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
DPG mode is enabled for vcn and jpeg on VCN v4_0_5
Signed-off-by: Saleemkhan Jamadar <[email protected]>
Reviewed-by: Tim Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
remove redundant semicolons in RAS_EVENT_LOG to avoid
code format check warning.
Fixes: b712d7c20133 ("drm/amdgpu: fix compiler 'side-effect' check issue for RAS_EVENT_LOG()")
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
While moving buffer which has dcc tiling config, it is needed to restore
its original dcc tiling.
1. extend copy flag to cover tiling bits
2. add logic to restore original dcc tiling config
Signed-off-by: Frank Min <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support of all the CP GFX queues for gfx12 ipdump
to be used by devcoredump.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add gfx12 support of CP queue registers for all queues
to be used by devcoredump.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable redirection of irq for pagefaults for specific
clients to avoid overflow without dropping interrupts.
So here we redirect the interrupts to another IH ring
i.e ring1 where only these interrupts are processed.
Signed-off-by: Sunil Khatri <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need IH ring1 for handling the pagefault
interrupts which over flow in default
ring for specific usecases.
Enable ring1 allows software to redirect
high interrupts to ring1 from default IH
ring.
Signed-off-by: Sunil Khatri <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
If VF request full GPU access and the request failed,
the VF driver can get stuck accessing registers for an extended period during
the unload of KMS.
[How]
Set no_hw_access flag when VF request for full GPU access fails
This prevents further hardware access attempts, avoiding the prolonged
stuck state.
Signed-off-by: Yifan Zha <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support of gfx12 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add general registers of gfx12 in ipdump for
devcoredump support.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
update gfxhub client id for gfx12
Signed-off-by: Frank Min <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Certain call paths still load the SMU firmware for APUs,
which needs to be skipped.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Sysfs node disable query error count during gpu reset.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-07-03:
amdgpu:
- Use vmalloc for dc_state
- Replay fixes
- Freesync fixes
- DCN 4.0.1 fixes
- DML fixes
- DCC updates
- Misc code cleanups and bug fixes
- 8K display fixes
- DCN 3.5 fixes
- Restructure DIO code
- DML1 fixes
- DML2 fixes
- GFX11 fix
- GFX12 updates
- GFX12 modifiers fixes
- RAS fixes
- IP dump fixes
- Add some updated IP version checks
_ Silence UBSAN warning
radeon:
- GPUVM fix
Signed-off-by: Daniel Vetter <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-06-28:
amdgpu:
- JPEG 5.x fixes
- More FW loading cleanups
- Misc code cleanups
- GC 12.x fixes
- ASPM fix
- DCN 4.0.1 updates
- SR-IOV fixes
- HDCP fix
- USB4 fixes
- Silence UBSAN warnings
- MES submission fixes
- Update documentation for new products
- DCC updates
- Initial ISP 4.x plumbing
- RAS fixes
- Misc small fixes
amdkfd:
- Fix missing unlock in error path for adding queues
Signed-off-by: Daniel Vetter <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The exynos-next pull is based on a newer -rc than drm-next. hence
backmerge first to make sure the unrelated conflicts we accumulated
don't end up randomly in the exynos merge pull, but are separated out.
Conflicts are all benign: Adjacent changes in amdgpu and fbdev-dma
code, and cherry-pick conflict in xe.
Signed-off-by: Daniel Vetter <[email protected]>
|
|
This is a variable sized array.
Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/110420.html
Tested-by: Jeff Layton <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
During ip dump in gfx11 the index variable is reused but is
not reinitialized to 0 and this causes the index calculation
to be wrong and access out of bound access.
Acked-by: Christian König <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add firmware for PSP 14.0.4.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set the default reset method to mode2 for SMU 14.0.4.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add SMU 14.0.4 support
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add SMU 14.0.4 support.
Signed-off-by: Li Ma <[email protected]>
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Replace IP VERSION with smu->is_apu in if condition.
And the dpmtable of smu v14.0.4 is same as smu v14.0.0.
Signed-off-by: Li Ma <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add PSP 14.0.4 support.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add PSP 14.0.4 support.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add firmware for VPE 6.1.3.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add VPE 6.1.3 support.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add VPE 6.1.3 support.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable setting soc21 common clockgating for NBIO 7.11.3.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch is to add NBIO 7.11.3 support.
Signed-off-by: Tim Huang <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|