Age | Commit message (Collapse) | Author | Files | Lines |
|
These were leftover from bring up and are no longer
necessary. The information is available via
the IP discovery table.
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rather than checking the asic_type.
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
structure changed in smc_fw_version >= 0x3A4900,
"uint16_t VcnActivityPercentage" replaced with
"uint16_t VcnUsagePercentage0" and "uint16_t VcnUsagePercentage1"
Signed-off-by: Danijel Slivka <[email protected]>
Acked-by: Evan Quan <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable gfxoff routine for GC 10.3.7.
Signed-off-by: Prike Liang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable gfx power gating for GC 10.3.7.
Signed-off-by: Prike Liang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This will enable the following block clock gating.
- MC
- SDMA
- HDP
- ATHUB
- IH
- VCN/JPEG
Signed-off-by: Prike Liang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable gfx cg gate/ungate control for GC 10.3.7.
Signed-off-by: Prike Liang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Regression has been reported that suspend/resume may hang with
the previous vm ready check commit.
So bring back the evicted list check as a temp fix.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1922
Fixes: c1a66c3bc425 ("drm/amdgpu: check vm ready by amdgpu_vm->evicting flag")
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
As PSP needs to verify the signature, CAP firmware must be loaded first when PSP loads firmwares.
Otherwise, when DFC feature is enabled, CP firmwares would be loaded failed.
[ 1149.160480] [drm] MM table gpu addr = 0x800022f000, cpu addr = 00000000a62afcea.
[ 1149.209874] [drm] failed to load ucode CP_CE(0x8)
[ 1149.209878] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[ 1149.215914] [drm] failed to load ucode CP_PFP(0x9)
[ 1149.215917] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[ 1149.221941] [drm] failed to load ucode CP_ME(0xA)
[ 1149.221944] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[ 1149.228082] [drm] failed to load ucode CP_MEC1(0xB)
[ 1149.228085] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[ 1149.234209] [drm] failed to load ucode CP_MEC2(0xD)
[ 1149.234212] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[ 1149.242379] [drm] failed to load ucode VCN(0x1C)
[ 1149.242382] [drm] psp gfx command LOAD_IP_FW(0x6) failed and response status is (0xFFFF0007)
[How]
Move CAP UCODE ID to the beginning of AMDGPU_UCODE_ID enum list.
Signed-off-by: Yifan Zha <[email protected]>
Reviewed-by: Bokun Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This will allow to enable the tests only after latest fix
after which the tests passed on my system.
I tested on NV21 standalone and Vega 10 and Polaris as
pair with DRI_PRIME.
It's possible there might be still issues on ASICs i don't
have at my posession but that that the point of enbling
the tests finally - if other people during testing will
encounter errors they will report and I will be able to fix.
The releated merge request for enabling libdrm tests suite is in
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/227
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Protect with drm_dev_enter/exit
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use IP version rather than codename for noretry set.
Acked-by: Christian König <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
otherwise adev->ip_versions is still empty when amdgpu_gmc_noretry_set
is called.
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as
.ras_fini common function, which is called when
.ras_fini of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_fini to
initialize .ras_fini in ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in mca ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in sdma ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in hdp ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in xgmi ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in nbio ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
ras block
Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
centrally calls the .ras_fini function of all ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Move the variables of ras block instance members from
specific xxx_ras_fini to general ras_fini call.
2. Function calls inside the modules only use parameters
passed from xxx_ras_fini instead of ras block instance
members.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Modify .ras_fini function pointer parameter so that
we can remove redundant intermediate calls in some
ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
for PSR
[Why]
PSR Power on/off is done in PSR. Add a dc_debug option
and dmub setting to use PHY implementation of this instead.
[How]
Add a dc_debug option and dmub setting to use
PHY FSM Power up/down for PSR.
Co-authored-by: Shah Dharati <[email protected]>
Reviewed-by: Hansen Dsouza <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Shah Dharati <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[WHY?]
Some projectors support frame alternate 3D modes at 120Hz, but DAL3 does
not create timings. Most active DP to HDMI dongles do not translate
infoframes properly to use HW packing stereo mode.
[HOW?]
Create frame alternate 3D timings for displays that support it. Disable HW
packing 3D mode on DP active dongles.
Reviewed-by: Martin Leung <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Dillon Varone <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Clang static analysis reports this error
amdgpu_debugfs.c:1690:9: warning: 1st function call
argument is an uninitialized value
tmp = krealloc_array(tmp, i + 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~
realloc uses tmp, so tmp can not be garbage.
And the return needs to be checked.
Fixes: 5ce5a584cb82 ("drm/amdgpu: add debugfs for reset registers list")
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
HDMI Compliance requires VIC to be set to 0
on 2D mode if HDMI_VIC is present.
[How]
When VIC and HDMI_VIC is both present,
reset VIC to 0.
Reviewed-by: Martin Leung <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Chris Park <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why & How]
To align coding style for how we use this across DCN. The resource
creation ones can remain static, however.
Reviewed-by: Eric Yang <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
RDPCS programming is done in DMUB remove legacy invalid code
Reviewed-by: Alvin Lee <[email protected]>
Acked-by: Alan Liu <[email protected]>
Signed-off-by: Hansen Dsouza <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To remove duplicate code, unify event message format and simplify new
event add in the following patches.
Use KFD_SMI_EVENT_MSG_SIZE to define msg size, the same size will be
used in user space to alloc the msg receive buffer.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
sizeof(buf) is 8 bytes because it is defined as unsigned char *buf,
each SMI event read only copy max 8 bytes to user buffer. Correct this
by using the buf allocate size.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 3abfe30d803e62cc75dec254eefab3b04d69219b.
To fix deadlock in kFDSVMEvictTest when xnack off.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In file vega10_hwmgr.c, the names of struct vega10_power_state *
and struct pp_power_state * are confusingly used, which may lead
to some confusion.
Status quo is that variables of type struct vega10_power_state *
are named "vega10_ps", "ps", "vega10_power_state". A more
appropriate usage is that struct are named "ps" is used for
variabled of type struct pp_power_state *.
So rename struct vega10_power_state * which are named "ps" and
"vega10_power_state" to "vega10_ps", I also renamed "psa" to
"vega10_psa" and "psb" to "vega10_psb" to make it more clearly.
The rows longer than 100 columns are involved.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Meng Tang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Don't fill up the logs with:
[253557.859575] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed.
[253557.892966] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed.
[253557.926070] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed.
[253557.959344] [drm:amdgpu_dm_atomic_check [amdgpu]] DSC precompute is not needed.
which prints many times a second, when the kernel is run with
drm.debug=2.
Instead of DRM_DEBUG_DRIVER(), make it DRM_INFO_ONCE().
Cc: Alex Deucher <[email protected]>
Cc: Roman Li <[email protected]>
Cc: Felix Kuehling <[email protected]>
Cc: Hersen Wu <[email protected]>
Cc: Daniel Wheeler <[email protected]>
Fixes: 17ce8a6907f7 ("drm/amd/display: Add dsc pre-validation in atomic check")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Roman Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add AST2600 chip support and setting.
Signed-off-by: Tommy Haung <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Joel Stanley <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Add interrupt clear register define for further chip support.
Signed-off-by: Tommy Haung <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Joel Stanley <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Add a TODO item for optimizing blitting and format-conversion helpers
in DRM and fbdev. There's always demand for faster graphics output.
v4:
* fix typos (Sam)
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Acked-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Improve the performance of cfb_imageblit() by manually unrolling
the inner blitting loop and moving some invariants out. The compiler
failed to do this automatically. This change keeps cfb_imageblit()
in sync with sys_imagebit().
A microbenchmark measures the average number of CPU cycles
for cfb_imageblit() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging).
cfb_imageblit(), new: 15724 cycles
cfb_imageblit(): old: 30566 cycles
In the optimized case, cfb_imageblit() is now ~2x faster than before.
v3:
* fix commit description (Pekka)
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Acked-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Fix coding style. No functional changes.
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Acked-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Improve the performance of sys_imageblit() by manually unrolling
the inner blitting loop and moving some invariants out. The compiler
failed to do this automatically. The resulting binary code was even
slower than the cfb_imageblit() helper, which uses the same algorithm,
but operates on I/O memory.
A microbenchmark measures the average number of CPU cycles
for sys_imageblit() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging). The value
for CFB is given as a reference.
sys_imageblit(), new: 25934 cycles
sys_imageblit(), old: 35944 cycles
cfb_imageblit(): 30566 cycles
In the optimized case, sys_imageblit() is now ~30% faster than before
and ~20% faster than cfb_imageblit().
v2:
* move switch out of inner loop (Gerd)
* remove test for alignment of dst1 (Sam)
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Acked-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Improve the performance of sys_fillrect() by using word-aligned
32/64-bit mov instructions. While the code tried to implement this,
the compiler failed to create fast instructions. The resulting
binary instructions were even slower than cfb_fillrect(), which
uses the same algorithm, but operates on I/O memory.
A microbenchmark measures the average number of CPU cycles
for sys_fillrect() after a stabilizing period of a few minutes
(i7-4790, FullHD, simpledrm, kernel with debugging). The value
for CFB is given as a reference.
sys_fillrect(), new: 26586 cycles
sys_fillrect(), old: 166603 cycles
cfb_fillrect(): 41012 cycles
In the optimized case, sys_fillrect() is now ~6x faster than before
and ~1.5x faster than the CFB implementation.
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Reviewed-by: Sam Ravnborg <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Registers that exist in the shared render/compute reset domain need to
be placed on an engine workaround list to ensure that they are properly
re-applied whenever an RCS or CCS engine is reset. We have a number of
workarounds (updating registers MLTICTXCTL, L3SQCREG1_CCS0,
GEN12_MERT_MOD_CTRL, and GEN12_GAMCNTRL_CTRL) that are incorrectly
implemented on the 'gt' workaround list and need to be moved
accordingly.
Cc: Matt Roper <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Additional workarounds are required once we start exposing CCS engines.
Note that we have a number of workarounds that update registers in the
shared render/compute reset domain. Historically we've just added such
registers to the RCS engine's workaround list. But going forward we
should be more careful to place such workarounds on a wa_list for an
engine that definitely exists and is not fused off (e.g., a platform
with no RCS would never apply the RCS wa_list). We'll keep
rcs_engine_wa_init() focused on RCS-specific workarounds that only need
to be applied if the RCS engine is present. A separate
general_render_compute_wa_init() function will be used to define
workarounds that touch registers in the shared render/compute reset
domain and that we need to apply regardless of what render and/or
compute engines actually exist. Any workarounds defined in this new
function will internally be added to the first present RCS or CCS
engine's workaround list to ensure they get applied (and only get
applied once rather than being needlessly re-applied several times).
Co-author: Srinivasan Shanmugam
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
HW resources are divided across the active CCS engines at the compute
slice level, with each CCS having priority on one of the cslices.
If a compute slice has no enabled DSS, its paired compute engine is not
usable in full parallel execution because the other ones already fully
saturate the HW, so consider it fused off.
v2 (José):
- moved it to its own function
- fixed definition of ccs_mask
v3 (Matt):
- Replace fls() condition with a simple IP version test
v4 (Matt):
- Don't try to calculate a ccs_mask using
intel_slicemask_from_dssmask() until we've determined that we're
running on an Xe_HP platform where the logic makes sense (and won't
overflow).
Cc: Stuart Summers <[email protected]>
Cc: Vinay Belgaumkar <[email protected]>
Cc: Ashutosh Dixit <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Stuart Summers <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
A different emit breadcrumbs ring programming is required for compute /
render and we don't have UMD user so just reject parallel submission for
these engine classes.
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Tell GuC that CCS is enabled by setting the CCS mask in its ADS.
Cc: Vinay Belgaumkar <[email protected]>
Original-author: Michel Thierry
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We have to specify in the Render Control Unit Mode register
when CCS is enabled.
v2:
- Move RCU_MODE programming to a helper function. (Tvrtko)
- Clean up and clarify comments. (Tvrtko)
- Add RCU_MODE to the GuC save/restore list. (Daniele)
v3:
- Move this patch before the GuC ADS update to enable compute engines;
the definition of RCU_MODE and its insertion into the save/restore
list moves to this patch. (Daniele)
v4:
- Call xehp_enable_ccs_engines() directly in guc_resume() and
execlists_resume() rather than adding an extra layer of wrapping to
the engine->resume() vfunc. (Umesh)
Bspec: 46034
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Vinay Belgaumkar <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Aravind Iddamsetty <[email protected]>
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|