Age | Commit message (Collapse) | Author | Files | Lines |
|
add RAS error info support for sdma_v4_4_2.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
introduced "ras_err_info" to better identify a RAS ERROR source.
NOTE:
For legacy chips, keep the original RAS error print format.
v1:
RAS errors may come from different dies during a RAS error query,
therefore, need a new data structure to identify the source of RAS ERROR.
v2:
- use new data structure 'amdgpu_smuio_mcm_config_info' instead of
ras_err_id (in v1 patch)
- refine ras error dump function name
- refine ras error dump log format
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
flush the correct vmid tlb for specific pasid on gmc 11.
Fixes: 041a5743883d ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid")
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
(No effect outside the ras_mgr data structure)
Since a new member was added to the ras_err_data data structure,
it becomes unreasonable for the ras_mgr instance to contain this data,
because ras mgr only uses the 2 member information of ue_count/ce_count in err_data.
This patch changes the code err_data into built-in structure members,
making the code directly compatible.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Temporary workaround to fix issues observed in some compute
applications when GFXOFF is enabled on GFX9.
Signed-off-by: Jesse Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These are missed during rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
FW uses IP_VERSION_MAJ_MIN_REV format.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Legacy invalidation is not supported.
This is missed during rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use v7.7 before, switch to v7.11 now.
Fix incorrect programing.
Signed-off-by: Lang Yu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is needed to correctly handle BOs imported into compute VM from gfx.
Both kfd and gfx should use same bo_va and set bo_va->ref_count correctly
when map the Bos into same VM, otherwise we may trigger kernel general
protection when iterate mappings over bo_va's valids or invalids list.
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Xiaogang Chen <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Ramesh Errabolu <[email protected]>
Tested-by: Xiaogang Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add P2S table load support on SMU v13.0.6 ASICs.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support to load P2S tables through PSP.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Adds FW id for P2S table.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
FRU EEPROM access is not valid for APU devices.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
JPEG init header will overwirte vcn init header info which will loss
some debug information
Signed-off-by: Lin.Cao <[email protected]>
Reviewed-by: Jingwen Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit cd1a4bc22821eea9a98f1beddd1a8d789989a720.
[WHY & HOW]
The writeback series cause a regression in thunderbolt display.
Signed-off-by: Alex Hung <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit f6893fcb10c7b24526454e465f6ec2563ef044cc.
[WHY & HOW]
The writeback series cause a regression in thunderbolt display.
Signed-off-by: Alex Hung <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Expose ras table version & schema info to sysfs
v2: Updated schema to get poison support info
from ras context, removed asic specific checks
Signed-off-by: Asad Kamal <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
PSP OS updates the version information in register. On APUs with PSPv13,
PSP OS will already be loaded with SBIOS. Hence use the version register
instead of using information in driver binary header.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Program vcn_doorbell_range with vcn_ring0_1.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_uvd_suspend() allocates memory and copies objects into that
allocated memory. This fails under memory pressure. Instead move
majority of this code into a prepare step when swap can still be
allocated.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If any IP blocks allocate memory during their hw_fini() sequence
this can cause the suspend to fail under memory pressure. Introduce
a new phase that IP blocks can use to allocate memory before suspend
starts so that it can potentially be evicted into swap instead.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Linux PM core has a prepare() callback run before suspend.
If the system is under high memory pressure, the resources may need
to be evicted into swap instead. If the storage backing for swap
is offlined during the suspend() step then such a call may fail.
So move this step into prepare() to move evict majority of
resources and update all non-pmops callers to call the same callback.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2362
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add CG support for GFX/MC/HDP/ATHUB/IH/BIF.
Add PG support for GFX.
Signed-off-by: Li Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add smu 14 into the IP discovery list.
Signed-off-by: Li Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
VCN 4.0.5 uses DLDO.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These changes are missed in rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
IP discovery region has increased to > 8K on some SOCs.Maximum reserve
size is upto 12K, but not used. For now increase to 10K.
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Return -EINVAL when MMSCH init fail which can be handle by function
amdgpu_device_reset_sriov correctly.
Signed-off-by: Lin.Cao <[email protected]>
Reviewed-by: Jingwen Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Avoid infinite loop when count is 0.
This is missed in rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
'amdgpu_gmc_gart_location'
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c:274: warning: Function parameter or member 'gart_placement' not described in 'amdgpu_gmc_gart_location'
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Pan, Xinhui" <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
MCBP is decided by adev->gfx.mcbp now.
This is missed in rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove IB end boundary requirement,
VPE has no such limitions, use existing
amdgpu_ring_generic_pad_ib() instead.
This is missed in rebase.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Yifan Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When MES is oversubscribed it may not frequently check for new
command submissions from driver if the scheduling load is high.
Response latency as high as 5 seconds has been observed.
Enable a flag which adds a check for new commands between
scheduling quantums.
Signed-off-by: Jay Cornwall <[email protected]>
Cc: Alexandru Tudor <[email protected]>
Reviewed-by: Harish Kasiviswanathan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Updating drm-misc-next to the state of Linux v6.6-rc2.
Signed-off-by: Thomas Zimmermann <[email protected]>
|
|
SI hardware does not have doorbells at all, however currently the code
will try to do the allocation and thus fail, makes SI AMDGPU not usable.
Fix this failure by skipping doorbells allocation when doorbells count
is zero.
Fixes: 54c30d2a8def ("drm/amdgpu: create kernel doorbell pages")
Reviewed-by: Shashank Sharma <[email protected]>
Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
bo->tbo.resource can easily be NULL here.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
CC: [email protected]
|
|
SI hardware does not have doorbells at all, however currently the code
will try to do the allocation and thus fail, makes SI AMDGPU not usable.
Fix this failure by skipping doorbells allocation when doorbells count
is zero.
Fixes: 54c30d2a8def ("drm/amdgpu: create kernel doorbell pages")
Reviewed-by: Shashank Sharma <[email protected]>
Signed-off-by: Icenowy Zheng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable DCN 3.5.0 support.
Signed-off-by: Aaron Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Here, Adding db_size in byte to find the doorbell's
absolute offset for both 32-bit and 64-bit doorbell sizes.
So that doorbell offset will be aligned based on the doorbell
size.
v2:
- Addressed the review comment from Felix.
v3:
- Adding doorbell_size as parameter to get db absolute offset.
v4:
Squash the two patches into one.
Cc: Christian Koenig <[email protected]>
Cc: Alex Deucher <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Shashank Sharma <[email protected]>
Signed-off-by: Arvind Yadav <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
bo->tbo.resource can easily be NULL here.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2902
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
CC: [email protected]
|
|
add hub->ctx_distance when read CONTEXT1_CNTL, align w/
write back operation.
v2: fix coding style errors reported by checkpatch.pl (Christian)
Signed-off-by: Yifan Zhang <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Lang Yu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The amdgpu_ras_get_context may return NULL if device
not support ras feature, so add check before using.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add documentation for the newly added manufacturer and fru_id attributes
in sysfs.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support to read Manufacturer Name and FRU File Id fields. Also add
sysfs device attributes for external usage.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Keep FRU related information together in a separate structure.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
enable GFX v9.4.3 FRU device to query board information.
v2:
use MP1 version to identify different asic
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Update IB starting address alignment and size alignment with correct values
for decode and encode IPs.
Decode IB starting address alignment: 256 bytes
Decode IB size alignment: 64 bytes
Encode IB starting address alignment: 256 bytes
Encode IB size alignment: 4 bytes
Also bump amdgpu driver version for this update.
Signed-off-by: Boyuan Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time via CONFIG_UBSAN_BOUNDS (for
array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).
As found with Coccinelle[1], add __counted_by for struct amdgpu_bo_list.
Additionally, since the element count member must be set before accessing
the annotated flexible array member, move its initialization earlier.
Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: "Pan, Xinhui" <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Christophe JAILLET <[email protected]>
Cc: Felix Kuehling <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci [1]
Reviewed-by: Luben Tuikov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
There is no reason to call return at the end of function that
returns void.
Fixes the below:
WARNING: void function return statements are not generally useful
Thus remove such a statement in the affected functions.
Cc: Christian König <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Pan, Xinhui" <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|