Age | Commit message (Collapse) | Author | Files | Lines |
|
Add ras_poison_irq and functions. And fix the amdgpu_irq_put
call trace in vcn_v4_0_hw_fini.
[ 44.563572] RIP: 0010:amdgpu_irq_put+0xa4/0xc0 [amdgpu]
[ 44.563629] RSP: 0018:ffffb36740edfc90 EFLAGS: 00010246
[ 44.563630] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[ 44.563630] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 44.563631] RBP: ffffb36740edfcb0 R08: 0000000000000000 R09: 0000000000000000
[ 44.563631] R10: 0000000000000000 R11: 0000000000000000 R12: ffff954c568e2ea8
[ 44.563631] R13: 0000000000000000 R14: ffff954c568c0000 R15: ffff954c568e2ea8
[ 44.563632] FS: 0000000000000000(0000) GS:ffff954f584c0000(0000) knlGS:0000000000000000
[ 44.563632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 44.563633] CR2: 00007f028741ba70 CR3: 000000026ca10000 CR4: 0000000000750ee0
[ 44.563633] PKRU: 55555554
[ 44.563633] Call Trace:
[ 44.563634] <TASK>
[ 44.563634] vcn_v4_0_hw_fini+0x62/0x160 [amdgpu]
[ 44.563700] vcn_v4_0_suspend+0x13/0x30 [amdgpu]
[ 44.563755] amdgpu_device_ip_suspend_phase2+0x240/0x470 [amdgpu]
[ 44.563806] amdgpu_device_ip_suspend+0x41/0x80 [amdgpu]
[ 44.563858] amdgpu_device_pre_asic_reset+0xd9/0x4a0 [amdgpu]
[ 44.563909] amdgpu_device_gpu_recover.cold+0x548/0xcf1 [amdgpu]
[ 44.564006] amdgpu_debugfs_reset_work+0x4c/0x80 [amdgpu]
[ 44.564061] process_one_work+0x21f/0x400
[ 44.564062] worker_thread+0x200/0x3f0
[ 44.564063] ? process_one_work+0x400/0x400
[ 44.564064] kthread+0xee/0x120
[ 44.564065] ? kthread_complete_and_exit+0x20/0x20
[ 44.564066] ret_from_fork+0x22/0x30
Suggested-by: Hawking Zhang <[email protected]>
Signed-off-by: Horatio Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add ras_poison_irq and functions.
Suggested-by: Hawking Zhang <[email protected]>
Signed-off-by: Horatio Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Separate vcn RAS poison consumption handling from the instance irq, and
register dedicated ras_poison_irq src and funcs for UVD_POISON.
v2:
- Separate ras irq from vcn instance irq
- Improve the subject and code comments
v3:
- Split the patch into three parts
- Improve the code comments
Suggested-by: Hawking Zhang <[email protected]>
Signed-off-by: Horatio Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The IMU firmware is loaded on the host side, not the guest.
Remove IMU in vf2pf ucode id enum.
Signed-off-by: YuanShang <[email protected]>
Reviewed-By: Horace Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is introduced by the code merge and will let the
adev->gfx.kiq[0].ring struct being overrided
Signed-off-by: Shiwu Zhang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Shiwu Zhang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[why]
[drm] psp gfx command LOAD_TA(0x1) failed and response status is (0x7)
[drm] psp gfx command INVOKE_CMD(0x3) failed and response status is (0x4)
amdgpu 0000:04:00.0: amdgpu: Secure display: Generic Failure.
[how]
don't enable secure display on incompatible platforms
Suggested-by: Aaron Liu <[email protected]>
Signed-off-by: Jesse zhang <[email protected]>
Reviewed-by: Aaron Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
smatch warning -
1) drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:3615 gfx_v9_0_kiq_resume()
warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'.
2) drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:6901 gfx_v10_0_kiq_resume()
warn: inconsistent returns 'ring->mqd_obj->tbo.base.resv'.
Signed-off-by: Sukrut Bellary <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
When freesync video mode is enabled, switching resolution from native
mode to one of the freesync video compatible modes can trigger continous
artifacts on some eDP panels when running under KDE. The articating can be seen in the
attached bug report.
[How]
Fix this by restricting updates that require full commit by using the same checks
for stream and scaling changes in the the enable pass of dm_update_crtc_state()
along with the check for compatible timings for freesync vide mode.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162
Fixes: da5e14909776 ("drm/amd/display: Fix hang when skipping modeset")
Signed-off-by: Aurabindo Pillai <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
None have been defined yet, so reject anybody setting any. Mesa sets
it to 0 anyway.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
No need cast (void*) to (struct amdgpu_device *).
Signed-off-by: Su Hui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[why]
Passthrough case is treated as root bus and pcie_gen_mask is set as
default value that does not support GEN 3 and GEN 4 for PCIe link
speed. So PCIe link speed will be downgraded at smu hw init in
passthrough condition
[how]
Move get pci info after detect virtualization and check if it is
passthrough case when set pcie_gen_mask
Signed-off-by: Tong Liu01 <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Since, we are only interested in having
drm_edid_override_connector_update(), update the value of
connector->edid_blob_ptr. We don't care about the return value of
drm_edid_override_connector_update() here. So, drop count.
Fixes: 550e5d23f147 ("drm/amd/display: assign edid_blob_ptr with edid from debugfs")
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
set_abm_event() is never actually used. So, drop it.
Fixes: b8fe56375f78 ("drm/amd/display: Refactor ABM feature")
Reported-by: kernel test robot <[email protected]>
Reported-by: Tom Rix <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
get_available_dsc_slices() returns the number of indices set, and all of
the users of get_available_dsc_slices() don't cross the returned bound
when iterating over available_slices[]. So, the memset() in
get_available_dsc_slices() is redundant and can be dropped.
Fixes: 97bda0322b8a ("drm/amd/display: Add DSC support for Navi (v2)")
Reported-by: Christophe JAILLET <[email protected]>
Reviewed-by: Rodrigo Siqueira <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Need flush HDP for MQD putting in vram
2. Zero out mes MQD
Signed-off-by: Jack Xiao <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix below checkpatch warnings:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: Comparisons should place the constant on the right side of the test
WARNING: Missing a blank line after declarations
Cc: Luben Tuikov <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix below checkpatch errors & warnings:
In amdgpu_uvd.c:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: Prefer 'unsigned int *' to bare use of 'unsigned *'
WARNING: Missing a blank line after declarations
WARNING: %Lx is non-standard C, use %llx
ERROR: space required before the open parenthesis '('
ERROR: space required before the open brace '{'
WARNING: %LX is non-standard C, use %llX
WARNING: Block comments use * on subsequent lines
+/* multiple fence commands without any stream commands in between can
+ crash the vcpu so just try to emmit a dummy create/destroy msg to
WARNING: Block comments use a trailing */ on a separate line
+ avoid this */
WARNING: braces {} are not necessary for single statement blocks
+ for (j = 0; j < adev->uvd.num_enc_rings; ++j) {
+ fences += amdgpu_fence_count_emitted(&adev->uvd.inst[i].ring_enc[j]);
+ }
In amdgpu_vce.c:
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: Missing a blank line after declarations
WARNING: %Lx is non-standard C, use %llx
WARNING: Possible repeated word: 'we'
ERROR: space required before the open parenthesis '('
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
perform mode2 reset for sdma fed error on gfx v11_0_3.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix below checkpatch insisted error & warnings:
ERROR: space required before the open brace '{'
WARNING: braces {} are not necessary for any arm of this statement
+ if ((type == VCN_ENCODE_RING) && (vcn_config & VCN_BLOCK_ENCODE_DISABLE_MASK)) {
[...]
+ } else if ((type == VCN_DECODE_RING) && (vcn_config & VCN_BLOCK_DECODE_DISABLE_MASK)) {
[...]
+ } else if ((type == VCN_UNIFIED_RING) && (vcn_config & VCN_BLOCK_QUEUE_DISABLE_MASK)) {
[...]
ERROR: code indent should use tabs where possible
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: braces {} are not necessary for single statement blocks
+ for (i = 0; i < adev->vcn.num_enc_rings; ++i) {
+ fence[j] += amdgpu_fence_count_emitted(&adev->vcn.inst[j].ring_enc[i]);
+
ERROR: space required before the open parenthesis '('
WARNING: Missing a blank line after declarations
WARNING: please, no spaces at the start of a line
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'.
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix below checkpatch warnings:
WARNING: Missing a blank line after declarations
+ struct amdgpu_connector *amdgpu_connector = to_amdgpu_connector(connector);
+ amdgpu_encoder->active_device = amdgpu_encoder->devices & amdgpu_connector->devices;
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Allocate large local variable on heap to avoid exceeding the
stack size:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c: In function ‘svm_range_validate_and_map’:
drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:1690:1: warning: the frame size of 2360 bytes is larger than 2048 bytes [-Wframe-larger-than=]
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix below checkpatch insisted error & warnings:
ERROR: Macros with complex values should be enclosed in parentheses
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: braces {} are not necessary for single statement blocks
WARNING: Block comments use a trailing */ on a separate line
WARNING: Missing a blank line after declarations
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
sq.is_enabled is a byte so there is no need to endian swap it.
Acked-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rework logic or use do_div() to avoid problems on 32 bit.
v2: add a missing case for XCP macro
v3: fix out of bounds array access
v4: fix xcp handling harder
Acked-by: Guchun Chen <[email protected]> (v1)
Reviewed-by: Mukul Joshi <[email protected]> (v3)
Signed-off-by: Alex Deucher <[email protected]>
|
|
Register GFX RAS functions and initialize GFX RAS.
v2: remove xcp operations.
v3: reuse the return value of gfx_ras_sw_init.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Query and reset sq timeout status.
v2: change instance from 0 to xcc_id for register access.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add GFX RAS error count reset function.
v2: remove xcp operation.
only select_se_sh when instance number is more than 1.
v3: add check for se_num before select_se_sh.
change instance from 0 to xcc_id for register access.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Query GFX RAS ce/ue count.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Prepare for the query of GFX RAS ce/ue count.
v2: remove xcp operation.
only select_se_sh when instance number is more than 1.
v3: add more CE/UE registsers to query list.
add check for se_num before select_se_sh.
change instance from 0 to xcc_id for register access.
v4: move gfx memory id definitions to gfx_v9_4_3.
v5: create a dedicated patch for adding error count query function.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add common GFX RAS definitions.
v2: remove instance from amdgpu_gfx_ras_reg_entry,
amdgpu_ras_err_status_reg_entry has already defined it.
v3: remove memory id definitions from amdgpu_gfx.h, they are
related to IP version.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GC v9_4_3 introduces UE|CE_ERR_STATUS_LO|HI to log
hardware errors
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reset GFX RAS status registers.
v2: fix typo in title.
remove xcp operation.
v3: change instance from 0 to xcc_id for register access.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Query GFX RAS status.
v2: remove xcp operation.
v3: change instance from 0 to xcc_id for register access.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The common function can help reduce redundant code.
v2: remove xcp operation, only need to do RAS operations for all
instances.
v3: remove check for GFX RAS support, will be checked in higher level.
add amdgpu prefix for the function name.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Not all the asic needs xcp. ensure check xcp availabity
before accessing its member.
v2: add missing change in kfd_topology.c
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Avoid access null xcp_mgr pointer.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The mask is only needed to be set when RAS block instance number is
more than 1 and invalid bits should be also masked out.
We only check valid bits for GFX and SDMA block for now, and will
add check for other RAS blocks in the future.
v2: move the check under injection operation since the mask is only
used by RAS error inject.
v3: add valid bits handling for SDMA.
v4: print message if the mask is adjusted.
Signed-off-by: Tao Zhou <[email protected]>
Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
No special requirement in RAS injection for the two versions, switch to
use default injection interface.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
So GFX RAS injection could use default function if it doesn't define its
own injection interface.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
User can specify injected instances by the mask. For backward
compatibility, the mask value is incorporated into sub block index
without interface change of RAS TA.
User uses logical mask and driver should convert it to physical value
before sending it to RAS TA.
v2: update parameter name.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Convert instance mask for the convenience of RAS TA.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Stanley.Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch enables IH CAM on GFX9.4.3 ASIC.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Current calculation only works for NPS4/QPX mode, correct it for
NPS4/CPX mode.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename smv_migrate_init to a better name kgd2kfd_init_zone_device
because it setup zone devive pgmap for page migration and keep it in
kfd_migrate.c to access static functions svm_migrate_pgmap_ops. Call it
only once in amdgpu_device_ip_init after adev ip blocks are initialized,
but before amdgpu_amdkfd_device_init initialize kfd nodes which enable
SVM support based on pgmap.
svm_range_set_max_pages is called by kgd2kfd_device_init everytime after
switching compute partition mode.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
During XCP init, unlike the primary device, there is no amdgpu_device
attached to each XCP's drm_device
In case that user trying to open/close the primary node of XCP drm_device
this rerouting is to solve the NULL pointer issue causing by referring
to any member of the amdgpu_device
BUG: unable to handle page fault for address: 0000000000020c80
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP NOPTI
Call Trace:
<TASK>
lock_timer_base+0x6b/0x90
try_to_del_timer_sync+0x2b/0x80
del_timer_sync+0x29/0x40
flush_delayed_work+0x1c/0x50
amdgpu_driver_open_kms+0x2c/0x280 [amdgpu]
drm_file_alloc+0x1b3/0x260 [drm]
drm_open+0xaa/0x280 [drm]
drm_stub_open+0xa2/0x120 [drm]
chrdev_open+0xa6/0x1c0
Signed-off-by: Shiwu Zhang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
svm_migrate_init set the max svm range pages based on the KFD nodes
partition size. APU mode don't init pgmap because there is no migration.
kgd2kfd_device_init calls svm_migrate_init after KFD nodes allocation
and initialization.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch fixes memory reporting on the GFX 9.4.3 APU and dGPU
by reporting available memory on a per partition basis. If its an
APU, available and used memory calculations take into account
system and TTM memory.
v2: squash in fix ("drm/amdkfd: Fix array out of bound warning")
squash in fix ("drm/amdgpu: Update memory reporting for GFX9.4.3")
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need to track memory usage on a per partition basis. To do
that, store the local memory information in KFD node instead
of kfd device.
v2: squash in fix ("amdkfd: Use mem_id to access mem_partition info")
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Find xcp_id from amdgpu_fpriv, use it for amdgpu_gem_object_create.
Signed-off-by: James Zhu <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|