diff options
author | Vignesh Chander <Vignesh.Chander@amd.com> | 2024-06-24 16:44:26 -0500 |
---|---|---|
committer | Alex Deucher <alexander.deucher@amd.com> | 2024-06-27 17:31:37 -0400 |
commit | cbda2758d8bfae323b846210a3e52f0ad5fe7164 (patch) | |
tree | 034e4d34e668d0dd014fe139efa6e9736b7b28e4 /drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | |
parent | 78146c1dcd220ae98fd5f4114f992299fc5ee161 (diff) |
drm/amdgpu: process RAS fatal error MB notification
For RAS error scenario, VF guest driver will check mailbox
and set fed flag to avoid unnecessary HW accesses.
additionally, poll for reset completion message first
to avoid accidentally spamming multiple reset requests to host.
v2: add another mailbox check for handling case where kfd detects
timeout first
v3: set host_flr bit and use wait_for_reset
Signed-off-by: Vignesh Chander <Vignesh.Chander@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index f04cd1586c72..b42a8854dca0 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -52,7 +52,7 @@ /* tonga/fiji use this offset */ #define mmBIF_IOV_FUNC_IDENTIFIER 0x1503 -#define AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT 5 +#define AMDGPU_VF2PF_UPDATE_MAX_RETRY_LIMIT 2 enum amdgpu_sriov_vf_mode { SRIOV_VF_MODE_BARE_METAL = 0, @@ -94,6 +94,7 @@ struct amdgpu_virt_ops { u32 data1, u32 data2, u32 data3); void (*ras_poison_handler)(struct amdgpu_device *adev, enum amdgpu_ras_block block); + bool (*rcvd_ras_intr)(struct amdgpu_device *adev); }; /* @@ -352,6 +353,7 @@ void amdgpu_virt_ready_to_reset(struct amdgpu_device *adev); int amdgpu_virt_wait_reset(struct amdgpu_device *adev); int amdgpu_virt_alloc_mm_table(struct amdgpu_device *adev); void amdgpu_virt_free_mm_table(struct amdgpu_device *adev); +bool amdgpu_virt_rcvd_ras_interrupt(struct amdgpu_device *adev); void amdgpu_virt_release_ras_err_handler_data(struct amdgpu_device *adev); void amdgpu_virt_init_data_exchange(struct amdgpu_device *adev); void amdgpu_virt_exchange_data(struct amdgpu_device *adev); |