Age | Commit message (Collapse) | Author | Files | Lines |
|
return 0 to avoid returning an uninitialized variable r
Signed-off-by: Ma Jun <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The ecc_irq is disabled while GPU mode2 reset suspending process,
but not be enabled during GPU mode2 reset resume process.
Changed from V1:
only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip
delete amdgpu_ras_late_resume function
Changed from V2:
check umc ras supported before put ecc_irq
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use an inline function for version check. Gives more flexibility to
handle any format changes.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When reset method is not passed in reset context, look for the handler
for default reset method. On Aldebaran, default reset method for SOCs
connected to CPU over XGMI is MODE2.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Tested-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of maintaining a list per device, keep the reset handlers common
per ASIC family. A pointer to the list of handlers is maintained in
reset control.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Tested-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
A list of devices to be reset is already created in
amdgpu_device_gpu_recover function. Creating another list with the
same nodes is incorrect and not supported in list_head. Instead, pass
the device list as part of reset context.
Fixes: 9e08564727fc (drm/amdgpu: Refactor mode2 reset logic for v13.0.2)
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use IP version and refactor reset logic to apply to a list of devices.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Those implementation details(whether swsmu supported, some ppt_funcs supported,
accessing internal statistics ...)should be kept internally. It's not a good
practice and even error prone to expose implementation details.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add the missing call to re-enable RAS error injections on the Aldebaran
mode2 reset code path.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove unneeded semicolon.
Signed-off-by: Zheng Yongjun <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix header guard and make internal functions static. Fixes the below warnings:
drivers/gpu/drm/amd/amdgpu/../amdgpu/amdgpu_reset.h:24:9: warning: '__AMDUGPU_RESET_H__' is used as a header guard here, followed by #define of a different macro [-Wheader-guard]
drivers/gpu/drm/amd/amdgpu/aldebaran.c:110:6: warning: no previous prototype for function 'aldebaran_async_reset' [-Wmissing-prototypes]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/aldebaran_ppt.c:1435:5: warning: no previous prototype for function 'aldebaran_mode2_reset' [-Wmissing-prototypes]
Signed-off-by: Lijo Lazar <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1: Aldebaran uses reset control to support mode2 reset. The sequences to
reset and restore hardware context are specific to a particular
configuration.
v2: Clear bus mastering before reset.
Fix coding style issues, drop unwanted variables and info log.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|