Age | Commit message (Collapse) | Author | Files | Lines |
|
No need to reset error status since only umc ras supported on psp v13_0_0.
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move reset_context out of gpu recover function to make it configurable
for different reset purpose.
For the reset way of call gpu_recovery sysfs, force to use full reset
method. Otherwise, try soft reset by default if the related ASIC
supportted, if soft reset failed, will use full reset.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It should not init whole ras bad page framework on sriov guest side
due to it is handled on host side.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX is the only IP block that RAS TA needs to program
the hardware when receiving enable_feature command.
Changed from V1:
remove amdgpu_ras_need_send_ras_feature inline function,
use GFX RAS block check directly.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_device_gpu_recover
We removed the wrapper that was queueing the recover function
into reset domain queue who was using this name.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Save the extra usless work schedule.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Adjust the sequence for ras late init and separate ras reset error status
from query status.
v2: squash in fix from Candice
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix aldebaran ras supported check on SRIOV guest side,
the previous check conditicon block all ras feature
on baremetal
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
support umc/gfx/sdma ras on guest side
Changed from V1:
move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Qeury ras status before ras poison consumption handling, add more
comment and log.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-and-tested-by: Mohammad Zafar Ziya <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable RAS IH if poison consumption handler is implemented.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Mohammad Zafar Ziya <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After alloc fail, we do not need to kfree.
Signed-off-by: Haowen Bai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The fatal error handler is independent from general ras interrupt
handler since there is no related IH ring.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support for general RAS poison consumption handler.
v2: remove callback function for poison consumption.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Prepare for the implementation of poison consumption handler.
v2: separate umc handler from poison creation.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add vcn and jpeg ras support options
V2: vcn and jpeg ras flag enabled for aldebaran asic only
V3: vcn and jpeg ras flag disabled for error counter query
Generic poison query interface added
VCN and JPEG ras enabled based on IP version check
V4: vcn and jpeg ras flag moved under ecc flag for dGPU
Signed-off-by: Mohammad Zafar Ziya <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It should notice SMU to update bad channel info when detected
uncorrectable error in UMC block
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as
.ras_fini common function, which is called when
.ras_fini of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_fini to
initialize .ras_fini in ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
centrally calls the .ras_fini function of all ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fixed warning reported by kernel test robot:
1.warning: variable 'ras_obj' is used uninitialized
whenever '||' condition is true.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Turn previously global function into a static function to avoid the
following Clang warning:
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:5: warning: no previous prototype
for function 'amdgpu_ras_block_late_init_default' [-Wmissing-prototypes]
int amdgpu_ras_block_late_init_default(struct amdgpu_device *adev,
^
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2459:1: note: declare 'static' if the
function is not intended to be used outside of this translation unit
int amdgpu_ras_block_late_init_default(struct amdgpu_device *adev,
^
static
Signed-off-by: Maíra Canal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Clang build fails with
amdgpu_ras.c:2416:7: error: variable 'ras_obj' is used uninitialized
whenever 'if' condition is true
if (adev->in_suspend || amdgpu_in_reset(adev)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
amdgpu_ras.c:2453:6: note: uninitialized use occurs here
if (ras_obj->ras_cb)
^~~~~~~
There is a logic error in the error handler's labels.
ex/ The sysfs: is the last goto label in the normal code but
is the middle of error handler. Rework the error handler.
cleanup: is the first error, so it's handler should be last.
interrupt: is the second error, it's handler is next. interrupt:
handles the failure of amdgpu_ras_interrupt_add_hander() by
calling amdgpu_ras_interrupt_remove_handler(). This is wrong,
remove() assumes the interrupt has been setup, not torn down by
add(). Change the goto label to cleanup.
sysfs is the last error, it's handler should be first. sysfs:
handles the failure of amdgpu_ras_sysfs_create() by calling
amdgpu_ras_sysfs_remove(). But when the create() fails there
is nothing added so there is nothing to remove. This error
handler is not needed. Remove the error handler and change
goto label to interrupt.
Fixes: b293e891b057 ("drm/amdgpu: add helper function to do common ras_late_init/fini (v3)")
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as
.ras_late_init common function, which is called when
.ras_late_init of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_init to
initialize .ras_late_init in ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Define amdgpu_ras_late_init to call all ras blocks' .ras_late_init.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_ras_block_late_init/amdgpu_ras_block_late_fini
1. Merge amdgpu_ras_late_init to
amdgpu_ras_block_late_init.
2. Remove amdgpu_ras_late_init since no ras block
calls amdgpu_ras_late_init.
3. Merge amdgpu_ras_late_fini to
amdgpu_ras_block_late_fini.
4. Remove amdgpu_ras_late_fini since no ras block
calls amdgpu_ras_late_fini.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_ras.c
In order to reduce redundant struct conversion, modify
operating sysfs and interrupt function interface parameters.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
code
Optimize amdgpu_nbio_ras_late_init/amdgpu_nbio_ras_fini function code.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Define amdgpu_ras_block_late_init to create sysfs nodes
and interrupt handles.
2. Define amdgpu_ras_block_late_fini to remove sysfs nodes
and interrupt handles.
3. Replace ras block variable members in struct
amdgpu_ras_block_object with struct ras_common_if, which
can make it easy to associate each ras block instance
with each ras block functional interface.
4. Add .ras_cb to struct amdgpu_ras_block_object.
5. Change each ras block to fit for the changement of struct
amdgpu_ras_block_object.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.18-2022-02-11-1:
amdgpu:
- Clean up of power management code
- Enable freesync video mode by default
- Clean up of RAS code
- Improve VRAM access for debug using SDMA
- Coding style cleanups
- SR-IOV fixes
- More display FP reorg
- TLB flush fixes for Arcuturus, Vega20
- Misc display fixes
- Rework special register access methods for SR-IOV
- DP2 fixes
- DP tunneling fixes
- DSC fixes
- More IP discovery cleanups
- Misc RAS fixes
- Enable both SMU i2c buses where applicable
- s2idle improvements
- DPCS header cleanup
- Add new CAP firmware support for SR-IOV
amdkfd:
- Misc cleanups
- SVM fixes
- CRIU support
- Clean up MQD manager
UAPI:
- Add interface to amdgpu CTX ioctl to request a stable power state for profiling
https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/207
- Add amdkfd support for CRIU
https://github.com/checkpoint-restore/criu/pull/1709
- Remove old unused amdkfd debugger interface
Was only implemented for Kaveri and was only ever used by an old HSA tool that was never open sourced
radeon:
- Fix error handling in radeon_driver_open_kms
- UVD suspend fix
- Misc fixes
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The commit d5e8ff5f7b2a ("drm/amdgpu: Fixed the defect of soft lock caused by infinite loop")
had fixed this defect.
Revert workaround
commit a2170b4af62f ("drm/amdgpu: Add judgement to avoid infinite loop").
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. The infinite loop case only occurs on multiple cards support
ras functions.
2. The explanation of root cause refer to commit 76641cbbf196
("drm/amdgpu: Add judgement to avoid infinite loop").
3. Create new node to manage each unique ras instance to guarantee
each device .ras_list is completely independent.
4. Fixes: commit 7a6b8ab3231b51 ("drm/amdgpu: Unify ras block
interface for each ras block").
5. The soft locked logs are as follows:
[ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu
[ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020
[ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu]
[ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45
[ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202
[ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248
[ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000
[ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001
[ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e
[ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003
[ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000
[ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0
[ 262.166267] Call Trace:
[ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu]
[ 262.166529] ? psi_task_switch+0xd2/0x250
[ 262.166537] ? __switch_to+0x11d/0x460
[ 262.166542] ? __switch_to_asm+0x36/0x70
[ 262.166549] process_one_work+0x220/0x3c0
[ 262.166556] worker_thread+0x4d/0x3f0
[ 262.166560] ? process_one_work+0x3c0/0x3c0
[ 262.166563] kthread+0x12b/0x150
[ 262.166568] ? set_kthread_struct+0x40/0x40
[ 262.166571] ret_from_fork+0x22/0x30
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
MESA polls for errors every 2-3 seconds. Printing with dev_info() causes
the dmesg log to fill up with the same message, e.g,
[18028.206676] amdgpu 0000:0b:00.0: amdgpu: df doesn't config ras function.
Make it dev_dbg_once(), as it isn't something correctible during boot or
thereafter, so printing just once is sufficient. Also sanitize the message.
Cc: Alex Deucher <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: John Clements <[email protected]>
Cc: Tao Zhou <[email protected]>
Cc: yipechai <[email protected]>
Fixes: 8b0fb0e967c1 ("drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. The infinite loop causing soft lock occurs on multiple amdgpu cards
supporting ras feature.
2. This a workaround patch to fix 6492e1b07c03397f85bd6dc0e230ea6cd9394635.
It is valid for multiple amdgpu cards of the same type.
3. The root cause is that each GPU card device has a separate .ras_list
link header, but the instance and linked list node of each ras block
are unique. When each device is initialized, each ras instance will
repeatedly add link node to the device every time. In this way, only
the .ras_list of the last initialized device is completely correct.
the .ras_list->prev and .ras_list->next of the device initialzied
before can still point to the correct ras instance, but the prev
pointer and next pointer of the pointed ras instance both point to
the last initialized device's .ras_ list instead of the beginning
.ras_ list. When using list_for_each_entry_safe searches for
non-existent Ras nodes on devices other than the last device, the
last ras instance next pointer cannot always be equal to the
beginning .ras_list, so that the loop cannot be terminated, the
program enters a infinite loop.
BTW: Since the data and initialization process of each card are the same,
the link list between ras instances will not be destroyed every time
the device is initialized.
4. The soft locked logs are as follows:
[ 262.165690] CPU: 93 PID: 758 Comm: kworker/93:1 Tainted: G OE 5.13.0-27-generic #29~20.04.1-Ubuntu
[ 262.165695] Hardware name: Supermicro AS -4124GS-TNR/H12DSG-O-CPU, BIOS T20200717143848 07/17/2020
[ 262.165698] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[ 262.165980] RIP: 0010:amdgpu_ras_get_ras_block+0x86/0xd0 [amdgpu]
[ 262.166239] Code: 68 d8 4c 8d 71 d8 48 39 c3 74 54 49 8b 45 38 48 85 c0 74 32 44 89 fa 44 89 e6 4c 89 ef e8 82 e4 9b dc 85 c0 74 3c 49 8b 46 28 <49> 8d 56 28 4d 89 f5 48 83 e8 28 48 39 d3 74 25 49 89 c6 49 8b 45
[ 262.166243] RSP: 0018:ffffac908fa87d80 EFLAGS: 00000202
[ 262.166247] RAX: ffffffffc1394248 RBX: ffff91e4ab8d6e20 RCX: ffffffffc1394248
[ 262.166249] RDX: ffff91e4aa356e20 RSI: 000000000000000e RDI: ffff91e4ab8c0000
[ 262.166252] RBP: ffffac908fa87da8 R08: 0000000000000007 R09: 0000000000000001
[ 262.166254] R10: ffff91e4930b64ec R11: 0000000000000000 R12: 000000000000000e
[ 262.166256] R13: ffff91e4aa356df8 R14: ffffffffc1394320 R15: 0000000000000003
[ 262.166258] FS: 0000000000000000(0000) GS:ffff92238fb40000(0000) knlGS:0000000000000000
[ 262.166261] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 262.166264] CR2: 00000001004865d0 CR3: 000000406d796000 CR4: 0000000000350ee0
[ 262.166267] Call Trace:
[ 262.166272] amdgpu_ras_do_recovery+0x130/0x290 [amdgpu]
[ 262.166529] ? psi_task_switch+0xd2/0x250
[ 262.166537] ? __switch_to+0x11d/0x460
[ 262.166542] ? __switch_to_asm+0x36/0x70
[ 262.166549] process_one_work+0x220/0x3c0
[ 262.166556] worker_thread+0x4d/0x3f0
[ 262.166560] ? process_one_work+0x3c0/0x3c0
[ 262.166563] kthread+0x12b/0x150
[ 262.166568] ? set_kthread_struct+0x40/0x40
[ 262.166571] ret_from_fork+0x22/0x30
Fixes: 6492e1b07c0339 ("drm/amdgpu: Unify ras block interface for each ras block")
Signed-off-by: yipechai <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Create common amdgpu_umc_fill_error_record function for all versions
of UMC and clean up related codes.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
exists in ras_list"
This reverts commit df4f0041c6ef497e598a67e367db835489162754.
Xgmi ras initialization had been moved from .late_init to early_init,
the defect of repeated calling amdgpu_ras_register_ras_block had been
fixed, so revert this patch.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove repeated calls.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the code style warnings in amdgpu_ras:
1. ERROR: space required before the open parenthesis '('.
2. WARNING: line length of xxx exceeds 100 columns.
3. ERROR: "foo* bar" should be "foo *bar".
4. WARNING: unnecessary whitespace before a quoted newline.
5. WARNING: space prohibited before semicolon.
6. WARNING: suspect code indent for conditional statements.
7. WARNING: braces {} are not necessary for single statement blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Changed from v1:
remove unused brace
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Pull drm fixes from Daniel Vetter:
"drivers fixes:
- i915 fixes for ttm backend + one pm wakelock fix
- amdgpu fixes, fairly big pile of small things all over. Note this
doesn't yet containe the fixed version of the otg sync patch that
blew up
- small driver fixes: meson, sun4i, vga16fb probe fix
drm core fixes:
- cma-buf heap locking
- ttm compilation
- self refresh helper state check
- wrong error message in atomic helpers
- mipi-dbi buffer mapping"
* tag 'drm-next-2022-01-14' of git://anongit.freedesktop.org/drm/drm: (49 commits)
drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy
drm: fix error found in some cases after the patch d1af5cd86997
drm/ttm: fix compilation on ARCH=um
dma-buf: cma_heap: Fix mutex locking section
video: vga16fb: Only probe for EGA and VGA 16 color graphic cards
drm/amdkfd: Fix ASIC name typos
drm/amdkfd: Fix DQM asserts on Hawaii
drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
drm/amd/pm: only send GmiPwrDnControl msg on master die (v3)
drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt
drm/amdgpu: not return error on the init_apu_flags
drm/amdkfd: Use prange->update_list head for remove_list
drm/amdkfd: Use prange->list head for insert_list
drm/amdkfd: make SPDX License expression more sound
drm/amdkfd: Check for null pointer after calling kmemdup
drm/amd/display: invalid parameter check in dmub_hpd_callback
Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process"
drm/amd/display: reset dcn31 SMU mailbox on failures
drm/amdkfd: use default_groups in kobj_type
drm/amdgpu: use default_groups in kobj_type
...
|
|
fix compile warning for ras_block_match_default
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use ARRAY_SIZE to get array length.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Eliminate the follow smatch warnings:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3504 amdgpu_device_init()
warn: inconsistent indenting
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1716
amdgpu_ras_error_status_query() warn: if statement not indented
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1058 amdgpu_ras_error_inject()
warn: inconsistent indenting
Reported-by: Abaci Robot <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Eliminate the following coccicheck warning:
./drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2725:16-17: Unneeded semicolon
Reported-by: Abaci Robot <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
in ras_list
No longer insert ras blocks into ras_list if it already exists in ras_list.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add ras supported check for register_ras_block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Removed redundant ras code.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Move xgmi special error inject function from amdgpu_ras.c to xgmi block.
2. Support to use psp_ras_trigger_error as default error inject function in amdgpu_ras.c. If .ras_error_inject isn't defined in ras block, default error inject function will take effect.
v2: squash in warning fix (Alex)
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1.Modify sdma block to fit for the unified ras block data and ops.
2.Change amdgpu_sdma_ras_funcs to amdgpu_sdma_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of sdma ras variable so that sdma ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register sdma ras block into amdgpu device ras block link list.
5.Remove the redundant code about sdma in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of sdma versions. If .ras_late_init and .ras_fini had been defined by the selected sdma version, the defined functions will take effect; if not defined, default fill them with amdgpu_sdma_ras_late_init and amdgpu_sdma_ras_fini.
v2: squash in warning fix (Alex)
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1.Modify umc block to fit for the unified ras block data and ops.
2.Change amdgpu_umc_ras_funcs to amdgpu_umc_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of umc ras variable so that umc ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register umc ras block into amdgpu device ras block link list.
5.Remove the redundant code about umc in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of umc versions. If .ras_late_init and .ras_fini had been defined by the selected umc version, the defined functions will take effect; if not defined, default fill them with amdgpu_umc_ras_late_init and amdgpu_umc_ras_fini.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1.Modify nbio block to fit for the unified ras block data and ops.
2.Change amdgpu_nbio_ras_funcs to amdgpu_nbio_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of mmhub ras variable so that nbio ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register nbio ras block into amdgpu device ras block link list.
5.Remove the redundant code about nbio in amdgpu_ras.c after using the unified ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|