diff options
author | Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> | 2024-03-18 08:49:24 -0700 |
---|---|---|
committer | Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> | 2024-03-20 14:13:58 -0700 |
commit | 649a125a88da64a66b0836cb7998bb433bbf1bf5 (patch) | |
tree | 7309d5ed751216aefa6dea1cd81244b4392c8ffb /drivers/gpu/drm/xe/xe_devcoredump.c | |
parent | 0267ee1914d21555e8e8817b32f2d07d8bf58cac (diff) |
drm/xe: Always check force_wake_get return code
A force_wake_get failure means that the HW might not be awake for the
access we're doing; this can lead to an immediate error or it can be a
more subtle problem (e.g. a register read might return an incorrect
value that is still valid, leading the driver to make a wrong choice
instead of flagging an error).
We avoid an error from the force_wake function because callers might
handle or tolerate the error, but this only works if all callers
are checking the error code. The majority already do, but a few are not.
These are mainly falling into 3 categories, which are each handled
differently:
1) error capture: in this case we want to continue the capture, but we
log an info message in dmesg to notify the user that the capture
might have incorrect data.
2) ioctl: in this case we return a -EIO error to userspace
3) unabortable actions: these are scenarios where we can't simply abort
and retry and so it's better to just try it anyway because there is a
chance the HW is awake even with the failure. In this case we throw a
warning so we know there was a forcewake problem if something fails
down the line.
v2: use gt_WARN_ON where appropriate
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tejas Upadhyay <tejas.upadhyay@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240318154924.3453513-1-daniele.ceraolospurio@intel.com
Diffstat (limited to 'drivers/gpu/drm/xe/xe_devcoredump.c')
-rw-r--r-- | drivers/gpu/drm/xe/xe_devcoredump.c | 9 |
1 files changed, 7 insertions, 2 deletions
diff --git a/drivers/gpu/drm/xe/xe_devcoredump.c b/drivers/gpu/drm/xe/xe_devcoredump.c index 0fcd30680323..7d3aa6bd3524 100644 --- a/drivers/gpu/drm/xe/xe_devcoredump.c +++ b/drivers/gpu/drm/xe/xe_devcoredump.c @@ -13,6 +13,7 @@ #include "xe_exec_queue.h" #include "xe_force_wake.h" #include "xe_gt.h" +#include "xe_gt_printk.h" #include "xe_guc_ct.h" #include "xe_guc_submit.h" #include "xe_hw_engine.h" @@ -64,7 +65,9 @@ static void xe_devcoredump_deferred_snap_work(struct work_struct *work) { struct xe_devcoredump_snapshot *ss = container_of(work, typeof(*ss), work); - xe_force_wake_get(gt_to_fw(ss->gt), XE_FORCEWAKE_ALL); + /* keep going if fw fails as we still want to save the memory and SW data */ + if (xe_force_wake_get(gt_to_fw(ss->gt), XE_FORCEWAKE_ALL)) + xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n"); xe_vm_snapshot_capture_delayed(ss->vm); xe_guc_exec_queue_snapshot_capture_delayed(ss->ge); xe_force_wake_put(gt_to_fw(ss->gt), XE_FORCEWAKE_ALL); @@ -180,7 +183,9 @@ static void devcoredump_snapshot(struct xe_devcoredump *coredump, } } - xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL); + /* keep going if fw fails as we still want to save the memory and SW data */ + if (xe_force_wake_get(gt_to_fw(q->gt), XE_FORCEWAKE_ALL)) + xe_gt_info(ss->gt, "failed to get forcewake for coredump capture\n"); coredump->snapshot.ct = xe_guc_ct_snapshot_capture(&guc->ct, true); coredump->snapshot.ge = xe_guc_exec_queue_snapshot_capture(job); |