diff options
author | Matthew Brost <matthew.brost@intel.com> | 2023-10-30 20:24:36 -0700 |
---|---|---|
committer | Luben Tuikov <ltuikov89@gmail.com> | 2023-11-01 17:29:21 -0400 |
commit | a6149f0393699308fb00149be913044977bceb56 (patch) | |
tree | 893c8d5e70a5d877e7bf9a787e7b07702a8ef121 /include/drm | |
parent | 35963cf2cd25eeea8bdb4d02853dac1e66fb13a0 (diff) |
drm/sched: Convert drm scheduler to use a work queue rather than kthread
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.
1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.
2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.
A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.
v2:
- (Rob Clark) Fix msm build
- Pass in run work queue
v3:
- (Boris) don't have loop in worker
v4:
- (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
- (Boris) default to ordered work queue
v6:
- (Luben / checkpatch) fix alignment in msm_ringbuffer.c
- (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
- (Luben) Update comment for drm_sched_wqueue_enqueue
- (Luben) Positive check for submit_wq in drm_sched_init
- (Luben) s/alloc_submit_wq/own_submit_wq
v7:
- (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
- (Luben) Adjust var names / comments
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Diffstat (limited to 'include/drm')
-rw-r--r-- | include/drm/gpu_scheduler.h | 14 |
1 files changed, 9 insertions, 5 deletions
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 7e622c18d336..4bf94aca2631 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -475,17 +475,16 @@ struct drm_sched_backend_ops { * @num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT, * as there's usually one run-queue per priority, but could be less. * @sched_rq: An allocated array of run-queues of size @num_rqs; - * @wake_up_worker: the wait queue on which the scheduler sleeps until a job - * is ready to be scheduled. * @job_scheduled: once @drm_sched_entity_do_release is called the scheduler * waits on this wait queue until all the scheduled jobs are * finished. * @hw_rq_count: the number of jobs currently in the hardware queue. * @job_id_count: used to assign unique id to the each job. + * @submit_wq: workqueue used to queue @work_run_job * @timeout_wq: workqueue used to queue @work_tdr + * @work_run_job: work which calls run_job op of each scheduler. * @work_tdr: schedules a delayed call to @drm_sched_job_timedout after the * timeout interval is over. - * @thread: the kthread on which the scheduler which run. * @pending_list: the list of jobs which are currently in the job queue. * @job_list_lock: lock to protect the pending_list. * @hang_limit: once the hangs by a job crosses this limit then it is marked @@ -494,6 +493,8 @@ struct drm_sched_backend_ops { * @_score: score used when the driver doesn't provide one * @ready: marks if the underlying HW is ready to work * @free_guilty: A hit to time out handler to free the guilty job. + * @pause_submit: pause queuing of @work_run_job on @submit_wq + * @own_submit_wq: scheduler owns allocation of @submit_wq * @dev: system &struct device * * One scheduler is implemented for each hardware ring. @@ -505,13 +506,13 @@ struct drm_gpu_scheduler { const char *name; u32 num_rqs; struct drm_sched_rq **sched_rq; - wait_queue_head_t wake_up_worker; wait_queue_head_t job_scheduled; atomic_t hw_rq_count; atomic64_t job_id_count; + struct workqueue_struct *submit_wq; struct workqueue_struct *timeout_wq; + struct work_struct work_run_job; struct delayed_work work_tdr; - struct task_struct *thread; struct list_head pending_list; spinlock_t job_list_lock; int hang_limit; @@ -519,11 +520,14 @@ struct drm_gpu_scheduler { atomic_t _score; bool ready; bool free_guilty; + bool pause_submit; + bool own_submit_wq; struct device *dev; }; int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_backend_ops *ops, + struct workqueue_struct *submit_wq, u32 num_rqs, uint32_t hw_submission, unsigned int hang_limit, long timeout, struct workqueue_struct *timeout_wq, atomic_t *score, const char *name, struct device *dev); |