diff options
author | Chris Wilson <[email protected]> | 2015-12-11 11:32:58 +0000 |
---|---|---|
committer | Jani Nikula <[email protected]> | 2015-12-22 12:55:50 +0200 |
commit | f87a780f07b22b6dc4642dbaf44af65112076cb8 (patch) | |
tree | 6bdd9ac46bcad78ef80d119c55375cdcb2ad9f57 /drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c | |
parent | e7571f7fd66c77a760338340adbe41d994fe93ac (diff) |
drm/i915: Limit the busy wait on requests to 5us not 10ms!
When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we want to sleep as quickly as possible. The tradeoff between
waiting and sleeping is roughly the time it takes to sleep on a request,
on the order of a microsecond. Based on measurements of synchronous
workloads from across big core and little atom, I have set the limit for
busywaiting as 10 microseconds. In most of the synchronous cases, we can
reduce the limit down to as little as 2 miscroseconds, but that leaves
quite a few test cases regressing by factors of 3 and more.
The code currently uses the jiffie clock, but that is far too coarse (on
the order of 10 milliseconds) and results in poor interactivity as the
CPU ends up being hogged by slow requests. To get microsecond resolution
we need to use a high resolution timer. The cheapest of which is polling
local_clock(), but that is only valid on the same CPU. If we switch CPUs
because the task was preempted, we can also use that as an indicator that
the system is too busy to waste cycles on spinning and we should sleep
instead.
__i915_spin_request was introduced in
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
Author: Chris Wilson <[email protected]>
Date: Tue Apr 7 16:20:41 2015 +0100
drm/i915: Optimistically spin for the request completion
v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
so we can use native register sizes on smaller architectures. Mention
the approximate microseconds units for elapsed time and add some extra
comments describing the reason for busywaiting.
v3: Raise the limit to 10us
v4: Now 5us.
Reported-by: Jens Axboe <[email protected]>
Link: https://lkml.org/lkml/2015/11/12/621
Reviewed-by: Tvrtko Ursulin <[email protected]>
Cc: "Rogozhkin, Dmitry V" <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Eero Tamminen <[email protected]>
Cc: "Rantala, Valtteri" <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit ca5b721e238226af1d767103ac852aeb8e4c0764)
Signed-off-by: Jani Nikula <[email protected]>
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c')
0 files changed, 0 insertions, 0 deletions