aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/bin
diff options
context:
space:
mode:
authorTvrtko Ursulin <[email protected]>2016-03-17 12:59:46 +0000
committerTvrtko Ursulin <[email protected]>2016-03-18 10:25:56 +0000
commit26720ab97feac7153a7b5c3c79cf5d53a8531126 (patch)
treef76e05c8fd731ca54fa6fa021684fecf5faad582 /tools/perf/scripts/python/bin
parent39dabecd991b0a914f044af5824774825fb0923e (diff)
drm/i915: Move CSB MMIO reads out of the execlists lock
By reading the CSB (slow MMIO accesses) into a temporary local buffer we can decrease the duration of holding the execlist lock. Main advantage is that during heavy batch buffer submission we reduce the execlist lock contention, which should decrease the latency and CPU usage between the submitting userspace process and interrupt handling. Downside is that we need to grab and relase the forcewake twice, but as the below numbers will show this is completely hidden by the primary gains. Testing with "gem_latency -n 100" (submit batch buffers with a hundred nops each) shows more than doubling of the throughput and more than halving of the dispatch latency, overall latency and CPU time spend in the submitting process. Submitting empty batches ("gem_latency -n 0") does not seem significantly affected by this change with throughput and CPU time improving by half a percent, and overall latency worsening by the same amount. Above tests were done in a hundred runs on a big core Broadwell. v2: * Overflow protection to local CSB buffer. * Use closer dev_priv in execlists_submit_requests. (Chris Wilson) v3: Rebase. v4: Added commend about irq needed to be disabled in execlists_submit_request. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: Chris Wilson <[email protected]> Reviewed-by: Chris Wilsno <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
Diffstat (limited to 'tools/perf/scripts/python/bin')
0 files changed, 0 insertions, 0 deletions