aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2015-12-08drm/i915: Leave FDI running after failed link training on LPT-HVille Syrjälä1-9/+15
Currently we disable some parts of FDI setup after a failed link training. But despite that we continue with the modeset as if everything is fine. This results in tons of noise from the state checker, and it means we're not following the proper modeset sequence for the rest of crtc enabling, nor for crtc disabling. Ideally we should abort the modeset and follow the proper disable sequence to shut off everything we enabled so far, but that would require a big rework of the modeset code. So instead just leave FDI up and running in its untrained state, and log an error. This is what we do on older platforms too. v2: Fix a typo in the commit message Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-08drm/i915: Disable LPT-H VGA dotclock during crtc disableVille Syrjälä1-0/+1
Currently we leave the LPT-H VGA dotclock running after turning the pipe/fdi/port/etc. Properly disable the VGA dotclock as specified in the modeset sequence. v2: Fix commit message typo (Paulo) Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-08drm/i915: Refactor LPT-H VGA dotclock disablingVille Syrjälä1-14/+20
Extract the LPT-H VGA dotclock disable to a separate function in anticipation of further use. While at it move the sb_lock locking inwards when enabling the VGA dotclock, as it's only needed to protect the sideband accesses. v2: Keep the PIXCLK_GATE_GATE name for 0 (Paulo) Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-08drm/i915: Disable FDI after the CRT port on LPT-HVille Syrjälä2-7/+6
Bspec modeset sequence tells us to disable the PCH transcoder and FDI after the CRT port on LPT-H, so let's do that. And the CRT port should be disabled after the pipe, as we do on other PCH platforms too since commit 1ea56e269e13 ("drm/i915: Disable CRT port after pipe on PCH platforms") commit 00490c22b1b5 ("drm/i915: Consider SPLL as another shared pll, v2.") moved the SPLL disable from the .post_disable() hook to some upper level code, so we can just move the CRT port disabling into the .post_disable() hook. If we still had the non-shared SPLL, it would have needed to be moved into the .post_pll_disable() hook. v2: Actually move the CRT port disable to the .post_disable() hook, and amend the commit message with more details (Paulo) v3: Fix typos in commit message (Paulo) Cc: Paulo Zanoni <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-08drm/i915: Round to closest when computing the VGA dotclock for LPT-HVille Syrjälä1-1/+1
Bspec says we should round to closest when computing the LPT-H VGA dotclock, so let's do that. v2: Fix typo in commit message (Paulo) Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-08drm/i915: Disable CLKOUT_DP bending on LPT/WPT as neededVille Syrjälä2-2/+67
When we want to use SPLL for FDI we want SSC, which means we have to disable clock bending for the PCH SSC reference (bend and spread are mutually exclusive). So let's turn off bending when we want spread. In case the BIOS enabled clock bending for some reason we'll just turn it off and enable the spread mode instead. Not sure what happens if the BIOS is actually using the bend source for HDMI at this time, but I suppose it should be no worse than what already happens when we simply turn on the spread. We don't currently use the bend source for anything, and only use the PCH SSC reference for the SPLL to drive FDI (always with spread). v2: Fix the %5 vs %10 fumble for SSCDITHPHASE (Paulo) Add 'WARN_ON(steps % 5 != 0)' sanity check (Paulo) Fix typos in commit message (Paulo) Cc: Paulo Zanoni <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Paulo Zanoni <[email protected]>
2015-12-08drm/i915/skl: Double RC6 WRL always onMika Kuoppala1-2/+1
WaRsDoubleRc6WrlWithCoarsePowerGating should be enabled for all Skylakes. Make it so. Cc: Sagar Arun Kamble <[email protected]> Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Sagar Arun Kamble <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit e7674b8c31717dd0c58b3a9493d43249722071eb) Cc: [email protected] # v4.3+ Signed-off-by: Jani Nikula <[email protected]>
2015-12-08drm/i915/skl: Disable coarse power gating up until F0Mika Kuoppala1-1/+1
There is conflicting info between E0 and F0 steppings for this workarounds. Trust more authoritative source and be conservative and extend also for F0. This prevents numerous (>50) gpu hangs with SKL GT4e during piglit run. References: HSD: gen9lp/2134184 Cc: Sagar Arun Kamble <[email protected]> Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Sagar Arun Kamble <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit 6686ece19f7446f0e29c77d9e0402e1d0ce10c48) Cc: [email protected] # v4.3+ Signed-off-by: Jani Nikula <[email protected]>
2015-12-08exynos: fixes an incorrect header guardAshley Towns1-1/+1
in the exynos gpu driver where the preprocessor #ifndef/#define variables were mismatched. Signed-off-by: Ashley Towns <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2015-12-08drm/i915/skl: Double RC6 WRL always onMika Kuoppala1-2/+1
WaRsDoubleRc6WrlWithCoarsePowerGating should be enabled for all Skylakes. Make it so. Cc: Sagar Arun Kamble <[email protected]> Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Sagar Arun Kamble <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Jani Nikula <[email protected]>
2015-12-08drm/i915/skl: Disable coarse power gating up until F0Mika Kuoppala1-1/+1
There is conflicting info between E0 and F0 steppings for this workarounds. Trust more authoritative source and be conservative and extend also for F0. This prevents numerous (>50) gpu hangs with SKL GT4e during piglit run. References: HSD: gen9lp/2134184 Cc: Sagar Arun Kamble <[email protected]> Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Sagar Arun Kamble <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Jani Nikula <[email protected]>
2015-12-08drm/vmwgfx: Implement the cursor_set2 callback v2Thomas Hellstrom7-18/+61
Fixes native drm clients like Fedora 23 Wayland which now appears to be able to use cursor hotspots without strange cursor offsets. Also fixes a couple of ignored error paths. Since the core drm cursor hotspot is incompatible with the legacy vmwgfx hotspot (the core drm hotspot is reset when the drm_mode_cursor ioctl is used), we need to keep track of both and add them when the device hotspot is set. We assume that either is always zero. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]>
2015-12-08drm/i915: Remove incorrect warning in context cleanupTvrtko Ursulin1-2/+0
Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae Author: Tvrtko Ursulin <[email protected]> Date: Mon Oct 5 13:26:36 2015 +0100 drm/i915: Clean up associated VMAs on context destruction Added a warning based on an incorrect assumption that all VMAs in a VM will be on the inactive list at the point last reference to a context and VM is dropped. This is not true because i915_gem_object_retire__read will not put VMA on the inactive list until all activities on the object in question (in all VMs) have been retired. As a consequence, whether or not a context/VM will be destroyed with its VMAs still on the active list, can depend on completely unrelated activities using the same object from a different context or engine. Signed-off-by: Tvrtko Ursulin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 Testcase: igt/gem_request_retire/retire-vma-not-inactive Cc: Daniel Vetter <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Michel Thierry <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Daniel Vetter <[email protected]> (cherry picked from commit 408952d43b27a54437244c56c0e0d8efa5607926) Signed-off-by: Jani Nikula <[email protected]>
2015-12-07drm/vc4: Add an interface for capturing the GPU state after a hang.Eric Anholt3-0/+191
This can be parsed with vc4-gpu-tools tools for trying to figure out what was going on. v2: Use __u32-style types. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Add support for async pageflips.Eric Anholt5-2/+342
An async pageflip stores the modeset to be done and executes it once the BOs are ready to be displayed. This gets us about 3x performance in full screen rendering with pageflipping. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Add support for drawing 3D frames.Eric Anholt11-1/+3102
The user submission is basically a pointer to a command list and a pointer to uniforms. We copy those in to the kernel, validate and relocate them, and store the result in a GPU BO which we queue for execution. v2: Drop support for NV shader recs (not necessary for GL), simplify vc4_use_bo(), improve bin flush/semaphore checks, use __u32 style types. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Bind and initialize the V3D engine.Eric Anholt5-0/+242
This is the component of the GPU that does 3D rendering. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Fix a typo in a V3D debug register.Eric Anholt1-1/+1
Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Add an API for creating GPU shaders in GEM BOs.Eric Anholt6-5/+974
Since we have no MMU, the kernel needs to validate that the submitted shader code won't make any accesses to memory that the user doesn't control, which involves banning some operations (general purpose DMA writes), and tracking where we need to write out pointers for other operations (texture sampling). Once it's validated, we return a GEM BO containing the shader, which doesn't allow mapping for write or exporting to other subsystems. v2: Use __u32-style types. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Add create and map BO ioctls.Eric Anholt3-0/+48
While there exist dumb APIs for creating and mapping BOs, one of the rules is that drivers doing 3D acceleration have to provide their own APIs for buffer allocation (besides, the pitch/height parameters of the dumb alloc don't really make sense for a lot of 3D allocations). v2: Use __u32-style types, use "drm.h" instead of <drm/drm.h>. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm/vc4: Add a BO cache.Eric Anholt4-8/+384
We need to allocate new BOs in the kernel as part of each frame, but the CMA allocator is way too slow for that. As an optimization, keep track of recently-freed BOs and reuse them, with a 1 second timeout to fully free them back to the system. This improves 3D performance by about 15%. Signed-off-by: Eric Anholt <[email protected]>
2015-12-07drm: Create a driver hook for allocating GEM object structs.Eric Anholt1-4/+6
The CMA helpers had no way for a driver to extend the struct with its own fields. Since the CMA helpers are mostly "Allocate a drm_gem_cma_object, then fill in a few fields", it's hard to write as pure helpers without passing in a driver callback for the allocate step. Signed-off-by: Eric Anholt <[email protected]> Reviewed-by: Daniel Vetter <[email protected]>
2015-12-08Back merge tag 'v4.4-rc4' into drm-nextDave Airlie49-255/+641
We've picked up a few conflicts and it would be nice to resolve them before we move onwards.
2015-12-07drm/i915: Fix idle_frames counter.Rodrigo Vivi1-13/+7
'commit 97173eaf5 ("drm/i915: PSR: Increase idle_frames")' was a mistake. The special case it tried to cover was already being covered by the DP_PSR_NO_TRAIN_ON_EXIT. So this ended up duplicated. So, instead of reverting that let's take this opportunity and unify the idle_frame definition in a single place so we standardize the access and avoid room for that same mistake again. Few changes with this patch: 1. Instead of just respecting the VBT we set a global minumum with max(). So we are sure that we will avoid corner cases in case VBT is doing something we don't understand. 2. Instead of minimum 5 we use 6. When introducing the idle_frames += 4 case we considered that minimum was 2. All because the off-by-one issue. v2: Unified idle_frame definition. Cc: Paulo Zanoni <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: Paulo Zanoni <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-07drm/i915: Remove double wait_for_vblank on broadwell.Maarten Lankhorst1-8/+0
wait_vblank is already set in intel_plane_atomic_calc_changes for broadwell, waiting for a double vblank is overkill. Signed-off-by: Maarten Lankhorst <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-5-git-send-email-maarten.lankhorst@linux.intel.com
2015-12-07drm/i915/skl: Update watermarks before the crtc is disabled.Maarten Lankhorst1-1/+4
On skylake some of the registers are only writable when the correct power wells are enabled. Because of this watermarks have to be updated before the crtc turns off, or you get unclaimed register read and write warnings. This patch needs to be modified slightly to apply to -fixes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92181 Signed-off-by: Maarten Lankhorst <[email protected]> Cc: [email protected] Cc: Matt Roper <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-4-git-send-email-maarten.lankhorst@linux.intel.com Reviewed-by: Ander Conselvan de Oliveira <[email protected]>
2015-12-07drm/i915: Calculate watermark related members in the crtc_state, v4.Maarten Lankhorst3-21/+21
This removes pre/post_wm_update from intel_crtc->atomic, and creates atomic state for it in intel_crtc. Changes since v1: - Rebase on top of wm changes. Changes since v2: - Split disable_cxsr into a separate patch. Changes since v3: - Move some of the changes to intel_wm_need_update. Signed-off-by: Maarten Lankhorst <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Daniel Vetter <[email protected]>
2015-12-07drm/i915: Move disable_cxsr to the crtc_state.Maarten Lankhorst3-7/+10
intel_crtc->atomic will be removed later on, move this member to intel_crtc_state. Signed-off-by: Maarten Lankhorst <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/1447945645-32005-2-git-send-email-maarten.lankhorst@linux.intel.com Reviewed-by: Ander Conselvan de Oliveira <[email protected]>
2015-12-07Merge tag 'topic/drm-misc-2015-12-04' of ↵Dave Airlie13-95/+431
git://anongit.freedesktop.org/drm-intel into drm-next New -misc pull. Big thing is Thierry's atomic helpers for system suspend resume, which I'd like to use in i915 too. Hence the pull. * tag 'topic/drm-misc-2015-12-04' of git://anongit.freedesktop.org/drm-intel: drm: keep connector status change logging human readable drm/atomic-helper: Reject attempts at re-stealing encoders drm/atomic-helper: Implement subsystem-level suspend/resume drm: Implement drm_modeset_lock_all_ctx() drm/gma500: Add driver private mutex for the fault handler drm/gma500: Drop dev->struct_mutex from mmap offset function drm/gma500: Drop dev->struct_mutex from fbdev init/teardown code drm/gma500: Drop dev->struct_mutex from modeset code drm/gma500: Use correct unref in the gem bo create function drm/edid: Make the detailed timing CEA/HDMI mode fixup accept up to 5kHz clock difference drm/atomic_helper: Add drm_atomic_helper_disable_planes_on_crtc() drm: Serialise multiple event readers drm: Drop dev->event_lock spinlock around faulting copy_to_user()
2015-12-07i915: Replace "hweight8(dev_priv->info.subslice_7eu[i]) != 1" with ↵Zeng Zhaoxiu1-1/+2
"!is_power_of_2(dev_priv->info.subslice_7eu[i])" Signed-off-by: Zeng Zhaoxiu <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-06vgaarb: remove bogus checksAl Viro1-7/+0
neither ->release() nor ->poll() can be called unless ->open() has succeeded on the same struct file, so checking for "has open() failed" is pointless. What's more, ->poll() doesn't return -E... - it always returns a bitmap of POLL... values, so the dead code in that one had been actively bogus. Signed-off-by: Al Viro <[email protected]>
2015-12-05drm/armada: use a private mutex to protect priv->linearDaniel Vetter4-8/+14
Reusing the Big DRM Lock just leaks, and the few things left that dev->struct_mutex protected are very well contained - it's just the linear drm_mm manager. With this armada is completely struct_mutex free! v2: Convert things properly and also take the lock in armada_gem_free_object, and remove the stale comment (Russell). Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-05drm/armada: drop struct_mutex from cursor pathsDaniel Vetter1-6/+1
The kms state itself is already protected by the modeset locks acquired by the drm core. The only thing left is gem bo state, and since the cursor code expects small objects which are statically mapped at create time and then invariant over the lifetime of the gem bo there's nothing to protect. See armada_gem_dumb_create -> armada_gem_linear_back which assigns obj->addr which is the only thing used by the cursor code. Only tricky bit is to switch to the _unlocked unreference function. Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-12-05Merge branch 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie30-91/+387
into drm-next A few more last minute fixes for 4.4 on top of my pull request from earlier this week. The big change here is a vblank regression fix due to commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed". Beyond that, a hotplug fix and a few VM fixes. * 'drm-fixes-4.4' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3) drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2) drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interrupt drm/amdgpu: add spin lock to protect freed list in vm (v2) drm/amdgpu: partially revert "drm/amdgpu: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 drm/amdgpu: take a BO reference for the user fence drm/amdgpu: take a BO reference in the display code drm/amdgpu: set snooped flags only on system addresses v2 drm/amdgpu: fix race condition in amd_sched_entity_push_job drm/amdgpu: add err check for pin userptr add blacklist for thinkpad T40p drm/amdgpu: fix VM page table reference counting drm/amdgpu: fix userptr flags check
2015-12-04drm/i915: Update DRIVER_DATE to 20151204Daniel Vetter1-1/+1
Signed-off-by: Daniel Vetter <[email protected]>
2015-12-04drm/amdgpu: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v3)Alex Deucher6-30/+140
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Maybe replace the udelay() in the flip_work_func() by a suitable usleep_range() for a bit better efficiency? Will try that. - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Probably fixes: fdo#93147 Port of Mario's radeon fix to amdgpu. Signed-off-by: Alex Deucher <[email protected]> (v1) Reviewed-by: Mario Kleiner <[email protected]> (v2) Refine amdgpu_flip_work_func() for better efficiency. In amdgpu_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Also small fix to code comment and formatting in that function. (v2) Signed-off-by: Mario Kleiner <[email protected]> (v3) Fix crash in crtc disabled case
2015-12-04drm/i915/skl: Add SKL GT4 PCI IDsMika Kuoppala1-0/+1
Add Skylake Intel Graphics GT4 PCI IDs v2: Rebase Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Damien Lespiau <[email protected]> Signed-off-by: Damien Lespiau <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2015-12-04drm/radeon: Fixup hw vblank counter/ts for new drm_update_vblank_count() (v2)Mario Kleiner9-29/+164
commit 4dfd6486 "drm: Use vblank timestamps to guesstimate how many vblanks were missed" introduced in Linux 4.4-rc1 makes the drm core more fragile to drivers which don't update hw vblank counters and vblank timestamps in sync with firing of the vblank irq and essentially at leading edge of vblank. This exposed a problem with radeon-kms/amdgpu-kms which do not satisfy above requirements: The vblank irq fires a few scanlines before start of vblank, but programmed pageflips complete at start of vblank and vblank timestamps update at start of vblank, whereas the hw vblank counter increments only later, at start of vsync. This leads to problems like off by one errors for vblank counter updates, vblank counters apparently going backwards or vblank timestamps apparently having time going backwards. The net result is stuttering of graphics in games, or little hangs, as well as total failure of timing sensitive applications. See bug #93147 for an example of the regression on Linux 4.4-rc: https://bugs.freedesktop.org/show_bug.cgi?id=93147 This patch tries to align all above events better from the viewpoint of the drm core / of external callers to fix the problem: 1. The apparent start of vblank is shifted a few scanlines earlier, so the vblank irq now always happens after start of this extended vblank interval and thereby drm_update_vblank_count() always samples the updated vblank count and timestamp of the new vblank interval. To achieve this, the reporting of scanout positions by radeon_get_crtc_scanoutpos() now operates as if the vblank starts radeon_crtc->lb_vblank_lead_lines before the real start of the hw vblank interval. This means that the vblank timestamps which are based on these scanout positions will now update at this earlier start of vblank. 2. The driver->get_vblank_counter() function will bump the returned vblank count as read from the hw by +1 if the query happens after the shifted earlier start of the vblank, but before the real hw increment at start of vsync, so the counter appears to increment at start of vblank in sync with the timestamp update. 3. Calls from vblank irq-context and regular non-irq calls are now treated identical, always simulating the shifted vblank start, to avoid inconsistent results for queries happening from vblank irq vs. happening from drm_vblank_enable() or vblank_disable_fn(). 4. The radeon_flip_work_func will delay mmio programming a pageflip until the start of the real vblank iff it happens to execute inside the shifted earlier start of the vblank, so pageflips now also appear to execute at start of the shifted vblank, in sync with vblank counter and timestamp updates. This to avoid some races between updates of vblank count and timestamps that are used for swap scheduling and pageflip execution which could cause pageflips to execute before the scheduled target vblank. The lb_vblank_lead_lines "fudge" value is calculated as the size of the display controllers line buffer in scanlines for the given video mode: Vblank irq's are triggered by the line buffer logic when the line buffer refill for a video frame ends, ie. when the line buffer source read position enters the hw vblank. This means that a vblank irq could fire at most as many scanlines before the current reported scanout position of the crtc timing generator as the number of scanlines the line buffer can maximally hold for a given video mode. This patch has been successfully tested on a RV730 card with DCE-3 display engine and on a evergreen card with DCE-4 display engine, in single-display and dual-display configuration, with different video modes. A similar patch is needed for amdgpu-kms to fix the same problem. Limitations: - Line buffer sizes in pixels are hard-coded on < DCE-4 to a value i just guessed to be high enough to work ok, lacking info on the true sizes atm. Fixes: fdo#93147 Signed-off-by: Mario Kleiner <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Michel Dänzer <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Ville Syrjälä <[email protected]> (v1) Tested-by: Dave Witbrodt <[email protected]> (v2) Refine radeon_flip_work_func() for better efficiency: In radeon_flip_work_func, replace the busy waiting udelay(5) with event lock held by a more performance and energy efficient usleep_range() until at least predicted true start of hw vblank, with some slack for scheduler happiness. Release the event lock during waits to not delay other outputs in doing their stuff, as the waiting can last up to 200 usecs in some cases. Retested on DCE-3 and DCE-4 to verify it still works nicely. (v2) Signed-off-by: Mario Kleiner <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/radeon: Retry DDC probing on DVI on failure if we got an HPD interruptLyude10-12/+32
HPD signals on DVI ports can be fired off before the pins required for DDC probing actually make contact, due to the pins for HPD making contact first. This results in a HPD signal being asserted but DDC probing failing, resulting in hotplugging occasionally failing. This is somewhat rare on most cards (depending on what angle you plug the DVI connector in), but on some cards it happens constantly. The Radeon R5 on the machine used for testing this patch for instance, runs into this issue just about every time I try to hotplug a DVI monitor and as a result hotplugging almost never works. Rescheduling the hotplug work for a second when we run into an HPD signal with a failing DDC probe usually gives enough time for the rest of the connector's pins to make contact, and fixes this issue. Reviewed-by: Christian König <[email protected]> Signed-off-by: Lyude <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: add spin lock to protect freed list in vm (v2)jimqu2-3/+15
there is a protection fault about freed list when OCL test. add a spin lock to protect it. v2: drop changes in vm_fini Signed-off-by: JimQu <[email protected]> Reviewed-by: Christian König <[email protected]>
2015-12-04drm/amdgpu: partially revert "drm/amdgpu: fix ↵Christian König2-2/+2
VM_CONTEXT*_PAGE_TABLE_END_ADDR" v2 The gtt_end is already inclusive, we don't need to subtract one here. v2 (chk): keep the fix for the VM code, cause here it really applies. Signed-off-by: Christian König <[email protected]> Signed-off-by: Anatoli Antonovitch <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: take a BO reference for the user fenceChristian König1-2/+4
No need for a GEM reference here. Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: take a BO reference in the display codeChristian König1-3/+3
No need for the GEM reference here. Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: set snooped flags only on system addresses v2Christian König1-3/+4
Not necessary for VRAM. v2: no need to check if ttm is NULL. Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: add spin lock to protect freed list in vm (v2)jimqu2-3/+15
there is a protection fault about freed list when OCL test. add a spin lock to protect it. v2: drop changes in vm_fini Signed-off-by: JimQu <[email protected]> Reviewed-by: Christian König <[email protected]>
2015-12-04Revert "drm/i915: Extend LRC pinning to cover GPU context writeback"Daniel Vetter4-118/+27
This reverts commit 6d65ba943a2d1e4292a07ca7ddb6c5138b9efa5d. Mika Kuoppala traced down a use-after-free crash in module unload to this commit, because ring->last_context is leaked beyond when the context gets destroyed. Mika submitted a quick fix to patch that up in the context destruction code, but that's too much of a hack. The right fix is instead for the ring to hold a full reference onto it's last context, like we do for legacy contexts. Since this is causing a regression in BAT it gets reverted before we can close this. Cc: Nick Hoath <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Gordon <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Alex Dai <[email protected]> Cc: Mika Kuoppala <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93248 Acked-by: Mika Kuoppala <[email protected]> Signed-off-by: Daniel Vetter <[email protected]>
2015-12-04amdgpu/gfxv8: Remove magic numbers from function ↵Tom St Denis1-2/+2
gfx_v8_0_tiling_mode_table_init() Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
2015-12-04drm/amdgpu: fix race condition in amd_sched_entity_push_jobNicolai Hähnle1-2/+3
As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <[email protected]> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <[email protected]>
2015-12-04drm/amdgpu: add err check for pin userptrChunming Zhou1-3/+7
Missing error check if the operation failed. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]>
2015-12-04amdgpu/gfxv8: Simplification in gfx_v8_0_enable_gui_idle_interrupt()Tom St Denis1-11/+5
Simplified the function by folding the two paths into one. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Alex Deucher <[email protected]>