Age | Commit message (Collapse) | Author | Files | Lines |
|
git://people.freedesktop.org/~gabbayo/linux into drm-next
- Add CWSR (compute wave save restore) support for GFX8 (Carrizo)
- Fix SDMA user-mode queues support for GFX7 (Kaveri)
- Add SDMA user-mode queues support for GFX8 (Carrizo)
- Allow HWS (hardware scheduling) to schedule multiple processes concurrently
- Add debugfs support
- Simplify process locking and lock dependencies
- Refactoring topology code to prepare for dGPU support + fixes to that code
- Add option to generate dummy/virtual CRAT table when its missing or deformed
- Recognize CPUs other then APUs as compute entities
- Various clean ups and bug fixes
I have not yet sent the dGPU topology code because it depends on a patch
for the PCI subsystem that adds PCIe atomics support. Once that patch is
upstreamed we can continue with the rest of the dGPU code.
* tag 'drm-amdkfd-next-2017-12-24' of git://people.freedesktop.org/~gabbayo/linux: (53 commits)
drm/amdgpu: Add support for reporting VRAM usage
drm/amdkfd: Ignore ACPI CRAT for non-APU systems
drm/amdkfd: Module option to disable CRAT table
drm/amdkfd: Add AQL Queue Memory flag on topology
drm/amdkfd: Fixup incorrect info in the CZ CRAT table
drm/amdkfd: Add perf counters to topology
drm/amdkfd: Add topology support for dGPUs
drm/amdkfd: Add topology support for CPUs
drm/amdkfd: Fix sibling_map[] size
drm/amdkfd: Simplify counting of memory banks
drm/amdkfd: Turn verbose topology messages into pr_debug
drm/amdkfd: sync IOLINK defines to thunk spec
drm/amdkfd: Support enumerating non-GPU devices
drm/amdkfd: Decouple CRAT parsing from device list update
drm/amdkfd: Reorganize CRAT fetching from ACPI
drm/amdkfd: Group up CRAT related functions
drm/amdkfd: Fix memory leaks in kfd topology
drm/amdkfd: Topology: Fix location_id
drm/amdkfd: Update number of compute unit from KGD
drm/amd: Remove get_vmem_size from KGD-KFD interface
...
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v4.15-rc7
- couple of documentation build fixes
- serialize non-blocking modesets
- prevent DMC from messing up GMBUS transfers
- PSR regression fix
* tag 'drm-intel-fixes-2018-01-04' of git://anongit.freedesktop.org/drm/drm-intel:
drm/i915: Apply Display WA #1183 on skl, kbl, and cfl
docs: fix, intel_guc_loader.c has been moved to intel_guc_fw.c
documentation/gpu/i915: fix docs build error after file rename
drm/i915: Put all non-blocking modesets onto an ordered wq
drm/i915: Disable DC states around GMBUS on GLK
drm/i915/psr: Fix register name mess up.
|
|
into drm-fixes
- backport of a DC change which fixes a greenish tint on some RV hw
- properly handle kzalloc fail in ttm
* 'drm-fixes-4.15' of git://people.freedesktop.org/~agd5f/linux:
drm/ttm: check the return value of kzalloc
drm/amd/display: call set csc_default if enable adjustment is false
|
|
git://git.armlinux.org.uk/~rmk/linux-arm into drm-fixes
Armada fixes.
* 'drm-armada-fixes-4.15' of git://git.armlinux.org.uk/~rmk/linux-arm:
drm/armada: fix YUV planar format framebuffer offsets
drm/armada: improve efficiency of armada_drm_plane_calc_addrs()
drm/armada: fix UV swap code
drm/armada: fix SRAM powerdown
drm/armada: fix leak of crtc structure
|
|
Add support for the A83T display pipeline.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/614b430adf3a67320362a75c01b01bd53013da8a.1513854122.git-series.maxime.ripard@free-electrons.com
|
|
The TCON supports the LVDS interface to output to a panel or a bridge.
Let's add support for it.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/7fbb85f33ee1d5009fde4f0d7d236e11ca58b114.1513854122.git-series.maxime.ripard@free-electrons.com
|
|
The various outputs the TCON can provide have different constraints on the
dotclock divider. Let's make them configurable by the various mode_set
functions.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/92ff5881c8f8674056d34458b2f264cd48d4e136.1513854122.git-series.maxime.ripard@free-electrons.com
|
|
It seems like the mixer can only run properly when clocked at 150MHz. In
order to have something more robust than simply a fire-and-forget
assigned-clocks-rate, let's put that in the code.
Reviewed-by: Chen-Yu Tsai <[email protected]>
Signed-off-by: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/f5f05307972ed05250e8094b302d68b9e7e167f6.1513854122.git-series.maxime.ripard@free-electrons.com
|
|
Display WA #1183 was recently added to workaround
"Failures when enabling DPLL0 with eDP link rate 2.16
or 4.32 GHz and CD clock frequency 308.57 or 617.14 MHz
(CDCLK_CTL CD Frequency Select 10b or 11b) used in this
enabling or in previous enabling."
This workaround was designed to minimize the impact only
to save the bad case with that link rates. But HW engineers
indicated that it should be safe to apply broadly, although
they were expecting the DPLL0 link rate to be unchanged on
runtime.
We need to cover 2 cases: when we are in fact enabling DPLL0
and when we are just changing the frequency with small
differences.
This is based on previous patch by Rodrigo Vivi with suggestions
from Ville Syrjälä.
Cc: Arthur J Runyan <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: [email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 53421c2fe99ce16838639ad89d772d914a119a49)
[ Lucas: Backport to 4.15 adding back variable that has been removed on
commits not meant to be backported ]
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
A shadow page table entry needs to be cleared after being set as
post-sync. This patch fixes the recent error reported in Win7-32 test.
Fixes: 2707e4446688 ("drm/i915/gvt: vGPU graphics memory virtualization")
Signed-off-by: Zhi Wang <[email protected]>
CC: Stable <[email protected]>
Signed-off-by: Zhenyu Wang <[email protected]>
|
|
We were calling enable_irq on bind, where it was already enabled previously
by the IRQ helper. Additionally, dev->irq is not set correctly until after
postinstall and so was always zero here, triggering a warning in 4.15.
Fix both by moving the enable to the power management resume path, where we
know there was a previous disable invocation during suspend.
Fixes: 253696ccd613 ("drm/vc4: Account for interrupts in flight")
Signed-off-by: Stefan Schake <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Tested-by: Stefan Wahren <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
The msm/kms driver should work even if there is no GPU device specified
in DT. Currently, we get a NULL dereference crash in adreno_load_gpu
since the driver assumes that priv->gpu_pdev is non-NULL.
Perform an additional check on priv->gpu_pdev before trying to retrieve
the msm_gpu pointer from it.
v2: Incorporate Jordan's comments:
- Simplify the check to share the same error message.
- Use dev_err_once() to avoid an error message every time we open the
drm device fd.
Fixes: eec874ce5ff1 (drm/msm/adreno: load gpu at probe/bind time)
Signed-off-by: Archit Taneja <[email protected]>
Acked-by: Jordan Crouse <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
This adds a new driver for Sitronix ST7735R display panels.
This has been tested using an Adafruit 1.8" TFT.
Signed-off-by: David Lechner <[email protected]>
Reviewed-by: Noralf Trønnes <[email protected]>
Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This updates the compatible string for a no-name LCD panel to
"vot,v220hf01a-t", "ilitek,ili9225".
The original bindings were the generic "ilitek,ili9225-2.2in-176x220"
because I could not find a datasheet. However, after some more research,
I finally found one, so the actual vendor and model name are now known.
This previous bindings have not made it to the mainline kernel yet, so
this is not breaking backwards compatibility.
Signed-off-by: David Lechner <[email protected]>
Signed-off-by: Noralf Trønnes <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
In the function ttm_page_alloc_init, kzalloc call is made for variable
_manager, we need to check its return value, it may return NULL.
Signed-off-by: Xiongwei Song <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fixes a greenish tint on RV displays.
Signed-off-by: Yue Hin Lau <[email protected]>
Reviewed-by: Eric Bernstein <[email protected]>
Acked-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
[[email protected]: backport to 4.15]
Signed-off-by: Daniel Drake <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When destroying an inactive queue, we don't need to call
execute_queues_cpsch.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
Now that memset32 is available, the open-coded pagetable initialization
loop can be replaced.
Signed-off-by: Lucas Stach <[email protected]>
|
|
There is no need to hold the GPU lock while freeing the submit
object. Only move the retired submits from the GPU active list to
a temporary retire list under the GPU lock.
Signed-off-by: Lucas Stach <[email protected]>
|
|
Now that the PMR lifetime issues are solved we can safely re-enable
performance counter profiling support.
Signed-off-by: Lucas Stach <[email protected]>
|
|
As long as there is an active submit, we want the GPU to stay awake. This
is slightly complicated by the fact that we really want to wake the GPU
at the last possible moment to achieve maximum power savings.
Signed-off-by: Lucas Stach <[email protected]>
|
|
The active count is used to check if the BO is idle, where idle is defined
as not active on the GPU and all VM mappings and reference counts dropped
to the initial state. As the idling of the mappings and references now only
happens in the submit cleanup, the active state handling must be moved to
the same location in order to keep the userspace semantics.
Signed-off-by: Lucas Stach <[email protected]>
|
|
Less dynamic allocations and slims down the cmdbuf object to only the
required information, as everything else is already available in the
submit object.
This also simplifies buffer and mappings lifetime management, as they
are now exlusively attached to the submit object and not additionally
to the cmdbuf.
Signed-off-by: Lucas Stach <[email protected]>
|
|
The GPU exec state may have changed at the time when the perfmon sampling
is done, as it reflects the state of the last submission, not the current
GPU execution state.
So for proper sampling we must use the submit exec_state.
Signed-off-by: Lucas Stach <[email protected]>
|
|
We'll need this in some places where only the submit is available. Also
this is a first step at slimming down the cmdbuf object.
Signed-off-by: Lucas Stach <[email protected]>
|
|
To make them available to the event worker even after the actual
command stream execution has finished.
Signed-off-by: Lucas Stach <[email protected]>
|
|
The submit object lifetime will get extended to the actual GPU
execution. As multiple users will depend on this, add a kref to
properly control destruction of the object.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
The acquire_ctx is special in that it needs to be released from the same
thread as has been used to initialize it. This collides with the intention to
extend the submit lifetime beyond the gem_submit function with potentially
other threads doing the final cleanup.
Move the ww_acquire_ctx to the function local stack as suggested in the
documentation.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
This is safe to call in all paths, as the BO_PINNED flag tells us if the BO
needs unpinning.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
Simplifies the cleanup path and moves fence waiting to a central location.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
This is the fence passed out on a sucessful GPU submit. Make the name
more clear.
Signed-off-by: Lucas Stach <[email protected]>
|
|
The object fencing has nothing to do with the actual GPU buffer submit,
so move it to the gem submit path to have a cleaner split.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
Use kzalloc so other code doesn't need to worry about uninitialized members.
Drop the non-standard GFP flags, as we really don't want to fail the submit
when under slight memory pressure. Remove one level of indentation by using
an early return if the allocation failed. Also remove the unused drm device
member.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
When manipulating the kernel command buffer the GPU mutex must be held, as
otherwise different callers might try to replace the same part of the
buffer, wreacking havok in the GPU execution.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
Inserting the END command when suspending the GPU is changing the
command buffer state, which requires the GPU to be held.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
While the etnaviv workqueue needs to be ordered, as we rely on work items
being executed in queuing order, this is only true for a single GPU.
Having a shared workqueue for all GPUs in the system limits concurrency
artificially.
Getting each GPU its own ordered workqueue still meets our ordering
expectations and enables retire workers to run concurrently.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
There is no need to store this in the gpu struct. MMU flushes are triggered
correctly in reaction to MMU maps and unmaps, independent of the current ctx.
Any required pipe switches can be infered from the current and the desired
GPU exec state.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
There is no need to synchronize with oustanding retire jobs if the object
has gone idle. Retire jobs only ever change the object state from active to
idle, not the other way around.
The IOVA put race is uncritical, as the GEM_WAIT ioctl itself is holding
a reference to the GEM object, so the retire worker will not pull the
object into the CPU domain, which is the thing we are trying to guard
against with etnaviv_gpu_wait_obj_inactive. The ordering of the various
counts and waits may change a bit, but the userspace visible behavior at
the bounds of the syscall are unchanged.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
Flush and prefetch are properly handled in the buffer code, data endianess
would need much wider changes than adding something to this single function.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
Now that the userptr BO handling doesn't rely on the userspace restarting
the submit after object population, there is no need to special case the
-EAGAIN return value anymore.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
All code paths which populate userptr BOs are fine with the get_pages
function taking the mmap_sem lock. This allows to get rid of the pretty
involved architecture with a worker being scheduled if the mmap_sem
needs to be taken, but instead call GUP directly and allow it to take
the lock if necessary.
This simplifies the code a lot and removes the possibility of this
function returning -EAGAIN, which complicates object population
handling at the callers.
A notable change in behavior is that we don't allow a process to populate
objects with user pages from a foreign MM anymore. This would have been an
invalid use before, as it breaks the assumptions made in the etnaviv kernel
driver to enfore cache coherence. We now disallow this by rejecting the
request to populate those objects. Well behaving userspace is unaffected by
this change.
Signed-off-by: Lucas Stach <[email protected]>
|
|
This function never fails, as it does nothing more than adding the GEM
object to the global device list. Making this explicit through the void
return type allows to drop some unnecessary error handling.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
This function has only one caller and it isn't expected that there will
be any more in the future. Folding this function into the caller is
helping the readability.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
The current userptr page population will defer work to a work item if
needed to avoid ever taking the mmap_sem in the direct call path. With
the more fine-grained locking in etnaviv this isn't needed anymore, so
a future commit will simplify this code.
Add a lockdep annotation to validate the assumption that the mmap_sem
can be taken in the direct call path.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
Userptr, prime and shmem buffer objects have different lock ordering
requirements. This is mostly due to the fact that we don't allow to mmap
userptr buffers, so we won't ever end up in our fault handler for those,
so some of the code paths are never called with the mmap_sem held.
To avoid lockdep false positives, split them up into different lock classes.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
If the FE is restarted before the sync point event is cleared, the GPU
might trigger a completion IRQ for the next sync point, corrupting
the state of the currently running worker.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
The omap4 CEC hardware cannot tell a Nack from a Low Drive from an
Arbitration Lost error, so just report a Nack, which is almost
certainly the reason for the error anyway.
This also simplifies the implementation. The only three interrupts
that need to be enabled are:
Transmit Buffer Full/Empty Change event: triggered when the
transmit finished successfully and cleared the buffer.
Receiver FIFO Not Empty event: triggered when a message was received.
Frame Retransmit Count Exceeded event: triggered when a transmit
failed repeatedly, usually due to the message being Nacked. Other
reasons are possible (Low Drive, Arbitration Lost) but there is no
way to know. If this happens the TX buffer needs to be cleared
manually.
While testing various error conditions I noticed that the hardware
can receive messages up to 18 bytes in total, which exceeds the legal
maximum of 16. This could cause a buffer overflow, so we check for
this and constrain the size to 16 bytes.
The old incorrect interrupt handler could cause the CEC framework to
enter into a bad state because it mis-detected the "Start Bit Irregularity
event" as an ARB_LOST transmit error when it actually is a receive error
which should be ignored.
Signed-off-by: Hans Verkuil <[email protected]>
Reported-by: Henrik Austad <[email protected]>
Tested-by: Henrik Austad <[email protected]>
Tested-by: Hans Verkuil <[email protected]>
Signed-off-by: Tomi Valkeinen <[email protected]>
|
|
We have plenty of global registers and whatnot programmed without
any further locking by the modeset code. Currently non-bocking
modesets are allowed to execute in parallel which could corrupt
said registers.
To avoid the problem let's run all non-blocking modesets on an
ordered workqueue. We still put page flips etc. to system_unbound_wq
allowing page flips on one pipe to execute in parallel with page flips
or a modeset on a another pipe (assuming no known state is shared
between them, at which point they would have been added to the same
atomic commit and serialized that way).
Blocking modesets are already serialized with each other by
connection_mutex, and thus are safe. To serialize them with
non-blocking modesets we just flush the workqueue before executing
blocking modesets.
Cc: Daniel Vetter <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Fixes: 94f050246b42 ("drm/i915: nonblocking commit")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Daniel Vetter <[email protected]>
Reviewed-by: Maarten Lankhorst <[email protected]>
(cherry picked from commit 757fffcfdffb6c0dd46c1b264091c36b4e5a86ae)
Signed-off-by: Jani Nikula <[email protected]>
|
|
Prevent the DMC from destroying GMBUS transfers on GLK. GMBUS
lives in PG1 so DC off is all we need.
Cc: [email protected]
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Dhinakaran Pandiyan <[email protected]>
(cherry picked from commit 156961ae7bdf6feb72778e8da83d321b273343fd)
Signed-off-by: Jani Nikula <[email protected]>
|