Age | Commit message (Collapse) | Author | Files | Lines |
|
If for some unexpected reason the registers all read zero it's better
to WARN and return instead of dividing by zero and completely freezing
the machine.
I don't expect this to happen in the wild with the current code, but I
accidentally triggered the division by zero while doing some debugging
in an unusual environment.
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Paulo Zanoni <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Looks like we were missing them.
Signed-off-by: Paulo Zanoni <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The AMDGPU_SCHED_OP_PROCESS_PRIORITY_OVERRIDE ioctls are used to set
the priority of a different process in the current system.
When a request is dropped, the process's contexts will be
restored to the priority specified at context creation time.
A request can be dropped by setting the override priority to
AMDGPU_CTX_PRIORITY_UNSET.
An fd is used to identify the remote process. This is simpler than
passing a pid number, which is vulnerable to re-use, etc.
This functionality is limited to DRM_MASTER since abuse of this
interface can have a negative impact on the system's performance.
v2: removed unused output structure
v3: change refcounted interface for a regular set operation
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Introduce amdgpu_ctx_priority_override(). A mechanism to override a
context's priority.
An override can be terminated by setting the override to
AMD_SCHED_PRIORITY_UNSET.
v2: change refcounted interface for a direct set
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use _INVALID to identify bad parameters and _UNSET to represent the
lack of interest in a specific value.
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is useful for changing an entity's priority at runtime.
v2: don't modify the order of amd_sched_entity members
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Returning invalid priorities as _NORMAL is a backwards compatibility
quirk of amdgpu_ctx_ioctl(). Move this detail one layer up where it
belongs.
Signed-off-by: Andres Rodriguez <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Programming CP_HQD_QUEUE_PRIORITY enables a queue to take priority over
other queues on the same pipe. Multiple queues on a pipe are timesliced
so this gives us full precedence over other queues.
Programming CP_HQD_PIPE_PRIORITY changes the SPI_ARB_PRIORITY of the
wave as follows:
0x2: CS_H
0x1: CS_M
0x0: CS_L
The SPI block will then dispatch work according to the policy set by
SPI_ARB_PRIORITY. In the current policy CS_H is higher priority than
gfx.
In order to prevent getting stuck in loops of resources bouncing between
GFX and high priority compute and introducing further latency, we
statically reserve a portion of the pipe.
v2: fix srbm_select to ring->queue and use ring->funcs->type
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: switch int to enum amd_sched_priority
v5: corresponding changes for srbm_lock
v6: change CU reservation to PIPE_PERCENT allocation
v7: use kiq instead of MMIO
v8: back to MMIO, and make the implementation sleep safe.
v9: corresponding changes for splitting HIGH into _HW/_SW
Acked-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add an initial framework for changing the HW priorities of rings. The
framework allows requesting priority changes for the lifetime of an
amdgpu_job. After the job completes the priority will decay to the next
lowest priority for which a request is still valid.
A new ring function set_priority() can now be populated to take care of
the HW specific programming sequence for priority changes.
v2: set priority before emitting IB, and take a ref on amdgpu_job
v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_*
v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb
v5: use atomic for tracking job priorities instead of last_job
v6: rename amdgpu_ring_priority_[get/put]() and align parameters
v7: replace spinlocks with mutexes for KIQ compatibility
v8: raise ring priority during cs_ioctl, instead of job_run
v9: priority_get() before push_job()
Reviewed-by: Christian König <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add a new context creation parameter to express a global context priority.
The priority ranking in descending order is as follows:
* AMDGPU_CTX_PRIORITY_HIGH_HW
* AMDGPU_CTX_PRIORITY_HIGH_SW
* AMDGPU_CTX_PRIORITY_NORMAL
* AMDGPU_CTX_PRIORITY_LOW_SW
* AMDGPU_CTX_PRIORITY_LOW_HW
The driver will attempt to schedule work to the hardware according to
the priorities. No latency or throughput guarantees are provided by
this patch.
This interface intends to service the EGL_IMG_context_priority
extension, and vulkan equivalents.
Setting a priority above NORMAL requires CAP_SYS_NICE or DRM_MASTER.
v2: Instead of using flags, repurpose __pad
v3: Swap enum values of _NORMAL _HIGH for backwards compatibility
v4: Validate usermode priority and store it
v5: Move priority validation into amdgpu_ctx_ioctl(), headline reword
v6: add UAPI note regarding priorities requiring CAP_SYS_ADMIN
v7: remove ctx->priority
v8: added AMDGPU_CTX_PRIORITY_LOW, s/CAP_SYS_ADMIN/CAP_SYS_NICE
v9: change the priority parameter to __s32
v10: split priorities into _SW and _HW
v11: Allow DRM_MASTER without CAP_SYS_NICE
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.
Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).
This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.
v2: only disable explicit sync in amdgpu_cs_ioctl
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Convert GTT mappings into linear ones for huge page handling.
v2: use fragment size as minimum for linear conversion
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of mapping them bit by bit map/unmap all consecutive
pages as in one call.
v2: test for consecutive pages instead of using compound page order.
Signed-off-by: Christian König <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Totally surprisingly this is more efficient than doing it page by page.
Signed-off-by: Christian König <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
SR-IOV need to reserve a piece of shared VRAM at the exact place
to exchange data betweem PF and VF. The start address and size of
the shared mem are passed to guest through VBIOS structure
VRAM_UsageByFirmware.
VRAM_UsageByFirmware is a general feature in VBIOS, it indicates
that VBIOS need to reserve a piece of memory on the VRAM.
Because the mem address is specified. Reserve it early in
amdgpu_ttm_init to make sure that it can monoplize the space.
Signed-off-by: Horace Chen <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Without the additional bits set in PDEs/PTEs, the ATC memory access
would have failed on Raven.
Signed-off-by: Yong Zhao <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's a little unclear what the sg_mask actually is, so prefer the more
meaningful name of sg_page_sizes.
Suggested-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
|
|
Currently, we reject attempting to pin a large bo into the mappable
aperture, but only after trying to create the vma. Under debug kernels,
repeatedly creating and freeing that vma for an oversized bo consumes
one-third of the runtime for pwrite/pread tests as it is spent on
kmalloc/kfree tracking. If we move the rejection to before creating that
vma, we lose some accuracy of checking against the fence_size as opposed
to object size, though the fence can never be smaller than the object.
Note that the vma creation itself will reject an attempt to create a vma
larger than the GTT so we can remove one redundant test.
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
Both pread/pwrite GTT paths provide a fast fallback in case we cannot
map the whole object at a time. Currently, we use the fallback for very
large objects and for active objects that would require remapping, but
we can also add active fault mappable objects to the list that we want
to avoid evicting. The rationale is that such fault mappable objects are
in active use and to evict requires tearing down the CPU PTE and forcing
a page fault on the next access; more costly, and intefers with other
processes, than our per-page GTT fallback.
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
As we have a lightweight fallback to insert a single page into the
aperture, try to avoid any heavier evictions when attempting to insert
the entire object.
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
If the caller says that he doesn't want to evict any other faulting
vma, honour that flag. The logic was used in evict_something, but not
the more specific evict_for_node, now being used as a preliminary probe
since commit 606fec956c0e ("drm/i915: Prefer random replacement before
eviction search").
Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
References: https://bugs.freedesktop.org/show_bug.cgi?id=102490
Signed-off-by: Chris Wilson <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.
v2: Use a local vma_offset to tidy up a multiline unmap_mapping_range().
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.
We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Joonas Lahtinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Add assert_forcewakes_active() (the complementary function to
assert_forcewakes_inactive) that documents the requirement of a
function for its callers to be holding the forcewake ref (i.e. the
function is part of a sequence over which RC6 must be prevented).
One such example is during ringbuffer reset, where RC6 must be held
across the whole reinitialisation sequence.
v2: Include debug information in the WARN so we know which fw domain is
missing.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The lowlevel reset functions expect the caller to be holding the rpm
wakeref for the device access across the reset. We were not explicitly
doing this in the sefltest, so for simplicity acquire the wakeref for
the duration of all subtests.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
|
|
Resetting the engine requires us to hold the forcewake wakeref to
prevent RC6 trying to happen in the middle of the reset sequence. The
consequence of an unwanted RC6 event in the middle is that random state
is then saved to the powercontext and restored later, which may
overwrite the mmio state we need to preserve (e.g. PD_DIR_BASE in the
legacy ringbuffer reset_ring_common()).
This was noticed in the live_hangcheck selftests when Haswell would
sporadically fail to restart during igt_reset_queue().
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
|
|
During hangcheck testing, we try to execute requests following the GPU
reset, and in particular want to try and debug when those fail.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
|
|
We can use drm_printer to hide the differences between printk and
seq_printf, and so make the i915_engine_info pretty printer able to be
called from different contexts and not just debugfs. For instance, I
want to use the pretty printer to debug kselftests.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Mika Kuoppala <[email protected]>
|
|
We only apply the hugepage PD redirection inside the ppGTT, so during
i915_vma_insert() we want to exclude the GGTT from the additional
alignment constraints (thereby avoiding the extra GTT pressure from
fragmentation). Add an assert to document that intention alongside the
comment.
v2: After discussion with Matthew, make it a blanket GGTT ban
(previously we allowed the expansion for appgtt, and so indirectly
ggtt). There are issues we need to fix before allowing the current
appgtt to be used with hugepages, and if we do, we probably want more
care over when to expand/align, as the mappable aperture inside the ggtt
is precious.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Joonas Lahtinen <[email protected]>
|
|
When dma_fence_get_rcu() fails to acquire a reference it doesn't necessary
mean that there is no fence at all.
It usually mean that the fence was replaced by a new one and in this situation
we certainly want to have the new one as result and *NOT* NULL.
v2: Keep extra check after dma_fence_get_rcu().
Signed-off-by: Christian König <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Reviewed-by: Maarten Lankhorst <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Stop requiring that the src reservation object is locked for this operation.
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
intel_crtc_mode_get()
Eliminate the duplicate code for pipe timing readout in
intel_crtc_mode_get() by using the functions we use for the normal state
readout.
v2: Store dotclock in adjusted_mode instead of the final mode
Cc: [email protected]
Cc: Rob Kramer <[email protected]>
Cc: Daniel Vetter <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Tested-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
intel_crtc->config->cpu_transcoder isn't yet filled out when
intel_crtc_mode_get() gets called during output probing, so we should
not use it there. Instead intel_crtc_mode_get() figures out the correct
transcoder on its own, and that's what we should use.
If the BIOS boots LVDS on pipe B, intel_crtc_mode_get() would actually
end up reading the timings from pipe A instead (since PIPE_A==0),
which clearly isn't what we want.
It looks to me like this may have been broken by
commit eccb140bca67 ("drm/i915: hw state readout&check support for cpu_transcoder")
as that one removed the early initialization of cpu_transcoder from
intel_crtc_init().
Cc: [email protected]
Cc: [email protected]
Cc: Rob Kramer <[email protected]>
Cc: Daniel Vetter <[email protected]>
Reported-by: Rob Kramer <[email protected]>
Fixes: eccb140bca67 ("drm/i915: hw state readout&check support for cpu_transcoder")
References: https://lists.freedesktop.org/archives/dri-devel/2016-April/104142.html
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Kmemleak reported memory leak after suspend and resume:
unreferenced object 0xffffffc0e31d8880 (size 128):
comm "bash", pid 181, jiffies 4294763583 (age 24.694s)
hex dump (first 32 bytes):
01 00 00 00 00 00 00 00 00 20 a2 eb c0 ff ff ff ......... ......
01 00 00 00 00 00 00 00 80 87 1d e3 c0 ff ff ff ................
backtrace:
[<ffffffc00034bb64>] __save_stack_trace+0x48/0x6c
[<ffffffc00034c244>] create_object+0x138/0x254
[<ffffffc0009dd218>] kmemleak_alloc+0x58/0x8c
[<ffffffc000346de4>] kmem_cache_alloc_trace+0x188/0x254
[<ffffffc0005af4c0>] drm_atomic_state_alloc+0x3c/0x88
[<ffffffc000591f0c>] drm_atomic_helper_duplicate_state+0x28/0x158
[<ffffffc000592098>] drm_atomic_helper_suspend+0x5c/0xf0
Problem here is that we are duplicating the drm_atomic_state in
drm_atomic_helper_suspend(), but not unreference it in the resume path.
Fixes: 1494276000db ("drm/atomic-helper: Implement subsystem-level suspend/resume")
Signed-off-by: Jeffy Chen <[email protected]>
Reviewed-by: Maarten Lankhorst <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Fixes: 0853695c3ba4 ("drm: Add reference counting to drm_atomic_state")
Cc: <[email protected]> # v4.10+
|
|
Add support for HDMI CEC to the drm adv7511/adv7533 drivers.
The CEC registers that we need to use are identical for both drivers,
but they appear at different offsets in the register map.
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Archit Taneja <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Document the cec clock binding.
Signed-off-by: Hans Verkuil <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Archit Taneja <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
into drm-next
More new stuff for 4.15. Highlights:
- Add clock query interface for raven
- Add new FENCE_TO_HANDLE ioctl
- UVD video encode ring support on polaris
- transparent huge page DMA support
- deadlock fixes
- compute pipe lru tweaks
- powerplay cleanups and regression fixes
- fix duplicate symbol issue with radeon and amdgpu
- misc bug fixes
* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (72 commits)
drm/radeon/dp: make radeon_dp_get_dp_link_config static
drm/radeon: move ci_send_msg_to_smc to where it's used
drm/amd/sched: fix deadlock caused by unsignaled fences of deleted jobs
drm/amd/sched: NULL out the s_fence field after run_job
drm/amd/sched: move adding finish callback to amd_sched_job_begin
drm/amd/sched: fix an outdated comment
drm/amd/sched: rename amd_sched_entity_pop_job
drm/amdgpu: minor coding style fix
drm/ttm: add transparent huge page support for DMA allocations v2
drm/ttm: add support for different pool sizes
drm/ttm: remove unsued options from ttm_mem_global_alloc_page
drm/amdgpu: add uvd enc irq
drm/amdgpu: add uvd enc ib test
drm/amdgpu: add uvd enc ring test
drm/amdgpu: add uvd enc vm functions (v2)
drm/amdgpu: add uvd enc into run queue
drm/amdgpu: add uvd enc rings
drm/amdgpu: add new uvd enc ring methods
drm/amdgpu: add uvd enc command in header
drm/amdgpu: add uvd enc registers in header
...
|
|
It's not used outside this file any longer.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's used in ci_dpm.c so move it there and make it static.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Make the docs read a little better.
Cc: Laurent Pinchart <[email protected]>
Cc: Daniel Vetter <[email protected]>
Signed-off-by: Noralf Trønnes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
For gen8+ platforms which support the 48b PPGTT, enable platform level
support for 2M pages. Also enable for mock testing.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
For gen9+ enable platform level support for 64K pages. Also enable for
mock testing.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Currently gvt gtt handling doesn't support huge page entries, so disable
for now.
v2: remove useless 48b PPGTT check
Suggested-by: Zhenyu Wang <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Reviewed-by: Zhenyu Wang <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Try to mix sg page sizes for 4K, 64K and 2M pages.
v2: s/BIT(x) >> 12/BIT(x) >> PAGE_SHIFT/
Suggested-by: Chris Wilson <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
v2: mock test page support configurations and add MI_STORE_DWORD test
v3: run all mockable huge page tests on all platforms via the mock_device
v4: add pin_update regression test
various improvements suggested by Chris
v5: fix issues reported by kbuild
test single sg spanning multiple page sizes
don't explode when running the live-tests through the appgtt
v6: lots of improvements from Chris
v7: run on each engine for igt_write_huge
add simple tmpfs fallback test
v8: size_t is bad
don't break the i386 build
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Good to know, mostly for debugging purposes.
v2: some improvements from Chris
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that we support multiple page sizes for the ppgtt, it would be
useful to track the real usage for debugging purposes.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Support inserting 64K pages into the 48b PPGTT.
v2: check for 64K scratch
v3: we should only have to re-adjust maybe_64K at every sg interval
Signed-off-by: Matthew Auld <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|