Age | Commit message (Collapse) | Author | Files | Lines |
|
When this flag is set in the CS IB flags, it causes
a memory cache flush of the GFX.
v2:
Move new flag to drm_amdgpu_cs_chunk_ib.flags
Bump up UAPI version
Remove condition on job != null to emit mem_sync
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the coding style, move and rename the definitions to
better match what they are supposed to be doing.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We don't support secure operation on compute rings at the
moment so reject them.
Reviewed-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Aaron Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Since commit "Move to a per-IB secure flag (TMZ)",
we've been seeing hangs in GFX. We need to send
FRAME CONTROL stop/start back-to-back, every time
we flip the TMZ flag. That is, when we transition
from TMZ to non-TMZ we have to send a stop with
TMZ followed by a start with non-TMZ, and
similarly for transitioning from non-TMZ into TMZ.
This patch implements this, thus fixing the GFX
hang.
v1 -> v2:
As suggested by Luben, and accept part of implemetation from this patch:
- Put "secure" closed to the loop and use optimization
- Change "secure" to bool again, and move "secure == -1" out of loop.
v3: Small fixes/optimizations.
Reported-and-Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Signed-off-by: Huang Rui <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move from a per-CS secure flag (TMZ) to a per-IB
secure flag.
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Mark a job as secure, if and only if the command
submission flag has the secure flag set.
v2: fix the null job pointer while in vmid 0
submission.
v3: Context --> Command submission.
v4: filling cs parser with cs->in.flags
v5: move the job secure flag setting out of amdgpu_cs_submit()
Signed-off-by: Huang Rui <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch expands the context control function to support trusted flag while we
want to set command buffer in trusted mode.
Signed-off-by: Huang Rui <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch expands the emit_tmz function to support trusted flag while we want
to set command buffer in trusted mode.
Signed-off-by: Huang Rui <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We have three ib pools, they are normal, VM, direct pools.
Any jobs which schedule IBs without dependence on gpu scheduler should
use DIRECT pool.
Any jobs schedule direct VM update IBs should use VM pool.
Any other jobs use NORMAL pool.
v2: squash in coding style fix
Signed-off-by: xinhui pan <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In order to remove the load and unload drm callbacks,
we need to reorder the init sequence to move all the drm
debugfs file handling. Do this for SA (sub allocator).
Tested-by: Thomas Zimmermann <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Hawaii needs to flush caches explicitly, submitting an IB in a user
VMID from kernel mode. There is no s_fence in this case.
Fixes: eb3961a57424 ("drm/amdgpu: remove fence context from the job")
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Backmerge drm-next and fix up conflicts due to drmP.h removal.
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add ib preemption status in amdgpu_job, so that ring level function
can detect preemption and program for resuming it.
v2: squash in fix to restore job->preamble_status back to status value (Jack)
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Jack Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
CSA is the Context Save Area for preemption.
Acked-by: Hawking Zhang <[email protected]>
Signed-off-by: Jack Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Drop use of drmP.h in all files named amdgpu*
in drm/amd/amdgpu/
Fix fallout.
Signed-off-by: Sam Ravnborg <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: "David (ChunMing) Zhou" <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
On XGMI configuration the ib test may take longer to finish
Signed-off-by: shaoyunl <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Replace the last bool type parameter with a general flags parameter,
to make the last parameter be able to contain more information.
v2: drop setting need_ctx_switch = false
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Jack Xiao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
use the point of struct amdgpu_job as the function
argument instand of vmid, so the other members of
struct amdgpu_job can be visit in emit_ib function.
v2: add a wrapper for getting the VMID
add the job before the ib on the parameter list.
v3: refine the wrapper name
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Rex Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of hard coding the ring type in the function just never provide
a test_ib callback.
Additional to that remove the emit_ib callback to make sure the nobody
ever tries to execute an IB on the KIQ.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Test only initialized rings, use the ring name instead of the index in the
error message and note on which device the error occured.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Start using drm_gpu_scheduler.ready isntead.
v3:
Add helper function to run ring test and set
sched.ready flag status accordingly, clean explicit
sched.ready sets from the IP specific files.
v4: Add kerneldoc and rebase.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
1. We never submit IBs to KIQ.
2. Ring test pass without KIQ's ring also.
3. By skipping we see an improvement of around 500ms
in the amdgpu's resume time.
[How]
skip IB tests for KIQ ring type.
Signed-off-by: Shirish S <[email protected]>
Signed-off-by: Pratik Vishwakarma <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's useful to trace any dependency a job has on prevoius
jobs.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
SWDEV-146499: hang during multi vulkan process testing
cause:
the second frame's PREAMBLE_IB have clear-state
and LOAD actions, those actions ruin the pipeline
that is still doing process in the previous frame's
work-load IB.
fix:
need insert pipeline sync if have context switch for
SRIOV (because only SRIOV will report PREEMPTION flag
to UMD)
Signed-off-by: Monk Liu <[email protected]>
Signed-off-by: Emily Deng <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Can be obtained directly from the fence as well.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Junwei Zhang <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The buffer object backing the user fence is reserved using the non-user
fence, i.e., as soon as the non-user fence is signaled, the user fence
buffer object can be moved or even destroyed.
Therefore, emit the user fence first.
Both fences have the same cache invalidation behavior, so this should
have no user-visible effect.
Signed-off-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable vcn jpeg ib ring test in amdgpu_ib.c
Signed-off-by: Boyuan Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
There is a new IB flag that enables this new behavior.
Full invalidation is unnecessary for RELEASE_MEM and doesn't make sense
when draw calls from two adjacent gfx IBs run in parallel. This will be
the new default for Mesa.
v2: bump the version
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
should use bo_create_kernel instead of split to two
function that create and pin the SA bo
issue:
before this patch, there are DMAR read error in host
side when running SRIOV test, the DMAR address dropped
in the range of SA bo.
fix:
after this cleanups of SA init and fini, above DMAR
eror gone.
v2:
keep sa_bo's fini instead of suspend, to keep
reporting error
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
issue:
sometime GFX/MM ib test hit timeout under SRIOV env, root cause
is that engine doesn't come back soon enough so the current
IB test considered as timed out.
fix:
for SRIOV GFX IB test wait time need to be expanded a lot during
SRIOV runtimei mode since it couldn't really begin before GFX engine
come back.
for SRIOV MM IB test it always need more time since MM scheduling
is not go together with GFX engine, it is controled by h/w MM
scheduler so no matter runtime or exclusive mode MM IB test
always need more time.
v2:
use ring type instead of idx to judge
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
issue:
under SR-IOV sometimes the iB test will fail on
gfx ring
fix:
with cond_exec inserted in RB the gfx engine would
skip part packets if RLCV issue PREEMPT on gfx engine
if gfx engine is prior to COND_EXEC packet, this is
okay for regular command from UMD, but for the ib test
since the whole dma format doesn't support PREEMPT
so must remove the COND_EXEC from it.
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
All HDP invalidation and most flush can now be replaced by the generic
ASIC function.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When ring special operations aren't available we can fallback to the
generic ASIC operations.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move both into the new files amdgpu_ids.[ch]. No functional change.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead mark fence as explicit in it's amdgpu_sync_entry.
v2:
Fix use after free bug and add new parameter description.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
for SR-IOV, we must keep the pipeline-sync in the protection
of COND_EXEC, otherwise the command consumed by CPG is not
consistent when world switch triggerd, e.g.:
world switch hit and the IB frame is skipped so the fence
won't signal, thus CP will jump to the next DMAframe's pipeline-sync
command, and it will make CP hang foever.
after pipelin-sync moved into COND_EXEC the consistency can be
guaranteed
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This allows us to queue IBs which needs an up to date system domain as well.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
|
|
Needed for the proper command sequence for VCN.
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v2: directly return for 'if' case.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
this is an improvement for previous patch, the sched_sync is to store fence
that could be skipped as scheduled, when job is executed, we didn't need
pipeline_sync if all fences in sched_sync are signalled, otherwise insert
pipeline_sync still.
v2: handle error when adding fence to sync failed.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Junwei Zhang <[email protected]> (v1)
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
AI affected:
CP/HW team requires KMD insert FRAME_CONTROL(end) after
the last IB and before the fence of this DMAframe.
this is to make sure the cache are flushed, and it's a must
change no matter MCBP/SR-IOV or bare-metal case because new
CP hw won't do the cache flush for each IB anymore, it just
leaves it to KMD now.
with this patch, certain MCBP hang issue when rendering
vulkan/chained-ib are resolved.
v2: drop gfx8 changes. gfx8 is not affected (Alex)
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The problem is that executing the jobs in the right order doesn't give you the right result
because consecutive jobs executed on the same engine are pipelined.
In other words job B does it buffer read before job A has written it's result.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This way GFX and MM won't fight for VMIDs any more.
Initially disabled since we need to stop flushing all HUBS
at the same time as well.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Andres Rodriguez <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Harry Wentland <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1) Adapt to vulkan:
Now use double SWITCH BUFFER to replace the 128 nops w/a,
because when vulkan introduced, umd can insert 7 ~ 16 IBs
per submit which makes 256 DW size cannot hold the whole
DMAframe (if we still insert those 128 nops), CP team suggests
use double SWITCH_BUFFERs, instead of tricky 128 NOPs w/a.
2) To fix the CE VM fault issue when MCBP introduced:
Need one more COND_EXEC wrapping IB part (original one us
for VM switch part).
this change can fix vm fault issue caused by below scenario
without this change:
>CE passed original COND_EXEC (no MCBP issued this moment),
proceed as normal.
>DE catch up to this COND_EXEC, but this time MCBP issued,
thus DE treats all following packages as NOP. The following
VM switch packages now looks just as NOP to DE, so DE
dosen't do VM flush at all.
>Now CE proceeds to the first IBc, and triggers VM fault,
because DE didn't do VM flush for this DMAframe.
3) change estimated alloc size for gfx9.
with new DMAframe scheme, we need modify emit_frame_size
for gfx9
4) No need to insert 128 nops after gfx8 vm flush anymore
because there was double SWITCH_BUFFER append to vm flush,
and for gfx7 we already use double SWITCH_BUFFER following
after vm_flush so no change needed for it.
5) Change emit_frame_size for gfx8
v2: squash in BUG removal from Monk
Signed-off-by: Monk Liu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We completely bypass the HDP now.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Junwei Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|