Age | Commit message (Collapse) | Author | Files | Lines |
|
Set ring functions with software ring callbacks on gfx9.
The software ring could be tested by debugfs_test_ib case.
v2: Set sw_ring 2 to enable software ring by default.
v3: Remove the parameter for software ring enablement.
v4: Use amdgpu_ring_init/fini for software rings.
v5: Update for code format. Fix conflict.
v6: Remove unnecessary checks and enable software ring on gfx9 by default.
v7: Use static array for software ring names and priorities.
v8: Stop creating software rings if no gfx ring existed.
Cc: Christian Koenig <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Michel Dänzer <[email protected]>
Cc: Likun Gao <[email protected]>
Signed-off-by: Jiadong.Zhu <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Acked-by: Huang Rui <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The software ring is created to support priority context while there is only
one hardware queue for gfx.
Every software ring has its fence driver and could be used as an ordinary ring
for the GPU scheduler.
Multiple software rings are bound to a real ring with the ring muxer. The
packages committed on the software ring are copied to the real ring.
v2: Use array to store software ring entry.
v3: Remove unnecessary prints.
v4: Remove amdgpu_ring_sw_init/fini functions,
using gtt for sw ring buffer for later dma copy
optimization.
v5: Allocate ring entry dynamically in the muxer.
v6: Update comments for the ring muxer.
v7: Modify for function naming.
v8: Combine software ring functions into amdgpu_ring_mux.c
v9: Use kernel-doc comment on the get_rptr function.
Cc: Christian Koenig <[email protected]>
Cc: Luben Tuikov <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Jiadong.Zhu <[email protected]>
Acked-by: Huang Rui <[email protected]>
Acked-by: Luben Tuikov <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Change the type of parameter on amdgpu_gfx_cp_init_microcode to fix
compiler warning.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add an common function to init CP related microcode.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
cache rlcv/rlcvp ucode version info in amdgpu_gfx
structure
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Likun Gao <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add comments to document gfx_off related members of struct amdgpu_gfx.
Signed-off-by: André Almeida <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add debugfs interface to log GFXOFF statistics:
- Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the
time of query since system power-up
- Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop.
Read it to get average GFXOFF residency % multiplied by 100
during the last logging interval.
Both features are designed to be keep the values persistent between
suspends.
Signed-off-by: André Almeida <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Starting from SIENNA CICHLID asic supports two gfx pipes, enabling
two graphics queues, 1 on each pipe, pipe0 queue0 would be the normal
piority queue and pipe1 queue0 would be the high priority queue
Only one queue per pipe is visble to SPI, SPI looks at the priority
value assigned to CP_GFX_HQD_QUEUE_PRIORITY from each of the queue's
HQD/MQD.
Create contexts applying AMDGPU_CTX_PRIORITY_HIGH which submits job
to the high priority queue on GFX pipe1. There would be starvation
of LP workload if HP workload is always available.
v2:
- remove unnecessary check(Nirmoy)
- make pipe1 hardware support a separate patch(Nirmoy)
- remove duplicate code(Shashank)
- add CSA support for second gfx pipe(Alex)
v3(Christian):
- fix incorrect indentation
- merge COMPUTE and GFX switch cases as both calls the same function.
v4:
- rebase w/ latest code base
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's over a decade ago that this was actually used for more than ring and
IB tests. Just use the static register directly where needed and nuke the
now useless infrastructure.
Signed-off-by: Christian König <[email protected]>
Acked-by: Lang Yu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add initial support for GC version 11. GC is
the graphics and compute block on the GPU.
v1: add initial gfx11 support (Wenhui)
v2: switch to new amdgpu_gfx_is_high_priority_compute_queue
interface (Hawking)
v3: fix num_mec (Alex)
Signed-off-by: Wenhui Sheng <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support to initialize imu for gfx v11.
IMU is a new power management block for
gfx which manages gfx power.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support for GPU memory controller v11.
v1: Add support for gmc v11.0
Add gmc 11 block (Tianci)
v2: drop unused amdgpu_bo_late_init (Hawking)
v3: squash in various fix
Signed-off-by: Tianci.Yin <[email protected]>
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
From the GC info table to the gfx config structure in the
driver. The driver will use this data to configure the
card correctly.
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add help functions to query and reset RAS UTCL2 poison status.
v2: implement it on amdgpu side and kfd only calls it.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
block
Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Modify .ras_fini function pointer parameter so that
we can remove redundant intermediate calls in some
ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Modify .ras_late_init function pointer parameter so that
it can remove redundant intermediate calls in some ras blocks.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1.Modify gfx block to fit for the unified ras block data and ops.
2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list.
5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini.
Signed-off-by: yipechai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Those implementation details(whether swsmu supported, some ppt_funcs supported,
accessing internal statistics ...)should be kept internally. It's not a good
practice and even error prone to expose implementation details.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Currently AMDGPU_RING_PRIO_MAX is redefinition of a
max gfx hwip priority, this won't work well when we will
have a hwip with different set of priorities than gfx.
Also, HW ring priorities are different from ring priorities.
Create a global enum for ring priority levels which each
HWIP can use to define its own priority levels.
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx ras is only available in cerntain ip generations.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When connected to a host via xGMI, system fatal errors may trigger
warm reset, driver has no change to query edc status before reset.
Therefore in this case, driver should harvest previous error loging
registers during boot, instead of only resetting them.
v2:
1. IP's ras_manager object is created when its ras feature is enabled,
so change to query edc status after amdgpu_ras_late_init called
2. change to enable watchdog timer after finishing gfx edc init
Signed-off-by: Dennis Li <[email protected]>
Reivewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
SQ's watchdog timer monitors forward progress, a mask of which waves
caused the watchdog timeout is recorded into ras status registers and
then trigger a system fatal error event.
v2:
1. change *query_timeout_status to *query_sq_timeout_status.
2. move query_sq_timeout_status into amdgpu_ras_do_recovery.
3. add module parameters to enable/disable fatal error event and modify
the watchdog timer.
v3:
1. remove unused parameters of *enable_watchdog_timer
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add edc counter/status reset and query functions for gfx block of
aldebaran.
v2: change to clear edc counter explicitly
aldebaran hardware will not clear edc counter after driver reading them,
so driver should clear them explicitly.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For high priority compute to work properly we need to enable
wave limiting on gfx pipe. Wave limiting is done through writing
into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high
priority compute queue to avoid race condition between multiple
high priority compute queues writing that register simultaneously.
Signed-off-by: Nirmoy Das <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The new amdgpu_gfx_state_change_set() funtion can support set GFX power
change status to D0/D3.
v2: squash in warning fix (Alex)
Signed-off-by: Prike Liang <[email protected]>
Acked-by: Huang Rui <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Compute queues are configurable with module param, num_kcq.
amdgpu_gfx_is_high_priority_compute_queue was setting 1st 4 queues to
high priority queue leaving a null drm scheduler in
adev->gpu_sched[hw_ip]["normal_prio"].sched if num_kcq < 5.
This patch tries to fix it by alternating compute queue priority between
normal and high priority.
Fixes: 33abcb1f5a1719b1c (drm/amdgpu: set compute queue priority at mqd_init)
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add a helper so we can set per asic default values. Also,
the module parameter is currently clamped to 8, but clamp it
per asic just in case some asics have different limits in the
future. Enable the option on gfx6,7 as well for consistency.
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable Navi1X MGCG perfmon setting.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GCEA/MMHUB EA error should not result to DF freeze, this is
fixed in next generation, but for some reasons the GCEA/MMHUB
EA error will result to DF freeze in previous generation,
diver should avoid to indicate GCEA/MMHUB EA error as hw fatal
error in kernel message by read GCEA/MMHUB err status registers.
Changed from V1:
make query_ras_error_status function more general
make read mmhub er status register more friendly
Changed from V2:
move ras error status query function into do_recovery workqueue
Changed from V3:
remove useless code from V2, print GCEA error status
instance number
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On Navi1x, the SPM golden settings are lost after GFXOFF
enter/exit, so reconfiguration is needed. Make the
configuration code as an interface for future use.
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Tianci.Yin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add interface for SMU12 device, used by UMR.
v2: fix code style
Signed-off-by: Jinzhou.Su <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support for GC 10.3.
v2: Squash in gb_addr_config fix (Alex)
v3: Add num_pkrs support (Alex)
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename it to amdgpu_queue_mask_bit_to_set_resource_bit() to be more
specific about its functionality. KFD will use it later.
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
According to the current kiq read register method,
there will be race condition when using KIQ to read
register if multiple clients want to read at same time
just like the expample below:
1. client-A start to read REG-0 throguh KIQ
2. client-A poll the seqno-0
3. client-B start to read REG-1 through KIQ
4. client-B poll the seqno-1
5. the kiq complete these two read operation
6. client-A to read the register at the wb buffer and
get REG-1 value
Therefore, use amdgpu_device_wb_get() to request reg_val_offs
for each kiq read register.
v2: fix the error remove
v3: fix the print typo
v4: remove unused variables
Signed-off-by: Yintian Tao <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.
v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum
v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list
v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We were changing compute ring priority while rings were being used
before every job submission which is not recommended. This patch
sets compute queue priority at mqd initialization for gfx8, gfx9 and
gfx10.
Policy: make queue 0 of each pipe as high priority compute queue
High/normal priority compute sched lists are generated from set of high/normal
priority compute queues. At context creation, entity of compute queue
get a sched list from high or normal priority depending on ctx->priority
Signed-off-by: Nirmoy Das <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GFX ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The two members will be used by KFD later.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
register by KIQ
Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c,
and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and
flexible.
Signed-off-by: chen gong <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
tlbs invalidate pointer function added to kiq_pm4_funcs struct.
This way, tlb flush can be done through kiq member.
TLBs invalidatation implemented for gfx9 and gfx10.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This sched array can be passed on to entity creation routine
instead of manually creating such sched array on every context creation.
v2: squash in missing break fix
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. no need to allocate an extra member for 'mqd_backup' array
2. backup/restore mqd to/from the correct 'mqd_backup' array slot
v2: warning fix (Alex)
Signed-off-by: Xiaojie Yuan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.
Signed-off-by: Dave Airlie <[email protected]>
|
|
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface. This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.
For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.
For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.
Signed-off-by: changzhu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Nirmoy Das <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
UMDs need this for correct programming of harvested chips.
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx_ras_late_init can get the info by itself
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx_ras_fini can be shared among all generations of gfx
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx ras ecc common functions could be reused among all gfx generations
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|