Age | Commit message (Collapse) | Author | Files | Lines |
|
Match existing asics.
v2: rebase (Alex)
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Aldebaran has fine grained DPM for GFXCLK. Instead of a discrete level,
user can specify a min/max range of GFXCLK for any profiling/tuning
purpose.This option is available only in manual performance level mode.
Select "manual" as power_dpm_force_performance_level and specify the
min/max range using pp_dpm_sclk sysfs node. User cannot specify a min/max
range outside of the default min/max range of the ASIC. If specified
outside the range, values will be bound by the default min/max range.
Ex: To use gfxclk min = 600MHz and max = 900MHz
echo manual > /sys/bus/pci/devices/.../power_dpm_force_performance_level
echo min 600 max 900 > /sys/bus/pci/devices/.../pp_dpm_sclk
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
the following message is not supported.
PPSMC_MSG_ReadSerialNumTop32
PPSMC_MSG_ReadSerialNumBottom32
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
With a recent gart page table re-construction, the gart page
table is now 2-level for some ASICs: PDB0->PTB.
In the case of 2-level gart page table, the page_table_base
of vmid0 should point to PDB0 instead of PTB.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
More accurate words are used to address a
code review feedback
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For the new 2-level GART table, the last PDE0 points
to PTB. Since PTB is in vram and right now we are
runing under s=0 mode (vram is treated as FB carveout),
so the s bit of this PDE0 should be set to 0.
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
update mmhub client id table for Aldebaran.
Signed-off-by: Alex Sierra <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Aldebaran can share the same initializing shader code witn
arcturus.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
With the 2-level gart page table, vram is squeezed into gart aperture
and FB aperture is disabled. Therefore all VRAM virtual addresses are
in the GART aperture. However currently PSP requires TMR addresses
in FB aperture. So we need some design change at PSP FW level to support
this 2-level gart table driver change. Right now this PSP FW support
doesn't exist. To workaround this issue temporarily, FB aperture is
added back and the gart aperture address is converted back to FB aperture
for this PSP TMR address.
Will revert it after we get a fix from PSP FW.
v2: squash in tmr fix for other asics (Kevin)
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set up HW for 2-level vmid0 page table: 1. Set up
PAGE_TABLE_START/END registers. Currently only plan
to do 2-level page table for ALDEBARAN, so only gfxhub1.0
and mmhub1.7 is changed. 2. Set page table base register.
For 2-level page table, the page table base should point
to PDB0. 3. Disable AGP and FB aperture as they are not
used.
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If use gart for FB translation, allocate and fill
PDB0.
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add functions to allocate PDB0, map it for CPU access,
and fill it.
Those functions are only used for 2-level vmid0 page
table construction
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If use gart for FB translation, we will squeeze vram into
sysvm aperture. This requires 2 level gart table. Add
page table depth and page table block size parameters
to gmc. This is prepare work to 2-level gart table
construction
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If use GART for FB translation, place both vram and gart to sysvm
aperture. AGP aperture is not set up in this case because it
is not used
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Modify the comment to reflect the fact that, if
use GART for vram address translation for vmid0,
[vram_start, vram_end] will be placed inside SYSVM
aperture, together with GART.
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In amdgpu_gmc_gart_location function, gart_size is adjusted
by a smu_prv_buffer_size. This logic shouldn't belong to
this function. Move the logic to the mc_init functions
Signed-off-by: Oak Zeng <[email protected]>
Reviewed-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On A+A platform, CPU write page directory and page table in cached
mode. So it is necessary for page table walker to snoop CPU cache.
This setting is necessary for page walker to snoop page directory
and page table data out of CPU cache.
Signed-off-by: Oak Zeng <[email protected]>
Acked-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On A+A platform, vram can be mapped as WB. Not necessarily
to always map vram as WC on such platform.
Calling function arch_io_reserve_memtype_wc will mark the
whole vram region as WC. So don't call it for A+A platform.
Signed-off-by: Oak Zeng <[email protected]>
Suggested-by: Alex Deucher <[email protected]>
Acked-by: Christian Konig <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Status 0 indicates success, fix the check before using PPTable limit
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>`
Signed-off-by: Alex Deucher <[email protected]>
|
|
Performance Determinism is a new mode in Aldebaran where PMFW tries to
maintain sustained performance level. It can be enabled on a per-die
basis on aldebaran. To guarantee that it remains within the power cap,
a max GFX frequency needs to be specified in this mode. A new
power_dpm_force_performance_level, "perf_determinism", is defined to enable
this mode in amdgpu. The max frequency (in MHz) can be specified through
pp_dpm_sclk. The mode will be disabled once any other performance level
is chosen.
Ex: To enable perf determinism at 900Mhz max gfx clock
echo perf_determinism > /sys/bus/pci/devices/.../power_dpm_force_performance_level
echo max 900 > /sys/bus/pci/devices/.../pp_dpm_sclk
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On aldebaran DCBTC should be run after enabling DPM. DCBTC won't be run
if support is not enabled in PPTable. Without PPTable support the message
is dummy and will return success always.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Aldebaran doesn't have AC/DC power limits. Separate the implementation
from SMU13. Max power limit is queried from PPTable.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The psp supplies the link type in the upper 2 bits of the psp xgmi node
information num_hops field. With a new link type, Aldebaran has these
bits set to a non-zero value (1 = xGMI3) so the KFD topology will report
the incorrect IO link weights without proper masking.
The actual number of hops is located in the 3 least significant bits of
this field so mask if off accordingly before passing it to the KFD.
Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Amber Lin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
By default this timestamp is 32 bit counter. It gets
overflowed in around 10 minutes.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If xnack is on, VM retry fault interrupt send to IH ring1, and ring1
will be full quickly. IH cannot receive other interrupts, this causes
deadlock if migrating buffer using sdma and waiting for sdma done while
handling retry fault.
Remove VMC from IH storm client, enable ring1 write pointer overflow,
then IH will drop retry fault interrupts and be able to receive other
interrupts while driver is handling retry fault.
IH ring1 write pointer doesn't writeback to memory by IH, and ring1
write pointer recorded by self-irq is not updated, so always read
the latest ring1 write pointer from register.
Signed-off-by: Philip Yang <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
With the current kfd memory accounting scheme, kfd applications
can use up to 15/16 of total system memory. For system which
has small total system memory size it leaves small system memory
for OS. For example, if the system has totally 16GB of system
memory, this scheme leave OS and non-kfd applications only 1GB
of system memory. In many cases, this leads to OOM killer.
This patch changed the KFD system memory accounting scheme.
15/16 of free system memory when kfd driver load. This deduct
the system memory that OS already use.
Signed-off-by: Oak Zeng <[email protected]>
Suggested-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Arcturus and onwards products should follow the same sequence
that have pmfw loading ahead of tmr setup
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Aldebaran MMHUB CG/LS logic is controlled by VBIOS. Enable the state
change logic only if driver is used for control.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1: The interrupts need to be enabled to move to DS clocks.
v2: Don't enable GFX IDLE interrupts if there are no GFX rings.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove SMU_MSG_GfxDriverReset generic index.
Always use SMU_MSG_GfxDeviceDriverReset as the generic index for reset.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use the correct mapping for mode-reset messages on aldebaran
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
PrepareMp1Reset and SoftReset messages are not supported on aldebaran.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Aldebaran clock gating support for GFX,SDMA,IH blocks
VCN/JPEG blocks are excluded in this patch, to be enabled later
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add the mmhub client id table for aldebaran.
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable dpg indirect sram mode on aldebaran.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable vcn dpg mode on aldebaran
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable vcn and jpeg 2.6 on aldebaran.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable smu13 block on aldebaran
Signed-off-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
global noretry setting now is cached to gmc.noretry
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
get_num_acc_vgprs does not set status.scc if the number of acc vgprs
is 0, so use an and instruction to set the condition code.
The Aldebaran handler binary was not based on the latest version of
the sources, so this update to the binary is the minimal change only
adding two instructions to set the condition code.
A newer version of the handler should be generated and tested in
another commit.
Signed-off-by: Laurent Morichetti <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For GPUs that don't support fan control, set the no fan control flag so
that they don't appear in hwmon sensors.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
hdp read cache is removed in aldebaran. don't issue
an mmio write or write data packet to hardware.
v2: rebase
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Simplify all Aldebaran DIDs into one ASIC type.
Signed-off-by: Amber Lin <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
This causes infinite retries on the UTCL1 RB, preventing
higher priority RB such as paging RB.
[How]
Set to one the SDMAx_UTLC1_TIMEOUT registers for all SDMAs.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
0x61 is assigned to HBM2E in atom_dgpu_vram_type.
Signed-off-by: Feifei Xu <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
driver should use the gfx_info atomfirmware interface
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For ASICs that don't support ip discovery feature, query
gfx configuration through atomfirmware interface, rather
than gpu_info firmware.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
PPSMC_MSG_SetSystemVirtualDramAddrHigh/Low messages are not handled by
PMFW in aldebaran
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Temporarily force to use BU PPTable defined in VBIOS. Add support to
override PPTable defined by module parameter.Add FW reported version to
kernel log.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Temporarily add smu_pptable module parameter for aldebaran.This is used
to force soft PPTable use overriding any VBIOS PPTable.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|