Age | Commit message (Collapse) | Author | Files | Lines |
|
When aperture size is zero, there is no mapping done.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On GFXIP 9.4.3, we dont need to rely on xGMI hive info to determine P2P
access.
Reviewed-by: Felix Kuehling <[email protected]>
Acked-and-tested-by: Mukul Joshi <[email protected]>
Signed-off-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For native mode, after amdgpu_bo is created on CPU domain, then call
amdgpu_ttm_tt_set_mem_pool to select the TTM pool using bo->mem_id.
ttm_bo_validate will allocate the memory to the correct memory partition
before mapping to GPUs.
Reviewed-by: Felix Kuehling <[email protected]>
Acked-and-tested-by: Mukul Joshi <[email protected]>
Signed-off-by: Philip Yang <[email protected]>
Signed-off-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For native mode only, create TTM pool for each memory partition to store
the NUMA node id, then the TTM pool will be selected using memory
partition id to allocate memory from the correct partition.
Acked-by: Christian König <[email protected]>
(rajneesh: changed need_swiotlb and need_dma32 to false for pool init)
Reviewed-by: Felix Kuehling <[email protected]>
Acked-and-tested-by: Mukul Joshi <[email protected]>
Signed-off-by: Philip Yang <[email protected]>
Signed-off-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When auto mode is specified, driver will choose the right compute
partition mode.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Check the memory ranges available to the device also for deciding a
valid partition mode. Only select combinations are valid for a
particular mode.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of start xcc id and number of xcc per node, use the xcc mask
which is the mask of logical ids of xccs belonging to a parition.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fetch xcp information from xcp_mgr and also add xcc_mask to kfd node.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After partition switch, fill all relevant xcp information before kfd
starts initialization.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Implement callbacks to fill memory node information in aquavanjaram.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add callback in xcp interface to fill xcp memory id information. Memory
id is used to identify the range/partition of an XCP from the available
memory partitions in device. Also, fill the id information.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GC 9.4.3 ASICS may have memory split into multiple partitions.Initialize
the memory partition information for each range. The information may be
in the form of a numa node id or a range of pages.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Some ASICs have the device memory divided into multiple partitions. The
parititions could be denoted by a numa node or by a range of pages.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add interface to get numa information of ACPI XCC object. The interface
uses logical id to identify an XCC.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use a struct to store additional numa node information including size
and base address. Add numa_info pointer to xcc object to point to the
relevant structure based on its proximity domain.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Expand the interface to get supported memory partition modes also along
with the current memory partition mode.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
GMC block handles memory related information, it makes more sense to
keep memory partition functions in gmc block.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add utility functions to get details of xcp and iterate through
available xcps.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use the generic term fw_reserved_memory for FW reserve region. This
region may also hold discovery TMR in addition to other reserve
regions. This region size could be larger than discovery tmr size, hence
don't change the discovery tmr size based on this.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For IH ring buffer and read/write pointers, use GPU VA space rather than
Guest PA on APU configs. Access through Guest PA doesn't work when IOMMU
is enabled. It is also beneficial in NUMA configs as it allocates from
the closest numa pool in a numa enabled system.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Harish Kasiviswanathan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Simplify so as to use the same sequence to assign logical to physical
ids for all IPs.
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Leo Liu <[email protected]>
Tested-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
VCN DPG buffer object is intialized to NULL. If allotted, buffer object
deletion logic will take care of NULL check and delete accordingly. This
is useful for cases where indirect sram flag could be manipulated later
after buffer allocation.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The 0xDEADBEEF standard anti-hang value. Use it may cause
fake pass.
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To make sure VCN DB_CTRL is delivered before doorbell write.
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Need parentheses for the micro parameters.
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: David (Ming Qiang) Wu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The jpeg_v4_0_3 jpeg_pitch register uses UVD_JRBC_SCRATCH0. It needs to
move WREG() to after jpeg_start.
Switch to a posted register write when doing the ring test to make sure
the register write lands before we test the result.
Signed-off-by: Sonny Jiang <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use physical AID index for VCN/JPEG ring name instead of
logical AID index.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Sonny Jiang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use dummy register 0xDEADBEEF selects AID for PSP VCN_RAM ucode.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Sonny Jiang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use VCN instance mask to check if an instance is harvested or not.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Address VCN/JPEG instances using logical ids. Whenever register access is
required, get the physical instance using GET_INST.
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Leo Liu <[email protected]>
Tested-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add mappings for logical to physical id for VCN/JPEG 4.0.3
v2: make local function static (Alex)
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Leo Liu <[email protected]>
Tested-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Keep an instance mask formed by physical instance numbers for VCN and JPEG
IPs. Populate the mask from discovery table information.
Signed-off-by: Lijo Lazar <[email protected]>
Acked-by: Leo Liu <[email protected]>
Tested-by: James Zhu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
VCN loading ucode is moved to early_init with using 'amdgpu_ucode_*'
helpers.
Reviewed-by: Leo Liu <[email protected]>
Signed-off-by: Sonny Jiang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For topology reflection, each socket to every other socket has the
exactly same topology info as the other way around. So it is safe
to keep the reflected num_links value otherwise it will be overriden
by the link info output of GET_PEER_LINKS command.
Signed-off-by: Shiwu Zhang <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Initalize syfs nodes after harvest information is fetched and fetch the
correct harvest info based on each IP instance.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
kfd_flush_tlb_after_unmap should return true for GFX v9.4.3, to do TLB
heavyweight flush after unmapping from GPU to guarantee that the GPU
will not access pages after they have been unmapped. This also helps
improve the mapping to GPU performance.
Without this, KFD accidently flush TLB after mapping to GPU because the
vm update sequence number is increased by previous unmapping.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If SOC doesn't expose dedicated vram, discovery region may be
available through system memory. Rename the existing interface to
generic read_binary_from_mem and add a fallback path to read from system
memory.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On certain ASICs, discovery info is available at reserved region in system
memory. The location is available through ACPI interface. Add API to read
discovery info from there.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In certain configs, TMR information is available from ACPI. Add API to
fetch the information.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add parsing of ACPI xcc objects and fill in relevant info from them by
invoking the DSM methods.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-and-tested-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch enables SVM capability on GFX9.4.3 when
run in Native mode. It also sets best_prefetch and
best_restore locations to CPU as there is no VRAM.
Signed-off-by: Mukul Joshi <[email protected]>
Acked-by: Rajneesh Bhardwaj <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's not fine grain, behaves similar to MGCG.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
During partition switch, keep the state as transient mode. Fetch the
latest state if switch fails.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's not required to take lock on all cases while querying partition
mode. Querying partition mode during KFD init process doesn't need to
take a lock. Init process after a switch will already be happening under
lock. Control the behaviour by adding flags to xcp_query_partition_mode.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
fix typo about smu socclk value.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Modifications to mode-2 reset flow for SMU v13.0.6 ASICs.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On SMU v13.0.6 APUs, FW will need to take some actions if driver is going
to halt RLC. Notify PMFW that driver is not going to manage device so
that FW takes care of the required actions.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It adds message support for FW notification on driver unload.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Le Ma <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add mem temperature as part of hw mon attributes for GC version 9.4.3
Signed-off-by: Asad Kamal <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Update hw mon attributes for GC Version 9.4.3 to valid ones
on APU and Non APU systems
v2: Group checks along existing one
Added power limit & mclock for gc version 9.4.3
Signed-off-by: Asad Kamal <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|