aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2023-06-09drm/amdgpu: do some register access cleanup in nbio v7_9Le Ma1-6/+7
Use WREG_SOC15x() instead of WREG32(SOC15_REG_OFFSET()) Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: extend max instancesLe Ma2-2/+2
Number of instances is extended. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: increase DISCOVERY_TMR_SIZELe Ma1-1/+1
New ip_discovery binary size is increased. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: update ip discovery header to v4Le Ma1-1/+29
version 4 supports 64bit ip base address Signed-off-by: Le Ma <[email protected]> Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: switch to aqua_vanjaram_doorbell_index_initLe Ma2-1/+24
New doorbell index assignment is used by aqua_vanjaram. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Use SDMA instance table for aqua vanjaramLijo Lazar1-1/+12
For aqua vanjaram, add mapping for logical to physical instances. v2: Register accesses on bare metal should be based on physical instance. Use GET_INST() to get physical instance. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add mask for SDMA instancesLijo Lazar1-1/+1
Add a mask of SDMA instances available for use. On certain ASIC configs, not all SDMA instances are available for software use. v2: Change sdma mask type to uint32_t (Le) Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add IP instance map for aqua vanjaramLijo Lazar3-0/+53
Add XCC logical to physical instance map for aqua vanjaram v2: Keep look up table only for required IPs, for others return default mapping (Felix). Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: add new doorbell assignment table for aqua_vanjaramLe Ma2-1/+53
Four basic reasons as below to do the change: 1. number of ring expand a lot on aqua_vanjaram, and adjustment on old assignment cannot make each ring in a continuous doorbell space. 2. the SDMA doorbell index should not exceed 0x1FF on aqua_vanjaram due to regDOORBELLx_CTRL_ENTRY.BIF_DOORBELLx_RANGE_OFFSET_ENTRY field width. 3. re-design the doorbell assignment and unify the calculation as "start + ring/inst id" will make the code much concise. 4. only defining the START/END makes the table look simple v2: (Lijo) 1. replace name 2. use num_inst_per_aid/sdma_doorbell_range instead of hardcoding Signed-off-by: Le Ma <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix register access on GC v9.4.3Lijo Lazar1-2/+2
In GC v9.4.3 there are multiple XCCs. It's required to use physical instance number to get the right register offset. Use GET_INST API for that. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix programming of initial XCP modeLijo Lazar1-22/+6
On initialization set the partition mode correctly to SPX (default) or any other user specified partition mode. Use switch_compute_partition API so that all settings are initialized correctly. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Update interrupt handling for GFX9.4.3Mukul Joshi6-20/+19
Update interrupt handling in CPX mode for GFX9.4.3 by using the VMID space instead of SDMA client id to determine if an interrupt should be processed by a KFD node. This is especially needed for handling retry faults from MMHUB. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix failure when switching to DPX modeMukul Joshi1-1/+5
Fix the if condition which causes dynamic repartitioning to fail when trying to switch to DPX mode. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Use instance table for GFX 9.4.3Mukul Joshi2-32/+33
For GFX 9.4.3, use the logical to physical mapping table, to get the correct XCD instance when accessing registers on bare metal. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix SWS on multi-XCD GPUAmber Lin1-9/+22
GFX_9_4_3 supports multi-XCDs and multi-AIDs in one GPU device. SWS needs to program IH_VMID_x_LUT with specified XCC instance and corresponded AID instance. Signed-off-by: Amber Lin <[email protected]> Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: drop redundant csb init for gfx943Le Ma1-99/+0
It's not required for compute pipeline and will cause soft lockup on emulation due to long-time writing. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: adjust s2a entry register for sdma doorbell trans decodingLe Ma1-24/+8
Use s2a entry 5/6 registers to decode sdma doorbell trans on different AIDs, which aligns the entry table in SHUB spec, and leave entry 4 dedicated for VCN doorbell to avoid conflict. Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Update SMI events for GFX9.4.3Mukul Joshi5-40/+40
On GFX 9.4.3, there can be multiple KFD nodes. As a result, SMI events for SVM, queue evict/restore should be raised for each node independently. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Use status register for partition modeLijo Lazar1-16/+12
Program partition status register to reflect the current partition mode. Partition capability register is for capability and is a one-time setting. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: pass kfd_node ref to svm migration apiAlex Sierra9-133/+166
This work is required for GC 9.4.3, previous to support memory partitions per node at SVM. When multiple partition is configured, every BO should be allocated inside one specific partition which corresponds to the current amdgpu_device and kfd_node. v2: squash in compilation fix (Alex) v3: squash in fix for pre-gfx 9.4.3 (Alex) v4: squash in best_loc fix (Alex) Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Conform to SET_UCONFIG_REG specLijo Lazar1-3/+4
The packet expects only 16 bits register offset. Hence pass register offset which is local to each XCC. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: add vcn multiple AIDs supportJames Zhu1-376/+434
add vcn multiple AIDs support. v2: squash in FW setting fix (Alex) Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: update clock gate setting for VCN 4.0.3James Zhu1-14/+16
Update clock gate setting. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/jpeg: add JPEG multiple AIDs supportJames Zhu1-153/+227
Add JPEG multiple AIDs support. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/nbio: add vcn doorbell multiple AIDs supportJames Zhu2-3/+18
Update vcn doorbell range to support multiple AIDs. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix GRBM programming sequenceLijo Lazar1-3/+6
It needs to be done only for XCC instances in non-AID0. Use the physical instance to determine non-AID0 XCC instances. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Use instance table for sdma 4.4.2Lijo Lazar3-14/+38
For ASICs with sdma IP v4.4.2, add mapping for logical to physical instances. v2: Register accesses on bare metal should be based on physical instance. Use GET_INST() to get physical instance. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add mask for SDMA instancesLijo Lazar1-0/+3
Add a mask of SDMA instances available for use. On certain ASIC configs, not all SDMA instances are available for software use. v2: Change sdma mask type to uint32_t (Le) Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Use instance lookup table for GC 9.4.3Lijo Lazar4-271/+279
Register accesses need to be based on physical instance on bare metal. Pass the right instance using logical to physical instance lookup table before accessing registers. Add a macro GET_INST to get the right physical instance of an IP corresponding to a logical instance. v2: fix gfx_v9_4_3_check_rlcg_range() (Alex) Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add map of logical to physical instLijo Lazar1-0/+10
Add a map for logical to physical instances of an IP. For ex: on some device configurations, the first logical XCC may not be the first physical XCC. Software may continue to access in logical IP instance order. The map provides a convenient way to get to the actual physical instance. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Add device repartition supportMukul Joshi5-5/+66
GFX9.4.3 will support dynamic repartitioning of the GPU through sysfs. Add device repartitioning support in KFD to repartition GPU from one mode to other. v2: squash in fix ("drm/amdkfd: Fix warning kgd2kfd_unlock_kfd defined but not used") Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Rework kfd_locked handlingMukul Joshi4-13/+25
Currently, even if kfd_locked is set, a process is first created and then removed to work around a race condition in updating kfd_locked flag. Rework kfd_locked handling to ensure no processes is created if kfd_locked is set. This is achieved by updating kfd_locked under kfd_processes_mutex. With this there is no need for kfd_locked to be an atomic counter. Instead, it can be a regular integer. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: configure the doorbell settings for sdma on non-AID0Le Ma1-9/+56
Configure the sdma doorbell settings on NBIF0 and SYSHUB of each AID v2: fetch aid_id from amdgpu_sdma_instance (Lijo) Signed-off-by: Le Ma <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: add indirect r/w interface for smn address greater than 32bitsLe Ma5-0/+118
On multiple AIDs platform, bit[34:32] in SMD address is leveraged to access nonAID0 register smn address and new PCI_INDEX_HI register is introduced to access the higher bits. v2: rebase on latest register accessors (Alex) Signed-off-by: Le Ma <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: EOP Removal - Handle size 0 correctlyDavid Belanger1-2/+7
On GC 9.4.3, we are removing the EOP buffer. If we specify 0 for the size, CP_HQD_EOP_CONTROL ends up with incorrect value as order_size_2 calculations does not handle 0. Fix it by using zero for the MQD entry for EOP size 0. v2: Reworked code with a conditional assignment and fixed style issues. Signed-off-by: David Belanger <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: reflect psp xgmi topology info for gfx9.4.3Jonathan Kim1-4/+7
Similar to GFX9.4.2 non-A+A devices, GFX9.4.3 psp xgmi topology info is half duplex and requires the driver to fill in the bidirectional info. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Shiwu Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: update amdgpu_fw_shared to amdgpu_vcn4_fw_sharedJames Zhu1-29/+11
Use amdgpu_vcn4_fw_shared for vcn 4.0.3. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: remove unused codeJames Zhu1-121/+0
Remove unused code. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: update ucode setupJames Zhu1-10/+1
Use common amdgpu_vcn_setup_ucode for ucode setup. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: update new doorbell mapJames Zhu3-5/+5
New doorbell map is used for VCN 4.0.3. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/jpeg: update jpeg header to support multiple AIDsJames Zhu1-0/+2
Add aid_id in jpeg header to support multiple AIDs. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: update vcn header to support multiple AIDsJames Zhu1-0/+3
Add aid_id in vcn header to support multiple AIDs Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu/vcn: use vcn4 irqsrc header for VCN 4.0.3James Zhu1-3/+3
Use vcn4 irqsrc header for VCN 4.0.3. Signed-off-by: James Zhu <[email protected]> Acked-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Change num_xcd to xcc_maskLijo Lazar7-99/+141
Instead of number of XCCs, keep a mask of XCCs for the exact XCCs available on the ASIC. XCC configuration could differ based on different ASIC configs. v2: Rename num_xcd to num_xcc (Hawking) Use smaller xcc_mask size, changed to u16 (Le) Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: add the support of XGMI link for GC 9.4.3Shiwu Zhang2-4/+47
Add the xgmi LFB_CNTL/LBF_SIZE reg addresses to fetch the xgmi info from. v2: move get_xgmi_info() to GC_V9_4_3 sepecific source files to utilize the register definitions specific for GC_V9_4_3 v3: remove the duplicated register definitions v4: enable xgmi based on asic_type as XGMI_IP ver is not available yet for IP discovery Signed-off-by: Shiwu Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Ack-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: add new vram type for dgpuHawking Zhang2-0/+2
hbm3 will be supported in some dgpu program Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Populate memory info before adding GPU node to topologyMukul Joshi1-2/+2
The local memory info needs to be fetched before the GPU node is added to topology. Without this, the sysfs is incorrectly populated and the size is reported as 0. This was causing rocr tests to fail. This issue was caused because of a bad merge. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Add SDMA info for SDMA 4.4.2Mukul Joshi1-0/+1
Update SDMA queue information for SDMA 4.4.2. Signed-off-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Fix SDMA in CPX modeMukul Joshi1-4/+15
When creating a user-mode SDMA queue, CP FW expects driver to use/set virtual SDMA engine id in MAP_QUEUES packet instead of using the physical SDMA engine id. Each partition node's virtual SDMA number should start from 0. However, when allocating doorbell for the queue, KFD needs to allocate the doorbell from doorbell space corresponding to the physical SDMA engine id, otherwise the hwardware will not see the doorbell press. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: add gpu compute cores io links for gfx9.4.3Jonathan Kim4-16/+47
The PSP TA will only provide xGMI topology info for links between GPU sockets so links between partitions from different sockets will be hardcoded as 3 xGMI hops with 1 hops weighted as xGMI and 2 hops weighted with a new intra-socket weight to indicate the longest possible distance. If the link between a partition and the CPU is non-PCIe, then assume the CPU (CCDs) is located within the same socket as the partition and represent the link as an intra-socket weighted single hop XGMI link with memory bandwidth. Links between partitions within a single socket will be abstracted as single hop xGMI links weighted with the new intra-socket weight and will have memory bandwidth. Finally, use the unused function bits in the location ID to represent the coordinates of the compute partition within its socket. A follow on patch will resolve the requirement for GPU socket xGMI link representation sometime later. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>