aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
AgeCommit message (Collapse)AuthorFilesLines
2023-07-07drm/amdgpu: Update invalid PTE flag settingMukul Joshi1-0/+2
Update the invalid PTE flag setting with TF enabled. This is to ensure, in addition to transitioning the retry fault to a no-retry fault, it also causes the wavefront to enter the trap handler. With the current setting, the fault only transitions to a no-retry fault. Additionally, have 2 sets of invalid PTE settings, one for TF enabled, the other for TF disabled. The setting with TF disabled, doesn't work with TF enabled. Signed-off-by: Mukul Joshi <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Update total channel number for umc v8_10Candice Li1-0/+2
Update total channel number for umc v8_10. Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUsFelix Kuehling1-0/+7
On GFXv9.4.3 NUMA APUs, system memory locality must be determined per page to choose the correct MTYPE. This patch adds a GMC callback that can provide this per-page override and implements it for native mode. Carve-out mode is not yet supported and will use the safe default (remote) MTYPE for system memory. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Reviewed-and-tested-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add memory partitions to gmcLijo Lazar1-0/+17
Some ASICs have the device memory divided into multiple partitions. The parititions could be denoted by a numa node or by a range of pages. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Move memory partition query to gmcLijo Lazar1-0/+16
GMC block handles memory related information, it makes more sense to keep memory partition functions in gmc block. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Check APU supports true APP modeRajneesh Bhardwaj1-0/+1
On GPXIP 9.4.3 APU, in no carveout mode there is no real vram heap and could be emulated by the driver over the interleaved NUMA system memory and the APU could also be in the carveout mode during early development stage or otherwise for debugging purpose so introduce a new member in amdgpu_gmc to figure out whether the APU is in the native mode as per the production configuration. AMD_IS_APU cannot be used for Accelerated Processing Platform APUs as it might be used in a different context on previous generations or on small APUs. Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Tested-by: Graham Sider <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add XCC inst to PASID TLB flushingMukul Joshi1-3/+4
Add XCC instance to select the correct KIQ ring when flushing TLBs on a multi-XCC setup. Signed-off-by: Mukul Joshi <[email protected]> Tested-by: Amber Lin <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-04-13drm/amdgpu: Rework retry fault removalMukul Joshi1-0/+1
Rework retry fault removal from the software filter by storing an expired timestamp for a fault that is being removed. When a new fault comes, and it matches an entry in the sw filter, it will be added as a new fault only when its timestamp is greater than the timestamp expiry of the fault in the sw filter. This helps in avoiding stale faults being added back into the filter and preventing legitimate faults from being handled. Suggested-by: Felix Kuehling <[email protected]> Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-03-15drm/amdgpu: Init MMVM_CONTEXTS_DISABLE in gmc11 golden setting under SRIOVYifan Zha1-0/+2
[Why] If disable the mmhub vm contexts(set MMVM_CONTEXTS_DISABLE to 0xffff), driver loading failed on vf due to fence fallback timer expired on all rings. FLR cannot reset MMVM_CONTEXTS_DISABLE. So this vf can not be recovered anymore unless trigger a whole gpu reset. [How] Under SRIOV, init MMVM_CONTEXTS_DISABLE in gmc11 golden register setting. Signed-off-by: Yifan Zha <[email protected]> Reviewed-by: Horace Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-03-13drm/amdgpu: Move umc ras block init to gmc ras sw_initHawking Zhang1-1/+1
Initialize umc ras block only when umc ip block supports ras. Driver queries ras capabilities after early_init, ras block init needs to be moved to sw_init. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Stanley Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-16drm/amdgpu: save and restore gc hub regsVictor Zhao1-0/+26
Save and restore gfxhub regs as they will be reset during mode 2 Signed-off-by: Victor Zhao <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-05-04drm/amdgpu: save the setting of VM_CONTEXT_CNTLJack Xiao1-0/+1
MES firmware needs the setting of VM_CONTEXT_CNTL to perform vmid switch. Save the initial setting when hub initializing. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-05-04drm/amdgpu: add mmhub v3_0 ip blockTianci.Yin1-0/+1
Add support for mmhub v3.0 Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-28drm/amdgpu/discovery: store the number of UMC IPs on the asicAlex Deucher1-0/+2
For chips with IP discovery get this from the table, hardcode it for older asics. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-28drm/amdgpu: store the mall size in the gmc structureAlex Deucher1-0/+3
This will be useful in future patches. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations.Yongqiang Sun1-1/+0
Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yongqiang Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-25drm/amdgpu: Move xgmi ras initialization from .late_init to .early_inityipechai1-0/+1
Move xgmi ras initialization from .late_init to .early_init, which let xgmi ras can be initialized only once. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-19drm/amdgpu: add vram check function for GMCXiaojian Du1-0/+1
This patch will add vram check function for GMC block. It will write pattern data to the vram and then read back from the vram, so that to verify the work status of vram. This patch will cover gmc v6/7/8/9/10. Signed-off-by: Xiaojian Du <[email protected]> Reviewed-by: Huang Rui <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14drm/amdgpu: Modify xgmi block to fit for the unified ras block data and opsyipechai1-7/+4
1.Modify gmc block to fit for the unified ras block data and ops. 2.Change amdgpu_xgmi_ras_funcs to amdgpu_xgmi_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gmc ras variable so that gmc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gmc ras block into amdgpu device ras block link list. 5.Remove the redundant code about gmc in amdgpu_ras.c after using the unified ras block. Signed-off-by: yipechai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01drm/amdgpu: handle IH ring1 overflowPhilip Yang1-1/+2
IH ring1 is used to process GPU retry fault, overflow is enabled to drain retry fault because we want receive other interrupts while handling retry fault to recover range. There is no overflow flag set when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow and drain retry fault. If fault timestamp goes backward, the fault is filtered and should not be processed. Drain fault is finished if processed_timestamp is equal to or larger than checkpoint timestamp. Add amdgpu_ih_functions interface decode_iv_ts for different chips to get timestamp from IV entry with different iv size and timestamp offset. amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-06-04drm/amdgpu: introduce a stolen reserved buffer to protect specific buffer ↵Huang Rui1-0/+1
region (v2) Some ASICs such as Yellow Carp needs to reserve a region of video memory to avoid access from driver. So this patch is to introduce a stolen reserved buffer to protect specific buffer region. v2: free this buffer in amdgpu_ttm_fini. Signed-off-by: Huang Rui <[email protected]> Acked-and-Tested-by: Aaron Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-28drm/amdgpu: address remove from fault filterPhilip Yang1-2/+4
Add interface to remove address from fault filter ring by resetting fault ring entry key, then future vm fault on the address will be processed to recover. Define fault key as atomic64_t type to use atomic read/set/cmpxchg key to protect fault ring access by interrupt handler and interrupt deferred work for vg20. Change fault->timestamp to 48-bit to share same uint64_t with 8-bit fault->next, it is enough for 48bit IH timestamp. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-20Revert "drm/amdgpu: workaround the TMR MC address issue (v2)"Oak Zeng1-9/+0
This reverts commit 2f055097daef498da57552f422f49de50a1573e6. 2f055097daef498da57552f422f49de50a1573e6 was a driver workaround when PSP firmware was not ready. Now the PSP fw is ready so we revert this driver workaround. Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-15drm/amdgpu: Introduce functions for vram physical addr calculationOak Zeng1-0/+3
Add one function to calculate BO's GPU physical address. And another function to calculate BO's CPU physical address. v2: Use functions vs macros (Christian) Use more proper function names (Christian) Signed-off-by: Oak Zeng <[email protected]> Suggested-by: Lijo Lazar <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09drm/amdgpu: move xgmi ras functions to xgmi_ras_funcsHawking Zhang1-0/+9
xgmi ras is not managed by gpu driver when gpu is connected to cpu through xgmi. move all xgmi ras functions to xgmi_ras_funcs so gpu driver only initializes xgmi ras functions when it manages xgmi ras. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Reset the devices in the XGMI hive duirng probeshaoyunl1-0/+1
In passthrough configuration, hypervisior will trigger the SBR(Secondary bus reset) to the devices without sync to each other. This could cause device hang since for XGMI configuration, all the devices within the hive need to be reset at a limit time slot. This serial of patches try to solve this issue by co-operate with new SMU which will only do minimum house keeping to response the SBR request but don't do the real reset job and leave it to driver. Driver need to do the whole sw init and minimum HW init to bring up the SMU and trigger the reset(possibly BACO) on all the ASICs at the same time Signed-off-by: shaoyunl <[email protected]> Acked-by: Andrey Grodzovsky [email protected] Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Fix the comment in amdgpu_gmc.hOak Zeng1-3/+3
More accurate words are used to address a code review feedback Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: workaround the TMR MC address issue (v2)Oak Zeng1-0/+9
With the 2-level gart page table, vram is squeezed into gart aperture and FB aperture is disabled. Therefore all VRAM virtual addresses are in the GART aperture. However currently PSP requires TMR addresses in FB aperture. So we need some design change at PSP FW level to support this 2-level gart table driver change. Right now this PSP FW support doesn't exist. To workaround this issue temporarily, FB aperture is added back and the gart aperture address is converted back to FB aperture for this PSP TMR address. Will revert it after we get a fix from PSP FW. v2: squash in tmr fix for other asics (Kevin) Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Add function to allocate and fill PDB0Oak Zeng1-0/+5
Add functions to allocate PDB0, map it for CPU access, and fill it. Those functions are only used for 2-level vmid0 page table construction Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Use different gart table parameters for 2-level gart tableOak Zeng1-0/+3
If use gart for FB translation, we will squeeze vram into sysvm aperture. This requires 2 level gart table. Add page table depth and page table block size parameters to gmc. This is prepare work to 2-level gart table construction Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Christian Konig <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Placement of gart and vram in sysvm apertureOak Zeng1-0/+1
If use GART for FB translation, place both vram and gart to sysvm aperture. AGP aperture is not set up in this case because it is not used Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Christian Konig <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: Modify comments of vram_start/endOak Zeng1-4/+7
Modify the comment to reflect the fact that, if use GART for vram address translation for vmid0, [vram_start, vram_end] will be placed inside SYSVM aperture, together with GART. Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Christian Konig <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23drm/amdgpu: define address map for host xgmi link (v3)Rajneesh Bhardwaj1-0/+1
This applies to AMD Accelerated Processing Platforms that support host gpu interconnect throguh a special link (xgmi). Aldebaran systems will support this special feature for utilizing the benefits of host-gpu cache coherence. This change outlines the basic framework for mapping the GPU VRAM (HBM) to system address space making it accesible to the host but managed by the amdgpu driver since this region is marked as reserved memory in host address space by the underlying system firmware. v2: switch to smuio callback function to check the type of host-gpu interface (Hawking) v3: use hub callbacks rather than direct function calls (Alex) Reviewed-by: Oak Zeng <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-25drm/amdgpu: store noretry parameter per driver instanceAlex Deucher1-0/+2
This will allow us to have different defaults per asic in a future patch. Reviewed-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: move stolen memory from gmc to mmanAlex Deucher1-5/+0
It's more related to memory management than memory controller. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu/gmc: add new helper to get the FB size used by pre-OS consoleAlex Deucher1-0/+5
This adds a new gmc callback to get the size reserved by the pre-OS console and provides a helper function for use by gmc IP drivers. Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: add support for extended stolen vga memoryAlex Deucher1-0/+2
This will allow us to split the allocation for systems where we have to keep the stolen memory around to avoid S3 issues. This way we don't waste as much memory and still avoid any screen artifacts during the bios to driver transition. Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: move keep stolen memory check into gmc coreAlex Deucher1-0/+1
Rather than leaving this as a gmc v9 specific hack. Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmcAlex Deucher1-1/+2
Since that is where we store the other data related to the stolen vga memory. Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-22drm/amdgpu: move get_invalidate_req function into gfxhub/mmhub levelHuang Rui1-0/+1
This patch is to move get_invalidate_req into gfxhub/mmhub level. It will avoid mismatch of the different gfxhub/mmhub register offsets and fields in the same gmc block. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-22drm/amdgpu: add vmhub funcs helper (v2)Huang Rui1-0/+7
This patch is to introduce vmhub funcs helper to add following callback (print_l2_protection_fault_status). Each GC/MMHUB register specific programming should be in gfxhub/mmhub level. v2: remove the condition of funcs assignment. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-22drm/amdgpu: abstract set_vm_fault_masks function to refine the programmingHuang Rui1-0/+4
This patch is to add set_vm_fault_masks helper to amdgpu_gmc to refine the original programming. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-22drm/amdgpu: add member to store vm fault interrupt masksHuang Rui1-0/+2
This patch adds a member in vmhub structure to store the vm fault interrupt masks for different version gfxhubs/mmhubs. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-08drm/amdgpu: add register distance members into vmhub structureHuang Rui1-0/+9
This patch is to abstract register distances between two continuous context domains and invalidation engines. In different ip headers, these distances may be differences. Signed-off-by: Huang Rui <[email protected]> Tested-by: AnZhong Huang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-28drm/amdgpu: implement TMZ accessor (v3)Luben Tuikov1-0/+4
Implement an accessor of adev->tmz.enabled. Let not code around access it as "if (adev->tmz.enabled)" as the organization may change. Instead... Recruit "bool amdgpu_is_tmz(adev)" to return exactly this Boolean value. That is, this function is now an accessor of an already initialized and set adev and adev->tmz. Add "void amdgpu_gmc_tmz_set(adev)" to check and set adev->gmc.tmz_enabled at initialization time. After which one uses "bool amdgpu_is_tmz(adev)" to query whether adev supports TMZ. Also, remove circular header file include. v2: Remove amdgpu_tmz.[ch] as requested. v3: Move TMZ into GMC. Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-25amdgpu/gmc_v9: save/restore sdpif regs during S3Shirish S1-0/+1
fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM. Suggested-by: Alex Deucher <[email protected]> Signed-off-by: Shirish S <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-01-22Revert "drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 ↵Tianci.Yin1-5/+0
training enabled(V5)" This reverts commit 9e441478623fd913d4340654682b19f0c24e629d. The patch will be replaced with a better solution, revert it. Reviewed-by: Christian König <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-01-16drm/amdgpu: fix modprobe failure of the secondary GPU when GDDR6 training ↵Tianci.Yin1-0/+5
enabled(V5) [why] In dual GPUs scenario, stolen_size is assigned to zero on the secondary GPU, since there is no pre-OS console using that memory. Then the bottom region of VRAM was allocated as GTT, unfortunately a small region of bottom VRAM was encroached by UMC firmware during GDDR6 BIST training, this cause page fault. [how] Forcing stolen_size to 3MB, then the bottom region of VRAM was allocated as stolen memory, GTT corruption avoid. Reviewed-by: Christian König <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-01-16drm/amdgpu: export function to flush TLB via pasidAlex Sierra1-0/+6
This can be used directly from amdgpu and amdkfd to invalidate TLB through pasid. It supports gmc v7, v8, v9 and v10. Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-01-07drm/amdgpu/gmc: move invaliation bitmap setup to common codeAlex Deucher1-0/+1
So it can be shared with newer GMC versions. Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>