diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 06:28:35 -1000 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-11-01 06:28:35 -1000 |
commit | 7d461b291e65938f15f56fe58da2303b07578a76 (patch) | |
tree | 015dd7c2f1743dd70be52787dd9aff33822bc938 /drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | |
parent | 8bc9e6515183935fa0cccaf67455c439afe4982b (diff) | |
parent | 631808095a82e6b6f8410a95f8b12b8d0d38b161 (diff) |
Merge tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"Highlights:
- AMD adds some more upcoming HW platforms
- Intel made Meteorlake stable and started adding Lunarlake
- nouveau has a bunch of display rework in prepartion for the NVIDIA
GSP firmware support
- msm adds a7xx support
- habanalabs has finished migration to accel subsystem
Detail summary:
kernel:
- add initial vmemdup-user-array
core:
- fix platform remove() to return void
- drm_file owner updated to reflect owner
- move size calcs to drm buddy allocator
- let GPUVM build as a module
- allow variable number of run-queues in scheduler
edid:
- handle bad h/v sync_end in EDIDs
panfrost:
- add Boris as maintainer
fbdev:
- use fb_ops helpers more
- only allow logo use from fbcon
- rename fb_pgproto to pgprot_framebuffer
- add HPD state to drm_connector_oob_hotplug_event
- convert to fbdev i/o mem helpers
i915:
- Enable meteorlake by default
- Early Xe2 LPD/Lunarlake display enablement
- Rework subplatforms into IP version checks
- GuC based TLB invalidation for Meteorlake
- Display rework for future Xe driver integration
- LNL FBC features
- LNL display feature capability reads
- update recommended fw versions for DG2+
- drop fastboot module parameter
- added deviceid for Arrowlake-S
- drop preproduction workarounds
- don't disable preemption for resets
- cleanup inlines in headers
- PXP firmware loading fix
- Fix sg list lengths
- DSC PPS state readout/verification
- Add more RPL P/U PCI IDs
- Add new DG2-G12 stepping
- DP enhanced framing support to state checker
- Improve shared link bandwidth management
- stop using GEM macros in display code
- refactor related code into display code
- locally enable W=1 warnings
- remove PSR watchdog timers on LNL
amdgpu:
- RAS/FRU EEPROM updatse
- IP discovery updatses
- GC 11.5 support
- DCN 3.5 support
- VPE 6.1 support
- NBIO 7.11 support
- DML2 support
- lots of IP updates
- use flexible arrays for bo list handling
- W=1 fixes
- Enable seamless boot in more cases
- Enable context type property for HDMI
- Rework GPUVM TLB flushing
- VCN IB start/size alignment fixes
amdkfd:
- GC 10/11 fixes
- GC 11.5 support
- use partial migration in GPU faults
radeon:
- W=1 Fixes
- fix some possible buffer overflow/NULL derefs
nouveau:
- update uapi for NO_PREFETCH
- scheduler/fence fixes
- rework suspend/resume for GSP-RM
- rework display in preparation for GSP-RM
habanalabs:
- uapi: expose tsc clock
- uapi: block access to eventfd through control device
- uapi: force dma-buf export to PAGE_SIZE alignments
- complete move to accel subsystem
- move firmware interface include files
- perform hard reset on PCIe AXI drain event
- optimise user interrupt handling
msm:
- DP: use existing helpers for DPCD
- DPU: interrupts reworked
- gpu: a7xx (a730/a740) support
- decouple msm_drv from kms for headless devices
mediatek:
- MT8188 dsi/dp/edp support
- DDP GAMMA - 12 bit LUT support
- connector dynamic selection capability
rockchip:
- rv1126 mipi-dsi/vop support
- add planar formats
ast:
- rename constants
panels:
- Mitsubishi AA084XE01
- JDI LPM102A188A
- LTK050H3148W-CTA6
ivpu:
- power management fixes
qaic:
- add detach slice bo api
komeda:
- add NV12 writeback
tegra:
- support NVSYNC/NHSYNC
- host1x suspend fixes
ili9882t:
- separate into own driver"
* tag 'drm-next-2023-10-31-1' of git://anongit.freedesktop.org/drm/drm: (1803 commits)
drm/amdgpu: Remove unused variables from amdgpu_show_fdinfo
drm/amdgpu: Remove duplicate fdinfo fields
drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded
drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems
drm/amdgpu: Retrieve CE count from ce_count_lo_chip in EccInfo table
drm/amdgpu: Identify data parity error corrected in replay mode
drm/amdgpu: Fix typo in IP discovery parsing
drm/amd/display: fix S/G display enablement
drm/amdxcp: fix amdxcp unloads incompletely
drm/amd/amdgpu: fix the GPU power print error in pm info
drm/amdgpu: Use pcie domain of xcc acpi objects
drm/amd: check num of link levels when update pcie param
drm/amdgpu: Add a read to GFX v9.4.3 ring test
drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported.
drm/amdgpu: get RAS poison status from DF v4_6_2
drm/amdgpu: Use discovery table's subrevision
drm/amd/display: 3.2.256
drm/amd/display: add interface to query SubVP status
drm/amd/display: Read before writing Backlight Mode Set Register
drm/amd/display: Disable SYMCLK32_SE RCO on DCN314
...
Diffstat (limited to 'drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c')
-rw-r--r-- | drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 209 |
1 files changed, 186 insertions, 23 deletions
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index d78bd9732543..2dce338b0f1e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c @@ -32,6 +32,7 @@ #include "amdgpu.h" #include "amdgpu_gmc.h" #include "amdgpu_ras.h" +#include "amdgpu_reset.h" #include "amdgpu_xgmi.h" #include <drm/drm_drv.h> @@ -263,12 +264,14 @@ void amdgpu_gmc_sysvm_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc * * @adev: amdgpu device structure holding all necessary information * @mc: memory controller structure holding memory information + * @gart_placement: GART placement policy with respect to VRAM * * Function will place try to place GART before or after VRAM. * If GART size is bigger than space left then we ajust GART size. * Thus function will never fails. */ -void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc) +void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc, + enum amdgpu_gart_placement gart_placement) { const uint64_t four_gb = 0x100000000ULL; u64 size_af, size_bf; @@ -286,11 +289,22 @@ void amdgpu_gmc_gart_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc) mc->gart_size = max(size_bf, size_af); } - if ((size_bf >= mc->gart_size && size_bf < size_af) || - (size_af < mc->gart_size)) - mc->gart_start = 0; - else + switch (gart_placement) { + case AMDGPU_GART_PLACEMENT_HIGH: mc->gart_start = max_mc_address - mc->gart_size + 1; + break; + case AMDGPU_GART_PLACEMENT_LOW: + mc->gart_start = 0; + break; + case AMDGPU_GART_PLACEMENT_BEST_FIT: + default: + if ((size_bf >= mc->gart_size && size_bf < size_af) || + (size_af < mc->gart_size)) + mc->gart_start = 0; + else + mc->gart_start = max_mc_address - mc->gart_size + 1; + break; + } mc->gart_start &= ~(four_gb - 1); mc->gart_end = mc->gart_start + mc->gart_size - 1; @@ -315,14 +329,6 @@ void amdgpu_gmc_agp_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc) const uint64_t sixteen_gb_mask = ~(sixteen_gb - 1); u64 size_af, size_bf; - if (amdgpu_sriov_vf(adev)) { - mc->agp_start = 0xffffffffffff; - mc->agp_end = 0x0; - mc->agp_size = 0; - - return; - } - if (mc->fb_start > mc->gart_start) { size_bf = (mc->fb_start & sixteen_gb_mask) - ALIGN(mc->gart_end + 1, sixteen_gb); @@ -347,6 +353,25 @@ void amdgpu_gmc_agp_location(struct amdgpu_device *adev, struct amdgpu_gmc *mc) } /** + * amdgpu_gmc_set_agp_default - Set the default AGP aperture value. + * @adev: amdgpu device structure holding all necessary information + * @mc: memory controller structure holding memory information + * + * To disable the AGP aperture, you need to set the start to a larger + * value than the end. This function sets the default value which + * can then be overridden using amdgpu_gmc_agp_location() if you want + * to enable the AGP aperture on a specific chip. + * + */ +void amdgpu_gmc_set_agp_default(struct amdgpu_device *adev, + struct amdgpu_gmc *mc) +{ + mc->agp_start = 0xffffffffffff; + mc->agp_end = 0; + mc->agp_size = 0; +} + +/** * amdgpu_gmc_fault_key - get hask key from vm fault address and pasid * * @addr: 48 bit physical address, page aligned (36 significant bits) @@ -452,7 +477,10 @@ void amdgpu_gmc_filter_faults_remove(struct amdgpu_device *adev, uint64_t addr, uint32_t hash; uint64_t tmp; - ih = adev->irq.retry_cam_enabled ? &adev->irq.ih_soft : &adev->irq.ih1; + if (adev->irq.retry_cam_enabled) + return; + + ih = &adev->irq.ih1; /* Get the WPTR of the last entry in IH ring */ last_wptr = amdgpu_ih_get_wptr(adev, ih); /* Order wptr with ring data. */ @@ -549,13 +577,17 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device *adev) /* reserve engine 5 for firmware */ if (adev->enable_mes) vm_inv_engs[i] &= ~(1 << 5); + /* reserve mmhub engine 3 for firmware */ + if (adev->enable_umsch_mm) + vm_inv_engs[i] &= ~(1 << 3); } for (i = 0; i < adev->num_rings; ++i) { ring = adev->rings[i]; vmhub = ring->vm_hub; - if (ring == &adev->mes.ring) + if (ring == &adev->mes.ring || + ring == &adev->umsch_mm.ring) continue; inv_eng = ffs(vm_inv_engs[vmhub]); @@ -575,6 +607,142 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device *adev) return 0; } +void amdgpu_gmc_flush_gpu_tlb(struct amdgpu_device *adev, uint32_t vmid, + uint32_t vmhub, uint32_t flush_type) +{ + struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring; + struct amdgpu_vmhub *hub = &adev->vmhub[vmhub]; + struct dma_fence *fence; + struct amdgpu_job *job; + int r; + + if (!hub->sdma_invalidation_workaround || vmid || + !adev->mman.buffer_funcs_enabled || + !adev->ib_pool_ready || amdgpu_in_reset(adev) || + !ring->sched.ready) { + + /* + * A GPU reset should flush all TLBs anyway, so no need to do + * this while one is ongoing. + */ + if (!down_read_trylock(&adev->reset_domain->sem)) + return; + + if (adev->gmc.flush_tlb_needs_extra_type_2) + adev->gmc.gmc_funcs->flush_gpu_tlb(adev, vmid, + vmhub, 2); + + if (adev->gmc.flush_tlb_needs_extra_type_0 && flush_type == 2) + adev->gmc.gmc_funcs->flush_gpu_tlb(adev, vmid, + vmhub, 0); + + adev->gmc.gmc_funcs->flush_gpu_tlb(adev, vmid, vmhub, + flush_type); + up_read(&adev->reset_domain->sem); + return; + } + + /* The SDMA on Navi 1x has a bug which can theoretically result in memory + * corruption if an invalidation happens at the same time as an VA + * translation. Avoid this by doing the invalidation from the SDMA + * itself at least for GART. + */ + mutex_lock(&adev->mman.gtt_window_lock); + r = amdgpu_job_alloc_with_ib(ring->adev, &adev->mman.high_pr, + AMDGPU_FENCE_OWNER_UNDEFINED, + 16 * 4, AMDGPU_IB_POOL_IMMEDIATE, + &job); + if (r) + goto error_alloc; + + job->vm_pd_addr = amdgpu_gmc_pd_addr(adev->gart.bo); + job->vm_needs_flush = true; + job->ibs->ptr[job->ibs->length_dw++] = ring->funcs->nop; + amdgpu_ring_pad_ib(ring, &job->ibs[0]); + fence = amdgpu_job_submit(job); + mutex_unlock(&adev->mman.gtt_window_lock); + + dma_fence_wait(fence, false); + dma_fence_put(fence); + + return; + +error_alloc: + mutex_unlock(&adev->mman.gtt_window_lock); + dev_err(adev->dev, "Error flushing GPU TLB using the SDMA (%d)!\n", r); +} + +int amdgpu_gmc_flush_gpu_tlb_pasid(struct amdgpu_device *adev, uint16_t pasid, + uint32_t flush_type, bool all_hub, + uint32_t inst) +{ + u32 usec_timeout = amdgpu_sriov_vf(adev) ? SRIOV_USEC_TIMEOUT : + adev->usec_timeout; + struct amdgpu_ring *ring = &adev->gfx.kiq[inst].ring; + struct amdgpu_kiq *kiq = &adev->gfx.kiq[inst]; + unsigned int ndw; + signed long r; + uint32_t seq; + + if (!adev->gmc.flush_pasid_uses_kiq || !ring->sched.ready || + !down_read_trylock(&adev->reset_domain->sem)) { + + if (adev->gmc.flush_tlb_needs_extra_type_2) + adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, + 2, all_hub, + inst); + + if (adev->gmc.flush_tlb_needs_extra_type_0 && flush_type == 2) + adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, + 0, all_hub, + inst); + + adev->gmc.gmc_funcs->flush_gpu_tlb_pasid(adev, pasid, + flush_type, all_hub, + inst); + return 0; + } + + /* 2 dwords flush + 8 dwords fence */ + ndw = kiq->pmf->invalidate_tlbs_size + 8; + + if (adev->gmc.flush_tlb_needs_extra_type_2) + ndw += kiq->pmf->invalidate_tlbs_size; + + if (adev->gmc.flush_tlb_needs_extra_type_0) + ndw += kiq->pmf->invalidate_tlbs_size; + + spin_lock(&adev->gfx.kiq[inst].ring_lock); + amdgpu_ring_alloc(ring, ndw); + if (adev->gmc.flush_tlb_needs_extra_type_2) + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 2, all_hub); + + if (flush_type == 2 && adev->gmc.flush_tlb_needs_extra_type_0) + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, 0, all_hub); + + kiq->pmf->kiq_invalidate_tlbs(ring, pasid, flush_type, all_hub); + r = amdgpu_fence_emit_polling(ring, &seq, MAX_KIQ_REG_WAIT); + if (r) { + amdgpu_ring_undo(ring); + spin_unlock(&adev->gfx.kiq[inst].ring_lock); + goto error_unlock_reset; + } + + amdgpu_ring_commit(ring); + spin_unlock(&adev->gfx.kiq[inst].ring_lock); + r = amdgpu_fence_wait_polling(ring, seq, usec_timeout); + if (r < 1) { + dev_err(adev->dev, "wait for kiq fence error: %ld.\n", r); + r = -ETIME; + goto error_unlock_reset; + } + r = 0; + +error_unlock_reset: + up_read(&adev->reset_domain->sem); + return r; +} + /** * amdgpu_gmc_tmz_set -- check and set if a device supports TMZ * @adev: amdgpu_device pointer @@ -584,7 +752,7 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device *adev) */ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) { - switch (adev->ip_versions[GC_HWIP][0]) { + switch (amdgpu_ip_version(adev, GC_HWIP, 0)) { /* RAVEN */ case IP_VERSION(9, 2, 2): case IP_VERSION(9, 1, 0): @@ -618,6 +786,7 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) /* YELLOW_CARP*/ case IP_VERSION(10, 3, 3): case IP_VERSION(11, 0, 4): + case IP_VERSION(11, 5, 0): /* Don't enable it by default yet. */ if (amdgpu_tmz < 1) { @@ -648,7 +817,7 @@ void amdgpu_gmc_tmz_set(struct amdgpu_device *adev) void amdgpu_gmc_noretry_set(struct amdgpu_device *adev) { struct amdgpu_gmc *gmc = &adev->gmc; - uint32_t gc_ver = adev->ip_versions[GC_HWIP][0]; + uint32_t gc_ver = amdgpu_ip_version(adev, GC_HWIP, 0); bool noretry_default = (gc_ver == IP_VERSION(9, 0, 1) || gc_ver == IP_VERSION(9, 3, 0) || gc_ver == IP_VERSION(9, 4, 0) || @@ -721,12 +890,6 @@ void amdgpu_gmc_get_vbios_allocations(struct amdgpu_device *adev) case CHIP_RENOIR: adev->mman.keep_stolen_vga_memory = true; break; - case CHIP_YELLOW_CARP: - if (amdgpu_discovery == 0) { - adev->mman.stolen_reserved_offset = 0x1ffb0000; - adev->mman.stolen_reserved_size = 64 * PAGE_SIZE; - } - break; default: adev->mman.keep_stolen_vga_memory = false; break; |