diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 11:59:58 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-12-13 11:59:58 -0800 |
commit | a594533df0f6ca391da003f43d53b336a2d23ffa (patch) | |
tree | ec984c693b0bfc208519c43134f21365797f90ee /drivers/gpu/drm/i915/gt/intel_gt.c | |
parent | cdb9d3537711939e4d8fd0de2889c966f88346eb (diff) | |
parent | 66efff515a6500d4b4976fbab3bee8b92a1137fb (diff) |
Merge tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"The biggest highlight is that the accel subsystem framework is merged.
Hopefully for 6.3 we will be able to line up a driver to use it.
In drivers land, i915 enables DG2 support by default now, and nouveau
has a big stability refactoring and initial ampere support, AMD
includes new hw IP support and should build on ARM again. There is
also an ofdrm driver to take over offb on platforms it's used.
Stuff outside my tree, the dma-buf patches hit a few places, the vc4
firmware changes also do, and i915 has some interactions with MEI for
discrete GPUs. I think all of those should have been acked/reviewed by
relevant parties.
New driver:
- ofdrm - replacement for offb
fbdev:
- add support for nomodeset
fourcc:
- add Vivante tiled modifier
core:
- atomic-helpers: CRTC primary plane test fixes, fb access hooks
- connector: TV API consistency, cmdline parser improvements
- send connector hotplug on cleanup
- sort makefile objects
tests:
- sort kunit tests
- improve DP-MST tests
- add kunit helpers to create a device
sched:
- module param for scheduling policy
- refcounting fix
buddy:
- add back random seed log
ttm:
- convert ttm_resource to size_t
- optimize pool allocations
edid:
- HFVSDB parsing support fixes
- logging/debug improvements
- DSC quirks
dma-buf:
- Add unlocked vmap and attachment mapping
- move drivers to common locking convention
- locking improvements
firmware:
- new API for rPI firmware and vc4
xilinx:
- zynqmp: displayport bridge support
- dpsub fix
bridge:
- adv7533: Remove dynamic lane switching
- it6505: Runtime PM support, sync improvements
- ps8640: Handle AUX defer messages
- tc358775: Drop soft-reset over I2C
panel:
- panel-edp: Add INX N116BGE-EA2 C2 and C4 support.
- Jadard JD9365DA-H3
- NewVision NV3051D
amdgpu:
- DCN support on ARM
- DCN 2.1 secure display
- Sienna Cichlid mode2 reset fixes
- new GC 11.x firmware versions
- drop AMD specific DSC workarounds in favour of drm code
- clang warning fixes
- scheduler rework
- SR-IOV fixes
- GPUVM locking fixes
- fix memory leak in CS IOCTL error path
- flexible array updates
- enable new GC/PSP/SMU/NBIO IP
- GFX preemption support for gfx9
amdkfd:
- cache size fixes
- userptr fixes
- enable cooperative launch on gfx 10.3
- enable GC 11.0.4 KFD support
radeon:
- replace kmap with kmap_local_page
- ACPI ref count fix
- HDA audio notifier support
i915:
- DG2 enabled by default
- MTL enablement work
- hotplug refactoring
- VBT improvements
- Display and watermark refactoring
- ADL-P workaround
- temp disable runtime_pm for discrete-
- fix for A380 as a secondary GPU
- Wa_18017747507 for DG2
- CS timestamp support fixes for gen5 and earlier
- never purge busy TTM objects
- use i915_sg_dma_sizes for all backends
- demote GuC kernel contexts to normal priority
- gvt: refactor for new MDEV interface
- enable DC power states on eDP ports
- fix gen 2/3 workarounds
nouveau:
- fix page fault handling
- Ampere acceleration support
- driver stability improvements
- nva3 backlight support
msm:
- MSM_INFO_GET_FLAGS support
- DPU: XR30 and P010 image formats
- Qualcomm SM6115 support
- DSI PHY support for QCM2290
- HDMI: refactored dev init path
- remove exclusive-fence hack
- fix speed-bin detection
- enable clamp to idle on 7c3
- improved hangcheck detection
vmwgfx:
- fb and cursor refactoring
- convert to generic hashtable
- cursor improvements
etnaviv:
- hw workarounds
- softpin MMU fixes
ast:
- atomic gamma LUT support
- convert to SHMEM
lcdif:
- support YUV planes
- Increase DMA burst size
- FIFO threshold tuning
meson:
- fix return type of cvbs mode_valid
mgag200:
- fix PLL setup on some revisions
sun4i:
- A100 and D1 support
udl:
- modesetting improvements
- hot unplug support
vc4:
- support PAL-M
- fix regression preventing 4K @ 60Hz
- fix NULL ptr deref
v3d:
- switch to drm managed resources
renesas:
- RZ/G2L DSI support
- DU Kconfig cleanup
mediatek:
- fixup dpi and hdmi
- MT8188 dpi support
- MT8195 AFBC support
tegra:
- NVDEC hardware on Tegra234 SoC
hdlcd:
- switch to drm managed resources
ingenic:
- fix registration error path
hisilicon:
- convert to drm_mode_init
maildp:
- use managed resources
mtk:
- use drm_mode_init
rockchip:
- use drm_mode_copy"
* tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits)
drm/amdgpu: fix mmhub register base coding error
drm/amdgpu: add tmz support for GC IP v11.0.4
drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4
drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4
drm/amdgpu: enable GFX IP v11.0.4 CG support
drm/amdgpu: Make amdgpu_ring_mux functions as static
drm/amdgpu: generally allow over-commit during BO allocation
drm/amd/display: fix array index out of bound error in DCN32 DML
drm/amd/display: 3.2.215
drm/amd/display: set optimized required for comp buf changes
drm/amd/display: Add debug option to skip PSR CRTC disable
drm/amd/display: correct DML calc error of UrgentLatency
drm/amd/display: correct static_screen_event_mask
drm/amd/display: Ensure commit_streams returns the DC return code
drm/amd/display: read invalid ddc pin status cause engine busy
drm/amd/display: Bypass DET swath fill check for max clocks
drm/amd/display: Disable uclk pstate for subvp pipes
drm/amd/display: Fix DCN2.1 default DSC clocks
drm/amd/display: Enable dp_hdmi21_pcon support
drm/amd/display: prevent seamless boot on displays that don't have the preferred dig
...
Diffstat (limited to 'drivers/gpu/drm/i915/gt/intel_gt.c')
-rw-r--r-- | drivers/gpu/drm/i915/gt/intel_gt.c | 159 |
1 files changed, 125 insertions, 34 deletions
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 7caa3412a244..767e329e1cc5 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -12,6 +12,7 @@ #include "i915_drv.h" #include "i915_perf_oa_regs.h" +#include "i915_reg.h" #include "intel_context.h" #include "intel_engine_pm.h" #include "intel_engine_regs.h" @@ -40,8 +41,6 @@ void intel_gt_common_init_early(struct intel_gt *gt) { spin_lock_init(gt->irq_lock); - INIT_LIST_HEAD(>->lmem_userfault_list); - mutex_init(>->lmem_userfault_lock); INIT_LIST_HEAD(>->closed_vma); spin_lock_init(>->closed_lock); @@ -56,6 +55,7 @@ void intel_gt_common_init_early(struct intel_gt *gt) seqcount_mutex_init(>->tlb.seqno, >->tlb.invalidate_lock); intel_gt_pm_init_early(gt); + intel_wopcm_init_early(>->wopcm); intel_uc_init_early(>->uc); intel_rps_init_early(>->rps); } @@ -192,7 +192,7 @@ int intel_gt_init_hw(struct intel_gt *gt) ret = i915_ppgtt_init_hw(gt); if (ret) { - DRM_ERROR("Enabling PPGTT failed (%d)\n", ret); + drm_err(&i915->drm, "Enabling PPGTT failed (%d)\n", ret); goto out; } @@ -231,6 +231,16 @@ static void gen6_clear_engine_error_register(struct intel_engine_cs *engine) GEN6_RING_FAULT_REG_POSTING_READ(engine); } +i915_reg_t intel_gt_perf_limit_reasons_reg(struct intel_gt *gt) +{ + /* GT0_PERF_LIMIT_REASONS is available only for Gen11+ */ + if (GRAPHICS_VER(gt->i915) < 11) + return INVALID_MMIO_REG; + + return gt->type == GT_MEDIA ? + MTL_MEDIA_PERF_LIMIT_REASONS : GT0_PERF_LIMIT_REASONS; +} + void intel_gt_clear_error_registers(struct intel_gt *gt, intel_engine_mask_t engine_mask) @@ -254,13 +264,17 @@ intel_gt_clear_error_registers(struct intel_gt *gt, * some errors might have become stuck, * mask them. */ - DRM_DEBUG_DRIVER("EIR stuck: 0x%08x, masking\n", eir); + drm_dbg(>->i915->drm, "EIR stuck: 0x%08x, masking\n", eir); rmw_set(uncore, EMR, eir); intel_uncore_write(uncore, GEN2_IIR, I915_MASTER_ERROR_INTERRUPT); } - if (GRAPHICS_VER(i915) >= 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_rmw(gt, XEHP_RING_FAULT_REG, + RING_FAULT_VALID, 0); + intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + } else if (GRAPHICS_VER(i915) >= 12) { rmw_clear(uncore, GEN12_RING_FAULT_REG, RING_FAULT_VALID); intel_uncore_posting_read(uncore, GEN12_RING_FAULT_REG); } else if (GRAPHICS_VER(i915) >= 8) { @@ -298,6 +312,42 @@ static void gen6_check_faults(struct intel_gt *gt) } } +static void xehp_check_faults(struct intel_gt *gt) +{ + u32 fault; + + /* + * Although the fault register now lives in an MCR register range, + * the GAM registers are special and we only truly need to read + * the "primary" GAM instance rather than handling each instance + * individually. intel_gt_mcr_read_any() will automatically steer + * toward the primary instance. + */ + fault = intel_gt_mcr_read_any(gt, XEHP_RING_FAULT_REG); + if (fault & RING_FAULT_VALID) { + u32 fault_data0, fault_data1; + u64 fault_addr; + + fault_data0 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA0); + fault_data1 = intel_gt_mcr_read_any(gt, XEHP_FAULT_TLB_DATA1); + + fault_addr = ((u64)(fault_data1 & FAULT_VA_HIGH_BITS) << 44) | + ((u64)fault_data0 << 12); + + drm_dbg(>->i915->drm, "Unexpected fault\n" + "\tAddr: 0x%08x_%08x\n" + "\tAddress space: %s\n" + "\tEngine ID: %d\n" + "\tSource ID: %d\n" + "\tType: %d\n", + upper_32_bits(fault_addr), lower_32_bits(fault_addr), + fault_data1 & FAULT_GTT_SEL ? "GGTT" : "PPGTT", + GEN8_RING_FAULT_ENGINE_ID(fault), + RING_FAULT_SRCID(fault), + RING_FAULT_FAULT_TYPE(fault)); + } +} + static void gen8_check_faults(struct intel_gt *gt) { struct intel_uncore *uncore = gt->uncore; @@ -344,7 +394,9 @@ void intel_gt_check_and_clear_faults(struct intel_gt *gt) struct drm_i915_private *i915 = gt->i915; /* From GEN8 onwards we only have one 'All Engine Fault Register' */ - if (GRAPHICS_VER(i915) >= 8) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) + xehp_check_faults(gt); + else if (GRAPHICS_VER(i915) >= 8) gen8_check_faults(gt); else if (GRAPHICS_VER(i915) >= 6) gen6_check_faults(gt); @@ -812,7 +864,6 @@ static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr) } intel_uncore_init_early(gt->uncore, gt); - intel_wakeref_auto_init(>->userfault_wakeref, gt->uncore->rpm); ret = intel_uncore_setup_mmio(gt->uncore, phys_addr); if (ret) @@ -833,7 +884,7 @@ int intel_gt_probe_all(struct drm_i915_private *i915) unsigned int i; int ret; - mmio_bar = GRAPHICS_VER(i915) == 2 ? GEN2_GTTMMADR_BAR : GTTMMADR_BAR; + mmio_bar = intel_mmio_bar(GRAPHICS_VER(i915)); phys_addr = pci_resource_start(pdev, mmio_bar); /* @@ -944,7 +995,10 @@ void intel_gt_info_print(const struct intel_gt_info *info, } struct reg_and_bit { - i915_reg_t reg; + union { + i915_reg_t reg; + i915_mcr_reg_t mcr_reg; + }; u32 bit; }; @@ -970,6 +1024,32 @@ get_reg_and_bit(const struct intel_engine_cs *engine, const bool gen8, return rb; } +/* + * HW architecture suggest typical invalidation time at 40us, + * with pessimistic cases up to 100us and a recommendation to + * cap at 1ms. We go a bit higher just in case. + */ +#define TLB_INVAL_TIMEOUT_US 100 +#define TLB_INVAL_TIMEOUT_MS 4 + +/* + * On Xe_HP the TLB invalidation registers are located at the same MMIO offsets + * but are now considered MCR registers. Since they exist within a GAM range, + * the primary instance of the register rolls up the status from each unit. + */ +static int wait_for_invalidate(struct intel_gt *gt, struct reg_and_bit rb) +{ + if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50)) + return intel_gt_mcr_wait_for_reg(gt, rb.mcr_reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS); + else + return __intel_wait_for_register_fw(gt->uncore, rb.reg, rb.bit, 0, + TLB_INVAL_TIMEOUT_US, + TLB_INVAL_TIMEOUT_MS, + NULL); +} + static void mmio_invalidate_full(struct intel_gt *gt) { static const i915_reg_t gen8_regs[] = { @@ -985,6 +1065,13 @@ static void mmio_invalidate_full(struct intel_gt *gt) [COPY_ENGINE_CLASS] = GEN12_BLT_TLB_INV_CR, [COMPUTE_CLASS] = GEN12_COMPCTX_TLB_INV_CR, }; + static const i915_mcr_reg_t xehp_regs[] = { + [RENDER_CLASS] = XEHP_GFX_TLB_INV_CR, + [VIDEO_DECODE_CLASS] = XEHP_VD_TLB_INV_CR, + [VIDEO_ENHANCEMENT_CLASS] = XEHP_VE_TLB_INV_CR, + [COPY_ENGINE_CLASS] = XEHP_BLT_TLB_INV_CR, + [COMPUTE_CLASS] = XEHP_COMPCTX_TLB_INV_CR, + }; struct drm_i915_private *i915 = gt->i915; struct intel_uncore *uncore = gt->uncore; struct intel_engine_cs *engine; @@ -993,7 +1080,10 @@ static void mmio_invalidate_full(struct intel_gt *gt) const i915_reg_t *regs; unsigned int num = 0; - if (GRAPHICS_VER(i915) == 12) { + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + regs = NULL; + num = ARRAY_SIZE(xehp_regs); + } else if (GRAPHICS_VER(i915) == 12) { regs = gen12_regs; num = ARRAY_SIZE(gen12_regs); } else if (GRAPHICS_VER(i915) >= 8 && GRAPHICS_VER(i915) <= 11) { @@ -1018,16 +1108,22 @@ static void mmio_invalidate_full(struct intel_gt *gt) if (!intel_engine_pm_is_awake(engine)) continue; - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (!i915_mmio_reg_offset(rb.reg)) - continue; - - if (GRAPHICS_VER(i915) == 12 && (engine->class == VIDEO_DECODE_CLASS || - engine->class == VIDEO_ENHANCEMENT_CLASS || - engine->class == COMPUTE_CLASS)) - rb.bit = _MASKED_BIT_ENABLE(rb.bit); - - intel_uncore_write_fw(uncore, rb.reg, rb.bit); + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + intel_gt_mcr_multicast_write_fw(gt, + xehp_regs[engine->class], + BIT(engine->instance)); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + if (!i915_mmio_reg_offset(rb.reg)) + continue; + + if (GRAPHICS_VER(i915) == 12 && (engine->class == VIDEO_DECODE_CLASS || + engine->class == VIDEO_ENHANCEMENT_CLASS || + engine->class == COMPUTE_CLASS)) + rb.bit = _MASKED_BIT_ENABLE(rb.bit); + + intel_uncore_write_fw(uncore, rb.reg, rb.bit); + } awake |= engine->mask; } @@ -1047,22 +1143,17 @@ static void mmio_invalidate_full(struct intel_gt *gt) for_each_engine_masked(engine, gt, awake, tmp) { struct reg_and_bit rb; - /* - * HW architecture suggest typical invalidation time at 40us, - * with pessimistic cases up to 100us and a recommendation to - * cap at 1ms. We go a bit higher just in case. - */ - const unsigned int timeout_us = 100; - const unsigned int timeout_ms = 4; - - rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); - if (__intel_wait_for_register_fw(uncore, - rb.reg, rb.bit, 0, - timeout_us, timeout_ms, - NULL)) + if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) { + rb.mcr_reg = xehp_regs[engine->class]; + rb.bit = BIT(engine->instance); + } else { + rb = get_reg_and_bit(engine, regs == gen8_regs, regs, num); + } + + if (wait_for_invalidate(gt, rb)) drm_err_ratelimited(>->i915->drm, "%s TLB invalidation did not complete in %ums!\n", - engine->name, timeout_ms); + engine->name, TLB_INVAL_TIMEOUT_MS); } /* |