aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2023-06-30drm/amdgpu: port SRIOV VF missed changesZhigang Luo1-1/+14
port SRIOV VF missed changes from gfx_v9_0 to gfx_v9_4_3. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amd: Don't try to enable secure display TA multiple timesMario Limonciello1-0/+2
If the securedisplay TA failed to load the first time, it's unlikely to work again after a suspend/resume cycle or reset cycle and it appears to be causing problems in futher attempts. Fixes: e42dfa66d592 ("drm/amdgpu: Add secure display TA load for Renoir") Reported-by: Filip Hejsek <[email protected]> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/2633 Signed-off-by: Mario Limonciello <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu: fix number of fence calculationsChristian König1-5/+6
Since adding gang submit we need to take the gang size into account while reserving fences. Signed-off-by: Christian König <[email protected]> Fixes: 4624459c84d7 ("drm/amdgpu: add gang submit frontend v6") Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu: Modify for_each_inst macroLijo Lazar1-3/+4
Modify it such that it doesn't change the instance mask parameter. Signed-off-by: Lijo Lazar <[email protected]> Acked-by: Victor Skvortsov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu:Remove sdma halt/unhalt during frontdoor loadMangesh Gadre1-4/+9
sdma halt/unhalt is performed by psp when frontdoor loading used,so this can be skipped. v2: Instead of removing halt/unhalt completely, driver will do it only during backdoor load. Signed-off-by: Mangesh Gadre <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu: check RAS irq existence for VCN/JPEGTao Zhou2-2/+4
No RAS irq is allowed. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu: remove vm sanity check from amdgpu_vm_make_computeXiaogang Chen1-6/+6
Since we allow kfd and graphic operate on same GPU VM to have interoperation between them GPU VM may have been used by graphic vm operations before kfd turns a GPU VM into a compute VM. Remove vm clean checking at amdgpu_vm_make_compute. Signed-off-by: Xiaogang Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-29Merge tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drmLinus Torvalds184-3251/+17286
Pull drm updates from Dave Airlie: "There is one set of patches to misc for a i915 gsc/mei proxy driver. Otherwise it's mostly amdgpu/i915/msm, lots of hw enablement and lots of refactoring. core: - replace strlcpy with strscpy - EDID changes to support further conversion to struct drm_edid - Move i915 DSC parameter code to common DRM helpers - Add Colorspace functionality aperture: - ignore framebuffers with non-primary devices fbdev: - use fbdev i/o helpers - add Kconfig options for fb_ops helpers - use new fb io helpers directly in drivers sysfs: - export DRM connector ID scheduler: - Avoid an infinite loop ttm: - store function table in .rodata - Add query for TTM mem limit - Add NUMA awareness to pools - Export ttm_pool_fini() bridge: - fsl-ldb: support i.MX6SX - lt9211, lt9611: remove blanking packets - tc358768: implement input bus formats, devm cleanups - ti-snd65dsi86: implement wait_hpd_asserted - analogix: fix endless probe loop - samsung-dsim: support swapped clock, fix enabling, support var clock - display-connector: Add support for external power supply - imx: Fix module linking - tc358762: Support reset GPIO panel: - nt36523: Support Lenovo J606F - st7703: Support Anbernic RG353V-V2 - InnoLux G070ACE-L01 support - boe-tv101wum-nl6: Improve initialization - sharp-ls043t1le001: Mode fixes - simple: BOE EV121WXM-N10-1850, S6D7AA0 - Ampire AM-800480L1TMQW-T00H - Rocktech RK043FN48H - Starry himax83102-j02 - Starry ili9882t amdgpu: - add new ctx query flag to handle reset better - add new query/set shadow buffer for rdna3 - DCN 3.2/3.1.x/3.0.x updates - Enable DC_FP on loongarch - PCIe fix for RDNA2 - improve DC FAMS/SubVP support for better power management - partition support for lots of engines - Take NUMA into account when allocating memory - Add new DRM_AMDGPU_WERROR config parameter to help with CI - Initial SMU13 overdrive support - Add support for new colorspace KMS API - W=1 fixes amdkfd: - Query TTM mem limit rather than hardcoding it - GC 9.4.3 partition support - Handle NUMA for partitions - Add debugger interface for enabling gdb - Add KFD event age tracking radeon: - Fix possible UAF i915: - new getparam for PXP support - GSC/MEI proxy driver - Meteorlake display enablement - avoid clearing preallocated framebuffers with TTM - implement framebuffer mmap support - Disable sampler indirect state in bindless heap - Enable fdinfo for GuC backends - GuC loading and firmware table handling fixes - Various refactors for multi-tile enablement - Define MOCS and PAT tables for MTL - GSC/MEI support for Meteorlake - PMU multi-tile support - Large driver kernel doc cleanup - Allow VRR toggling and arbitrary refresh rates - Support async flips on linear buffers on display ver 12+ - Expose CRTC CTM property on ILK/SNB/VLV - New debugfs for display clock frequencies - Hotplug refactoring - Display refactoring - I915_GEM_CREATE_EXT_SET_PAT for Mesa on Meteorlake - Use large rings for compute contexts - HuC loading for MTL - Allow user to set cache at BO creation - MTL powermanagement enhancements - Switch to dedicated workqueues to stop using flush_scheduled_work() - Move display runtime init under display/ - Remove 10bit gamma on desktop gen3 parts, they don't support it habanalabs: - uapi: return 0 for user queries if there was a h/w or f/w error - Add pci health check when we lose connection with the firmware. This can be used to distinguish between pci link down and firmware getting stuck. - Add more info to the error print when TPC interrupt occur. - Firmware fixes msm: - Adreno A660 bindings - SM8350 MDSS bindings fix - Added support for DPU on sm6350 and sm6375 platforms - Implemented tearcheck support to support vsync on SM150 and newer platforms - Enabled missing features (DSPP, DSC, split display) on sc8180x, sc8280xp, sm8450 - Added support for DSI and 28nm DSI PHY on MSM8226 platform - Added support for DSI on sm6350 and sm6375 platforms - Added support for display controller on MSM8226 platform - A690 GPU support - Move cmdstream dumping out of fence signaling path - a610 support - Support for a6xx devices without GMU nouveau: - NULL ptr before deref fixes armada: - implement fbdev emulation as client sun4i: - fix mipi-dsi dotclock - release clocks vc4: - rgb range toggle property - BT601 / BT2020 HDMI support vkms: - convert to drmm helpers - add reflection and rotation support - fix rgb565 conversion gma500: - fix iomem access shmobile: - support renesas soc platform - enable fbdev mxsfb: - Add support for i.MX93 LCDIF stm: - dsi: Use devm_ helper - ltdc: Fix potential invalid pointer deref renesas: - Group drivers in renesas subdirectory to prepare for new platform - Drop deprecated R-Car H3 ES1.x support meson: - Add support for MIPI DSI displays virtio: - add sync object support mediatek: - Add display binding document for MT6795" * tag 'drm-next-2023-06-29' of git://anongit.freedesktop.org/drm/drm: (1791 commits) drm/i915: Fix a NULL vs IS_ERR() bug drm/i915: make i915_drm_client_fdinfo() reference conditional again drm/i915/huc: Fix missing error code in intel_huc_init() drm/i915/gsc: take a wakeref for the proxy-init-completion check drm/msm/a6xx: Add A610 speedbin support drm/msm/a6xx: Add A619_holi speedbin support drm/msm/a6xx: Use adreno_is_aXYZ macros in speedbin matching drm/msm/a6xx: Use "else if" in GPU speedbin rev matching drm/msm/a6xx: Fix some A619 tunables drm/msm/a6xx: Add A610 support drm/msm/a6xx: Add support for A619_holi drm/msm/adreno: Disable has_cached_coherent in GMU wrapper configurations drm/msm/a6xx: Introduce GMU wrapper support drm/msm/a6xx: Move CX GMU power counter enablement to hw_init drm/msm/a6xx: Extend and explain UBWC config drm/msm/a6xx: Remove both GBIF and RBBM GBIF halt on hw init drm/msm/a6xx: Add a helper for software-resetting the GPU drm/msm/a6xx: Improve a6xx_bus_clear_pending_transactions() drm/msm/a6xx: Move a6xx_bus_clear_pending_transactions to a6xx_gpu drm/msm/a6xx: Move force keepalive vote removal to a6xx_gmu_force_off() ...
2023-06-27Merge tag 'hardening-v6.5-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "There are three areas of note: A bunch of strlcpy()->strscpy() conversions ended up living in my tree since they were either Acked by maintainers for me to carry, or got ignored for multiple weeks (and were trivial changes). The compiler option '-fstrict-flex-arrays=3' has been enabled globally, and has been in -next for the entire devel cycle. This changes compiler diagnostics (though mainly just -Warray-bounds which is disabled) and potential UBSAN_BOUNDS and FORTIFY _warning_ coverage. In other words, there are no new restrictions, just potentially new warnings. Any new FORTIFY warnings we've seen have been fixed (usually in their respective subsystem trees). For more details, see commit df8fc4e934c12b. The under-development compiler attribute __counted_by has been added so that we can start annotating flexible array members with their associated structure member that tracks the count of flexible array elements at run-time. It is possible (likely?) that the exact syntax of the attribute will change before it is finalized, but GCC and Clang are working together to sort it out. Any changes can be made to the macro while we continue to add annotations. As an example of that last case, I have a treewide commit waiting with such annotations found via Coccinelle: https://git.kernel.org/linus/adc5b3cb48a049563dc673f348eab7b6beba8a9b Also see commit dd06e72e68bcb4 for more details. Summary: - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko) - Convert strreplace() to return string start (Andy Shevchenko) - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook) - Add missing function prototypes seen with W=1 (Arnd Bergmann) - Fix strscpy() kerndoc typo (Arne Welzel) - Replace strlcpy() with strscpy() across many subsystems which were either Acked by respective maintainers or were trivial changes that went ignored for multiple weeks (Azeem Shaikh) - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers) - Add KUnit tests for strcat()-family - Enable KUnit tests of FORTIFY wrappers under UML - Add more complete FORTIFY protections for strlcat() - Add missed disabling of FORTIFY for all arch purgatories. - Enable -fstrict-flex-arrays=3 globally - Tightening UBSAN_BOUNDS when using GCC - Improve checkpatch to check for strcpy, strncpy, and fake flex arrays - Improve use of const variables in FORTIFY - Add requested struct_size_t() helper for types not pointers - Add __counted_by macro for annotating flexible array size members" * tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (54 commits) netfilter: ipset: Replace strlcpy with strscpy uml: Replace strlcpy with strscpy um: Use HOST_DIR for mrproper kallsyms: Replace all non-returning strlcpy with strscpy sh: Replace all non-returning strlcpy with strscpy of/flattree: Replace all non-returning strlcpy with strscpy sparc64: Replace all non-returning strlcpy with strscpy Hexagon: Replace all non-returning strlcpy with strscpy kobject: Use return value of strreplace() lib/string_helpers: Change returned value of the strreplace() jbd2: Avoid printing outside the boundary of the buffer checkpatch: Check for 0-length and 1-element arrays riscv/purgatory: Do not use fortified string functions s390/purgatory: Do not use fortified string functions x86/purgatory: Do not use fortified string functions acpi: Replace struct acpi_table_slit 1-element array with flex-array clocksource: Replace all non-returning strlcpy with strscpy string: use __builtin_memcpy() in strlcpy/strlcat staging: most: Replace all non-returning strlcpy with strscpy drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy ...
2023-06-26drm: Clear fd/handle callbacks in struct drm_driverThomas Zimmermann1-4/+0
Clear all assignments of struct drm_driver's fd/handle callbacks to drm_gem_prime_fd_to_handle() and drm_gem_prime_handle_to_fd(). These functions are called by default. Add a TODO item to convert vmwgfx to the defaults as well. v2: * remove TODO item (Zack) * also update amdgpu's amdgpu_partition_driver Signed-off-by: Thomas Zimmermann <[email protected]> Reviewed-by: Simon Ser <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Jeffrey Hugo <[email protected]> # qaic Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-06-23drm/amd/pm: update the LC_L1_INACTIVITY setting to address possible noise issueEvan Quan1-1/+1
It is proved that insufficient LC_L1_INACTIVITY setting can cause audio noise on some platform. With the LC_L1_INACTIVITY increased to 4ms, the issue can be resolved. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Skip TMR for MP0_HWIP 13.0.6Zhigang Luo1-0/+1
For SRIOV VF, no TMR needed. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amd/pm: revise the ASPM settings for thunderbolt attached scenarioEvan Quan1-4/+7
Also, correct the comment for NAVI10_PCIE__LC_L1_INACTIVITY_TBT_DEFAULT as 0x0000000E stands for 400ms instead of 4ms. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: fix clearing mappings for BOs that are always valid in VMSamuel Pitoiset1-0/+12
Per VM BOs must be marked as moved or otherwise their ranges are not updated on use which might be necessary when the replace operation splits mappings. This fixes random GPU hangs when replacing sparse mappings from the userspace, while OP_MAP/OP_UNMAP works fine because always valid BOs are correctly handled there. Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Add vbios attribute only if supportedLijo Lazar4-2/+15
Not all devices carry VBIOS version information. Add the device attribute only if supported. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu/atomfirmware: fix LPDDR5 width reportingAlex Deucher1-6/+12
LPDDR5 channels are 32 bit rather than 64, report the width properly in the log. v2: Only LPDDR5 are 32 bits per channel. DDR5 is 64 bits per channel Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2468 Acked-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Remove CONFIG_DEBUG_FS guard around body of ↵Nathan Chancellor1-2/+0
amdgpu_rap_debugfs_init() After commit 8020f0f9316b ("drm/amd/amdgpu: enable W=1 for amdgpu"), there is an instance of -Wunused-const-variable when CONFIG_DEBUG_FS is disabled: drivers/gpu/drm/amd/amdgpu/amdgpu_rap.c:110:37: error: unused variable 'amdgpu_rap_debugfs_ops' [-Werror,-Wunused-const-variable] 110 | static const struct file_operations amdgpu_rap_debugfs_ops = { | ^ 1 error generated. There is no reason for the body of this function to be guarded when CONFIG_DEBUG_FS is disabled, as debugfs_create_file() is a stub that just returns an error pointer in that situation. Remove the preprocessor guards so that the variable never appears unused, while not changing anything at run time. Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Move calculation of xcp per memory nodeLijo Lazar1-1/+3
Its value is required for finding the memory id of xcp. Fixes: d26ea1b346e7 ("drm/amdgpu: Add xcp manager num_xcp_per_mem_partition") Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Skip mark offset for high priority ringsJiadong Zhu1-0/+3
Only low priority rings are using chunks to save the offset. Bypass the mark offset callings from high priority rings. Signed-off-by: Jiadong Zhu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-19drm/amdgpu: Remove struct drm_driver.gem_prime_mmapThomas Zimmermann1-1/+0
The callback struct drm_driver.gem_prime_mmap as been removed in commit 0adec22702d4 ("drm: Remove struct drm_driver.gem_prime_mmap"). Do not assign to it. The assigned function, drm_gem_prime_mmap(), is now the default for the operation, so there is no change in functionality. Signed-off-by: Thomas Zimmermann <[email protected]> Fixes: 0adec22702d4 ("drm: Remove struct drm_driver.gem_prime_mmap") Cc: Thomas Zimmermann <[email protected]> Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-06-19Merge drm/drm-next into drm-misc-nextThomas Zimmermann185-3341/+17711
Backmerging into drm-misc-next to get commit 2c1c7ba457d4 ("drm/amdgpu: support partition drm devices"), which is required to fix commit 0adec22702d4 ("drm: Remove struct drm_driver.gem_prime_mmap"). Signed-off-by: Thomas Zimmermann <[email protected]>
2023-06-19drm: Remove struct drm_driver.gem_prime_mmapThomas Zimmermann1-1/+0
All drivers initialize this field with drm_gem_prime_mmap(). Call the function directly and remove the field. Simplifies the code and resolves a long-standing TODO item. Signed-off-by: Thomas Zimmermann <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-06-15drm/amdgpu: Increase hmm range get pages timeoutPhilip Yang1-2/+2
If hmm_range_fault returns -EBUSY, we should call hmm_range_fault again to validate the remaining pages. On one system with NUMA auto balancing enabled, hmm_range_fault takes 6 seconds for 1GB range because CPU migrate the range one page at a time. To be safe, increase timeout value to 1 second for 128MB range. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Enable translate further for GC v9.4.3Philip Yang1-0/+1
To extend UTCL2 reach. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Remove unused NBIO interfaceLijo Lazar2-16/+0
Set compute partition mode interface in NBIO is no longer used. Remove the only implementation from NBIO v7.9 Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: add entity error check in amdgpu_ctx_get_entityZhenGuo Yin1-1/+9
[Why] UMD is not aware of entity error, and will keep submitting jobs into the error entity. [How] Add entity error check when getting entity from ctx. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: ZhenGuo Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: add VM generation tokenChristian König7-7/+37
Instead of using the VRAM lost counter add a 64bit token which indicates if a context or job is still valid to use. Should the VRAM be lost or the page tables need re-creation the token will change indicating that userspace needs to act and re-create the contexts and re-submit the work. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: reset VM when an error is detectedChristian König1-16/+65
When some problem with the updates of page tables is detected reset the state machine of the VM and re-create all page tables from scratch. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: abort submissions during prepare on errorChristian König1-1/+12
Forward errors from previous submissions to this one. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: mark soft recovered fences with -ENODATAChristian König1-0/+7
Set the fence error code before trying to soft-recover it. It gets overwritten when a hard recovery is required. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: mark force completed fences with -ECANCELEDChristian König1-0/+1
When we force complete fences we should mark them as canceled. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: add amdgpu_error_* debugfs fileChristian König3-0/+41
This allows us to insert some error codes into the bottom of the pipeline on an engine. Signed-off-by: Christian König <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: mark GC 9.4.3 experimental for nowAlex Deucher1-0/+2
Mark as experimental for now until we get closer to production to avoid possible undesireable behavior when mixing newer boards with older kernels. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Use PSP FW API for partition switchLijo Lazar2-15/+6
Use PSP firmware interface for switching compute partitions. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Change nbio v7.9 xcp status definitionLijo Lazar1-5/+3
PARTITION_MODE field in PARTITION_COMPUTE_STATUS register is defined as below by firmware. SPX = 0, DPX = 1, TPX = 2, QPX = 3, CPX = 4 Change driver definition accordingly. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: make sure that BOs have a backing storeChristian König1-1/+5
It's perfectly possible that the BO is about to be destroyed and doesn't have a backing store associated with it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Guchun Chen <[email protected]> Tested-by: Mikhail Gavrilov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: make sure BOs are locked in amdgpu_vm_get_memoryChristian König1-30/+39
We need to grab the lock of the BO or otherwise can run into a crash when we try to inspect the current location. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Guchun Chen <[email protected]> Tested-by: Mikhail Gavrilov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Add checking mc_vram_sizeStanley.Yang1-1/+2
Do not compare injection address with mc_vram_size if mc_vram_size is zero. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Optimize checking ras supportedStanley.Yang3-21/+23
Using "is_app_apu" to identify device in the native APU mode or carveout mode. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Add channel_dis_num to ras init flagsCandice Li2-0/+2
Add disabled channel number to ras init flags. Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Update total channel number for umc v8_10Candice Li3-1/+5
Update total channel number for umc v8_10. Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Fix usage of UMC fill record in RASLuben Tuikov1-2/+1
The fixed commit listed in the Fixes tag below, introduced a bug in amdgpu_ras.c::amdgpu_reserve_page_direct(), in that when introducing the new amdgpu_umc_fill_error_record() and internally in that new function the physical address (argument "uint64_t retired_page"--wrong name) is right-shifted by AMDGPU_GPU_PAGE_SHIFT. Thus, in amdgpu_reserve_page_direct() when we pass "address" to that new function, we should NOT right-shift it, since this results, erroneously, in the page address to be 0 for first 2^(2*AMDGPU_GPU_PAGE_SHIFT) memory addresses. This commit fixes this bug. Cc: Tao Zhou <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Alex Deucher <[email protected]> Fixes: 400013b268cb ("drm/amdgpu: add umc_fill_error_record to make code more simple") Signed-off-by: Luben Tuikov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu/sdma4: set align mask to 255Alex Deucher2-4/+4
The wptr needs to be incremented at at least 64 dword intervals, use 256 to align with windows. This should fix potential hangs with unaligned updates. Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Report ras_num_recs in debugfsLuben Tuikov1-0/+2
Report the number of records stored in the RAS EEPROM table in debugfs. This can be used by user-space to calculate the capacity of the RAS EEPROM table since "bad_page_cnt_threshold" is also reported in the same place in debugfs. See commit 7fb640714547 ("drm/amdgpu: Add bad_page_cnt_threshold to debugfs"). ras_num_recs can already be inferred by dumping the RAS EEPROM table, also in the same debugfs location, see commit reference c65b0805e77919 (drm/amdgpu: RAS EEPROM table is now in debugfs, 2021-04-08). This commit makes it an integer value easily shown in a single file. Cc: Alex Deucher <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Tao Zhou <[email protected]> Cc: Stanley Yang <[email protected]> Cc: John Clements <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Release SDMAv4.4.2 ecc irq properlyLijo Lazar1-6/+10
Release ECC irq only if irq is enabled - only when RAS feature is enabled ECC irq gets enabled. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: add wait_for helper for spirom updateLikun Gao4-4/+29
Spirom update typically requires extremely long duration for command execution, and special helper function to wait for it completion. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: Print client id for the unregistered interrupt resourceMa Jun1-1/+2
Modify the debug information and print the clien id for these interrupts as well. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15Revert "drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar ↵Arunpravin Paneer Selvam1-1/+1
system" This reverts commit c105518679b6e87232874ffc989ec403bee59664. This patch disables the TOPDOWN flag for APU and few dGPU cards which has the VRAM size equal to the BAR size. When we enable the TOPDOWN flag, we get the free blocks at the highest available memory region and we don't split the lower order blocks. This change is required to keep off the fragmentation related issues particularly in ASIC which has VRAM space <= 500MiB Hence, we are reverting this patch. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: expose num_hops and num_links xgmi info through dev attrShiwu Zhang1-0/+46
Add these two dev attrs for xgmi info details which is helpful for developers checking the xgmi topology by catting the sys file directly. Take 4 cards with xgmi connection as an example, get the num_hops for each device or node through xmig_hive_info dir like, cat /sys/bus/pci/devices/0000:41:00.0/xgmi_hive_info/node1/num_hops will return "00 41 41 41" where "00" stands for the hops to node1 itself and "41" is the hops in hex format to every other node in the same hive. There are node1/node2/node3/node4 representing 4 cards in the hive. The same for num_links dev attr. Signed-off-by: Shiwu Zhang <[email protected]> Acked-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdgpu: vcn_4_0 set instance 0 init sched score to 1Sonny Jiang1-1/+5
Only vcn0 can process AV1 codecx. In order to use both vcn0 and vcn1 in h264/265 transcode to AV1 cases, set vcn0 sched score to 1 at initialization time. Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>