aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-10-27drm/amdgpu: correct MES debugfs versionsGraham Sider1-4/+6
Use mes.sched_version, mes.kiq_version for debugfs as mes.ucode_fw_version does not contain correct versioning information. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: Move the mutex_lock to protect the return status of ↵Alan Liu2-4/+9
securedisplay command buffer [Why] Before we call psp_securedisplay_invoke(), we call psp_prep_securedisplay_cmd_buf() to prepare and initialize the command buffer. However, we didn't use the mutex_lock to protect the status of command buffer. So when multiple threads are using the command buffer, after thread A return from psp_securedisplay_invoke() and the command buffer status is set to SUCCESS, another thread B may call psp_prep_securedisplay_cmd_buf() and initialize the status to FAILURE again, and cause Thread A to get a failure return status. [How] Move the mutex_lock out of psp_securedisplay_invoke() to its caller to cover psp_prep_securedisplay_cmd_buf() and the code checking the return status of command buffer. Signed-off-by: Alan Liu <HaoPing.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: remove ras_error_status parameter for UMC poison handlerTao Zhou4-16/+8
Make the code simpler. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: add RAS poison handling for MCATao Zhou1-11/+20
For MCA poison, if unmap queue fails, only gpu reset should be triggered without page retirement handling, MCA notifier will do it. v2: handle MCA poison consumption in umc_poison_handler directly. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: use page retirement API in MCA notifierTao Zhou1-33/+3
Make the code more readable. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: add RAS page retirement functions for MCATao Zhou2-0/+55
Define page retirement functions for MCA platform. v2: remove page retirement handling from MCA poison handler, let MCA notifier do page retirement. v3: remove specific poison handler for MCA to simplify code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-27drm/amdgpu: set fb_modifiers_not_supported in vkmsYifan Zhang1-0/+2
This patch to fix the gdm3 start failure with virual display: /usr/libexec/gdm-x-session[1711]: (II) AMDGPU(0): Setting screen physical size to 270 x 203 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to make import prime FD as pixmap: 22 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument /usr/libexec/gdm-x-session[1711]: (WW) AMDGPU(0): Failed to set mode on CRTC 0 /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): Failed to enable any CRTC gnome-shell[1840]: Running GNOME Shell (using mutter 42.2) as a X11 window and compositing manager /usr/libexec/gdm-x-session[1711]: (EE) AMDGPU(0): failed to set mode: Invalid argument vkms doesn't have modifiers support, set fb_modifiers_not_supported to bring the gdm back. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24amd/amdgpu: fix repeated words in commentswangjianli1-1/+1
Delete the redundant word 'the'. Signed-off-by: wangjianli <wangjianli@cdjrlc.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: disallow gfxoff until GC IP blocks complete s2idle resumePrike Liang1-0/+16
In the S2idle suspend/resume phase the gfxoff is keeping functional so some IP blocks will be likely to reinitialize at gfxoff entry and that will result in failing to program GC registers.Therefore, let disallow gfxoff until AMDGPU IPs reinitialized completely. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: skip mes self test for gc 11.0.3 in recoverYuBiao Wang1-1/+1
Temporary disable mes self teset for gc 11.0.3 during gpu_recovery. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amd: Add IMU fw version to fw version queriesDavid Francis4-1/+18
IMU is a new firmware for GFX11. There are four means by which firmware version can be queried from the driver: device attributes, vf2pf, debugfs, and the AMDGPU_INFO_FW_VERSION option in the amdgpu info ioctl. Add IMU as an option for those four methods. V2: Added debugfs Reviewed-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: fix sdma doorbell init ordering on APUsAlex Deucher2-5/+21
Commit 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") uncovered a bug in amdgpu that required a reordering of the driver init sequence to avoid accessing a special register on the GPU before it was properly set up leading to an PCI AER error. This reordering uncovered a different hw programming ordering dependency in some APUs where the SDMA doorbells need to be programmed before the GFX doorbells. To fix this, move the SDMA doorbell programming back into the soc15 common code, but use the actual doorbell range values directly rather than the values stored in the ring structure since those will not be initialized at this point. This is a partial revert, but with the doorbell assignment fixed so the proper doorbell index is set before it's used. Fixes: e3163bc8ffdfdb ("drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega") Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: skhan@linuxfoundation.org
2022-10-24drm/amd/pm: allow gfxoff on gc_11_0_3Kenneth Feng1-0/+1
allow gfxoff on gc_11_0_3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdkfd: Fix memory leak in kfd_mem_dmamap_userptr()Rafael Mendonca1-3/+3
If the number of pages from the userptr BO differs from the SG BO then the allocated memory for the SG table doesn't get freed before returning -EINVAL, which may lead to a memory leak in some error paths. Fix this by checking the number of pages before allocating memory for the SG table. Fixes: 264fb4d332f5 ("drm/amdgpu: Add multi-GPU DMA mapping helpers") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-24drm/amdgpu: Remove ATC L2 access for MMHUB 2.1.xLijo Lazar1-20/+8
MMHUB 2.1.x versions don't have ATCL2. Remove accesses to ATCL2 registers. Since they are non-existing registers, read access will cause a 'Completer Abort' and gets reported when AER is enabled with the below patch. Tagging with the patch so that this is backported along with it. v2: squash in uninitialized warning fix (Nathan Chancellor) Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-10-21drm/amdgpu: Adjust MES polling timeout for sriovYiqing Yao1-1/+8
[why] MES response time in sriov may be longer than default value due to reset or init in other VF. A timeout value specific to sriov is needed. [how] When in sriov, adjust the timeout value to calculated worst case scenario. Signed-off-by: Yiqing Yao <yiqing.yao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-21drm/amdgpu: fix pstate setting issueChengming Gui1-1/+4
[WHY] 0, original pstate X 1, ctx_A_create -> ctx_A->stable_pstate = X 2, ctx_A_set_pstate (Y) -> current pstate is Y (PEAK or STANDARD) 3, ctx_B_create -> ctx_B->stable_pstate = Y 4, ctx_A_destroy -> restore pstate to X 5, ctx_B_destroy -> restore pstate to Y Above sequence will cause final pstate is wrong (Y), should be original X. [HOW] When ctx_B create, if ctx_A touched pstate setting (not auto, stable_pstate_ctx != NULL), set ctx_B->stable_pstate the same value as ctx_A saved, if stable_pstate_ctx == NULL, fetch current pstate to fill ctx_B->stable_pstate. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Fix for BO move issueArunpravin Paneer Selvam1-0/+3
A user reported a bug on CAPE VERDE system where uvd_v3_1 IP component failed to initialize as there is an issue with BO move code from one memory to other. In function amdgpu_mem_visible() called by amdgpu_bo_move(), when there are no blocks to compare or if we have a single block then break the loop. Fixes: 312b4dc11d4f ("drm/amdgpu: Fix VRAM BO swap issue") Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: dequeue mes scheduler during finiYuBiao Wang1-3/+39
[Why] If mes is not dequeued during fini, mes will be in an uncleaned state during reload, then mes couldn't receive some commands which leads to reload failure. [How] Perform MES dequeue via MMIO after all the unmap jobs are done by mes and before kiq fini. v2: Move the dequeue operation inside kiq_hw_fini. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-18drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11Yifan Zha3-9/+13
[Why] L1 blocks most of GC registers accessing by MMIO. [How] Use RLCG interface to program GC registers under SRIOV VF in full access time. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amd/amdgpu: Replace kmap() with kmap_local_page()Fabio M. De Francesco1-4/+4
kmap() is being deprecated in favor of kmap_local_page(). There are two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the tasks can be preempted and, when they are scheduled to run again, the kernel virtual addresses are restored and are still valid. Since its use in amdgpu/amdgpu_ttm.c is safe, it should be preferred. Therefore, replace kmap() with kmap_local_page() in amdgpu/amdgpu_ttm.c. Suggested-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: move convert_error_address out of umc_rasHawking Zhang4-12/+18
RAS error address translation algorithm is common across dGPU and A + A platform as along as the SOC integrates the same generation of UMC IP. UMC RAS is managed by x86 MCA on A + A platform, umc_ras in GPU driver is not initialized at all on A + A platform. In such case, any umc_ras callback implemented for dGPU config shouldn't be invoked from A + A specific callback. The change moves convert_error_address out of dGPU umc_ras structure and makes it share between A + A and dGPU config. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amd/pm: disable cstate feature for gpu reset scenarioEvan Quan1-0/+8
Suggested by PMFW team and same as what did for gfxoff feature. This can address some Mode1Reset failures observed on SMU13.0.0. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.0.x
2022-10-17drm/amdgpu: Add sriov vf ras support in amdgpu_ras_asic_supportedYiPeng Chai1-5/+9
V2: Add sriov vf ras support in amdgpu_ras_asic_supported. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: Enable ras support for mp0 v13_0_0 and v13_0_10YiPeng Chai1-0/+10
V1: Enable ras support for CHIP_IP_DISCOVERY asic type. V2: 1. Change commit comment. 2. Enable ras support for mp0 v13_0_0 and v13_0_10. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: Enable gmc soft reset on gmc_v11_0_3YiPeng Chai1-0/+1
Enable gmc soft reset on gmc_v11_0_3. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: skip mes self test for gc 11.0.3Likun Gao1-1/+2
Temporary disable mes self teset for gc 11.0.3. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amd/amdgpu: enable gfx clock gating features on smu_v13_0_10Kenneth Feng2-1/+6
enable gfx clock gating features on smu_v13_0_10 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Jack Gui <Jack.Gui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: Refactor mode2 reset logic for v11.0.7Victor Zhao1-8/+17
- refactor mode2 on v11.0.7 to align with aldebaran - comment out using mode2 reset as default for now, will introduce another controller to replace previous reset_level_mask v2: squash in unused variable removal (Alex) Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17Revert "drm/amdgpu: let mode2 reset fallback to default when failure"Victor Zhao9-20/+2
This reverts commit dac6b80818ac2353631c5a33d140d8d5508e2957. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b80818ac23 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17Revert "drm/amdgpu: add debugfs amdgpu_reset_level"Victor Zhao4-17/+0
This reverts commit 5bd8d53f6fa53eab5433698d1362dae2aa53c1cc. This commit breaks the reset logic for aldebaran, revert it for now. Will move the mask inside the reset handler. Fixes: 5bd8d53f6fa53e ("drm/amdgpu: add debugfs amdgpu_reset_level") Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: set vm_update_mode=0 as default for Sienna Cichlid in SRIOV caseDanijel Slivka3-1/+15
For asic with VF MMIO access protection avoid using CPU for VM table updates. CPU pagetable updates have issues with HDP flush as VF MMIO access protection blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register during sriov runtime. v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT which indicates that VF MMIO write access is not allowed in sriov runtime Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu/si_dma: remove unused variable in si_dma_stop()Yang Yingliang1-2/+0
After commit 571c05365892 ("drm/amdgpu: switch sdma buffer function tear down to a helper"), the variable 'ring' is not used anymore, it can be removed. Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: convert amdgpu_amdkfd_gpuvm.c to IP version checksAlex Deucher1-4/+4
For consistency with the rest of the code. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: convert vega20_ih.c to IP version checksAlex Deucher1-3/+3
For consistency with newer asics. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: update psp_fw_type enum in amdgpu_ucode headerHawking Zhang1-0/+1
To match with the definition in psp firmware Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: extend HWIP_MAX_INSTANCE to 28Hawking Zhang1-1/+1
more ip instances are available Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: allow secure submission on gfx11 and sdma6Yifan Zhang2-0/+2
This patch to allow secure submission on gfx11 and sdma6. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-17drm/amdgpu: add tmz support for GC 11.0.1Yifan Zhang1-0/+1
this patch to add tmz support for GC 11.0.1. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-13Merge tag 'drm-next-2022-10-14' of git://anongit.freedesktop.org/drm/drmLinus Torvalds21-269/+231
Pull more drm updates from Dave Airlie: "Round of fixes for the merge window stuff, bunch of amdgpu and i915 changes, this should have the gcc11 warning fix, amongst other changes. amdgpu: - DC mutex fix - DC SubVP fixes - DCN 3.2.x fixes - DCN 3.1.x fixes - SDMA 6.x fixes - Enable DPIA for 3.1.4 - VRR fixes - VRAM BO swapping fix - Revert dirty fb helper change - SR-IOV suspend/resume fixes - Work around GCC array bounds check fail warning - UMC 8.10 fixes - Misc fixes and cleanups i915: - Round to closest in g4x+ HDMI clock readout - Update MOCS table for EHL - Fix PSR_IMR/IIR field handling - Fix watermark calculations for gen12+/DG2 modifiers - Reject excessive dotclocks early - Fix revocation of non-persistent contexts - Handle migration for dpt - Fix display problems after resume - Allow control over the flags when migrating - Consider DG2_RC_CCS_CC when migrating buffers" * tag 'drm-next-2022-10-14' of git://anongit.freedesktop.org/drm/drm: (110 commits) drm/amd/display: Add HUBP surface flip interrupt handler drm/i915/display: consider DG2_RC_CCS_CC when migrating buffers drm/i915: allow control over the flags when migrating drm/amd/display: Simplify bool conversion drm/amd/display: fix transfer function passed to build_coefficients() drm/amd/display: add a license to cursor_reg_cache.h drm/amd/display: make virtual_disable_link_output static drm/amd/display: fix indentation in dc.c drm/amd/display: make dcn32_split_stream_for_mpc_or_odm static drm/amd/display: fix build error on arm64 drm/amd/display: 3.2.207 drm/amd/display: Clean some DCN32 macros drm/amdgpu: Add poison mode query for umc v8_10_0 drm/amdgpu: Update umc v8_10_0 headers drm/amdgpu: fix coding style issue for mca notifier drm/amdgpu: define convert_error_address for umc v8.7 drm/amdgpu: define RAS convert_error_address API drm/amdgpu: remove check for CE in RAS error address query drm/i915: Fix display problems after resume drm/amd/display: fix array-bounds error in dc_stream_remove_writeback() [take 2] ...
2022-10-11drm/amdgpu: Add poison mode query for umc v8_10_0Candice Li1-0/+26
Add poison mode query support on umc v8_10_0. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: fix coding style issue for mca notifierTao Zhou1-3/+3
Fix some issues found by checkpatch script. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: define convert_error_address for umc v8.7Tao Zhou1-22/+25
So the code can be simplified. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: define RAS convert_error_address APITao Zhou3-93/+64
Make the code reusable and remove redundant code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-11drm/amdgpu: remove check for CE in RAS error address queryTao Zhou4-90/+59
Only RAS UE error address is queried currently, no need to check CE status. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-10drm/amdgpu: fix SDMA suspend/resume on SR-IOVAlex Deucher4-13/+18
Update all SDMA versions that support SR-IOV to properly tear down the ttm buffer functions on suspend. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-10drm/amdgpu: switch sdma buffer function tear down to a helperAlex Deucher10-59/+36
Switch all of the SDMA implementations to use the helper to tear down the ttm buffer manager. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-10drm/amdgpu: Fix SDMA engine resume issue under SRIOVBokun Zhang1-3/+10
- Under SRIOV, SDMA engine is shared between VFs. Therefore, we will not stop SDMA during hw_fini. This is not an issue with normal dirver loading and unloading. - However, when we put the SDMA engine to suspend state and resume it, the issue starts to show up. Something could attempt to use that SDMA engine to clear or move memory before the engine is initialized since the DRM entity is still there. - Therefore, we will call sdma_v5_2_enable(false) during hw_fini, and if we are under SRIOV, we will call sdma_v5_2_enable(true) afterwards to allow other VFs to use SDMA. This way, the DRM entity of SDMA engine is emptied and it will follow the flow of resume code path. Tested-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-10-07Merge tag 'driver-core-6.1-rc1' of ↵Linus Torvalds1-0/+14
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core and debug printk changes for 6.1-rc1. Included in here is: - dynamic debug updates for the core and the drm subsystem. The drm changes have all been acked by the relevant maintainers - kernfs fixes for syzbot reported problems - kernfs refactors and updates for cgroup requirements - magic number cleanups and removals from the kernel tree (they were not being used and they really did not actually do anything) - other tiny cleanups All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (74 commits) docs: filesystems: sysfs: Make text and code for ->show() consistent Documentation: NBD_REQUEST_MAGIC isn't a magic number a.out: restore CMAGIC device property: Add const qualifier to device_get_match_data() parameter drm_print: add _ddebug descriptor to drm_*dbg prototypes drm_print: prefer bare printk KERN_DEBUG on generic fn drm_print: optimize drm_debug_enabled for jump-label drm-print: add drm_dbg_driver to improve namespace symmetry drm-print.h: include dyndbg header drm_print: wrap drm_*_dbg in dyndbg descriptor factory macro drm_print: interpose drm_*dbg with forwarding macros drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers. drm_print: condense enum drm_debug_category debugfs: use DEFINE_SHOW_ATTRIBUTE to define debugfs_regset32_fops driver core: use IS_ERR_OR_NULL() helper in device_create_groups_vargs() Documentation: ENI155_MAGIC isn't a magic number Documentation: NBD_REPLY_MAGIC isn't a magic number nbd: remove define-only NBD_MAGIC, previously magic number Documentation: FW_HEADER_MAGIC isn't a magic number Documentation: EEPROM_MAGIC_VALUE isn't a magic number ...
2022-10-06Revert "drm/amdgpu: use dirty framebuffer helper"Hamza Mahfooz1-12/+2
This reverts commit 66f99628eb24409cb8feb5061f78283c8b65f820. Unfortunately, that commit causes performance regressions on non-PSR setups. So, just revert it until FB_DAMAGE_CLIPS support can be added. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2189 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216554 Fixes: 66f99628eb2440 ("drm/amdgpu: use dirty framebuffer helper") Fixes: abbc7a3dafb91b ("drm/amdgpu: don't register a dirty callback for non-atomic") Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>