blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2022-01-19	drm/amdgpu: modify a pair of functions for the pcie port wreg/rreg	Xiaojian Du	1	-0/+33
	This patch will modify a pair of functions for pcie port wreg/rreg. AMD GPU have had an independent NBIO block from SOC15 arch. If the dirver wants to read/write the address space of the pcie devices, it has to go through the NBIO block. This patch will move the pcie port wreg/rreg functions to "amdgpu_device.c", so that to reuse the functions on the future GPU ASICs. Signed-off-by: Xiaojian Du <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-18	drm/amd/amdgpu: fixing read wrong pf2vf data in SRIOV	Jingwen Chen	1	-1/+1
	[Why] This fixes 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange"). we should read pf2vf data based at mman.fw_vram_usage_va after gmc sw_init. commit 892deb48269c breaks this logic. [How] calling amdgpu_virt_exchange_data in amdgpu_virt_init_data_exchange to set the right base in the right sequence. v2: call amdgpu_virt_init_data_exchange after gmc sw_init to make data exchange workqueue run v3: clean up the code logic v4: add some comment and make the code more readable Fixes: 892deb48269c ("drm/amdgpu: Separate vf2pf work item init from virt data exchange") Signed-off-by: Jingwen Chen <[email protected]> Reviewed-by: Horace Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: invert the logic in amdgpu_device_should_recover_gpu()	Alex Deucher	1	-27/+17
	Rather than opting into GPU recovery support, default to on, and opt out if it's not working on a particular GPU. This avoids the need to add new asics to this list since this is a core feature. Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: Enable recovery on yellow carp	CHANDAN VURDIGERE NATARAJ	1	-0/+1
	Add yellow carp to devices which support recovery Signed-off-by: CHANDAN VURDIGERE NATARAJ <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: clean up some inconsistent indenting	Yang Li	1	-1/+1
	Eliminate the follow smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3504 amdgpu_device_init() warn: inconsistent indenting drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1716 amdgpu_ras_error_status_query() warn: if statement not indented drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1058 amdgpu_ras_error_inject() warn: inconsistent indenting Reported-by: Abaci Robot <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Yang Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: Modify mmhub block to fit for the unified ras block data and ops	yipechai	1	-6/+6
	1.Modify mmhub block to fit for the unified ras block data and ops. 2.Change amdgpu_mmhub_ras_funcs to amdgpu_mmhub_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of mmhub ras variable so that mmhub ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register mmhub ras block into amdgpu device ras block link list. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of mmhub versions. If .ras_late_init and .ras_fini had been defined by the selected mmhub version, the defined functions will take effect; if not defined, default fill them with amdgpu_mmhub_ras_late_init and amdgpu_mmhub_ras_fini. Signed-off-by: yipechai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: Unify ras block interface for each ras block	yipechai	1	-0/+2
	1. Define unified ops interface for each block. 2. Add ras_block_match function pointer in ops interface, each ras block can customize specail match function to identify itself. 3. Add amdgpu_ras_block_match_default new function. If a ras block doesn't define .ras_block_match, default execute amdgpu_ras_block_match_default to identify this ras block. 4. Define unified basic ras block data for each ras block. 5. Create dedicated amdgpu device ras block link list to manage all of the ras blocks. 6. Add amdgpu_ras_register_ras_block new function interface for each ras block to register itself to ras controlling block. Signed-off-by: yipechai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amd/pm: do not expose implementation details to other blocks out of power	Evan Quan	1	-3/+3
	Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-11	drm/amdgpu: not return error on the init_apu_flags	Prike Liang	1	-4/+2
	In some APU project we needn't always assign flags to identify each other, so we may not need return an error. Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-11	drm/amd/amdgpu: Add pcie indirect support to amdgpu_mm_wreg_mmio_rlc()	Tom St Denis	1	-1/+3
	The function amdgpu_mm_wreg_mmio_rlc() is used by debugfs to write to MMIO registers. It didn't support registers beyond the BAR mapped MMIO space. This adds pcie indirect write support. Signed-off-by: Tom St Denis <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-11	drm/amdgpu: recover gart table at resume	Nirmoy Das	1	-11/+0
	Get rid off pin/unpin of gart BO at resume/suspend and instead pin only once and try to recover gart content at resume time. This is much more stable in case there is OOM situation at 2nd call to amdgpu_device_evict_resources() while evicting GART table. v3: remove gart recovery from other places v2: pin gart at amdgpu_gart_table_vram_alloc() Reviewed-by: Christian König <[email protected]> Signed-off-by: Nirmoy Das <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-11	drm/amdgpu: do not pass ttm_resource_manager to gtt_mgr	Nirmoy Das	1	-2/+2
	Do not allow exported amdgpu_gtt_mgr_*() to accept any ttm_resource_manager pointer. Also there is no need to force other module to call a ttm function just to eventually call gtt_mgr functions. v4: remove unused adev. v3: upcast mgr from ttm resopurce manager instead of getting it from adev. v2: pass adev's gtt_mgr instead of adev. Reviewed-by: Christian König <[email protected]> Signed-off-by: Nirmoy Das <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-11	drm/amdgpu: Unmap MMIO mappings when device is not unplugged	Leslie Shi	1	-0/+11
	Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure") makes call to amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps MMIO mappings even when device is not unplugged. v2: Add condition of drm_dev_enter() to deleted unmaps in patch "drm/amdgpu: Unmap all MMIO mappings" Signed-off-by: Leslie Shi <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-07	drm/amdgpu: explicitly check for s0ix when evicting resources	Mario Limonciello	1	-2/+2
	This codepath should be running in both s0ix and s3, but only does currently because s3 and s0ix are both set in the s0ix case. Signed-off-by: Mario Limonciello <[email protected]> Acked-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-31	Merge tag 'amd-drm-next-5.17-2021-12-30' of ↵	Dave Airlie	1	-17/+29
	ssh://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.17-2021-12-30: amdgpu: - Suspend/resume fixes - Fence fix - Misc code cleanups - IP discovery fixes - SRIOV fixes - RAS fixes - GMC 8 VRAM detection fix - FRU fixes for Aldebaran - Display fixes amdkfd: - SVM fixes - IP discovery fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-30	drm/amdgpu: no DC support for headless chips	Alex Deucher	1	-0/+6
	Chips with no display hardware should return false for DC support. v2: drop Arcturus and Aldebaran Fixes: f7f12b25823c0d ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reported-by: Tareque Md.Hanif <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-30	drm/amdgpu: Check the memory can be accesssed by ttm_device_clear_dma_mappings.	Surbhi Kakarya	1	-1/+2
	If the event guard is enabled and VF doesn't receive an ack from PF for full access, the guest driver load crashes. This is caused due to the call to ttm_device_clear_dma_mappings with non-initialized mman during driver tear down. This patch adds the necessary condition to check if the mman initialization passed or not and takes the path based on the condition output. Signed-off-by: Surbhi Kakarya <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-30	drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to ↵	Leslie Shi	1	-1/+3
	prevent crash in GPU initialization failure [Why] In amdgpu_driver_load_kms, when amdgpu_device_init returns error during driver modprobe, it will start the error handle path immediately and call into amdgpu_device_unmap_mmio as well to release mapped VRAM. However, in the following release callback, driver stills visits the unmapped memory like vcn.inst[i].fw_shared_cpu_addr in vcn_v3_0_sw_fini. So a kernel crash occurs. [How] call amdgpu_device_unmap_mmio() if device is unplugged to prevent invalid memory address in vcn_v3_0_sw_fini() when GPU initialization failure. Signed-off-by: Leslie Shi <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-28	drm/amdgpu: Send Message to SMU on aldebaran passthrough for sbr handling	sashank saye	1	-5/+4
	For Aldebaran chip passthrough case we need to intimate SMU about special handling for SBR.On older chips we send LightSBR to SMU, enabling the same for Aldebaran. Slight difference, compared to previous chips, is on Aldebaran, SMU would do a heavy reset on SBR. Hence, the word Heavy instead of Light SBR is used for SMU to differentiate. Reviewed by: Shaoyun.liu <[email protected]> Signed-off-by: sashank saye <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-28	drm/amdgpu: get xgmi info before ip_init	Victor Skvortsov	1	-0/+7
	Driver needs to call get_xgmi_info() before ip_init to determine whether it needs to handle a pending hive reset. Signed-off-by: Victor Skvortsov <[email protected]> Reviewed-by: David Nieto <[email protected]> Reviewed by: shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-23	Merge tag 'amd-drm-next-5.17-2021-12-16' of ↵	Dave Airlie	1	-27/+118
	https://gitlab.freedesktop.org/agd5f/linux into drm-next amdgpu: - Add some display debugfs entries - RAS fixes - SR-IOV fixes - W=1 fixes - Documentation fixes - IH timestamp fix - Misc power fixes - IP discovery fixes - Large driver documentation updates - Multi-GPU memory use reductions - Misc display fixes and cleanups - Add new SMU debug option amdkfd: - SVM fixes radeon: - Fix typo in comment From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-12-16	drm/amdgpu: Separate vf2pf work item init from virt data exchange	Victor Skvortsov	1	-1/+5
	We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <[email protected]> Reviewed By: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-16	drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence	Huang Rui	1	-9/+2
	The job embedded fence donesn't initialize the flags at dma_fence_init(). Then we will go a wrong way in amdgpu_fence_get_timeline_name callback and trigger a null pointer panic once we enabled the trace event here. So introduce new amdgpu_fence object to indicate the job embedded fence. [ 156.131790] BUG: kernel NULL pointer dereference, address: 00000000000002a0 [ 156.131804] #PF: supervisor read access in kernel mode [ 156.131811] #PF: error_code(0x0000) - not-present page [ 156.131817] PGD 0 P4D 0 [ 156.131824] Oops: 0000 [#1] PREEMPT SMP PTI [ 156.131832] CPU: 6 PID: 1404 Comm: sdma0 Tainted: G OE 5.16.0-rc1-custom #1 [ 156.131842] Hardware name: Gigabyte Technology Co., Ltd. Z170XP-SLI/Z170XP-SLI-CF, BIOS F20 11/04/2016 [ 156.131848] RIP: 0010:strlen+0x0/0x20 [ 156.131859] Code: 89 c0 c3 0f 1f 80 00 00 00 00 48 01 fe eb 0f 0f b6 07 38 d0 74 10 48 83 c7 01 84 c0 74 05 48 39 f7 75 ec 31 c0 c3 48 89 f8 c3 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 c3 31 [ 156.131872] RSP: 0018:ffff9bd0018dbcf8 EFLAGS: 00010206 [ 156.131880] RAX: 00000000000002a0 RBX: ffff8d0305ef01b0 RCX: 000000000000000b [ 156.131888] RDX: ffff8d03772ab924 RSI: ffff8d0305ef01b0 RDI: 00000000000002a0 [ 156.131895] RBP: ffff9bd0018dbd60 R08: ffff8d03002094d0 R09: 0000000000000000 [ 156.131901] R10: 000000000000005e R11: 0000000000000065 R12: ffff8d03002094d0 [ 156.131907] R13: 000000000000001f R14: 0000000000070018 R15: 0000000000000007 [ 156.131914] FS: 0000000000000000(0000) GS:ffff8d062ed80000(0000) knlGS:0000000000000000 [ 156.131923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 156.131929] CR2: 00000000000002a0 CR3: 000000001120a005 CR4: 00000000003706e0 [ 156.131937] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 156.131942] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 156.131949] Call Trace: [ 156.131953] <TASK> [ 156.131957] ? trace_event_raw_event_dma_fence+0xcc/0x200 [ 156.131973] ? ring_buffer_unlock_commit+0x23/0x130 [ 156.131982] dma_fence_init+0x92/0xb0 [ 156.131993] amdgpu_fence_emit+0x10d/0x2b0 [amdgpu] [ 156.132302] amdgpu_ib_schedule+0x2f9/0x580 [amdgpu] [ 156.132586] amdgpu_job_run+0xed/0x220 [amdgpu] v2: fix mismatch warning between the prototype and function name (Ray, kernel test robot) Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-14	amdgpu: fix some kernel-doc markup	Yann Dirson	1	-3/+3
	Those are not today pulled by the sphinx doc, but better be ready. Signed-off-by: Yann Dirson <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-14	drm/amdgpu: use adev_to_drm to get drm_device pointer	Guchun Chen	1	-1/+1
	Updated for consistency when accessing drm_device from amdgpu driver. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-14	Merge v5.16-rc5 into drm-next	Daniel Vetter	1	-6/+9
	Thomas Zimmermann requested a fixes backmerge, specifically also for 96c5f82ef0a1 ("drm/vc4: fix error code in vc4_create_object()") Just a bunch of adjacent changes conflicts, even the big pile of them in vc4. Signed-off-by: Daniel Vetter <[email protected]>
2021-12-13	drm/amdgpu: Detect if amdgpu in IOMMU direct map mode	Philip Yang	1	-0/+19
	If host and amdgpu IOMMU is not enabled or IOMMU is pass through mode, set adev->ram_is_direct_mapped flag which will be used to optimize memory usage for multi GPU mappings. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-13	drm/amdgpu: introduce a kind of halt state for amdgpu device	Lang Yu	1	-0/+39
	It is useful to maintain error context when debugging SW/FW issues. Introduce amdgpu_device_halt() for this purpose. It will bring hardware to a kind of halt state, so that no one can touch it any more. Compare to a simple hang, the system will keep stable at least for SSH access. Then it should be trivial to inspect the hardware state and see what's going on. v2: - Set adev->no_hw_access earlier to avoid potential crashes.(Christian) Suggested-by: Christian Koenig <[email protected]> Suggested-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian Koenig <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-13	drm/amdgpu: only hw fini SMU fisrt for ASICs need that	Lang Yu	1	-15/+32
	We found some headaches on ASICs don't need that, so remove that for them. Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Kevin Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-13	drm/amd: fix improper docstring syntax	Isabella Basso	1	-2/+2
	This fixes various warnings relating to erroneous docstring syntax, of which some are listed below: warning: Function parameter or member 'adev' not described in 'amdgpu_atomfirmware_ras_rom_addr' ... warning: expecting prototype for amdgpu_atpx_validate_functions(). Prototype was for amdgpu_atpx_validate() instead ... warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new' ... warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of waves in-flight on this device ... warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: Isabella Basso <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-13	drm/amdgpu: recover XGMI topology for SRIOV VF after reset	Zhigang Luo	1	-3/+14
	For SRIOV VF, the XGMI topology was not recovered after reset. This change added code to SRIOV VF reset function to update XGMI topology for SRIOV VF after reset. Signed-off-by: Zhigang Luo <[email protected]> Reviewed-by: Shaoyun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-13	drm/amdgpu: skip reset other device in the same hive if it's SRIOV VF	Zhigang Luo	1	-3/+4
	On SRIOV, host driver can support FLR(function level reset) on individual VF within the hive which might bring the individual device back to normal without the necessary to execute the hive reset. If the FLR failed , host driver will trigger the hive reset, each guest VF will get reset notification before the real hive reset been executed. The VF device can handle the reset request individually in it's reset work handler. This change updated gpu recover sequence to skip reset other device in the same hive for SRIOV VF. Signed-off-by: Zhigang Luo <[email protected]> Reviewed-by: Shaoyun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01	drm/amdgpu: adjust the kfd reset sequence in reset sriov function	shaoyunl	1	-4/+8
	This change revert previous commits: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01	drm/amdgpu: check atomic flag to differeniate with legacy path	Flora Cui	1	-2/+2
	since vkms support atomic KMS interface Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01	drm/amdgpu: adjust the kfd reset sequence in reset sriov function	shaoyunl	1	-4/+8
	This change revert previous commits: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") This change moves the amdgpu_amdkfd_pre_reset to an earlier place in amdgpu_device_reset_sriov, presumably to address the sequence issue that the first patch was originally meant to fix. Some register access(GRBM_GFX_CNTL) only be allowed on full access mode. Move kfd_pre_reset and kfd_post_reset back inside reset_sriov function. Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") Fixes: 271fd38ce56d ("drm/amdgpu: move kfd post_reset out of reset_sriov function") Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01	drm/amdgpu: fix disable ras feature failed when unload drvier v2	Stanley.Yang	1	-2/+3
	v2: still need call ras_disable_all_featrures to handle ras initilization failure case. Function amdgpu_device_fini_hw is called before amdgpu_device_fini_sw, so ras ta will unload before send ras disable command, ras dsiable operation must before hw fini. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-12-01	drm/amdgpu: check atomic flag to differeniate with legacy path	Flora Cui	1	-2/+2
	since vkms support atomic KMS interface Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-24	drm/amdgpu: move kfd post_reset out of reset_sriov function	shaoyunl	1	-4/+3
	Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-24	drm/amdgpu: move kfd post_reset out of reset_sriov function	shaoyunl	1	-4/+3
	Fixes: 9f4f2c1a3524 ("drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov") For sriov XGMI configuration, the host driver will handle the hive reset, so in guest side, the reset_sriov only be called once on one device. This will make kfd post_reset unblanced with kfd pre_reset since kfd pre_reset already been moved out of reset_sriov function. Move kfd post_reset out of reset_sriov function to make them balance . Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-22	drm/amd/pm: avoid duplicate powergate/ungate setting	Evan Quan	1	-0/+3
	Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/[email protected]/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Signed-off-by: Evan Quan <[email protected]> Tested-by: Borislav Petkov <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17	drm/amd/pm: avoid duplicate powergate/ungate setting	Evan Quan	1	-0/+3
	Just bail out if the target IP block is already in the desired powergate/ungate state. This can avoid some duplicate settings which sometimes may cause unexpected issues. Link: https://lore.kernel.org/all/[email protected]/ Bug: https://bugzilla.kernel.org/show_bug.cgi?id=214921 Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215025 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1789 Fixes: bf756fb833cb ("drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend") Signed-off-by: Evan Quan <[email protected]> Tested-by: Borislav Petkov <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-11-17	drm/amdgpu: use generic fb helpers instead of setting up AMD own's.	Evan Quan	1	-8/+4
	With the shadow buffer support from generic framebuffer emulation, it's possible now to have runpm kicked when no update for console. Signed-off-by: Evan Quan <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-09	drm/amd/amdgpu: fix the kfd pre_reset sequence in sriov	shaoyunl	1	-4/+1
	The KFD pre_reset should be called before reset been executed, it will hold the lock to prevent other rocm process to sent the packlage to hiq during host execute the real reset on the HW Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-05	drm/amdgpu: fix SI handling in amdgpu_device_asic_has_dc_support()	Alex Deucher	1	-1/+11
	Properly handle SI DC support when CONFIG_DRM_AMD_DC_SI is not set. Fixes: f7f12b25823c0d ("drm/amdgpu: default to true in amdgpu_device_asic_has_dc_support") Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-03	drm/amdgpu: remove duplicated kfd_resume_iommu	James Zhu	1	-4/+0
	Remove duplicated kfd_resume_iommu which already runs in mdgpu_amdkfd_device_init. Tested-By: Ken Moffat <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-03	drm/amd/amdgpu: fix bad job hw_fence use after free in advance tdr	Jingwen Chen	1	-0/+4
	[Why] In advance tdr mode, the real bad job will be resubmitted twice, while in drm_sched_resubmit_jobs_ext, there's a dma_fence_put, so the bad job is put one more time than other jobs. [How] Adding dma_fence_get before resbumit job in amdgpu_device_recheck_guilty_jobs and put the fence for normal jobs Signed-off-by: Jingwen Chen <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-28	drm/amdgpu: fix a potential memory leak in amdgpu_device_fini_sw()	Lang Yu	1	-1/+1
	amdgpu_fence_driver_sw_fini() should be executed before amdgpu_device_ip_fini(), otherwise fence driver resource won't be properly freed as adev->rings have been tore down. Fixes: 72c8c97b1522 ("drm/amdgpu: Split amdgpu_device_fini into early and late") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-28	BackMerge tag 'v5.15-rc7' into drm-next	Dave Airlie	1	-0/+4
	The msm next tree is based on rc3, so let's just backmerge rc7 before pulling it in. Signed-off-by: Dave Airlie <[email protected]>
2021-10-19	drm/amd/amdgpu: Do irq_fini_hw after ip_fini_early	YuBiao Wang	1	-2/+2
	[Why] drm_irq_uninstall is called in irq_fini_hw so that irq is disabled in sw stage. SMU (and maybe other IP blocks) fini_hw will call irq_put for cleanup and the whole cleanup process will be skipped because of drm->irq_enable = false. [How] Move ip_fini_early before irq_fini_hw. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-13	drm/amdkfd: fix boot failure when iommu is disabled in Picasso.	Yifan Zhang	1	-4/+0
	When IOMMU disabled in sbios and kfd in iommuv2 path, iommuv2 init will fail. But this failure should not block amdgpu driver init. Reported-by: youling <[email protected]> Tested-by: youling <[email protected]> Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>