aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2023-01-04Revert "drm/amd/display: Enable Freesync Video Mode by default"Michel Dänzer3-5/+35
This reverts commit de05abe6b9d0fe08f65d744f7f75a4cba4df27ad. The bug referenced below was bisected to this commit. There has been no activity toward fixing it in 3 months, so let's revert for now. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2162 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-01-03drm/amd/display: Uninitialized variables causing 4k60 UCLK to stay at DPM1 ↵Samson Tam1-3/+3
and not DPM0 [Why] SwathSizePerSurfaceY[] and SwathSizePerSurfaceC[] values are uninitialized because we are using += instead of = operator. [How] Assign values in loop with = operator. Acked-by: Aurabindo Pillai <[email protected]> Signed-off-by: Samson Tam <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x, 6.1.x
2023-01-03drm/amdkfd: Fix kernel warning during topology setupMukul Joshi1-1/+1
This patch fixes the following kernel warning seen during driver load by correctly initializing the p2plink attr before creating the sysfs file: [ +0.002865] ------------[ cut here ]------------ [ +0.002327] kobject: '(null)' (0000000056260cfb): is not initialized, yet kobject_put() is being called. [ +0.004780] WARNING: CPU: 32 PID: 1006 at lib/kobject.c:718 kobject_put+0xaa/0x1c0 [ +0.001361] Call Trace: [ +0.001234] <TASK> [ +0.001067] kfd_remove_sysfs_node_entry+0x24a/0x2d0 [amdgpu] [ +0.003147] kfd_topology_update_sysfs+0x3d/0x750 [amdgpu] [ +0.002890] kfd_topology_add_device+0xbd7/0xc70 [amdgpu] [ +0.002844] ? lock_release+0x13c/0x2e0 [ +0.001936] ? smu_cmn_send_smc_msg_with_param+0x1e8/0x2d0 [amdgpu] [ +0.003313] ? amdgpu_dpm_get_mclk+0x54/0x60 [amdgpu] [ +0.002703] kgd2kfd_device_init.cold+0x39f/0x4ed [amdgpu] [ +0.002930] amdgpu_amdkfd_device_init+0x13d/0x1f0 [amdgpu] [ +0.002944] amdgpu_device_init.cold+0x1464/0x17b4 [amdgpu] [ +0.002970] ? pci_bus_read_config_word+0x43/0x80 [ +0.002380] amdgpu_driver_load_kms+0x15/0x100 [amdgpu] [ +0.002744] amdgpu_pci_probe+0x147/0x370 [amdgpu] [ +0.002522] local_pci_probe+0x40/0x80 [ +0.001896] work_for_cpu_fn+0x10/0x20 [ +0.001892] process_one_work+0x26e/0x5a0 [ +0.002029] worker_thread+0x1fd/0x3e0 [ +0.001890] ? process_one_work+0x5a0/0x5a0 [ +0.002115] kthread+0xea/0x110 [ +0.001618] ? kthread_complete_and_exit+0x20/0x20 [ +0.002422] ret_from_fork+0x1f/0x30 [ +0.001808] </TASK> [ +0.001103] irq event stamp: 59837 [ +0.001718] hardirqs last enabled at (59849): [<ffffffffb30fab12>] __up_console_sem+0x52/0x60 [ +0.004414] hardirqs last disabled at (59860): [<ffffffffb30faaf7>] __up_console_sem+0x37/0x60 [ +0.004414] softirqs last enabled at (59654): [<ffffffffb307d9c7>] irq_exit_rcu+0xd7/0x130 [ +0.004205] softirqs last disabled at (59649): [<ffffffffb307d9c7>] irq_exit_rcu+0xd7/0x130 [ +0.004203] ---[ end trace 0000000000000000 ]--- Fixes: 0f28cca87e9a ("drm/amdkfd: Extend KFD device topology to surface peer-to-peer links") Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-30Merge tag 'acpi-6.2-rc2' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These are new ACPI IRQ override quirks, low-power S0 idle (S0ix) support adjustments and ACPI backlight handling fixes, mostly for platforms using AMD chips. Specifics: - Add ACPI IRQ override quirks for Asus ExpertBook B2502, Lenovo 14ALC7, and XMG Core 15 (Hans de Goede, Adrian Freund, Erik Schumacher). - Adjust ACPI video detection fallback path to prevent non-operational ACPI backlight devices from being created on systems where the native driver does not detect a suitable panel (Mario Limonciello). - Fix Apple GMUX backlight detection (Hans de Goede). - Add a low-power S0 idle (S0ix) handling quirk for HP Elitebook 865 and stop using AMD-specific low-power S0 idle code path for systems with Rembrandt chips and newer (Mario Limonciello)" * tag 'acpi-6.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: x86: s2idle: Stop using AMD specific codepath for Rembrandt+ ACPI: x86: s2idle: Force AMD GUID/_REV 2 on HP Elitebook 865 ACPI: video: Fix Apple GMUX backlight detection ACPI: resource: Add Asus ExpertBook B2502 to Asus quirks ACPI: resource: do IRQ override on Lenovo 14ALC7 ACPI: resource: do IRQ override on XMG Core 15 ACPI: video: Don't enable fallback path for creating ACPI backlight by default drm/amd/display: Report to ACPI video if no panels were found ACPI: video: Allow GPU drivers to report no panels
2022-12-23Merge tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drmLinus Torvalds37-284/+551
Pull drm fixes from Dave Airlie: "Holiday fixes! Two batches from amd, and one group of i915 changes. amdgpu: - Spelling fix - BO pin fix - Properly handle polaris 10/11 overlap asics - GMC9 fix - SR-IOV suspend fix - DCN 3.1.4 fix - KFD userptr locking fix - SMU13.x fixes - GDS/GWS/OA handling fix - Reserved VMID handling fixes - FRU EEPROM fix - BO validation fixes - Avoid large variable on the stack - S0ix fixes - SMU 13.x fixes - VCN fix - Add missing fence reference amdkfd: - Fix init vm error handling - Fix double release of compute pasid i915 - Documentation fixes - OA-perf related fix - VLV/CHV HDMI/DP audio fix - Display DDI/Transcoder fix - Migrate fixes" * tag 'drm-next-2022-12-23' of git://anongit.freedesktop.org/drm/drm: (39 commits) drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependency drm/amdgpu: enable VCN DPG for GC IP v11.0.4 drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0 drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asics drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34 drm/amdgpu: skip MES for S0ix as well since it's part of GFX drm/amd/pm: avoid large variable on kernel stack drm/amdkfd: Fix double release compute pasid drm/amdkfd: Fix kfd_process_device_init_vm error handling drm/amd/pm: update SMU13.0.0 reported maximum shader clock drm/amd/pm: correct SMU13.0.0 pstate profiling clock settings drm/amd/pm: enable GPO dynamic control support for SMU13.0.7 drm/amd/pm: enable GPO dynamic control support for SMU13.0.0 drm/amdgpu: revert "generally allow over-commit during BO allocation" drm/amdgpu: Remove unnecessary domain argument drm/amdgpu: Fix size validation for non-exclusive domains (v4) drm/amdgpu: Check if fru_addr is not NULL (v2) drm/i915/ttm: consider CCS for backup objects drm/i915/migrate: fix corner case in CCS aux copying drm/amdgpu: rework reserved VMID handling ...
2022-12-22drm/amd/display: Report to ACPI video if no panels were foundMario Limonciello1-0/+4
On desktop APUs amdgpu doesn't create a native backlight device as no eDP panels are found. However if the BIOS has reported backlight control methods in the ACPI tables then an acpi_video0 backlight device will be made 8 seconds after boot. This has manifested in a power slider on a number of desktop APUs ranging from Ryzen 5000 through Ryzen 7000 on various motherboard manufacturers. To avoid this, report to the acpi video detection that the system does not have any panel connected in the native driver. Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783786 Reported-by: Hans de Goede <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2022-12-21drm/amdgpu: grab extra fence reference for drm_sched_job_add_dependencyChristian König1-0/+2
That function consumes the reference. Reviewed-by: Luben Tuikov <[email protected]> Reported-by: Borislav Petkov (AMD) <[email protected]> Tested-by: Borislav Petkov (AMD) <[email protected]> Signed-off-by: Christian König <[email protected]> Fixes: aab9cf7b6954 ("drm/amdgpu: use scheduler dependencies for VM updates") Signed-off-by: Alex Deucher <[email protected]>
2022-12-21drm/amdgpu: enable VCN DPG for GC IP v11.0.4Saleemkhan Jamadar1-0/+1
Enable VCN Dynamic Power Gating control for GC IP v11.0.4. Signed-off-by: Saleemkhan Jamadar <[email protected]> Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0, 6.1
2022-12-20drm/amdgpu: skip mes self test after s0i3 resume for MES IP v11.0Tim Huang1-1/+2
MES is part of gfxoff and MES suspend and resume are skipped for S0i3. But the mes_self_test call path is still in the amdgpu_device_ip_late_init. it's should also be skipped for s0ix as no hardware re-initialization happened. Besides, mes_self_test will free the BO that triggers a lot of warning messages while in the suspend state. [ 81.656085] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_object.c:425 amdgpu_bo_free_kernel+0xfc/0x110 [amdgpu] [ 81.679435] Call Trace: [ 81.679726] <TASK> [ 81.679981] amdgpu_mes_remove_hw_queue+0x17a/0x230 [amdgpu] [ 81.680857] amdgpu_mes_self_test+0x390/0x430 [amdgpu] [ 81.681665] mes_v11_0_late_init+0x37/0x50 [amdgpu] [ 81.682423] amdgpu_device_ip_late_init+0x53/0x280 [amdgpu] [ 81.683257] amdgpu_device_resume+0xae/0x2a0 [amdgpu] [ 81.684043] amdgpu_pmops_resume+0x37/0x70 [amdgpu] [ 81.684818] pci_pm_resume+0x5c/0xa0 [ 81.685247] ? pci_pm_thaw+0x90/0x90 [ 81.685658] dpm_run_callback+0x4e/0x160 [ 81.686110] device_resume+0xad/0x210 [ 81.686529] async_resume+0x1e/0x40 [ 81.686931] async_run_entry_fn+0x33/0x120 [ 81.687405] process_one_work+0x21d/0x3f0 [ 81.687869] worker_thread+0x4a/0x3c0 [ 81.688293] ? process_one_work+0x3f0/0x3f0 [ 81.688777] kthread+0xff/0x130 [ 81.689157] ? kthread_complete_and_exit+0x20/0x20 [ 81.689707] ret_from_fork+0x22/0x30 [ 81.690118] </TASK> [ 81.690380] ---[ end trace 0000000000000000 ]--- v2: make the comment clean and use adev->in_s0ix instead of adev->suspend Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0, 6.1
2022-12-20drm/amd/pm: correct the fan speed retrieving in PWM for some SMU13 asicsEvan Quan2-6/+28
For SMU 13.0.0 and 13.0.7, the output from PMFW is in percent. Driver need to convert that into correct PMW(255) based. Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0, 6.1
2022-12-20drm/amd/pm: bump SMU13.0.0 driver_if header to version 0x34Evan Quan3-1/+4
To fit the latest PMFW and suppress the warning emerged on driver loading. Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0, 6.1
2022-12-20drm/amdgpu: skip MES for S0ix as well since it's part of GFXAlex Deucher1-2/+3
It's also part of gfxoff. Cc: [email protected] # 6.0, 6.1 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-20drm/amd/pm: avoid large variable on kernel stackArnd Bergmann1-5/+16
The activity_monitor_external[] array is too big to fit on the kernel stack, resulting in this warning with clang: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_7_ppt.c:1438:12: error: stack frame size (1040) exceeds limit (1024) in 'smu_v13_0_7_get_power_profile_mode' [-Werror,-Wframe-larger-than] Use dynamic allocation instead. It should also be possible to have single element here instead of the array, but this seems easier. v2: fix up argument to sizeof() (Alex) Fixes: 334682ae8151 ("drm/amd/pm: enable workload type change on smu_v13_0_7") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-20drm/amdkfd: Fix double release compute pasidPhilip Yang3-15/+40
If kfd_process_device_init_vm returns failure after vm is converted to compute vm and vm->pasid set to compute pasid, KFD will not take pdd->drm_file reference. As a result, drm close file handler maybe called to release the compute pasid before KFD process destroy worker to release the same pasid and set vm->pasid to zero, this generates below WARNING backtrace and NULL pointer access. Add helper amdgpu_amdkfd_gpuvm_set_vm_pasid and call it at the last step of kfd_process_device_init_vm, to ensure vm pasid is the original pasid if acquiring vm failed or is the compute pasid with pdd->drm_file reference taken to avoid double release same pasid. amdgpu: Failed to create process VM object ida_free called for id=32770 which is not allocated. WARNING: CPU: 57 PID: 72542 at ../lib/idr.c:522 ida_free+0x96/0x140 RIP: 0010:ida_free+0x96/0x140 Call Trace: amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu] amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu] drm_file_free.part.13+0x216/0x270 [drm] drm_close_helper.isra.14+0x60/0x70 [drm] drm_release+0x6e/0xf0 [drm] __fput+0xcc/0x280 ____fput+0xe/0x20 task_work_run+0x96/0xc0 do_exit+0x3d0/0xc10 BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:ida_free+0x76/0x140 Call Trace: amdgpu_pasid_free_delayed+0xe1/0x2a0 [amdgpu] amdgpu_driver_postclose_kms+0x2d8/0x340 [amdgpu] drm_file_free.part.13+0x216/0x270 [drm] drm_close_helper.isra.14+0x60/0x70 [drm] drm_release+0x6e/0xf0 [drm] __fput+0xcc/0x280 ____fput+0xe/0x20 task_work_run+0x96/0xc0 do_exit+0x3d0/0xc10 Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-20drm/amdkfd: Fix kfd_process_device_init_vm error handlingPhilip Yang1-6/+6
Should only destroy the ib_mem and let process cleanup worker to free the outstanding BOs. Reset the pointer in pdd->qpd structure, to avoid NULL pointer access in process destroy worker. BUG: kernel NULL pointer dereference, address: 0000000000000010 Call Trace: amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel+0x46/0xb0 [amdgpu] kfd_process_device_destroy_cwsr_dgpu+0x40/0x70 [amdgpu] kfd_process_destroy_pdds+0x71/0x190 [amdgpu] kfd_process_wq_release+0x2a2/0x3b0 [amdgpu] process_one_work+0x2a1/0x600 worker_thread+0x39/0x3d0 Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-15drm/amd/pm: update SMU13.0.0 reported maximum shader clockEvan Quan1-1/+69
Update the reported maximum shader clock to the value which can be guarded to be achieved on all cards. This is to align with Window setting. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-15drm/amd/pm: correct SMU13.0.0 pstate profiling clock settingsEvan Quan1-7/+15
Correct the pstate standard/peak profiling mode clock settings for SMU13.0.0. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-15drm/amd/pm: enable GPO dynamic control support for SMU13.0.7Evan Quan1-0/+2
To better support UMD pstate profilings, the GPO feature needs to be switched on/off accordingly. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-15drm/amd/pm: enable GPO dynamic control support for SMU13.0.0Evan Quan4-1/+22
To better support UMD pstate profilings, the GPO feature needs to be switched on/off accordingly. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-14drm/amdgpu: revert "generally allow over-commit during BO allocation"Christian König2-4/+18
This reverts commit f9d00a4a8dc8fff951c97b3213f90d6bc7a72175. This causes problem for KFD because when we overcommit we accidentially bind the BO to GTT for moving it into VRAM. We also need to make sure that this is done only as fallback after trying to evict first. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: Remove unnecessary domain argumentLuben Tuikov4-14/+6
Remove the "domain" argument to amdgpu_bo_create_kernel_at() since this function takes an "offset" argument which is the offset off of VRAM, and as such allocation always takes place in VRAM. Thus, the "domain" argument is unnecessary. Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: AMD Graphics <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: Fix size validation for non-exclusive domains (v4)Luben Tuikov1-11/+8
Fix amdgpu_bo_validate_size() to check whether the TTM domain manager for the requested memory exists, else we get a kernel oops when dereferencing "man". v2: Make the patch standalone, i.e. not dependent on local patches. v3: Preserve old behaviour and just check that the manager pointer is not NULL. v4: Complain if GTT domain requested and it is uninitialized--most likely a bug. Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: AMD Graphics <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: Check if fru_addr is not NULL (v2)Luben Tuikov1-2/+4
Always check if fru_addr is not NULL. This commit also fixes a "smatch" warning. v2: Add a Fixes tag. Cc: Alex Deucher <[email protected]> Cc: Dan Carpenter <[email protected]> Cc: kernel test robot <[email protected]> Cc: AMD Graphics <[email protected]> Fixes: afbe5d1e4bd7c7 ("drm/amdgpu: Bug-fix: Reading I2C FRU data on newer ASICs") Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: rework reserved VMID handlingChristian König3-30/+24
Instead of reserving a VMID for a single process allow that many processes use the reserved ID. This allows for proper isolation between the processes. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: stop waiting for the VM during unreserveChristian König1-16/+0
This is completely pointless since the VMID always stays allocated until the VM is idle. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: cleanup SPM support a bitChristian König3-4/+7
This should probably not access job->vm and also emit the SPM switch under the conditional execute. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: fix GDS/GWS/OA switch handlingChristian König3-43/+54
Bas pointed out that this isn't working as expected and could cause crashes. Fix the handling by storing the marker that a switch is needed inside the job instead. Reported-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amd/pm: add missing SMU13.0.7 mm_dpm feature mappingEvan Quan1-0/+2
Without this, the pp_dpm_vclk and pp_dpm_dclk outputs are not with correct data. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-14drm/amd/pm: add missing SMU13.0.0 mm_dpm feature mappingEvan Quan1-0/+2
Without this, the pp_dpm_vclk and pp_dpm_dclk outputs are not with correct data. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-14drm/amdgpu: Add notifier lock for KFD userptrsFelix Kuehling6-91/+172
Add a per-process MMU notifier lock for processing notifiers from userptrs. Use that lock to properly synchronize page table updates with MMU notifiers. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Xiaogang Chen<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amd/display: Add DCN314 display SG SupportYifan Zhang1-0/+1
Add display SG support for DCN 3.1.4. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-14drm/amdgpu: WARN when freeing kernel memory during suspendChristian König1-0/+2
When buffers are freed during suspend there is no guarantee that they can be re-allocated during resume. The PSP subsystem seems to be quite buggy regarding this, so add a WARN_ON() to point out those bugs. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Tested-by: Guilherme G. Piccoli <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-14drm/amdgpu: fixx NULL pointer deref in gmc_v9_0_get_vm_pteChristian König1-1/+3
We not only need to make sure that we have a BO, but also that the BO has some backing store. Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-13Merge tag 'mm-stable-2022-12-13' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - More userfaultfs work from Peter Xu - Several convert-to-folios series from Sidhartha Kumar and Huang Ying - Some filemap cleanups from Vishal Moola - David Hildenbrand added the ability to selftest anon memory COW handling - Some cpuset simplifications from Liu Shixin - Addition of vmalloc tracing support by Uladzislau Rezki - Some pagecache folioifications and simplifications from Matthew Wilcox - A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it - Miguel Ojeda contributed some cleanups for our use of the __no_sanitize_thread__ gcc keyword. This series should have been in the non-MM tree, my bad - Naoya Horiguchi improved the interaction between memory poisoning and memory section removal for huge pages - DAMON cleanups and tuneups from SeongJae Park - Tony Luck fixed the handling of COW faults against poisoned pages - Peter Xu utilized the PTE marker code for handling swapin errors - Hugh Dickins reworked compound page mapcount handling, simplifying it and making it more efficient - Removal of the autonuma savedwrite infrastructure from Nadav Amit and David Hildenbrand - zram support for multiple compression streams from Sergey Senozhatsky - David Hildenbrand reworked the GUP code's R/O long-term pinning so that drivers no longer need to use the FOLL_FORCE workaround which didn't work very well anyway - Mel Gorman altered the page allocator so that local IRQs can remnain enabled during per-cpu page allocations - Vishal Moola removed the try_to_release_page() wrapper - Stefan Roesch added some per-BDI sysfs tunables which are used to prevent network block devices from dirtying excessive amounts of pagecache - David Hildenbrand did some cleanup and repair work on KSM COW breaking - Nhat Pham and Johannes Weiner have implemented writeback in zswap's zsmalloc backend - Brian Foster has fixed a longstanding corner-case oddity in file[map]_write_and_wait_range() - sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang Chen - Shiyang Ruan has done some work on fsdax, to make its reflink mode work better under xfstests. Better, but still not perfect - Christoph Hellwig has removed the .writepage() method from several filesystems. They only need .writepages() - Yosry Ahmed wrote a series which fixes the memcg reclaim target beancounting - David Hildenbrand has fixed some of our MM selftests for 32-bit machines - Many singleton patches, as usual * tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits) mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio mm: mmu_gather: allow more than one batch of delayed rmaps mm: fix typo in struct pglist_data code comment kmsan: fix memcpy tests mm: add cond_resched() in swapin_walk_pmd_entry() mm: do not show fs mm pc for VM_LOCKONFAULT pages selftests/vm: ksm_functional_tests: fixes for 32bit selftests/vm: cow: fix compile warning on 32bit selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem mm,thp,rmap: fix races between updates of subpages_mapcount mm: memcg: fix swapcached stat accounting mm: add nodes= arg to memory.reclaim mm: disable top-tier fallback to reclaim on proactive reclaim selftests: cgroup: make sure reclaim target memcg is unprotected selftests: cgroup: refactor proactive reclaim code to reclaim_until() mm: memcg: fix stale protection of reclaim target memcg mm/mmap: properly unaccount memory on mas_preallocate() failure omfs: remove ->writepage jfs: remove ->writepage ...
2022-12-13drm/amdgpu: Add an extra evict_resource call during device_suspend.Shikang Fan1-0/+5
- evict_resource is taking too long causing sriov full access mode timeout. So, add an extra evict_resource in the beginning as an early evict. Signed-off-by: Shikang Fan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-13Merge tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drmLinus Torvalds292-3757/+7004
Pull drm updates from Dave Airlie: "The biggest highlight is that the accel subsystem framework is merged. Hopefully for 6.3 we will be able to line up a driver to use it. In drivers land, i915 enables DG2 support by default now, and nouveau has a big stability refactoring and initial ampere support, AMD includes new hw IP support and should build on ARM again. There is also an ofdrm driver to take over offb on platforms it's used. Stuff outside my tree, the dma-buf patches hit a few places, the vc4 firmware changes also do, and i915 has some interactions with MEI for discrete GPUs. I think all of those should have been acked/reviewed by relevant parties. New driver: - ofdrm - replacement for offb fbdev: - add support for nomodeset fourcc: - add Vivante tiled modifier core: - atomic-helpers: CRTC primary plane test fixes, fb access hooks - connector: TV API consistency, cmdline parser improvements - send connector hotplug on cleanup - sort makefile objects tests: - sort kunit tests - improve DP-MST tests - add kunit helpers to create a device sched: - module param for scheduling policy - refcounting fix buddy: - add back random seed log ttm: - convert ttm_resource to size_t - optimize pool allocations edid: - HFVSDB parsing support fixes - logging/debug improvements - DSC quirks dma-buf: - Add unlocked vmap and attachment mapping - move drivers to common locking convention - locking improvements firmware: - new API for rPI firmware and vc4 xilinx: - zynqmp: displayport bridge support - dpsub fix bridge: - adv7533: Remove dynamic lane switching - it6505: Runtime PM support, sync improvements - ps8640: Handle AUX defer messages - tc358775: Drop soft-reset over I2C panel: - panel-edp: Add INX N116BGE-EA2 C2 and C4 support. - Jadard JD9365DA-H3 - NewVision NV3051D amdgpu: - DCN support on ARM - DCN 2.1 secure display - Sienna Cichlid mode2 reset fixes - new GC 11.x firmware versions - drop AMD specific DSC workarounds in favour of drm code - clang warning fixes - scheduler rework - SR-IOV fixes - GPUVM locking fixes - fix memory leak in CS IOCTL error path - flexible array updates - enable new GC/PSP/SMU/NBIO IP - GFX preemption support for gfx9 amdkfd: - cache size fixes - userptr fixes - enable cooperative launch on gfx 10.3 - enable GC 11.0.4 KFD support radeon: - replace kmap with kmap_local_page - ACPI ref count fix - HDA audio notifier support i915: - DG2 enabled by default - MTL enablement work - hotplug refactoring - VBT improvements - Display and watermark refactoring - ADL-P workaround - temp disable runtime_pm for discrete- - fix for A380 as a secondary GPU - Wa_18017747507 for DG2 - CS timestamp support fixes for gen5 and earlier - never purge busy TTM objects - use i915_sg_dma_sizes for all backends - demote GuC kernel contexts to normal priority - gvt: refactor for new MDEV interface - enable DC power states on eDP ports - fix gen 2/3 workarounds nouveau: - fix page fault handling - Ampere acceleration support - driver stability improvements - nva3 backlight support msm: - MSM_INFO_GET_FLAGS support - DPU: XR30 and P010 image formats - Qualcomm SM6115 support - DSI PHY support for QCM2290 - HDMI: refactored dev init path - remove exclusive-fence hack - fix speed-bin detection - enable clamp to idle on 7c3 - improved hangcheck detection vmwgfx: - fb and cursor refactoring - convert to generic hashtable - cursor improvements etnaviv: - hw workarounds - softpin MMU fixes ast: - atomic gamma LUT support - convert to SHMEM lcdif: - support YUV planes - Increase DMA burst size - FIFO threshold tuning meson: - fix return type of cvbs mode_valid mgag200: - fix PLL setup on some revisions sun4i: - A100 and D1 support udl: - modesetting improvements - hot unplug support vc4: - support PAL-M - fix regression preventing 4K @ 60Hz - fix NULL ptr deref v3d: - switch to drm managed resources renesas: - RZ/G2L DSI support - DU Kconfig cleanup mediatek: - fixup dpi and hdmi - MT8188 dpi support - MT8195 AFBC support tegra: - NVDEC hardware on Tegra234 SoC hdlcd: - switch to drm managed resources ingenic: - fix registration error path hisilicon: - convert to drm_mode_init maildp: - use managed resources mtk: - use drm_mode_init rockchip: - use drm_mode_copy" * tag 'drm-next-2022-12-13' of git://anongit.freedesktop.org/drm/drm: (1397 commits) drm/amdgpu: fix mmhub register base coding error drm/amdgpu: add tmz support for GC IP v11.0.4 drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4 drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4 drm/amdgpu: enable GFX IP v11.0.4 CG support drm/amdgpu: Make amdgpu_ring_mux functions as static drm/amdgpu: generally allow over-commit during BO allocation drm/amd/display: fix array index out of bound error in DCN32 DML drm/amd/display: 3.2.215 drm/amd/display: set optimized required for comp buf changes drm/amd/display: Add debug option to skip PSR CRTC disable drm/amd/display: correct DML calc error of UrgentLatency drm/amd/display: correct static_screen_event_mask drm/amd/display: Ensure commit_streams returns the DC return code drm/amd/display: read invalid ddc pin status cause engine busy drm/amd/display: Bypass DET swath fill check for max clocks drm/amd/display: Disable uclk pstate for subvp pipes drm/amd/display: Fix DCN2.1 default DSC clocks drm/amd/display: Enable dp_hdmi21_pcon support drm/amd/display: prevent seamless boot on displays that don't have the preferred dig ...
2022-12-09drm/amdgpu: handle polaris10/11 overlap asics (v2)Alex Deucher1-2/+11
Some special polaris 10 chips overlap with the polaris11 DID range. Handle this properly in the driver. v2: use local flags for other function calls. Acked-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-09drm/amdgpu: make display pinning more flexible (v2)Alex Deucher1-1/+2
Only apply the static threshold for Stoney and Carrizo. This hardware has certain requirements that don't allow mixing of GTT and VRAM. Newer asics do not have these requirements so we should be able to be more flexible with where buffers end up. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2270 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2291 Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2255 Acked-by: Luben Tuikov <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-09drm/amd/display: Fix spelling mistake: "dram_clk_chanage" -> "dram_clk_change"Colin Ian King7-20/+20
There is a spelling mistake in the struct field dram_clk_chanage. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-09Merge tag 'amd-drm-next-6.2-2022-12-07' of ↵Dave Airlie37-94/+156
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.2-2022-12-07: amdgpu: - DSC fixes for DCN 2.1 - HDMI PCON fixes - PSR fixes - DC DML fixes - Properly throttle on BO allocation - GFX 11.0.4 fixes - MMHUB fix - Make some functions static Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-12-09Merge tag 'amd-drm-next-6.2-2022-12-02' of ↵Dave Airlie63-361/+1670
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.2-2022-12-02: amdgpu: - Fix CPU stalls when allocating large amounts of system memory - SR-IOV fixes - BACO fixes - Enable GC 11.0.4 - Enable PSP 13.0.11 - Enable SMU 13.0.11 - Enable NBIO 7.7.1 - Fix reported VCN capabilities for RDNA2 - Misc cleanups - PCI ref count fixes - DCN DPIA fixes - DCN 3.2.x fixes - Documentation updates - GC 11.x fixes - VCN RAS fixes - APU fix for passthrough - PSR fixes - GFX preemption support for gfx9 - SDMA fix for S0ix amdkfd: - Enable KFD support for GC 11.0.4 - Misc cleanups - Fix memory leak Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-12-07drm/amd/display: fix array index out of bound error in DCN32 DMLAurabindo Pillai1-1/+1
[Why&How] LinkCapacitySupport array is indexed with the number of voltage states and not the number of max DPPs. Fix the error by changing the array declaration to use the correct (larger) array size of total number of voltage states. Signed-off-by: Aurabindo Pillai <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.0.x
2022-12-07drm/amdgpu/sdma_v4_0: turn off SDMA ring buffer in the s2idle suspendPrike Liang1-9/+15
In the SDMA s0ix save process requires to turn off SDMA ring buffer for avoiding the SDMA in-flight request, otherwise will suffer from SDMA page fault which causes by page request from in-flight SDMA ring accessing at SDMA restore phase. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2248 Cc: [email protected] # 6.0,5.15+ Fixes: f8f4e2a51834 ("drm/amdgpu: skipping SDMA hw_init and hw_fini for S0ix.") Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Tested-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: fix mmhub register base coding errorYang Wang5-5/+5
fix MMHUB register base coding error. Fixes: ec6837591f992 ("drm/amdgpu/gmc10: program the smallK fragment size") Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-12-06drm/amdgpu: add tmz support for GC IP v11.0.4Tim Huang1-0/+1
Add tmz support for GC 11.0.4. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4Tim Huang1-0/+1
Enable GFX IP v11.0.4 CG gate/ungate control. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4Tim Huang1-0/+2
Enable GFX Power Gating control for GC IP v11.0.4. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: enable GFX IP v11.0.4 CG supportTim Huang1-1/+17
Add CG support for GFX/MC/HDP/ATHUB/IH/BIF. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: Make amdgpu_ring_mux functions as staticJiadong Zhu1-5/+3
lkp robot reported missing-prototypes and unused-but-set-variable warnings on some functions of amdgpu_mcbp_mux.c. Make them static and remove the unused variable. Reported-by: kernel test robot <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-12-06drm/amdgpu: generally allow over-commit during BO allocationChristian König2-18/+4
We already fallback to a dummy BO with no backing store when we allocate GDS,GWS and OA resources and to GTT when we allocate VRAM. Drop all those workarounds and generalize this for GTT as well. This fixes ENOMEM issues with runaway applications which try to allocate/free GTT in a loop and are otherwise only limited by the CPU speed. The CS will wait for the cleanup of freed up BOs to satisfy the various domain specific limits and so effectively throttle those buggy applications down to a sane allocation behavior again. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Arunpravin Paneer Selvam <[email protected]> Signed-off-by: Alex Deucher <[email protected]>