aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2021-08-31drm/amdgpu: Fix a deadlock if previous GEM object allocation failsxinhui pan1-13/+10
Fall through to handle the error instead of return. Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs") Cc: [email protected] Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-31drm/amdgpu: stop scheduler when calling hw_fini (v2)Guchun Chen1-0/+8
This gurantees no more work on the ring can be submitted to hardware in suspend/resume case, otherwise a potential race will occur and the ring will get no chance to stay empty before suspend. v2: Call drm_sched_resubmit_job before drm_sched_start to restart jobs from the pending list. Suggested-by: Andrey Grodzovsky <[email protected]> Suggested-by: Christian König <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-31drm/amdgpu: Clear RAS interrupt status on aldebaranJohn Clements1-5/+25
Resolve incorrect register address Reviewed-by: Candice Li <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30Merge tag 'irq-core-2021-08-30' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates to the interrupt core and driver subsystems: Core changes: - The usual set of small fixes and improvements all over the place, but nothing stands out MSI changes: - Further consolidation of the PCI/MSI interrupt chip code - Make MSI sysfs code independent of PCI/MSI and expose the MSI interrupts of platform devices in the same way as PCI exposes them. Driver changes: - Support for ARM GICv3 EPPI partitions - Treewide conversion to generic_handle_domain_irq() for all chained interrupt controllers - Conversion to bitmap_zalloc() throughout the irq chip drivers - The usual set of small fixes and improvements" * tag 'irq-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits) platform-msi: Add ABI to show msi_irqs of platform devices genirq/msi: Move MSI sysfs handling from PCI to MSI core genirq/cpuhotplug: Demote debug printk to KERN_DEBUG irqchip/qcom-pdc: Trim unused levels of the interrupt hierarchy irqdomain: Export irq_domain_disconnect_hierarchy() irqchip/gic-v3: Fix priority comparison when non-secure priorities are used irqchip/apple-aic: Fix irq_disable from within irq handlers pinctrl/rockchip: drop the gpio related codes gpio/rockchip: drop irq_gc_lock/irq_gc_unlock for irq set type gpio/rockchip: support next version gpio controller gpio/rockchip: use struct rockchip_gpio_regs for gpio controller gpio/rockchip: add driver for rockchip gpio dt-bindings: gpio: change items restriction of clock for rockchip,gpio-bank pinctrl/rockchip: add pinctrl device to gpio bank struct pinctrl/rockchip: separate struct rockchip_pin_bank to a head file pinctrl/rockchip: always enable clock for gpio controller genirq: Fix kernel doc indentation EDAC/altera: Convert to generic_handle_domain_irq() powerpc: Bulk conversion to generic_handle_domain_irq() nios2: Bulk conversion to generic_handle_domain_irq() ...
2021-08-30drm/amdgpu: show both cmd id and name when psp cmd failedLang Yu1-4/+4
To cover the corner case that people want to know the ID of an UNKNOWN CMD. Suggested-by: John Clements <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amdgpu: Enable S/G for Yellow CarpNicholas Kazlauskas1-0/+1
Missing code for Yellow Carp to enable scatter gather - follows how DCN21 support was added. Tested that 8k framebuffer allocation and display can now succeed after applying the patch. v2: Add hookup in DM Reviewed-by: Aaron Liu <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-30drm/amdgpu: reenable BACO support for 699F:C7 polaris12 SKUEvan Quan1-8/+1
This reverts the commit below: "drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily". As the S3 hang issue has been fixed by another commit: "drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend". Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amd/amdgpu: Add ready_to_reset resp for vega10YuBiao Wang2-0/+3
Send response to host after received the flr notification from host. Port NV change to vega10. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Jingwen Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amdgpu: add some additional RDNA2 PCI IDsAlex Deucher1-0/+17
New PCI IDs. Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amdgpu: correct comments in memory type managersYifan Zhang2-5/+5
The parameters were renamed. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amdgpu: Process any VBIOS RAS EEPROM addressLuben Tuikov1-12/+13
We can now process any RAS EEPROM address from VBIOS. Generalize so as to compute the top three bits of the 19-bit EEPROM address, from any byte returned as the "i2c address" from VBIOS. Cc: John Clements <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/amdgpu: Fixes to returning VBIOS RAS EEPROM addressLuben Tuikov1-17/+33
1) Generalize the function--if the user didn't set i2c_address, still return true/false to indicate whether VBIOS contains the RAS EEPROM address. This function shouldn't evaluate whether the user set the i2c_address pointer or not. 2) Don't touch the caller's i2c_address, unless you have to--this function shouldn't have side effects. 3) Correctly set the function comment as a kernel-doc comment. Cc: John Clements <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-30drm/sched: drop entity parameter from drm_sched_push_jobDaniel Vetter2-2/+2
Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. v2: Rebase on top of msm adopting drm/sched Reviewed-by: Christian König <[email protected]> Acked-by: Emma Anholt <[email protected]> Acked-by: Melissa Wen <[email protected]> Reviewed-by: Steven Price <[email protected]> (v1) Reviewed-by: Boris Brezillon <[email protected]> (v1) Signed-off-by: Daniel Vetter <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Christian Gmeiner <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tomeu Vizoso <[email protected]> Cc: Steven Price <[email protected]> Cc: Alyssa Rosenzweig <[email protected]> Cc: Emma Anholt <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Nirmoy Das <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Chen Li <[email protected]> Cc: Lee Jones <[email protected]> Cc: Deepak R Varma <[email protected]> Cc: Kevin Wang <[email protected]> Cc: Luben Tuikov <[email protected]> Cc: "Marek Olšák" <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Dennis Li <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Rob Clark <[email protected]> Cc: Sean Paul <[email protected]> Cc: Melissa Wen <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-30drm/sched: Split drm_sched_job_initDaniel Vetter2-0/+4
This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. v5: Reshuffle the split between init/arm once more, amdgpu abuses drm_sched.ready to signal gpu reset failures. Also document this somewhat. (Christian) v6: Rebase on top of the msm drm/sched support. Note that the drm_sched_job_init() call is completely misplaced, and hence also the split-out drm_sched_entity_push_job(). I've put in a FIXME which the next patch will address. v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't be a problem where it is now. Acked-by: Christian König <[email protected]> Acked-by: Melissa Wen <[email protected]> Cc: Melissa Wen <[email protected]> Acked-by: Emma Anholt <[email protected]> Acked-by: Steven Price <[email protected]> (v2) Reviewed-by: Boris Brezillon <[email protected]> (v5) Signed-off-by: Daniel Vetter <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Christian Gmeiner <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tomeu Vizoso <[email protected]> Cc: Steven Price <[email protected]> Cc: Alyssa Rosenzweig <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: "Christian König" <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Kees Cook <[email protected]> Cc: Adam Borowski <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Paul Menzel <[email protected]> Cc: Sami Tolvanen <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Nirmoy Das <[email protected]> Cc: Deepak R Varma <[email protected]> Cc: Lee Jones <[email protected]> Cc: Kevin Wang <[email protected]> Cc: Chen Li <[email protected]> Cc: Luben Tuikov <[email protected]> Cc: "Marek Olšák" <[email protected]> Cc: Dennis Li <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Andrey Grodzovsky <[email protected]> Cc: Sonny Jiang <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: Tian Tao <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Emma Anholt <[email protected]> Cc: Rob Clark <[email protected]> Cc: Sean Paul <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-30Merge tag 'amd-drm-next-5.15-2021-08-27' of ↵Dave Airlie40-143/+761
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-08-27: amdgpu: - PLL fix for SI - Misc code cleanups - RAS fixes - PSP cleanups - Polaris UVD/VCE suspend fixes - aldebaran fixes - DCN3.x mclk fixes amdkfd: - CWSR fixes for arcturus and aldebaran - SVM fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-29Merge tag 'irqchip-5.15' of ↵Thomas Gleixner1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - API updates: - Treewide conversion to generic_handle_domain_irq() for anything that looks like a chained interrupt controller - Update the irqdomain documentation - Use of bitmap_zalloc() throughout the tree - New functionalities: - Support for GICv3 EPPI partitions - Fixes: - Qualcomm PDC hierarchy fixes - Yet another priority decoding fix for the GICv3 pseudo-NMIs - Fix the apple-aic driver irq_eoi() callback to always unmask the interrupt - Properly handle edge interrupts on loongson-pch-pic - Let the mtk-sysirq driver advertise IRQCHIP_SKIP_SET_WAKE Link: https://lore.kernel.org/r/[email protected]
2021-08-26drm/amdgpu: disable GFX CGCG in aldebaranHawking Zhang1-2/+0
disable GFX CGCG and CGLS to workaround a hardware issue found in aldebaran. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-26drm/amdgpu: Clear RAS interrupt status on aldebaranJohn Clements1-5/+29
resolve register address issue for detecting/clearing RAS interrupt Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-26drm/amdgpu: Add support for RAS XGMI err queryJohn Clements1-0/+65
Update XGMI RAS to support error query on aldebaran Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-26Merge tag 'amd-drm-next-5.15-2021-08-20' of ↵Dave Airlie45-439/+716
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-08-20: amdgpu: - embed hw fence into job - Misc SMU fixes - PSP TA code cleanup - RAS fixes - PWM fan speed fixes - DC workqueue cleanups - SR-IOV fixes - gfxoff delayed work fix - Pin domain check fix amdkfd: - SVM fixes radeon: - Code cleanup Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-25drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domainYifan Zhang4-7/+7
amdgpu_bo_get_preferred_pin_domain is used for page tables creation, which is not involved with page pinning. And it is used in more cases than display scanout, modify its documentation as well. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: drop redundant cancel_delayed_work_sync callEvan Quan4-6/+0
As those _sw_fini() APIs follow just after _suspend() APIs. And the cancel_delayed_work_sync was already called in latter. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspendEvan Quan6-1/+144
This is a supplement for commit below: "drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend". Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspendEvan Quan2-0/+47
Perform proper cleanups on UVD/VCE suspend: powergate enablement, clockgating enablement and dpm disablement. This can fix some hangs observed on suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video is still playing). Signed-off-by: Evan Quan <[email protected]> Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdkfd: check access permisson to restore retry faultPhilip Yang4-5/+8
Check range access permission to restore GPU retry fault, if GPU retry fault on address which belongs to VMA, and VMA has no read or write permission requested by GPU, failed to restore the address. The vm fault event will pass back to user space. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Update RAS XGMI Error QueryJohn Clements1-1/+3
Resolve bug querying error on unsupported ASIC Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Add driver infrastructure for MCA RASJohn Clements9-2/+388
Add MCA specific IP blocks targetting RAS features Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/amdgpu: consolidate PSP TA init shared buf functionsCandice Li1-99/+43
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/amdgpu: add name field back to ras_common_ifCandice Li1-0/+1
Adding name field back to ras_common_if to work around error injection failure with amdgpuras tool. Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Fix build with missing pm_suspend_target_state module exportBorislav Petkov1-1/+1
Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND || HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-24drm/amdgpu: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-3/+3
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Reviewed-by: Christian König <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdkfd: CWSR with sw scheduler on Aldebaran and ArcturusMukul Joshi4-2/+6
Program trap handler settings to enable CWSR with software scheduler on Aldebaran and Arcturus. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu/OLAND: clip the ref divider max valueShashank Sharma3-9/+16
This patch limits the ref_div_max value to 100, during the calculation of PLL feedback reference divider. With current value (128), the produced fb_ref_div value generates unstable output at particular frequencies. Radeon driver limits this value at 100. On Oland, when we try to setup mode 2048x1280@60 (a bit weird, I know), it demands a clock of 221270 Khz. It's been observed that the PLL calculations using values 128 and 100 are vastly different, and look like this: +------------------------------------------+ |Parameter |AMDGPU |Radeon | | | | | +-------------+----------------------------+ |Clock feedback | | |divider max | 128 | 100 | |cap value | | | | | | | | | | | +------------------------------------------+ |ref_div_max | | | | | 42 | 20 | | | | | | | | | +------------------------------------------+ |ref_div | 42 | 20 | | | | | +------------------------------------------+ |fb_div | 10326 | 8195 | +------------------------------------------+ |fb_div | 1024 | 163 | +------------------------------------------+ |fb_dev_p | 4 | 9 | |frac fb_de^_p| | | +----------------------------+-------------+ With ref_div_max value clipped at 100, AMDGPU driver can also drive videmode 2048x1280@60 (221Mhz) and produce proper output without any blanking and distortion on the screen. PS: This value was changed from 128 to 100 in Radeon driver also, here: https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806 V1: Got acks from: Acked-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> V2: - Restricting the changes only for OLAND, just to avoid any regression for other cards. - Changed unsigned -> unsigned int to make checkpatch quiet. V3: Apply the change on SI family (not only oland) (Christian) Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Eddy Qin <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Fix build with missing pm_suspend_target_state module exportBorislav Petkov1-1/+1
Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND || HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-23drm/ttm: remove ttm_tt_destroy_common v2Christian König1-1/+0
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead. We don't need this any more since we removed the unbind from the destroy code paths in the drivers. Also add a warning to ttm_tt_fini() if we try to fini a still populated TT object. v2: instead of reverting the patch move the functionality to different places. Signed-off-by: Christian König <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-23drm/amdgpu: unbind in amdgpu_ttm_tt_unpopulateChristian König1-1/+2
Doing this in amdgpu_ttm_backend_destroy() is to late. It turned out that this is not a good idea at all because it leaves pointers to freed up system memory pages in the GART tables of the drivers. Signed-off-by: Christian König <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer2-17/+30
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: [email protected] Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> # v3 Acked-by: Christian König <[email protected]> # v3 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-20drm/amdgpu: use the preferred pin domain after the checkChristian König1-5/+5
For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer2-17/+30
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: [email protected] Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> # v3 Acked-by: Christian König <[email protected]> # v3 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-20drm/amdgpu: use the preferred pin domain after the checkChristian König1-5/+5
For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-18drm/amd/amdgpu:flush ttm delayed work before cancel_syncYuBiao Wang1-1/+3
[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd: consolidate TA shared memory structuresCandice Li4-162/+134
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: increase max xgmi physical node for aldebaranHawking Zhang1-3/+2
aldebaran supports up to 16 xgmi physical nodes. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarilyEvan Quan1-1/+8
We have a S3 issue on that SKU with BACO enabled. Will bring back this when that root caused. Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: correct MMSCH 1.0 versionZhigang Luo1-3/+1
MMSCH 1.0 doesn't have major/minor version, only verison. Signed-off-by: Zhigang Luo <[email protected]> Reviewed by Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: get extended xgmi topology dataJonathan Kim4-14/+145
The TA has a limit to the amount of data that can be retrieved from GET_TOPOLOGY. For setups that exceed this limit, the xGMI topology needs to be re-initialized and data needs to be re-fetched from the extended link records by setting a flag in the shared command buffer. The number of hops and the number of links must be accumulated by the driver. Other data points are all fetched from the first request. Because the TA has already exceeded its link record limit, it cannot hold bidirectional information. Otherwise the driver would have to do more than two fetches so the driver has to reflect the topology information in the opposite direction. v2: squashed with internal reviewed fix Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: drop the unnecessary intermediate percent-based transitionEvan Quan1-0/+2
Currently, the readout of fan speed pwm is transited into percent-based and then pwm-based. However, the transition into percent-based is totally unnecessary and make the final output less accurate. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/amdgpu: remove unnecessary RAS context fieldCandice Li10-14/+6
Delete ras_if->name in the RAS ctx structure and remove related lines. Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/amdgpu: consolidate PSP TA contextCandice Li11-187/+158
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.Jiange Zhao2-1/+4
When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has prepared for host reset. Signed-off-by: Jiange Zhao <[email protected]> Signed-off-by: Peng Ju Zhou <[email protected]> Reviewed-by: Emily.Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>