aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2024-01-22drm/amd/display: Add usb4_bw_alloc_support flagPeichen Huang3-0/+77
[Why] dc should have a flag for DM to enable usb4_bw_alloc in dptx [How] - Add usb4_bw_alloc_support flag in dc_config Reviewed-by: Wayne Lin <[email protected]> Reviewed-by: Meenakshikumar Somasundaram <[email protected]> Acked-by: Roman Li <[email protected]> Signed-off-by: Peichen Huang <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: Promote DAL to 3.2.268Aric Cyr1-1/+1
Acked-by: Roman Li <[email protected]> Signed-off-by: Aric Cyr <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: create DCN3-specific log for MPC stateMelissa Wen2-1/+54
Logging DCN3 MPC state was following DCN1 implementation that doesn't consider new DCN3 MPC color blocks. Create new elements according to DCN3 MPC color caps and a new DCN3-specific function for reading MPC data. v3: - remove gamut remap reg reading in favor of fixed31_32 matrix data Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: Fix timing bandwidth calculation for HDMILeo (Hanghong) Ma2-0/+6
[Why && How] The current bandwidth calculation for timing doesn't account for certain HDMI modes overhead which leads to DSC can't be enabled. Add support to calculate the actual bandwidth for these HDMI modes. Reviewed-by: Chris Park <[email protected]> Acked-by: Roman Li <[email protected]> Signed-off-by: Leo (Hanghong) Ma <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: add get_gamut_remap helper for MPC3Melissa Wen2-0/+62
We want to be able to read the MPC's gamut remap matrix similar to what we do with .dpp_get_gamut_remap functions. On the other hand, we don't need a hook here because only DCN3+ has the MPC gamut remap block, being absent in previous families. Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: fill up DCN3 DPP color stateMelissa Wen2-2/+43
DCN3 DPP color state was uncollected and some state elements from DCN1 doesn't fit DCN3. Create new elements according to DCN3 color caps and fill them up for DTN log output. rfc-v2: - fix reading of gamcor and blnd gamma states - remove gamut remap register in favor of gamut remap matrix reading Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: read gamut remap matrix in fixed-point 31.32 formatMelissa Wen2-12/+29
Instead of read gamut remap data from hw values, convert HW register values (S2D13) into a fixed-point 31.32 matrix for color state log. Change DCN10 log to print data in the format of the gamut remap matrix. Signed-off-by: Melissa Wen <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: Add dpp_get_gamut_remap functionsHarry Wentland16-3/+256
We want to be able to read the DPP's gamut remap matrix. v2: - code-style and doc comments clean-up (Melissa) Signed-off-by: Harry Wentland <[email protected]> Signed-off-by: Melissa Wen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: decouple color state from hw state logMelissa Wen1-8/+18
Prepare to hook up color state log according to the DCN version. v3: - put functions in single line (Siqueira) Signed-off-by: Melissa Wen <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: Fix uninitialized variable usage in core_link_ 'read_dpcd() ↵Srinivasan Shanmugam1-2/+2
& write_dpcd()' functions The 'status' variable in 'core_link_read_dpcd()' & 'core_link_write_dpcd()' was uninitialized. Thus, initializing 'status' variable to 'DC_ERROR_UNEXPECTED' by default. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:226 core_link_read_dpcd() error: uninitialized symbol 'status'. drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dpcd.c:248 core_link_write_dpcd() error: uninitialized symbol 'status'. Cc: [email protected] Cc: Jerry Zuo <[email protected]> Cc: Jun Lei <[email protected]> Cc: Wayne Lin <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Hamza Mahfooz <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: update check condition of query for ras page retireTao Zhou1-2/+8
Support page retirement handling in debug mode. v2: revert smu_v13_0_6_get_ecc_info directly. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22Revert "drm/amd/pm: smu v13_0_6 supports ecc info by default"Tao Zhou1-8/+0
This reverts commit 6fe08f56db798659beca41ab5b1727a31518f794. We use debug mode flag instead of this interface. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/display: Drop kdoc markers for some Panel Replay functionsSrinivasan Shanmugam1-2/+2
Fixes the below gcc with W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_replay.c:262: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Set REPLAY power optimization flags and coasting vtotal. drivers/gpu/drm/amd/amdgpu/../display/dc/dce/dmub_replay.c:284: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * send Replay general cmd to DMUB. Fixes: e379787cbc2a ("drm/amd/display: Add some functions for Panel Replay") Cc: Aurabindo Pillai <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Leo Li <[email protected]> Cc: Tom Chung <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Cleanup inconsistent indenting in 'amdgpu_gfx_enable_kcq()'Srinivasan Shanmugam1-2/+2
Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:645 amdgpu_gfx_enable_kcq() warn: inconsistent indenting Cc: Le Ma <[email protected]> Cc: Hawking Zhang <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amd/pm: udpate smu v13.0.6 message permissionYang Wang1-2/+2
update smu v13.0.6 message to allow guest driver set gfx clock. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu:Support retiring multiple MCA error address pagesYiPeng Chai3-40/+77
Support retiring multiple MCA error address pages in one in-band query for umc v12_0. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: add interface to check mca umc statusYiPeng Chai5-5/+38
Add interface to check mca umc status. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Use asynchronous polling to handle umc_v12_0 poisoningYiPeng Chai3-31/+116
Use asynchronous polling to handle umc_v12_0 poisoning. v2: 1. Change function name. 2. Change the debugging information content. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Fix ras features value calltraceStanley.Yang2-5/+12
The high three bits of ras features mask indicate socket id, it should skip to check high three bits of ras features mask before disable all ras features. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Prepare for asynchronous processing of umc page retirementYiPeng Chai2-0/+39
Preparing for asynchronous processing of umc page retirement. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Add log info for umc_v12_0YiPeng Chai2-1/+16
Add log info for umc_v12_0. v2: Delete redundant logs. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: fix wrong sizeof argumentSamasth Norway Ananda1-1/+1
voltage_parameters is a point to a struct of type SET_VOLTAGE_PARAMETERS_V1_3. Passing just voltage_parameters would not print the right size of the struct variable. So we need to pass *voltage_parameters to sizeof(). Fixes: 4630d5031cd8 ("drm/amdgpu: check PS, WS index") Signed-off-by: Samasth Norway Ananda <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Enable seq64 manager and fix bugsArunpravin Paneer Selvam11-52/+68
- Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs. v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE otherwise the areas will conflict with user space allocations (Alex) - It needs to be mapped read only in the user VM (Alex) v2: - Instead of just one define for TOP/BOTTOM reserved space separate them into two (Christian) - Fix the CPU and VA calculations and while at it also cleanup error handling and kerneldoc (Christian) Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]>
2024-01-19Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drmLinus Torvalds67-373/+554
Pull more drm fixes from Dave Airlie: "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix thrown in. The amdgpu ones are just the usual couple of weeks of fixes. The xe ones are bunch of cleanups for the new xe driver, the fix you put in on the merge commit and the kconfig fix that was hiding the problem from me. amdgpu: - DSC fixes - DC resource pool fixes - OTG fix - DML2 fixes - Aux fix - GFX10 RLC firmware handling fix - Revert a broken workaround for SMU 13.0.2 - DC writeback fix - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version amdkfd: - Fix dma-buf exports using GEM handles nouveau: - fix a unneeded WARN_ON triggering xe: - Fix for definition of wakeref_t - Fix for an error code aliasing - Fix for VM_UNBIND_ALL in the case there are no bound VMAs - Fixes for a number of __iomem address space mismatches reported by sparse - Fixes for the assignment of exec_queue priority - A Fix for skip_guc_pc not taking effect - Workaround for a build problem on GCC 11 - A couple of fixes for error paths - Fix a Flat CCS compression metadata copy issue - Fix a misplace array bounds checking - Don't have display support depend on EXPERT (as discussed on IRC)" * tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits) nouveau/vmm: don't set addr on the fail path to avoid warning drm/amdgpu: Enable GFXOFF for Compute on GFX11 drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests. drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2" drm/amdkfd: init drm_client with funcs hook drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state() drm/amdgpu: Fix the null pointer when load rlc firmware drm/amd/display: Align the returned error code with legacy DP drm/amd/display: Fix DML2 watermark calculation drm/amd/display: Clear OPTC mem select on disable drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A drm/amd/display: Add logging resource checks drm/amd/display: Init link enc resources in dc_state only if res_pool presents drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()' drm/amd/display: Avoid enum conversion warning drm/amd/pm: Fix smuv13.0.6 current clock reporting drm/amd/pm: Add error log for smu v13.0.6 reset drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()' drm/amdgpu: drop exp hw support check for GC 9.4.3 drm/amdgpu: move debug options init prior to amdgpu device init ...
2024-01-18Merge tag 'i2c-for-6.8-rc1-rebased' of ↵Linus Torvalds5-5/+0
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This removes the currently unused CLASS_DDC support (controllers set the flag, but there is no client to use it). Also, CLASS_SPD support gets simplified to prepare removal in the future. Class based instantiation is not recommended these days anyhow. Furthermore, I2C core now creates a debugfs directory per I2C adapter. Current bus driver users were converted to use it. Finally, quite some driver updates. Standing out are patches for the wmt-driver which is refactored to support more variants. This is the rebased pull request where a large series for the designware driver was dropped" * tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits) MAINTAINERS: use proper email for my I2C work i2c: stm32f7: add support for stm32mp25 soc i2c: stm32f7: perform I2C_ISR read once at beginning of event isr dt-bindings: i2c: document st,stm32mp25-i2c compatible i2c: stm32f7: simplify status messages in case of errors i2c: stm32f7: perform most of irq job in threaded handler i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq i2c: i801: Add lis3lv02d for Dell XPS 15 7590 i2c: i801: Add lis3lv02d for Dell Precision 3540 i2c: wmt: Reduce redundant: REG_CR setting i2c: wmt: Reduce redundant: function parameter i2c: wmt: Reduce redundant: clock mode setting i2c: wmt: Reduce redundant: wait event complete i2c: wmt: Reduce redundant: bus busy check i2c: mux: reg: Remove class-based device auto-detection support i2c: make i2c_bus_type const dt-bindings: at24: add ROHM BR24G04 eeprom: at24: use of_match_ptr() i2c: cpm: Remove linux,i2c-index conversion from be32 i2c: imx: Make SDA actually optional for bus recovering ...
2024-01-18drm/amdgpu: Enable GFXOFF for Compute on GFX11Ori Messinger1-4/+2
On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for ↵Srinivasan Shanmugam1-3/+3
writeback requests. Return value of 'to_amdgpu_crtc' which is container_of(...) can't be null, so it's null check 'acrtc' is dropped. Fixing the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9302 amdgpu_dm_atomic_commit_tail() error: we previously assumed 'acrtc' could be null (see line 9299) Added 'new_crtc_state' NULL check for function 'drm_atomic_get_new_crtc_state' that retrieves the new state for a CRTC, while enabling writeback requests. Cc: [email protected] Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Hamza Mahfooz <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"Christian König4-65/+1
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18drm/amdkfd: init drm_client with funcs hookFlora Cui1-1/+4
otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Fix a switch statement in ↵Christophe JAILLET1-1/+1
populate_dml_output_cfg_from_stream_state() It is likely that the statement related to 'dml_edp' is misplaced. So move it in the correct "case SIGNAL_TYPE_EDP". Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18drm/amdgpu: Fix the null pointer when load rlc firmwareMa Jun1-9/+6
If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10") Signed-off-by: Ma Jun <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18drm/amd/display: Align the returned error code with legacy DPWayne Lin1-0/+5
[Why] For usb4 connector, AUX transaction is handled by dmub utilizing a differnt code path comparing to legacy DP connector. If the usb4 DP connector is disconnected, AUX access will report EBUSY and cause igt@kms_dp_aux_dev fail. [How] Align the error code with the one reported by legacy DP as EIO. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Alex Hung <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Fix DML2 watermark calculationOvidiu Bunea1-7/+7
[Why] core_mode_programming in DML2 should output watermark calculations to locals, but it incorrectly uses mode_lib [How] update code to match HW DML2 Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Charlene Liu <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Ovidiu Bunea <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Clear OPTC mem select on disableIlya Bakoulin2-0/+6
[Why] Not clearing the memory select bits prior to OPTC disable can cause DSC corruption issues when attempting to reuse a memory instance for another OPTC that enables ODM. [How] Clear the memory select bits prior to disabling an OPTC. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Charlene Liu <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Ilya Bakoulin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/ANicholas Kazlauskas1-12/+9
[Why] We can experience DENTIST hangs during optimize_bandwidth or TDRs if FIFO is toggled and hangs. [How] Port the DCN35 fixes to DCN314. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Charlene Liu <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Add logging resource checksCharlene Liu3-3/+10
[Why] When mapping resources, resources could be unavailable. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Sung joon Kim <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Charlene Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Init link enc resources in dc_state only if res_pool presentsDillon Varone1-1/+2
[Why & How] res_pool is not initialized in all situations such as virtual environments, and therefore link encoder resources should not be initialized if res_pool is NULL. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Martin Leung <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'Srinivasan Shanmugam1-2/+6
In link_set_dsc_pps_packet(), 'struct display_stream_compressor *dsc' was dereferenced in a DC_LOGGER_INIT(dsc->ctx->logger); before the 'dsc' NULL pointer check. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_dpms.c:905 link_set_dsc_pps_packet() warn: variable dereferenced before check 'dsc' (see line 903) Cc: [email protected] Cc: Aurabindo Pillai <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Hamza Mahfooz <[email protected]> Cc: Wenjing Liu <[email protected]> Cc: Qingqing Zhuo <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/pm: enable amdgpu smu send message logYang Wang1-1/+8
v1: enable amdgpu smu driver message log. v2: add smu/pmfw response value into debug log. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: update error condition check for umc_v12_0_query_error_addressTao Zhou1-4/+1
Deferred error is also taken into account. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: Skip do PCI error slot reset during RAS recoveryStanley.Yang1-0/+14
Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: Show deferred error count for UMCStanley.Yang1-2/+6
Show deferred error count for UMC syfs node Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: Enable GFXOFF for Compute on GFX11Ori Messinger1-4/+2
On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: fix UBSAN array-index-out-of-bounds for ras_block_string[]Yang Wang1-1/+4
fix array index out of bounds issue for ras_block_string[] array. Fixes: 30df05fb74f6 ("drm/amdgpu: Align ras block enum with firmware") Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg in guestYuanShang8-13/+23
Submit command of wreg in GFX and COMPUTE ring to update RLC_SPM_MC_CNT in guest machine during runtime. Signed-off-by: YuanShang <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for ↵Srinivasan Shanmugam1-3/+3
writeback requests. Return value of 'to_amdgpu_crtc' which is container_of(...) can't be null, so it's null check 'acrtc' is dropped. Fixing the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm.c:9302 amdgpu_dm_atomic_commit_tail() error: we previously assumed 'acrtc' could be null (see line 9299) Added 'new_crtc_state' NULL check for function 'drm_atomic_get_new_crtc_state' that retrieves the new state for a CRTC, while enabling writeback requests. Cc: [email protected] Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Hamza Mahfooz <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: Remove unnecessary NULL checkFelix Kuehling1-1/+1
A static checker pointed out, that bo_va->base.bo was already derefenced earlier in the same scope. Therefore this check is unnecessary here. Reported-by: Dan Carpenter <[email protected]> Fixes: 50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"Christian König4-65/+1
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdgpu: Remove usage of the deprecated ida_simple_xx() APIChristophe JAILLET1-4/+3
ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So a -1 has been added when needed. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18drm/amdkfd: init drm_client with funcs hookFlora Cui1-1/+4
otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>