aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2024-06-20drm/i915/dsb: Convert the DSB code to use intel_display rather than i915Ville Syrjälä1-26/+26
The future direction will be to mainly use intel_display rather than i915 in the display code. Start on that path for the DSB code. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915/dsb: Plumb the whole atomic state into intel_dsb_prepare()Ville Syrjälä3-6/+11
The DSB code will need to examine both the old and new crtc states. Pass in the whole atomic state so we can dig up what we need. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915: Pass the whole atomic state to intel_color_prepare_commit()Ville Syrjälä3-9/+13
We'll have need to examine both the old and new crtc states in intel_color_prepare_commit(), so let's just pass in the whole atomic state. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915: Introduce intel_mode_vdisplay()Ville Syrjälä2-0/+11
The DSB code will need to know the hardware's idea of vertical active, as that is also what defines the start of undelayed vblank. Introduce a helper that gives us that information, in line with the other intel_mode_v*() functions. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915: Add flip done tracepointVille Syrjälä2-0/+24
Add a tracepoint to see exactly when async flips complete. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915: Add async flip tracepointVille Syrjälä4-7/+50
Add a separate tracepoint for async flips vs. sync plane updates to make it a bit easier to figure out what is happening. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/i915: Extract intel_crtc_arm_vblank_event()Ville Syrjälä2-10/+20
We'll need to arm the vblank event also from the future DSB based codepath. Extract the function that does the whold dance for us. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-20drm/xe/guc: Move ARAT interrupts enabling to the upload stepMichal Wajdeczko1-3/+3
Even though ARAT interrupts are enabled by default, we still want to keep the code that enables them. But instead doing that in the CTB enabling step, move this code to the upload step, where we already setup few other registers related to GuC. Signed-off-by: Michal Wajdeczko <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-20drm/tests: add drm_hdmi_state_helper_test MODULE_DESCRIPTION()Jeff Johnson1-0/+1
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/gpu/drm/tests/drm_hdmi_state_helper_test.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Fixes: eb66d34d793e ("drm/tests: Add output bpc tests") Signed-off-by: Jeff Johnson <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20240619-md-drm-tests-drm_hdmi_state_helper_test-v1-1-41c5fe2fdb4a@quicinc.com
2024-06-20drm/vc4: vec: Add the margin properties to the connectorDave Stevenson1-0/+2
All the handling for the properties was present, but they were never attached to the connector to allow userspace to change them. Add them to the connector. Signed-off-by: Dave Stevenson <[email protected]> Reviewed-by: Maxime Ripard <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-20drm/vc4: Add monochrome mode to the VEC.Dave Stevenson1-1/+28
The VEC supports not producing colour bursts for monochrome output. It also has an option for disabling the chroma input to remove chroma from the signal. Now that there is a DRM_MODE_TV_MODE_MONOCHROME defined, plumb this in. Signed-off-by: Dave Stevenson <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-20drm/ttm/tests: Let ttm_bo_test consider different ww_mutex implementation.Sebastian Andrzej Siewior1-1/+7
PREEMPT_RT has a different locking implementation for ww_mutex. The base mutex of struct ww_mutex is declared as struct WW_MUTEX_BASE. The latter is defined as `mutex' for non-PREEMPT_RT builds and `rt_mutex' for PREEMPT_RT builds. Using mutex_lock() directly on the base mutex in ttm_bo_reserve_deadlock() leads to compile error on PREEMPT_RT. The locking-selftest has its own defines to deal with this and it is probably best to defines the needed one within the test program since their usefulness is limited outside of well known selftests. Provide ww_mutex_base_lock() which points to the correct function for PREEMPT_RT and non-PREEMPT_RT builds. Fixes: 995279d280d1e ("drm/ttm/tests: Add tests for ttm_bo functions") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-20drm/xe/vf: Don't touch GuC irq registers if using memory irqsMichal Wajdeczko1-2/+2
On platforms where VFs are using memory based interrupts, we missed invalid access to no longer existing interrupt registers, as we keep them marked with XE_REG_OPTION_VF. To fix that just either setup memirq vectors in GuC or enable legacy interrupts. Fixes: aef4eb7c7dec ("drm/xe/vf: Setup memory based interrupts in GuC") Signed-off-by: Michal Wajdeczko <[email protected]> Cc: Matt Roper <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] (cherry picked from commit f0ccd2d805e55e12b430d5d6b9acd9f891af455e) Signed-off-by: Thomas Hellström <[email protected]>
2024-06-20drm/i915/gem: Use the correct format specifier for resource_size_tAndi Shyti1-2/+2
Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") adds a debug message where the "lmem_size" and "dsm_base" variables are printed using the %lli identifier. However, these variables are defined as resource_size_t, which are unsigned long for 32-bit machines and unsigned long long for 64-bit machines. The documentation (core-api/printk-formats.rst) recommends using the %pa specifier for printing addresses and sizes of resources. Replace %lli with %pa. This patch also mutes the following sparse warning when compiling with: make W=1 ARCH=i386 drivers/gpu/drm/i915 >> drivers/gpu/drm/i915/gem/i915_gem_stolen.c:941:5: error: format '%lli' expects argument of type 'long long int', but argument 5 has type 'resource_size_t' {aka 'unsigned int'} [-Werror=format=] Signed-off-by: Andi Shyti <[email protected]> Cc: Jonathan Cavitt <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-20drm/i915/gem: Return NULL instead of '0'Andi Shyti1-1/+1
Commit 05da7d9f717b ("drm/i915/gem: Downgrade stolen lmem setup warning") returns '0' from i915_gem_stolen_lmem_setup(), but it's supposed to return a pointer to the intel_memory_region structure. Sparse complains with the following message: >> drivers/gpu/drm/i915/gem/i915_gem_stolen.c:943:32: sparse: sparse: Using plain integer as NULL pointer Return NULL. Signed-off-by: Andi Shyti <[email protected]> Cc: Jonathan Cavitt <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-19drm/amdgpu: init TA fw for psp v14Likun Gao1-0/+5
Add support to init TA firmware for psp v14. Signed-off-by: Likun Gao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: cleanup MES11 command submissionChristian König1-28/+48
The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: eef016ba8986 ("drm/amdgpu/mes11: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: fix UBSAN warning in kv_dpm.cAlex Deucher1-0/+2
Adds bounds check for sumo_vid_mapping_entry. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3392 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-06-19drm/radeon: fix UBSAN warning in kv_dpm.cAlex Deucher1-0/+2
Adds bounds check for sumo_vid_mapping_entry. Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-06-19drm/etnaviv: don't disable TS on MMUv2 core when moving the linear windowLucas Stach1-2/+5
On MMUv2 cores the linear window is only relevant when starting the FE, before the MMU has been activated. Once the MMU is active, all accesses are translated with no way to bypass the MMU via the linear window. Thus TS ignoring the linear window offset is not an issue on cores with MMUv2 present and there is no need to disable TS when we need to move the linear window. Signed-off-by: Lucas Stach <[email protected]> Tested-by: Joao Paulo Goncalves <[email protected]>
2024-06-19drm/etnaviv: Read some FE registers twiceDerek Foreman1-0/+8
On some hardware (such at the GC7000 rev 6009), these registers need to be read twice to return the correct value. Hide that in gpu_read(). Signed-off-by: Derek Foreman <[email protected]> Signed-off-by: Lucas Stach <[email protected]>
2024-06-19drm/amd/display: Disable CONFIG_DRM_AMD_DC_FP for RISC-V with clangNathan Chancellor1-1/+1
Commit 77acc6b55ae4 ("riscv: add support for kernel-mode FPU") and commit a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT") enabled support for CONFIG_DRM_AMD_DC_FP with RISC-V. Unfortunately, this exposed -Wframe-larger-than warnings (which become fatal with CONFIG_WERROR=y) when building ARCH=riscv allmodconfig with clang: drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn32/display_mode_vba_32.c:58:13: error: stack frame size (2448) exceeds limit (2048) in 'DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation' [-Werror,-Wframe-larger-than] 58 | static void DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation( | ^ 1 error generated. Many functions in this file use a large number of parameters, which must be passed on the stack at a certain pointer due to register exhaustion, which can cause high stack usage when inlining and issues with stack slot analysis get involved. While the compiler can and should do better (as GCC uses less than half the amount of stack space for the same function), it is not as simple as a fix as adjusting the functions not to take a large number of parameters. Unfortunately, modifying these files to avoid the problem is a difficult to justify approach because any revisions to the files in the kernel tree never make it back to the original source (so copies of the code for newer hardware revisions just reintroduce the issue) and the files are hard to read/modify due to being "gcc-parsable HW gospel, coming straight from HW engineers". Avoid building the problematic code for RISC-V by modifying the existing condition for arm64 that exists for the same reason. Factor out the logical not to make the condition a little more readable naturally. Fixes: a28e4b672f04 ("drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT") Reported-by: Palmer Dabbelt <[email protected]> Closes: https://lore.kernel.org/[email protected]/ Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amd/display: Attempt to avoid empty TUs when endpoint is DPIAMichael Strauss3-1/+75
[WHY] Empty SST TUs are illegal to transmit over a USB4 DP tunnel. Current policy is to configure stream encoder to pack 2 pixels per pclk even when ODM combine is not in use, allowing seamless dynamic ODM reconfiguration. However, in extreme edge cases where average pixel count per TU is less than 2, this can lead to unexpected empty TU generation during compliance testing. For example, VIC 1 with a 1xHBR3 link configuration will average 1.98 pix/TU. [HOW] Calculate average pixel count per TU, and block 2 pixels per clock if endpoint is a DPIA tunnel and pixel clock is low enough that we will never require 2:1 ODM combine. Cc: [email protected] # 6.6+ Reviewed-by: Wenjing Liu <[email protected]> Acked-by: Hamza Mahfooz <[email protected]> Signed-off-by: Michael Strauss <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amd/display: change dram_clock_latency to 34us for dcn35Paul Hsieh1-1/+1
[Why & How] Current DRAM setting would cause underflow on customer platform. Modify dram_clock_change_latency_us from 11.72 to 34.0 us as per recommendation from HW team Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Zaeem Mohamed <[email protected]> Signed-off-by: Paul Hsieh <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amd/display: Change dram_clock_latency to 34us for dcn351Daniel Miess1-1/+1
[Why] Intermittent underflow observed when using 4k144 display on dcn351 [How] Update dram_clock_change_latency_us from 11.72us to 34us Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Zaeem Mohamed <[email protected]> Signed-off-by: Daniel Miess <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: revert "take runtime pm reference when we attach a buffer" v2Christian König3-51/+0
This reverts commit b8c415e3bf98 ("drm/amdgpu: take runtime pm reference when we attach a buffer") and commit 425285d39afd ("drm/amdgpu: add amdgpu runpm usage trace for separate funcs"). Taking a runtime pm reference for DMA-buf is actually completely unnecessary and even dangerous. The problem is that calling pm_runtime_get_sync() from the DMA-buf callbacks is illegal because we have the reservation locked here which is also taken during resume. So this would deadlock. When the buffer is in GTT it is still accessible even when the GPU is powered down and when it is in VRAM the buffer gets migrated to GTT before powering down. The only use case which would make it mandatory to keep the runtime pm reference would be if we pin the buffer into VRAM, and that's not something we currently do. v2: improve the commit message Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
2024-06-19drm/amdgpu: Indicate CU havest info to CPHarish Kasiviswanathan1-2/+13
To achieve full occupancy CP hardware needs to know if CUs in SE are symmetrically or asymmetrically harvested v2: Reset is_symmetric_cus for each loop Signed-off-by: Harish Kasiviswanathan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amd/display: prevent register access while in IPSHamza Mahfooz1-0/+10
We can't read/write to DCN registers while in IPS. Since, that can cause the system to hang. So, before proceeding with the access in that scenario, force the system out of IPS. Cc: [email protected] # 6.6+ Reviewed-by: Roman Li <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: fix locking scope when flushing tlbYunxiang Li1-32/+34
Which method is used to flush tlb does not depend on whether a reset is in progress or not. We should skip flush altogether if the GPU will get reset. So put both path under reset_domain read lock. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
2024-06-19drm/amd/display: Remove redundant idle optimization checkRoman Li1-3/+0
[Why] Disable idle optimization for each atomic commit is unnecessary, and can lead to a potential race condition. [How] Remove idle optimization check from amdgpu_dm_atomic_commit_tail() Fixes: 196107eb1e15 ("drm/amd/display: Add IPS checks before dcn register access") Cc: [email protected] Reviewed-by: Hamza Mahfooz <[email protected]> Acked-by: Roman Li <[email protected]> Signed-off-by: Roman Li <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/i915: Enable plane/pipeDMC ATS fault interrupts on mtlVille Syrjälä2-0/+12
MTL has some new IOMMU thing that has a few new fault interrupts. Enable those so we can know if things are going poorly. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Enable pipeDMC fault interrupts on tgl+Ville Syrjälä2-2/+15
PipeDMC has its own fault interrupt. Enable that so that we can know if things are failing. While at it, define the other pipeDMC interrupt as well, even though we're not currently using it. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Nuke the intermediate pipe fault bitmasksVille Syrjälä2-22/+22
GEN8_DE_PIPE_IRQ_FAULT_ERRORS & co. don't really achieve anything. Get rid of them and just declare all the bits directly in gen8_de_pipe_fault_mask(). Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Extend GEN9_PIPE_PLANE_FLIP_DONE() to cover all universal planesVille Syrjälä1-1/+5
GEN9_PIPE_PLANE_FLIP_DONE() only works for planes 1-4. Extend it handle planes 5-7 as well. Somewhat annoyingly the bits are spread around into two distinct clumps. Currently this doesn't achieve anything, but if we ever extend async flip support to more than just the first plane then we'll need this. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Sort bdw+ pipe interrupt bitsVille Syrjälä1-11/+11
It's really hard to figure out which bdw+ pipe interrupt bits we've defined and which we have not. Sort the defines to make that a bit easier (still not super easy since the bits have been shuffled a bit over the years). Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Document bdw+ pipe interrupt bitsVille Syrjälä1-21/+21
Sprinkle some notes indicating which platforms have which pipe interrupt bits. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/i915: Use REG_BIT() for bdw+ pipe interruptsVille Syrjälä1-27/+27
Replace the hand rolled (1<<n) with the modern REG_BIT() approach for the bdw+ pipe interrupt bits. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Jani Nikula <[email protected]>
2024-06-19drm/amdgpu: init TA fw for psp v14Likun Gao1-0/+5
Add support to init TA firmware for psp v14. Signed-off-by: Likun Gao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: refine gfx6 firmware loadingYang Wang1-10/+9
refine gfx6 firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amd/pm: powerplay: Add `__counted_by` attribute for flexible arraysMario Limonciello2-37/+37
This attribute is used to hint the length of flexible arrays to compiler and sanitizers. Acked-by: Alex Deucher <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19Revert "drm/amdgpu: change aca bank error lock type to spinlock"Yang Wang2-11/+11
This reverts commit f6bce954f432c556659a57be9e18fecdc575affb. Revert this patch to modify lock type back to 'mutex' to avoid kernel calltrace issue. [ 602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu] [ 602.668939] Call Trace: [ 602.668940] <TASK> [ 602.668941] dump_stack_lvl+0x4c/0x70 [ 602.668945] dump_stack+0x14/0x20 [ 602.668946] __schedule_bug+0x5a/0x70 [ 602.668950] __schedule+0x940/0xb30 [ 602.668952] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.668955] ? hrtimer_reprogram+0x77/0xb0 [ 602.668957] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.668959] ? hrtimer_start_range_ns+0x126/0x370 [ 602.668961] schedule+0x39/0xe0 [ 602.668962] schedule_hrtimeout_range_clock+0xb1/0x140 [ 602.668964] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 602.668966] schedule_hrtimeout_range+0x17/0x20 [ 602.668967] usleep_range_state+0x69/0x90 [ 602.668970] psp_cmd_submit_buf+0x132/0x570 [amdgpu] [ 602.669066] psp_ras_invoke+0x75/0x1a0 [amdgpu] [ 602.669156] psp_ras_query_address+0x9c/0x120 [amdgpu] [ 602.669245] umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu] [ 602.669337] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669339] ? stack_depot_save+0x12/0x20 [ 602.669342] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669343] ? set_track_prepare+0x52/0x70 [ 602.669346] ? kmemleak_alloc+0x4f/0x90 [ 602.669348] ? __kmalloc_node+0x34b/0x450 [ 602.669352] amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu] [ 602.669438] mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu] [ 602.669554] mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu] [ 602.669655] amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu] [ 602.669743] ? kmemleak_free+0x36/0x60 [ 602.669745] ? kvfree+0x32/0x40 [ 602.669747] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669749] ? kfree+0x15d/0x2a0 [ 602.669752] amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu] [ 602.669839] amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu] [ 602.669924] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669925] ? __call_rcu_common.constprop.0+0xa6/0x2b0 [ 602.669929] amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu] [ 602.670014] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.670017] amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu] [ 602.670103] amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu] [ 602.670187] ? srso_alias_return_thunk+0x5/0 Signed-off-by: Yang Wang <[email protected]> Reviewed-by: YiPeng Chai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19Revert "drm/amdgpu: change bank cache lock type to spinlock"Yang Wang2-6/+7
This reverts commit 258ed689bc3163f86204f75df6c23f92b59b3fad revert this patch to modify lock type back to 'mutex' to avoid kernel calltrace issue. [ 602.668806] Workqueue: amdgpu-reset-dev amdgpu_ras_do_recovery [amdgpu] [ 602.668939] Call Trace: [ 602.668940] <TASK> [ 602.668941] dump_stack_lvl+0x4c/0x70 [ 602.668945] dump_stack+0x14/0x20 [ 602.668946] __schedule_bug+0x5a/0x70 [ 602.668950] __schedule+0x940/0xb30 [ 602.668952] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.668955] ? hrtimer_reprogram+0x77/0xb0 [ 602.668957] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.668959] ? hrtimer_start_range_ns+0x126/0x370 [ 602.668961] schedule+0x39/0xe0 [ 602.668962] schedule_hrtimeout_range_clock+0xb1/0x140 [ 602.668964] ? __pfx_hrtimer_wakeup+0x10/0x10 [ 602.668966] schedule_hrtimeout_range+0x17/0x20 [ 602.668967] usleep_range_state+0x69/0x90 [ 602.668970] psp_cmd_submit_buf+0x132/0x570 [amdgpu] [ 602.669066] psp_ras_invoke+0x75/0x1a0 [amdgpu] [ 602.669156] psp_ras_query_address+0x9c/0x120 [amdgpu] [ 602.669245] umc_v12_0_update_ecc_status+0x16d/0x520 [amdgpu] [ 602.669337] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669339] ? stack_depot_save+0x12/0x20 [ 602.669342] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669343] ? set_track_prepare+0x52/0x70 [ 602.669346] ? kmemleak_alloc+0x4f/0x90 [ 602.669348] ? __kmalloc_node+0x34b/0x450 [ 602.669352] amdgpu_umc_update_ecc_status+0x23/0x40 [amdgpu] [ 602.669438] mca_umc_mca_get_err_count+0x85/0xc0 [amdgpu] [ 602.669554] mca_smu_parse_mca_error_count+0x120/0x1d0 [amdgpu] [ 602.669655] amdgpu_mca_dispatch_mca_set.part.0+0x141/0x250 [amdgpu] [ 602.669743] ? kmemleak_free+0x36/0x60 [ 602.669745] ? kvfree+0x32/0x40 [ 602.669747] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669749] ? kfree+0x15d/0x2a0 [ 602.669752] amdgpu_mca_smu_log_ras_error+0x1f6/0x210 [amdgpu] [ 602.669839] amdgpu_ras_query_error_status_helper+0x2ad/0x390 [amdgpu] [ 602.669924] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.669925] ? __call_rcu_common.constprop.0+0xa6/0x2b0 [ 602.669929] amdgpu_ras_query_error_status+0xf3/0x620 [amdgpu] [ 602.670014] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.670017] amdgpu_ras_log_on_err_counter+0xe1/0x170 [amdgpu] [ 602.670103] amdgpu_ras_do_recovery+0xd2/0x2c0 [amdgpu] [ 602.670187] ? srso_alias_return_thunk+0x5/0xfbef5 [ 602.670189] ? __schedule+0x37d/0xb30 [ 602.670191] process_one_work+0x176/0x350 [ 602.670194] worker_thread+0x2f7/0x420 [ 602.670197] ? Signed-off-by: Yang Wang <[email protected]> Reviewed-by: YiPeng Chai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: remove amdgpu_mes_fence_wait_polling()Alex Deucher2-16/+0
No longer used so remove it. Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: cleanup MES12 command submissionAlex Deucher1-28/+48
The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: ade887c63394 ("drm/amdgpu/mes12: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <[email protected]> Suggested-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: refine gfx10 firmware loadingYang Wang1-13/+12
refine gfx10 firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: refine gfx9 firmware loadingYang Wang2-30/+26
refine gfx9 firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: cleanup MES11 command submissionChristian König1-28/+48
The approach of having a separate WB slot for each submission doesn't really work well and for example breaks GPU reset. Use a status query packet for the fence update instead since those should always succeed we can use the fence of the original packet to signal the state of the operation. While at it cleanup the coding style. Fixes: eef016ba8986 ("drm/amdgpu/mes11: Use a separate fence per transaction") Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: fix UBSAN warning in kv_dpm.cAlex Deucher1-0/+2
Adds bounds check for sumo_vid_mapping_entry. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3392 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/radeon: fix UBSAN warning in kv_dpm.cAlex Deucher1-0/+2
Adds bounds check for sumo_vid_mapping_entry. Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-19drm/amdgpu: fix using the reserved VMID with gang submitChristian König3-13/+45
We need to ensure that even when using a reserved VMID that the gang members can still run in parallel. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>