Age | Commit message (Collapse) | Author | Files | Lines |
|
refine gmc firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
refine vpe firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
refine vcn firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
adjust the function position to better match aca/mca fini code in ras_fini().
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To fix potential overflowed constant warning, modify the variables to u32
for getting the return value of RREG32_SOC15().
Signed-off-by: Bob Zhou <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To achieve full occupancy CP hardware needs to know if CUs in SE are
symmetrically or asymmetrically harvested
v2: Reset is_symmetric_cus for each loop
Signed-off-by: Harish Kasiviswanathan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Skip scheduling coredump when gpu reset is intentionally triggered
through debugfs.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
refine sdma firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
refine psp firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
refine mes firmware loading
v2:
use dev_info instead of DRM_INFO
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add missing locking at a few places when calling MES APIs to ensure
exclusive access to MES queue.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Kent Russell <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Currently, amdgpu will always set up the brightness at 100% when it
loads. However this is jarring when the BIOS has it previously
programmed to a much lower value.
The ACPI ATIF method includes two members for "ac_level" and "dc_level".
These represent the default values that should be used if the system is
brought up in AC and DC respectively.
Use these values to set up the default brightness when the backlight
device is registered.
v2: squash in ACPI fix
Reviewed-by: Leo Li <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Similar to commit 813e7d4cd05e where some kernel log
messages are dropped. With this commit, more log
messages in older version of VCN/JPEG code are dropped.
Acked-by: Leo Liu <[email protected]>
Signed-off-by: David (Ming Qiang) Wu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
refine gpu_info firmware loading
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
Adding formatting string feature to improve function flexibility.
v2:
modify macro name to ADMGPU_UCODE_MAX_NAME.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To fix potential overflowed constant warning reported by Coverity,
modify the variables to uint32_t.
Signed-off-by: Bob Zhou <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need to take the reset domain lock before flush hdp. We can't put the
lock inside amdgpu_device_flush_hdp itself because it is used during
reset where we already take the write side lock.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Which method is used to flush tlb does not depend on whether a reset is
in progress or not. We should skip flush altogether if the GPU will get
reset. So put both path under reset_domain read lock.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
CC: [email protected]
|
|
Here since we are in reset and takes the reset_domain write side lock
already. We can't use the flush tlb helper which tries to take the read
side.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When amdgpu_gart_invalidate_tlb helper is introduced this part was left
out of the conversion. Avoid the code duplication here.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
At this point the gart is not set up, there's no point to invalidate tlb
here and it could even be harmful.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We send back the ready to reset message before we stop anything. This is
wrong. Move it to when we are actually ready for the FLR to happen.
In the current state since we take tens of seconds to stop everything,
it is very likely that host would give up waiting and reset the GPU
before we send ready, so it would be the same as before. But this gets
rid of the hack with reset_domain locking and also let us tell how slow
ready to reset actually is from the host. The ready to reset speed can
be improved later.
Signed-off-by: Yunxiang Li <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Emily Deng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Accessing registers via host is missing the check for skip_hw_access and
the lockdep check that comes with it.
Signed-off-by: Yunxiang Li <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To fullfill the reset event description.
Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_job_ring may return NULL, which causes kernel NULL
pointer error, using another way to print ring name instead
of ring->name.
Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Since the range of align is 0~7, the expression is: align = (attr >> 3) & 7.
In the case of ATOM_ARG_IMM, the code cannot reach the default case.
So there is no need for "break".
Signed-off-by: Jesse Zhang <[email protected]>
Suggested-by: Tim Huang <[email protected]>
Reviewed-by: Tim Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Add dcc buffer flag for copy buffer
2. Add sdma 7.0 support copy dcc buffer
Signed-off-by: Likun Gao <[email protected]>
Signed-off-by: Frank Min <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Deal with AMDGPU_GEM_CREATE_GFX12_DCC to set DCC bit
when needed.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add additional VM PTE bits.
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Roll -rc3 and current drm/fixes in.
This will also unstuck our for-next branch.
Signed-off-by: Maxime Ripard <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-06-07:
amdgpu:
- DCN 4.0.x support
- DCN 3.5 updates
- GC 12.0 support
- DP MST fixes
- Cursor fixes
- MES11 updates
- MMHUB 4.1 support
- DML2 Updates
- DCN 3.1.5 fixes
- IPS fixes
- Various code cleanups
- GMC 12.0 support
- SDMA 7.0 support
- SMU 13 updates
- SR-IOV fixes
- VCN 5.x fixes
- MES12 support
- SMU 14.x updates
- Devcoredump improvements
- Fixes for HDP flush on platforms with >4k pages
- GC 9.4.3 fixes
- RAS ACA updates
- Silence UBSAN flex array warnings
- MMHUB 3.3 updates
amdkfd:
- Contiguous VRAM allocations
- GC 12.0 support
- SDMA 7.0 support
- SR-IOV fixes
radeon:
- Backlight workaround for iMac
- Silence UBSAN flex array warnings
UAPI:
- GFX12 modifier and DCC support
Proposed Mesa changes:
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29510
- KFD GFX ALU exceptions
Proposed ROCdebugger changes:
https://github.com/ROCm/ROCdbgapi/commit/08c760622b6601abf906f75abbc5e21d9fd425df
https://github.com/ROCm/ROCgdb/commit/944fe1c1414a68700414e86e32273b6bfa62ba6f
- KFD Contiguous VRAM allocation flag
Proposed ROCr/HIP changes:
https://github.com/ROCm/ROCT-Thunk-Interface/commit/f7b4a269914a3ab4f1e2453c2879adb97b5cc9e5
https://github.com/ROCm/ROCR-Runtime/pull/214/commits/26e8530d05a775872cb06dde6693db72be0c454a
https://github.com/ROCm/clr/commit/1d48f2a1ab38b632919c4b7274899b3faf4279ff
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This happens when the amdgpu_bo_release_notify running
before amdgpu_ttm_set_buffer_funcs_status set the buffer
funcs to enabled.
check the buffer funcs enablement before calling the fill
buffer memory.
v2:(Christian)
- Apply it only for GEM buffers and since GEM buffers are only
allocated/freed while the driver is loaded we never run into
the issue to clear with buffer funcs disabled.
v3:(Mario)
- drop the stable tag as this will presumably go into a
-fixes PR for 6.10
Log snip:
*ERROR* Trying to clear memory with ring turned off.
RIP: 0010:amdgpu_bo_release_notify+0x201/0x220 [amdgpu]
Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality")
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Reviewed-by: Christian König <[email protected]>
Tested-by: Mikhail Gavrilov <[email protected]>
Tested-by: Richard Gong <[email protected]>
Suggested-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Set the flag to true if bad page number reaches threshold.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Flag "mes.ring.shced.ready" will be set as true after mes hw init and set
as false when mes hw fini to avoid duplicate initialization. But hw fini
will not be called when function level reset, which will cause mes hw
init be skipped during FLR, which will leads to mapping legacy queue
fail. Set this flag as false when post reset will fix this issue.
Signed-off-by: Lin.Cao <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
reset cause is requested by customer as additional
info for gpu reset smi event.
v2: integerate reset sources suggested by Lijo Lazar
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set PTE_IS_PTE bit while PRT is enabled on gfx12.
Signed-off-by: Frank Min <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
reset source or reset cause is very useful info
for reset context, it will be used by events API.
Suggested-by: Lijo Lazar <[email protected]>
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10,
clear the bits before setting the new one.
Suggested-by: Alex Deucher <[email protected]>
Signed-off-by: longlyao <[email protected]>
Signed-off-by: Shane Xiao <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 6670142d25f3cc3166f2a6c8454acd310bf2776a.
Pierre-Eric reported problems with this on his navi33. Revert
for now until we understand what is going wrong.
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10,
clear the bits before setting the new one.
Suggested-by: Alex Deucher <[email protected]>
Signed-off-by: longlyao <[email protected]>
Signed-off-by: Shane Xiao <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch to disable lane0 L1TLB and enable lane1 L1TLB.
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch to configure mmhub3.3 SAW registers
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix comments and error messages to rightly represent
the information.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename the variable ip_dump_cp_queues to ip_dump_compute_queue
as it represent compute queues.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add gfx9 support of CP queue registers for all queues
to be used by devcoredump.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Sunil Khatri <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support of gfx9 ipdump print so devcoredump
could trigger it to dump the captured registers
in devcoredump.
Signed-off-by: Sunil Khatri <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add general registers of gfx9 in ipdump for
devcoredump support.
Signed-off-by: Sunil Khatri <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This commit fixes a type mismatch in the amdgpu_gfx_kiq_init_ring
function triggered by the snprintf function expecting unsigned char
arguments due to the '%hhu' format specifier, but receiving int and u32
arguments.
The issue occurred because the arguments xcc_id, ring->me, ring->pipe,
and ring->queue were of type int and u32, not unsigned char. This led to
a type mismatch when these arguments were passed to snprintf.
To resolve this, the snprintf line was modified to cast these arguments
to unsigned char. This ensures that the arguments are of the correct
type for the '%hhu' format specifier and resolves the warning.
Fixes the below:
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:4: warning: format
>> specifies type 'unsigned char' but the argument has type 'int'
>> [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~~~~~
>> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:12: warning: format
>> specifies type 'unsigned char' but the argument has type 'u32' (aka
>> 'unsigned int') [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:22: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:34: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
xcc_id, ring->me, ring->pipe, ring->queue);
^~~~~~~~~~~
4 warnings generated.
Fixes: 0ea554455542 ("drm/amdgpu: Fix snprintf usage in amdgpu_gfx_kiq_init_ring")
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Cc: Lijo Lazar <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Srinivasan Shanmugam <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Potentially overflowing expression mall_size_per_umc * adev->gmc.num_umc with type unsigned int (32 bits, unsigned)
is evaluated using 32-bit arithmetic,and then used in a context that expects an expression of type u64 (64 bits, unsigned).
Signed-off-by: Jesse Zhang <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
AMDGPU_RAS_GPU_ERR_BOOT_STATUS field is no longer valid.
The polling sequence is also simplifed according to
the latest firmware change.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|