Age | Commit message (Collapse) | Author | Files | Lines |
|
it will cause hwmon node of power1_label is not created.
v2:
the hwmon node of "power1_label" is always needed for all ASICs.
and the patch will remove ASIC type check for "power1_label".
Fixes: ae07970a0621d6 ("drm/amd/pm: add support for hwmon control of slow and fast PPT limit on vangogh")
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Even if can_apply_edp_fast_boot is set to 1 at boot, this flag will
be cleared to 0 at S3 resume.
[How]
Keep eDP Vdd on when eDP stream is already enabled.
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Zhan Liu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix clamping to match register field size
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Dmytro Laktyushkin <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
pflip interrupt order are mapped 1 to 1 to otg id.
e.g. if irq_src=26 corresponds to otg0 then 27->otg1, 28->otg2...
Linux DM registers pflip interrupts per number of crtcs.
In fused pipe case crtc numbers can be less than otg id.
e.g. if one pipe out of 3(otg#0-2) is fused adev->mode_info.num_crtc=2
so DM only registers irq_src 26,27.
This is a bug since if pipe#2 remains unfused DM never gets
otg2 pflip interrupt (irq_src=28)
That may results in gfx failure due to pflip timeout.
[How]
Register pflip interrupts per max num of otg instead of num_crtc
Signed-off-by: Roman Li <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Confirmed with hardware team, there is harvesting for gc 10.3.1.
Signed-off-by: Aaron Liu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
A number of BIOS versions have a problem with the watermarks table not
being configured properly. This manifests as a very scary looking warning
during resume from s0i3. This should be harmless in most cases and is well
understood, so decrease the assertion to a clearer warning about the problem.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Replace reset queue for specific PASID with unmap all queues, reset
queue could break CP scheduler.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As the function is used in more different cases, use a more general
name.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Since we have a single instance of reset semaphore which we
lock only once even for XGMI hive we don't need the nested
locking hint anymore.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74120.html
|
|
This functions needs to be split into 2 parts where
one is called only once for locking single instance of
reset_domain's sem and reset flag and the other part
which handles MP1 states should still be called for
each device in XGMI hive.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74118.html
|
|
We should have a single instance per entrire reset domain.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Suggested-by: Lijo Lazar <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74116.html
|
|
We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74117.html
|
|
The reset domain contains register access semaphor
now and so needs to be present as long as each device
in a hive needs it and so it cannot be binded to XGMI
hive life cycle.
Adress this by making reset domain refcounted and pointed
by each member of the hive and the hive itself.
v4:
Fix crash on boot witrh XGMI hive by adding type to reset_domain.
XGMI will only create a new reset_domain if prevoius was of single
device type meaning it's first boot. Otherwsie it will take a
refocunt to exsiting reset_domain from the amdgou device.
Add a wrapper around reset_domain->refcount get/put
and a wrapper around send to reset wq (Lijo)
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74121.html
|
|
Since now all GPU resets are serialzied there is no need for this.
This patch also reverts 'drm/amdgpu: race issue when jobs on 2 ring timeout'
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74119.html
|
|
Since we serialize all resets no need to protect from concurrent
resets.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74115.html
|
|
No need to to trigger another work queue inside the work queue.
v3:
Problem:
Extra reset caused by host side FLR notification
following guest side triggered reset.
Fix: Preven qeuing flr_work from mailbox irq if guest
already executing a reset.
Suggested-by: Liu Shaoyun <[email protected]>
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Liu Shaoyun <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74114.html
|
|
Use reset domain wq also for non TDR gpu recovery trigers
such as sysfs and RAS. We must serialize all possible
GPU recoveries to gurantee no concurrency there.
For TDR call the original recovery function directly since
it's already executed from within the wq. For others just
use a wrapper to qeueue work and wait on it to finish.
v2: Rename to amdgpu_recover_work_struct
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74113.html
|
|
Before we initialize schedulers we must know which reset
domain are we in - for single device there iis a single
domain per device and so single wq per device. For XGMI
the reset domain spans the entire XGMI hive and so the
reset wq is per hive.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74112.html
|
|
Defined a reset_domain struct such that
all the entities that go through reset
together will be serialized one against
another. Do it for both single device and
XGMI hive cases.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Suggested-by: Daniel Vetter <[email protected]>
Suggested-by: Christian König <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://www.spinics.net/lists/amd-gfx/msg74111.html
|
|
Instead of manually extracting the fence.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
To align with other headers.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To align with other headers.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
MIT.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fixes hangs on driver load with multiple displays on
DCN 2.0 parts.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=215511
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1877
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1886
Fixes: ee2698cf79cc ("drm/amd/display: Changed pipe split policy to allow for multi-display pipe split")
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These have been at production level for a while. Drop
the flag.
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Include the header with the prototype to silence the following clang
warnings:
drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:29:6: warning: no
previous prototype for function 'amdgpu_dpm_get_active_displays'
[-Wmissing-prototypes]
void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev)
^
drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:29:1: note: declare
'static' if the function is not intended to be used outside of this
translation unit
void amdgpu_dpm_get_active_displays(struct amdgpu_device *adev)
^
static
drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:76:5: warning: no
previous prototype for function 'amdgpu_dpm_get_vrefresh'
[-Wmissing-prototypes]
u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev)
^
drivers/gpu/drm/amd/amdgpu/../pm/amdgpu_dpm_internal.c:76:1: note: declare
'static' if the function is not intended to be used outside of this
translation unit
u32 amdgpu_dpm_get_vrefresh(struct amdgpu_device *adev)
^
static
2 warnings generated.
Besides that, remove the duplicated prototype of the function
amdgpu_dpm_get_vblank_time in order to keep the consistency of the
headers.
Fixes: 6ddbd37f1074 ("drm/amd/pm: optimize the amdgpu_pm_compute_clocks() implementations")
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Maíra Canal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
clang static analysis reports this error
amdgpu_smu.c:2289:9: warning: Called function pointer
is null (null dereference)
return smu->ppt_funcs->emit_clk_levels(
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There is a logic error in the earlier check of
emit_clk_levels. The error value is set to
the ret variable but ret is never used. Return
directly and remove the unneeded ret variable.
Fixes: 5d64f9bbb628 ("amdgpu/pm: Implement new API function "emit" that accepts buffer base and write offset")
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We want to have lockdep annotation here, so make sure that we reserve
the PD while removing PRTs even if it isn't strictly necessary since the
VM object is about to be destroyed anyway.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Since newly added BOs don't have any mappings it's ok to add them
without holding the VM lock. Only when we add per VM BOs the lock is
mandatory.
Signed-off-by: Christian König <[email protected]>
Reported-by: Bhardwaj, Rajneesh <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
The link encoder mapping could return a null one and causes system crash.
[How]
Let the mapping can get an available link encoder without endpoint
identification check.
Reviewed-by: Wenjing Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Martin Tsai <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This version brings along the following fixes:
-fix for build failure uninitalized error
-Bug fix for DP2 using uncertified cable
-limit unbounded request to 5k
-fix DP LT sequence on EQ fail
-Bug fixes for S3/S4
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Aric Cyr <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Anthony Koo <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
The number of lanes wasn't being reset to maximum when reducing link
rate due to an EQ failure. This could result in having fewer lanes in
the verified link capabilities, a lower maximum link bandwidth, and
fewer modes being supported.
[How]
Reset the number of lanes to max when dropping link rate due to EQ
failure during link training.
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Ilya <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Even if can_apply_edp_fast_boot is set to 1 at boot, this flag will
be cleared to 0 at S3 resume.
[How]
Keep eDP Vdd on when eDP stream is already enabled.
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Zhan Liu <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
VBIOS light up eDP with 6bpc but driver use 8bpc without
disable valid stream then re-enable valid stream. Some
panels can't runtime change color depth.
[How]
Change fastboot timing validation function. Not only check
LANE_COUNT, LINK_RATE...etc
Reviewed-by: Anthony Koo <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Paul Hsieh <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix clamping to match register field size
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Dmytro Laktyushkin <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Why:
When resume from sleep or hiberation, blocked MST Topology discovery might
need to be used.
How:
Added "DETECT_REASON_RESUMEFROMS3S4" to enum dc_detect_reason; use it to
require blocked MST Topology discovery.
Reviewed-by: Wenjing Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Bing Guo <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
remove static from optc31_set_drr
Reviewed-by: Nevenko Stupar <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Eric Bernstein <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Unbounded requesting is unsupported on pipe split modes
and this change prevents us running into such a situation
with wide modes.
Reviewed-by: Charlene Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Dmytro Laktyushkin <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Found when running igt@kms_atomic.
Userspace attempts to do a TEST_COMMIT when 0 streams which calls
dc_remove_stream_from_ctx. This in turn calls link_enc_unassign
which ends up modifying stream->link = NULL directly, causing the
global link_enc to be removed preventing further link activity
and future link validation from passing.
[How]
We take care of link_enc unassignment at the start of
link_enc_cfg_link_encs_assign so this call is no longer necessary.
Fixes global state from being modified while unlocked.
Reviewed-by: Jimmy Kizito <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Build failure due to ‘status’ may be used uninitialized
[How]
Initialize status to LINK_TRAINING_SUCCESS
Reviewed-by: Wenjing Liu <[email protected]>
Acked-by: Jasdeep Dhillon <[email protected]>
Signed-off-by: Eric Bernstein <[email protected]>
Tested-by: Daniel Wheeler <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
smu_cmn_disable_all_features_with_exception
As there is no internal cache for enabled ppfeatures now. Thus the 2nd
parameter will be not needed any more.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As the enabled ppfeatures are just retrieved ahead. We can use
that directly instead of retrieving again and again.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The following scenarios make the driver cache for enabled ppfeatures
outdated and invalid:
- Other tools interact with PMFW to change the enabled ppfeatures.
- PMFW may enable/disable some features behind driver's back. E.g.
for sienna_cichild, on gfxoff entering, PMFW will disable gfx
related DPM features. All those are performed without driver's
notice.
Also considering driver does not actually interact with PMFW such
frequently, the benefit brought by such cache is very limited.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The supported features should be retrieved just after EnableAllDpmFeatures message
complete. And the check(whether some dpm feature is supported) is only needed when we
decide to enable or disable it.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use uint64_t instead of an array of uint32_t. This can avoid
some non-necessary intermediate uint32_t -> uint64_t conversions.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of having two which do the same thing.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As other dGPU asics, Renoir should use smu_cmn_get_enabled_mask() for
that job.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
pflip interrupt order are mapped 1 to 1 to otg id.
e.g. if irq_src=26 corresponds to otg0 then 27->otg1, 28->otg2...
Linux DM registers pflip interrupts per number of crtcs.
In fused pipe case crtc numbers can be less than otg id.
e.g. if one pipe out of 3(otg#0-2) is fused adev->mode_info.num_crtc=2
so DM only registers irq_src 26,27.
This is a bug since if pipe#2 remains unfused DM never gets
otg2 pflip interrupt (irq_src=28)
That may results in gfx failure due to pflip timeout.
[How]
Register pflip interrupts per max num of otg instead of num_crtc
Signed-off-by: Roman Li <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Bypass group programming (utcl2_harvest) aims to forbid UTCL2 to send
invalidation command to harvested SE/SA. Once invalidation command comes
into harvested SE/SA, SE/SA has no response and system hang.
This patch is to add checking if the GART table is already allocated before
invalidating TLB. The new procedure is as following:
1. Calling amdgpu_gtt_mgr_init() in amdgpu_ttm_init(). After this step GTT
BOs can be allocated, but GART mappings are still ignored.
2. Calling amdgpu_gart_table_vram_alloc() from the GMC code. This allocates
the GART backing store.
3. Initializing the hardware, and programming the backing store into VMID0
for all VMHUBs.
4. Calling amdgpu_gtt_mgr_recover() to make sure the table is updated with
the GTT allocations done before it was allocated.
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Aaron Liu <[email protected]>
Acked-by: Huang Rui <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|