Age | Commit message (Collapse) | Author | Files | Lines |
|
It doesn't expose PPTable descriptor on APU platform. So max/min
temperature values cannot be got from APU platform.
v2: Stoney needs to skip crit temperature as well.
Signed-off-by: Huang Rui <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
NULL dereference occurs when string that is not ended with space or
newline is written to some dpm sysfs interface (for example pp_dpm_sclk).
This happens because strsep replaces the tmp with NULL if the delimiter
is not present in string, which is then dereferenced by tmp[0].
Reproduction example:
sudo sh -c 'echo -n 1 > /sys/class/drm/card0/device/pp_dpm_sclk'
Signed-off-by: Paweł Gronowski <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Leftover of previous performance level setting cleanups.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move the mutext lock/unlock outside of the if(),
as the mutex is always taken: either in the if()
branch or in the else branch.
Signed-off-by: Alex Jivin <[email protected]>
Suggested-By: Luben Tukov <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Large clock values may overflow and show up as negative.
Reported by prOMiNd on IRC.
Acked-by: Nirmoy Das <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Port functionality from the Radeon driver to support
UVD and VCE power management.
Signed-off-by: Alex Jivin <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On Navi12 platform, node power_dpm_force_performance_level
doesn't work correctly in one-VF mode with at least three
smu messages not supported:
SMU_MSG_SetSoftMaxByFreq
SMU_MSG_SetSoftMinByFreq
SMU_MSG_TransferTableDram2Smu
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Wenhui Sheng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value. The initialization is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The call to pm_runtime_get_sync increments the counter even in case of
failure, leading to incorrect ref count.
In case of failure, decrement the ref count before returning.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add rename the gpu busy percentage for consistency and
add the mem busy percentage documentation.
Reviewed-by: Evan Quan <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Vega10 and previous asics use one interface, vega20 and newer
use another.
Reviewed-by: Evan Quan <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Drop unused APIs, variables and argument.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support for unique_id and serial_number, as these are now
the same value, and will be for future ASICs as well.
v2: Explicitly create unique_id only for VG10/20/ARC
v3: Change set_unique_id to get_unique_id for clarity
Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
User can check and set the enablement of throttling logging and
the interval between each logging.
V2: simplify the sysfs interface(no string parsing)
V3: add proper lock protection on updating throttling_logging_rs.interval
V4: documentation cosmetic per Luben's suggestion
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Return an error for sysfs and debugfs power interfaces during
gpu reset and suspend. Prevents access to the hw while it may
be in an unusable state.
v2: squash in fix to drop suspend check
Acked-by: Nirmoy Das <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
the origin design will use varible of "attr->states" to save node
supported states on current gpu device, but for multi gpu device, when
probe second gpu device, the driver will check attribute node states
from previous gpu device wthether to create attribute node.
it will cause other gpu device create attribute node faild.
1. add member attr_list into amdgpu_device to link supported device attribute node.
2. add new structure "struct amdgpu_device_attr_entry{}" to track device attribute state.
3. drop member "states" from amdgpu_device_attr.
v2:
1. move "attr_list" into amdgpu_pm and rename to "pm_attr_list".
2. refine create & remove device node functions parameter.
fix:
drm/amdgpu: optimize amdgpu device attribute code
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add some APU flags to simplify handling of different APU
variants. It's easier to understand the special cases
if we use names flags rather than checking device ids and
silicon revisions.
v2: rebase on latest code
Acked-by: Evan Quan <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Problem description]
1. Boot up picasso platform, launches desktop, Don't do anything (APU enter into "gfxoff" state)
2. Remote login to platform using SSH, then type the command line:
sudo su -c "echo manual > /sys/class/drm/card0/device/power_dpm_force_performance_level"
sudo su -c "echo 2 > /sys/class/drm/card0/device/pp_dpm_sclk" (fix SCLK to 1400MHz)
3. Move the mouse around in Window
4. Phenomenon : The screen frozen
Tester will switch sclk level during glmark2 run time.
APU will enter "gfxoff" state intermittently during glmark2 run time.
The system got hanged if fix GFXCLK to 1400MHz when APU is in "gfxoff"
state.
[Debug]
1. Fix SCLK to X MHz
1400: screen frozen, screen black, then OS will reboot.
1300: screen frozen.
1200: screen frozen, screen black.
1100: screen frozen, screen black, then OS will reboot.
1000: screen frozen, screen black.
900: screen frozen, screen black, then OS will reboot.
800: Situation Nomal, issue disappear.
700: Situation Nomal, issue disappear.
2. SBIOS setting: AMD CBS --> SMU Debug Options -->SMU Debug --> "GFX DLDO Psm Margin Control":
50 : Situation Nomal, issue disappear.
45 : Situation Nomal, issue disappear.
40 : Situation Nomal, issue disappear.
35 : Situation Nomal, issue disappear.
30 : screen black.
25 : screen frozen, then blurred screen.
20 : screen frozen.
15 : screen black.
10 : screen frozen.
5 : screen frozen, then blurred screen.
3. Disable GFXOFF feature
Situation Nomal, issue disappear.
[Why]
Through a period of time debugging with Sys Eng team and SMU team, Sys
Eng team said this is voltage/frequency marginal issue not a F/W or H/W
bug. This experiment proves that default targetPsm [for f=1400MHz] is
not sufficient when GFXOFF is enabled on Picasso.
SMU team think it is an odd test conditions to force sclk="1400MHz" when
GPU is in "gfxoff" state,then wake up the GFX. SCLK should be in the
"lowest frequency" when gfxoff.
[How]
Disable gfxoff when setting manual mode.
Enable gfxoff when setting other mode(exiting manual mode) again.
By the way, from the user point of view, now that user switch to manual
mode and force SCLK Frequency, he don't want SCLK be controlled by
workload.It becomes meaningless to "switch to manual mode" if APU enter "gfxoff"
due to lack of workload at this point.
Tips: Same issue observed on Raven.
Signed-off-by: chen gong <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix typos that prevented them from showing up.
v2: switch other files in addition to pp_clk_voltage
Fixes: 4e01847c38f7a5 ("drm/amdgpu: optimize amdgpu device attribute code")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1150
Signed-off-by: Alex Deucher <[email protected]>
Acked-by: Evan Quan <[email protected]>
|
|
1. Initialize the counters to 0 in case the callback
fails to initialize them.
2. The counters don't exist on APUs so return an error
for them.
3. Return an error if the callback doesn't exist.
Reviewed-by: Yong Zhao <[email protected]>
Reviewed-By: Kent Russell <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This loop in the error handling code should start a "i - 1" and end at
"i == 0". Currently it starts a "i" and ends at "i == 1". The result
is that it removes one attribute that wasn't created yet, and leaks the
zeroeth attribute.
Fixes: 4e01847c38f7 ("drm/amdgpu: optimize amdgpu device attribute code")
Acked-by: Michael J. Ruhl <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
the amdgpu device attribute node will be created accordding to sriov vf
mode at runtime.
cleanup unnecessary sriov check in attribute operation function.
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
unified amdgpu device attribute node functions:
1. add some helper functions to create amdgpu device attribute node.
2. create device node according to device attr flags on different VF mode.
3. rename some functions name to adapt a new interface.
v2:
1. remove ATTR_STATE_DEAD, ATTR_STATE_ALIVE enum.
2. rename callback function perform to attr_update.
3. modify some variable names
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Monk Liu <[email protected]>
Acked-by: Yintian Tao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For chip like CHIP_OLAND with si enabled(amdgpu.si_support=1),
the amdgpu will expose pp_num_states to the /sys directory.
In this moment, read the pp_num_states file will excute the
amdgpu_get_pp_num_states func. In our case, the data hasn't
been initialized, so the kernel will access some ilegal
address, trigger the segmentfault and system will reboot soon:
uos@uos-PC:~$ cat /sys/devices/pci0000\:00/0000\:00\:00.0/0000\:01\:00
.0/pp_num_states
Message from syslogd@uos-PC at Apr 22 09:26:20 ...
kernel:[ 82.154129] Internal error: Oops: 96000004 [#1] SMP
This patch aims to fix this problem, avoid that reading file
triggers the kernel sementfault.
Signed-off-by: limingyu <[email protected]>
Signed-off-by: zhoubinbin <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On ARCTURUS and RENOIR, powerplay is not supported yet.
When plug in or unplug power jack, ACPI event will issue.
Then kernel NULL pointer BUG will be triggered.
Check for NULL pointers before calling.
Signed-off-by: Aaron Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For boards that do not support automatic AC/DC transitions
in firmware, manually tell the firmware when the status
changes.
Bug: https://gitlab.freedesktop.org/drm/amd/issues/1043
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In order to remove the load and unload drm callbacks,
we need to reorder the init sequence to move all the drm
debugfs file handling. Do this for pm.
Tested-by: Thomas Zimmermann <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
count is size_t so don't use negative values.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If power management sysfs or debugfs files are accessed,
power up the GPU when necessary.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Provided an unified entry point. And fixed the confusing that the API
usage is conflict with what the naming implies.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Originally, due to the restriction from PSP and SMU, VF has
to send message to hypervisor driver to handle powerplay
change which is complicated and redundant. Currently, SMU
and PSP can support VF to directly handle powerplay
change by itself. Therefore, the old code about the handshake
between VF and PF to handle powerplay will be removed and VF
will use new the registers below to handshake with SMU.
mmMP1_SMN_C2PMSG_101: register to handle SMU message
mmMP1_SMN_C2PMSG_102: register to handle SMU parameter
mmMP1_SMN_C2PMSG_103: register to handle SMU response
v2: remove module parameter pp_one_vf
v3: fix the parens
v4: forbid vf to change smu feature
v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute
v6: change skip condition at vega10_copy_table_to_smc
Signed-off-by: Yintian Tao <[email protected]>
Acked-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
By using its own enabling function
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For Arcturus, clock limit settings on uclk/socclk/fclk domains
are not supported.
V2: simplify the code to support both SGPU and MGPU cases
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is a quick and low risk fix. Those APIs which
are exposed to other IPs or to support sysfs/hwmon
interfaces or DAL will have lock protection. Meanwhile
no lock protection is enforced for swSMU internal used
APIs. Future optimization is needed.
V2: strip the lock protection for all swSMU internal APIs
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Andrey Grodzovsky <[email protected]>
Acked-by: Feifei Xu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix DOC link name, clean up formatting in pp_dpm_* section.
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Some of the documentation formatting could be improved
which will resolve some Sphinx amdgpu build warnings e.g
WARNING: Unexpected indentation.
WARNING: Block quote ends without a blank line; unexpected unindent.
WARNING: Inline emphasis start-string without end-string.
Signed-off-by: Adam Zerella <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The driver does not support these sensors yet and there is no point in
creating sysfs attributes which will always return an error.
Signed-off-by: Jean Delvare <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: "David (ChunMing) Zhou" <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Do not expose those unsupported clock domains through sysfs on
Arcturus.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
VCN should be used for Vega20 later ASICs while UVD and VCE
are for previous ASICs.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. Unified feature enable status format in sysfs
2. Rename ppfeature to pp_features to adapt other pp sysfs node name
3. this function support all asic, not asic related function.
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Acked-by: Rui Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
DPM state relates are not supported on the new SW SMU ASICs. But still
it's not OK to trigger null pointer dereference on accessing them.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add checking for possible invalid input and null pointer. And
drop redundant code.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Drop redundant check, duplicate check, duplicate setting
and fix the return value.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On vega20, there is an SMU message to query it. On navi, it's fetched
from the metrics table.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The dpm sensor function already does this for us. This fixes
the freq*_input files with the new SMU implementation.
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As the lock was already held on the entrance to smu_handle_task.
- V2: lock in small granularity
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Make mem_busy_percent sysfs interface invisible on Vega10.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:1496 amdgpu_hwmon_show_temp() error: uninitialized symbol 'temp'.
Signed-off-by: Ernst Sjöstrand <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reported by smatch:
drivers/gpu/drm/amd/amdgpu/amdgpu_pm.c:693 amdgpu_set_pp_od_clk_voltage() error: uninitialized symbol 'ret'.
Signed-off-by: Ernst Sjöstrand <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|