Age | Commit message (Collapse) | Author | Files | Lines |
|
On a full device reset, PSP FW gets unloaded. Hence restore the
partition mode by placing a new request.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Tested-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch fixes a memory leak in the amdgpu_ras_feature_enable() function.
The leak occurs when the function sends a command to the firmware to enable
or disable a RAS feature for a GFX block. If the command fails, the kfree()
function is not called to free the info memory.
Fixes: 9f051d6ff13f ("drm/amdgpu: Free ras cmd input buffer properly")
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Cong Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 7748ce5b69581325cae40c2134088820f0957902.
vbios_version sysfs node is used to identify Part Number also. Revert to
the same so that it doesn't break scripts/software which parse this.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Include subrevision and variant fileds also to IP version.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Print channel index for UMC v12.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
update the query to return the number of functional
instances where there is more than an instance of the requested
type and for others continue to return one.
v2: count must reflect the actual number of engines (Alex)
v3: fix wrong number of engines for vcn (Alex)
Signed-off-by: Sathishkumar S <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It should first check block ras obj whether be set, it should
return 0 directly if block ras obj or hw_ops is not set.
If block doesn't support RAS just return 0 is fine.
Changed from V1:
return 0 directly if block ras obj or hw ops is not set
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Host handles PG.
Signed-off-by: Vignesh Chander <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of storing coredump information inside amdgpu_device struct,
move if to a proper separated struct and allocate it dynamically. This
will make it easier to further expand the logged information.
Signed-off-by: André Almeida <[email protected]>
Reviewed-by: Shashank Sharma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch fixes a memory leak in the amdgpu_ras_feature_enable() function.
The leak occurs when the function sends a command to the firmware to enable
or disable a RAS feature for a GFX block. If the command fails, the kfree()
function is not called to free the info memory.
Fixes: 9f051d6ff13f ("drm/amdgpu: Free ras cmd input buffer properly")
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Cong Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Search for vbios version string in STRING_OFFSET-ATOM_ROM_HEADER region
first. If those offsets are not populated, use the hardcoded region.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 7748ce5b69581325cae40c2134088820f0957902.
vbios_version sysfs node is used to identify Part Number also. Revert to
the same so that it doesn't break scripts/software which parse this.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These flags (for GEM and SVM allocations) allocate
memory that allows for system-scope atomic semantics.
On GFX943 these flags cause caches to be avoided on
non-local memory.
On all other ASICs they are identical in functionality to the
equivalent COHERENT flags.
Corresponding Thunk patch is at
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
Reviewed-by: David Yat Sin <[email protected]>
Signed-off-by: David Francis <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu mca debug sysfs support.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add missing IP discovery info.
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Lang Yu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu smu mca dump feature support.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Seamless boot can technically be supported as far back as DCN1
but to avoid regressions on older hardware, enable it for DCN3 and
later.
If users report using the module parameter that it works on older
ASICs as well, this can be adjusted.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The module parameter can be used to test more easily enabling seamless
boot support on additional ASICs.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
During jpeg init, CPU writes to frame buffer which can be cached by HDP,
occasionally causing invalid header to be sent to MMSCH. Perform HDP flush
after writing to frame buffer before continuing with jpeg init sequence.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Timmy Tsai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This will allow base driver to dictate whether seamless should be
enabled. No intended functional changes.
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
`amdgpu_gmc_get_vbios_allocations` has a special case for how to
bring up yellow carp when amdgpu discovery is turned off. As this ASIC
ships with discovery turned on, it's generally dead code and worse it
causes `adev->mman.keep_stolen_vga_memory` to not be initialized for
yellow carp.
Remove it.
Signed-off-by: Mario Limonciello <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use an inline function for version check. Gives more flexibility to
handle any format changes.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
With the typical model where the display server opens the file descriptor
and then hands it over to the client(*), we were showing stale data in
debugfs.
Fix it by updating the drm_file->pid on ioctl access from a different
process.
The field is also made RCU protected to allow for lockless readers. Update
side is protected with dev->filelist_mutex.
Before:
$ cat /sys/kernel/debug/dri/0/clients
command pid dev master a uid magic
Xorg 2344 0 y y 0 0
Xorg 2344 0 n y 0 2
Xorg 2344 0 n y 0 3
Xorg 2344 0 n y 0 4
After:
$ cat /sys/kernel/debug/dri/0/clients
command tgid dev master a uid magic
Xorg 830 0 y y 0 0
xfce4-session 880 0 n y 0 1
xfwm4 943 0 n y 0 2
neverball 1095 0 n y 0 3
*)
More detailed and historically accurate description of various handover
implementation kindly provided by Emil Velikov:
"""
The traditional model, the server was the orchestrator managing the
primary device node. From the fd, to the master status and
authentication. But looking at the fd alone, this has varied across
the years.
IIRC in the DRI1 days, Xorg (libdrm really) would have a list of open
fd(s) and reuse those whenever needed, DRI2 the client was responsible
for open() themselves and with DRI3 the fd was passed to the client.
Around the inception of DRI3 and systemd-logind, the latter became
another possible orchestrator. Whereby Xorg and Wayland compositors
could ask it for the fd. For various reasons (hysterical and genuine
ones) Xorg has a fallback path going the open(), whereas Wayland
compositors are moving to solely relying on logind... some never had
fallback even.
Over the past few years, more projects have emerged which provide
functionality similar (be that on API level, Dbus, or otherwise) to
systemd-logind.
"""
v2:
* Fixed typo in commit text and added a fine historical explanation
from Emil.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: "Christian König" <[email protected]>
Cc: Daniel Vetter <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Tested-by: Rob Clark <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Christian König <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.6-2023-09-13:
amdgpu:
- GC 9.4.3 fixes
- Fix white screen issues with S/G display on system with >= 64G of ram
- Replay fixes
- SMU 13.0.6 fixes
- AUX backlight fix
- NBIO 4.3 SR-IOV fixes for HDP
- RAS fixes
- DP MST resume fix
- Fix segfault on systems with no vbios
- DPIA fixes
amdkfd:
- CWSR grace period fix
- Unaligned doorbell fix
- CRIU fix for GFX11
- Add missing TLB flush on gfx10 and newer
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
One doc fix for drm/connector, one fix for amdgpu for an crash when
VRAM usage is high, and one fix in gm12u320 to fix the timeout units in
the code
Signed-off-by: Daniel Vetter <[email protected]>
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/w5nlld5ukeh6bgtljsxmkex3e7s7f4qquuqkv5lv4cv3uxzwqr@pgokpejfsyef
|
|
Implement support for remapping the HDP aperture registers for
NBIO 7.11.
Reviewed-by: Lang Yu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Implement support for setting up the VCN doorbell range for
NBIO 7.11.
Reviewed-by: Lang Yu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Forwarding to v6.6-rc1.
Signed-off-by: Thomas Zimmermann <[email protected]>
|
|
On some APU systems, there is no atom context and so the
atom_context struct is null.
Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl
to handle this case, returning all zeroes.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: David Francis <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
So driver doesn't generate incorrect message until
the new format is settled down for aqua_vanjaram
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Needed for HDP flush to work correctly.
Reviewed-by: Timmy Tsai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This matches the behavior for soc15 and nv.
Acked-by: Christian König <[email protected]>
Reviewed-by: Timmy Tsai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 70e64c4d522b732e31c6475a3be2349de337d321.
Since, we now have an actual fix for this issue, we can get rid of this
workaround as it can cause pin failures if enough VRAM isn't carved out
by the BIOS.
Cc: [email protected] # 6.1+
Acked-by: Harry Wentland <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This patch fixes the case where the code currently passes
absolute register address and not the reg offset, which HWS
expects, when sending the PM4 packet to set/update CWSR grace
period. Additionally, cleanup the signature of
build_grace_period_packet_info function as it no longer needs
the inst parameter.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Jonathan Kim <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On some APU systems, there is no atom context and so the
atom_context struct is null.
Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl
to handle this case, returning all zeroes.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: David Francis <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Create a module option to disable soft recoveries on amdgpu, making
every recovery go through the device reset path. This option makes
easier to force device resets for testing and debugging purposes.
Signed-off-by: André Almeida <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Merge all developer debug options available as separated module
parameters in one, making it obvious that are for developers.
Drop the obsolete module options in favor of the new ones.
Signed-off-by: André Almeida <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gc info usage misses type conversion.
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Yifan Zhang <[email protected]>
Reviewed-by: Li Ma <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename KGD_MAX_QUEUES to AMDGPU_MAX_QUEUES to conform with
the naming convention followed in amdgpu_gfx.h. No functional
change.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Print out row, column and bank value of UMC error address for UMC v12.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
So driver doesn't generate incorrect message until
the new format is settled down for aqua_vanjaram
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Yang Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Needed for HDP flush to work correctly.
Reviewed-by: Timmy Tsai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This matches the behavior for soc15 and nv.
Acked-by: Christian König <[email protected]>
Reviewed-by: Timmy Tsai <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 70e64c4d522b732e31c6475a3be2349de337d321.
Since, we now have an actual fix for this issue, we can get rid of this
workaround as it can cause pin failures if enough VRAM isn't carved out
by the BIOS.
Cc: [email protected] # 6.1+
Acked-by: Harry Wentland <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Hamza Mahfooz <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Get UMC phyical channel index according to node id, umc instance and
channel instance.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Convert MCA error address to physical address and find out all pages in
one physical row.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When reset method is not passed in reset context, look for the handler
for default reset method. On Aldebaran, default reset method for SOCs
connected to CPU over XGMI is MODE2.
Signed-off-by: Lijo Lazar <[email protected]>
Reviewed-by: Asad Kamal <[email protected]>
Tested-by: Asad Kamal <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[1] Remove the irq flags setting code since pci_alloc_irq_vectors()
handles these flags.
[2] Free the msi vectors in case of error.
Signed-off-by: Ma Jun <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|