aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c
AgeCommit message (Collapse)AuthorFilesLines
2023-09-26drm/amdgpu: Restore partition mode after resetLijo Lazar1-7/+21
On a full device reset, PSP FW gets unloaded. Hence restore the partition mode by placing a new request. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Acked-by: Alex Deucher <[email protected]> Tested-by: Asad Kamal <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-16drm/amdgpu: skip xcp drm device allocation when out of drm resourceJames Zhu1-1/+12
Return 0 when drm device alloc failed with -ENOSPC in order to allow amdgpu drive loading. But the xcp without drm device node assigned won't be visiable in user space. This helps amdgpu driver loading on system which has more than 64 nodes, the current limitation. The proposal to add more drm nodes is discussed in public, which will support up to 2^20 nodes totally. kernel drm: https://lore.kernel.org/lkml/[email protected]/T/ libdrm: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 Signed-off-by: James Zhu <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-07-18drm/amdgpu: use a macro to define no xcp partition caseGuchun Chen1-2/+2
~0 as no xcp partition is used in several places, so improve its definition by a macro for code consistency. Suggested-by: Christian König <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-30drm/amdgpu: share drm device for pci amdgpu device with 1st partition deviceJames Zhu1-3/+6
To save render node resoure, share drm device setting for pci amdgpu device with 1st XCP partition device. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-23drm/amdgpu: Move calculation of xcp per memory nodeLijo Lazar1-1/+3
Its value is required for finding the memory id of xcp. Fixes: d26ea1b346e7 ("drm/amdgpu: Add xcp manager num_xcp_per_mem_partition") Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: use amdxcp platform device as spatial partitionJames Zhu1-7/+5
Use amdxcp platform device as spatial partition device. -v2: remove unused variable Signed-off-by: James Zhu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: save/restore part of xcp drm_device fieldsJames Zhu1-2/+14
Redirect xcp allocated drm_device::rdev/pdev/driver with amdgpu pci_device/drm_device setting. They need be saved before redirect and restored after unregister xcp drm_device. -v2: fix warning discarded-qualifiers Signed-off-by: James Zhu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix warningsLijo Lazar1-1/+1
Fix warnings reported by kernel test bot/smatch smatch warnings: drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.c:65 amdgpu_xcp_run_transition() error: buffer overflow 'xcp_mgr->xcp' 8 <= 8 Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/[email protected]/ Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Fix a couple of spelling mistakes in info and debug messagesColin Ian King1-1/+1
There are a couple of spelling mistakes, one in a dev_info message and the other in a dev_debug message. Fix them. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: route ioctls on primary node of XCPs to primary deviceShiwu Zhang1-0/+1
During XCP init, unlike the primary device, there is no amdgpu_device attached to each XCP's drm_device In case that user trying to open/close the primary node of XCP drm_device this rerouting is to solve the NULL pointer issue causing by referring to any member of the amdgpu_device BUG: unable to handle page fault for address: 0000000000020c80 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page Oops: 0002 [#1] PREEMPT SMP NOPTI Call Trace: <TASK> lock_timer_base+0x6b/0x90 try_to_del_timer_sync+0x2b/0x80 del_timer_sync+0x29/0x40 flush_delayed_work+0x1c/0x50 amdgpu_driver_open_kms+0x2c/0x280 [amdgpu] drm_file_alloc+0x1b3/0x260 [drm] drm_open+0xaa/0x280 [drm] drm_stub_open+0xa2/0x120 [drm] chrdev_open+0xa6/0x1c0 Signed-off-by: Shiwu Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add memory partition id to amdgpu_vmPhilip Yang1-0/+3
If xcp_mgr is initialized, add mem_id to amdgpu_vm structure to store memory partition number when creating amdgpu_vm for the xcp. The xcp number is decided when opening the render device, for example /dev/dri/renderD129 is xcp_id 0, /dev/dri/renderD130 is xcp_id 1. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add xcp manager num_xcp_per_mem_partitionPhilip Yang1-0/+1
Used by KFD to check memory limit accounting. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: update ref_cnt before ctx freeJames Zhu1-0/+16
Update ref_cnt before ctx free. Signed-off-by: James Zhu <[email protected]> Acked-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: add partition scheduler list updateJames Zhu1-0/+2
Add partition scheduler list update in late init and xcp partition mode switch. Signed-off-by: James Zhu <[email protected]> Acked-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: find partition ID when open deviceJames Zhu1-0/+29
Find partition ID when open device from render device minor. Signed-off-by: Christian König <[email protected]> Signed-off-by: James Zhu <[email protected]> Reviewed-and-tested-by: Philip Yang<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: support partition drm devicesJames Zhu1-1/+58
Support partition drm devices on GC_HWIP IP_VERSION(9, 4, 3). This is a temporary solution and will be superceded. Signed-off-by: Christian König <[email protected]> Signed-off-by: James Zhu <[email protected]> Reviewed-and-tested-by: Philip Yang<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Move initialization of xcp before kfdLijo Lazar1-9/+7
After partition switch, fill all relevant xcp information before kfd starts initialization. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add callback to fill xcp memory idLijo Lazar1-0/+12
Add callback in xcp interface to fill xcp memory id information. Memory id is used to identify the range/partition of an XCP from the available memory partitions in device. Also, fill the id information. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add utility functions for xcpLijo Lazar1-0/+12
Add utility functions to get details of xcp and iterate through available xcps. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Use transient mode during xcp switchLijo Lazar1-3/+15
During partition switch, keep the state as transient mode. Fetch the latest state if switch fails. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add flags for partition mode queryLijo Lazar1-3/+5
It's not required to take lock on all cases while querying partition mode. Querying partition mode during KFD init process doesn't need to take a lock. Init process after a switch will already be happening under lock. Control the behaviour by adding flags to xcp_query_partition_mode. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdgpu: Add initial version of XCP routinesLijo Lazar1-0/+244
Within a device, an accelerator core partition can be constituted with different IP instances. These partitions are spatial in nature. Number of partitions which can exist at the same time depends on the 'partition mode'. Add a manager entity which is responsible for switching between different partition modes and maintaining partitions. It is also responsible for suspend/resume of different partitions. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>