Age | Commit message (Collapse) | Author | Files | Lines |
|
The src isn't used any more after GART hack removal.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Keep track off relocated PDs/PTs instead of walking and checking all PDs.
v2: fix root PD handling
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]> (v1)
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Kfree on NULL pointer is a no-op and therefore checking is redundant.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Himanshu Jha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of validating all page tables when one was evicted,
track which one needs a validation.
v2: simplify amdgpu_vm_ready as well
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]> (v1)
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Except for the reference count all other members are protected
by the VM PD being reserved.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We changed this to use an extra list a while back, but for the next
series I need a separate flag again.
v2: reorder to avoid unlocked list access
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of using the vm_state use a separate flag to note
that the BO was moved.
v2: reorder patches to avoid temporary lockless access
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Allows writing data to vram via debugfs.
Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
(v2): Call get_user before holding spinlock.
Signed-off-by: Alex Deucher <[email protected]>
|
|
To allocate additional space for the dynamic cu masks.
Confirmed with the hw team that we only need 1 dword
for the mask. The mask is the same for each SE so
you only need 1 dword.
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Confirmed with the hw team. It's the same for all asics.
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Those are certainly not kernel allocations, instead set the NO_CPU_ACCESS flag.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Stop checking the mapped BO itself, cause that one is
certainly not a page table.
Additional to that move the code into amdgpu_vm.c
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
That somehow got lost.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
sysfs is more stable, and doesn't require root to access
Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add 2 debugfs files, one that contains the VBIOS version, and one that
contains the VBIOS itself. These won't change after initialization,
so we can add the VBIOS version when we parse the atombios information.
This ensures that we can find out the VBIOS version, even when the dmesg
buffer fills up, and makes it easier to associate which VBIOS version is
for which GPU on mGPU configurations. Set the size to 20 characters in
case of some weird VBIOS version that exceeds the expected 17 character
format (3-8-3\0). The VBIOS dump also allows for easy debugging
v2: Move to debugfs, clarify commit message, add VBIOS dump file
Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Switches the AMDGPU driver over to the TTM tracepoint and removes
our old one. Now you can enable traces before loading the module
and trace all mappings.
Signed-off-by: Tom St Denis <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
(v2): Use struct device instead of pci in trace.
|
|
Newer versions of the CP firmware require changes in how the driver
initializes the hw block.
Change the firmware name for new firmware to maintain compatibility with
older kernels.
Acked-by: Christian König <[email protected]>
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove a redundant identical return statement, it has no use.
Detected by CoverityScan, CID#1454586 ("Structurally dead code")
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Check memory allocation failure and return -ENOMEM in such a case.
'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.
The calling graph would be, in such a case!
failure in amdgpu_cs_process_syncobj_out_dep()
---> error code returned by amdgpu_cs_dependencies()
--> amdgpu_cs_parser_fini() is called
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.
v2: agd: fix warning
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The function is called only once and doesn't do anything special.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use ttm_bo_mem_space instead of manually allocating GART space.
This allows us to evict BOs when there isn't enought GART space any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This isn't used since we don't map evicted BOs to GART any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
KIQ doesn't really use the GPU scheduler. The base
drivers generally use the KIQ ring directly rather than
submitting IBs. However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2. KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.
v2: move some expressions to one line
change KIQ sched_hw_submission to at least 16
v3: bump to 256
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move the asic specific code into the IP modules.
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.
This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode. Change the default size based on
the asic type.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
immediately
For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.
BUG: SWDEV-129274
Signed-off-by: Emily Deng <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Correctly detect system memory mappings when using CPU and don't use
huge pages for them.
Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The function has far more in common with drm_syncobj_find than with
any in the get/put functions.
Signed-off-by: Jason Ekstrand <[email protected]>
Acked-by: Christian König <[email protected]> (v1)
Signed-off-by: Dave Airlie <[email protected]>
|
|
Remove a redundant identical return statement, it has no use.
Detected by CoverityScan, CID#1454586 ("Structurally dead code")
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Check memory allocation failure and return -ENOMEM in such a case.
'num_post_dep_syncobjs' still has to be set to 0 before the test in order
to have it initialized if 'amdgpu_cs_parser_fini()' is called to free
resources.
The calling graph would be, in such a case!
failure in amdgpu_cs_process_syncobj_out_dep()
---> error code returned by amdgpu_cs_dependencies()
--> amdgpu_cs_parser_fini() is called
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
BANK_SELECT should always be FRAGMENT_SIZE + 3 due to 8-entry (2^3)
per cache line in L2 TLB for Vega10.
v2: agd: fix warning
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The function is called only once and doesn't do anything special.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use ttm_bo_mem_space instead of manually allocating GART space.
This allows us to evict BOs when there isn't enought GART space any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This isn't used since we don't map evicted BOs to GART any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
KIQ doesn't really use the GPU scheduler. The base
drivers generally use the KIQ ring directly rather than
submitting IBs. However, amdgpu_sched_hw_submission
(which defaults to 2) limits the number of outstanding
fences to 2. KFD uses the KIQ for TLB flushes and the
2 fence limit hurts performance when there are several KFD
processes running.
v2: move some expressions to one line
change KIQ sched_hw_submission to at least 16
v3: bump to 256
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move the asic specific code into the IP modules.
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Be more explicit and add comments explaining each case.
Also s/gart/GART/ in the parameter string as per Felix'
suggestion.
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set the shadow flag on the shadow and not the parent, always bind shadow BOs
during allocation instead of manually, use the reservation_object wrappers
to grab the lock.
This fixes a couple of issues with binding the shadow BOs as well as correctly
evicting them when memory becomes tight.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need a larger gart for asics that do not support GPUVM on all
engines (e.g., MM) to make sure we have enough space for all
gtt buffers in physical mode. Change the default size based on
the asic type.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Acked-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
immediately
For virtual display, it uses software timer to emulate the vsync interrupt,
it doesn't have high precision, so doesn't support disable vblank immediately.
BUG: SWDEV-129274
Signed-off-by: Emily Deng <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Correctly detect system memory mappings when using CPU and don't use
huge pages for them.
Avoid incorrectly translating a physical page table GPU address when
splitting a huge page while mapping system memory.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
git://people.freedesktop.org/~gabbayo/linux into drm-next
This is the amdkfd pull request for 4.14 merge window.
AMD has started cleaning the pipe and sending patches from their internal
development to the upstream community.
The plan as I understand it is to first get all the non-dGPU patches to
upstream and then move to upstream dGPU support.
The patches here are relevant only for Kaveri and Carrizo.
The following is a summary of the changes:
- Add new IOCTL to set a Scratch memory VA
- Update PM4 headers for new firmware that support scratch memory
- Support image tiling mode
- Remove all uses of BUG_ON
- Various Bug fixes and coding style fixes
* tag 'drm-amdkfd-next-2017-08-18' of git://people.freedesktop.org/~gabbayo/linux: (24 commits)
drm/amdkfd: Implement image tiling mode support v2
drm/amdgpu: Add kgd kfd interface get_tile_config() v2
drm/amdkfd: Adding new IOCTL for scratch memory v2
drm/amdgpu: Add kgd/kfd interface to support scratch memory v2
drm/amdgpu: Program SH_STATIC_MEM_CONFIG globally, not per-VMID
drm/amd: Update MEC HQD loading code for KFD
drm/amdgpu: Disable GFX PG on CZ
drm/amdkfd: Update PM4 packet headers
drm/amdkfd: Clamp EOP queue size correctly on Gfx8
drm/amdkfd: Add more error printing to help bringup v2
drm/amdkfd: Handle remaining BUG_ONs more gracefully v2
drm/amdkfd: Allocate gtt_sa_bitmap in long units
drm/amdkfd: Fix doorbell initialization and finalization
drm/amdkfd: Remove BUG_ONs for NULL pointer arguments
drm/amdkfd: Remove usage of alloc(sizeof(struct...
drm/amdkfd: Fix goto usage v2
drm/amdkfd: Change x==NULL/false references to !x
drm/amdkfd: Consolidate and clean up log commands
drm/amdkfd: Clean up KFD style errors and warnings v2
drm/amdgpu: Remove hard-coded assumptions about compute pipes
...
|
|
mmVGT_INDEX_TYPE has no default value, need to make sure
it's initialized when gfx is initialized.
Signed-off-by: Ken Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|