blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2022-11-09	drm/amdgpu: complete gfxoff allow signal during suspend without delay	Harsh Jain	1	-3/+7
	change guarantees that gfxoff is allowed before moving further in s2idle sequence to add more reliablity about gfxoff in amdgpu IP's suspend flow Signed-off-by: Harsh Jain <[email protected]> Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-29	drm/amdgpu: fix compiler warning for amdgpu_gfx_cp_init_microcode	Likun Gao	1	-1/+1
	Change the type of parameter on amdgpu_gfx_cp_init_microcode to fix compiler warning. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-29	drm/amdgpu: add function to init CP microcode	Likun Gao	1	-0/+140
	Add an common function to init CP related microcode. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-16	drm/amd: Add detailed GFXOFF stats to debugfs	André Almeida	1	-0/+39
	Add debugfs interface to log GFXOFF statistics: - Read amdgpu_gfxoff_count to get the total GFXOFF entry count at the time of query since system power-up - Write 1 to amdgpu_gfxoff_residency to start logging, and 0 to stop. Read it to get average GFXOFF residency % multiplied by 100 during the last logging interval. Both features are designed to be keep the values persistent between suspends. Signed-off-by: André Almeida <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-16	drm/amdgpu: reduce reset time	Victor Zhao	1	-1/+1
	In multi container use case, reset time is important, so skip ring tests and cp halt wait during ip suspending for reset as they are going to fail and cost more time on reset v2: add a hang flag to indicate the reset comes from a job timeout, skip ring test and cp halt wait in this case v3: move hang flag to adev Signed-off-by: Victor Zhao <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-30	drm/amdgpu: enable mes to access registers v2	Jack Xiao	1	-0/+8
	Enable mes to access registers. v2: squash mes sched ring enablement flag Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-08	drm/amd/amdgpu: Fix alignment issue	Arunpravin Paneer Selvam	1	-1/+1
	Fix alignment problems reported by zuul for the commit b07d1d73b09e ("drm/amd/amdgpu: Enable high priority gfx queue") Fixes: b07d1d73b09e ("drm/amd/amdgpu: Enable high priority gfx queue") Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-06	drm/amd/amdgpu: Enable high priority gfx queue	Arunpravin Paneer Selvam	1	-11/+44
	Starting from SIENNA CICHLID asic supports two gfx pipes, enabling two graphics queues, 1 on each pipe, pipe0 queue0 would be the normal piority queue and pipe1 queue0 would be the high priority queue Only one queue per pipe is visble to SPI, SPI looks at the priority value assigned to CP_GFX_HQD_QUEUE_PRIORITY from each of the queue's HQD/MQD. Create contexts applying AMDGPU_CTX_PRIORITY_HIGH which submits job to the high priority queue on GFX pipe1. There would be starvation of LP workload if HP workload is always available. v2: - remove unnecessary check(Nirmoy) - make pipe1 hardware support a separate patch(Nirmoy) - remove duplicate code(Shashank) - add CSA support for second gfx pipe(Alex) v3(Christian): - fix incorrect indentation - merge COMPUTE and GFX switch cases as both calls the same function. v4: - rebase w/ latest code base Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-01	drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus	Candice Li	1	-3/+6
	Adjust the sequence for ras late init and separate ras reset error status from query status. v2: squash in fix from Candice Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-05-06	drm/amdgpu: nuke dynamic gfx scratch reg allocation	Christian König	1	-36/+0
	It's over a decade ago that this was actually used for more than ring and IB tests. Just use the static register directly where needed and nuke the now useless infrastructure. Signed-off-by: Christian König <[email protected]> Acked-by: Lang Yu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-05-04	drm/amdgpu: add mes unmap legacy queue routine	Jack Xiao	1	-3/+5
	For mes kiq has been taken over by mes sched, drv can't directly use mes kiq to unmap queues. drv has to use mes sched api to unmap legacy queue. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-05-04	drm/amdgpu: kiq takes charge of all queues	Jack Xiao	1	-0/+3
	To make kgq/kcq and mes queue co-exist, kiq needs take charge of all queues. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25	drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()	Dan Carpenter	1	-1/+1
	This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: 71c37505e7ea ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02	drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras ↵	yipechai	1	-7/+0
	block Remove redundant calls of amdgpu_ras_block_late_fini in gfx ras block. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02	drm/amdgpu: Optimize xxx_ras_fini function of each ras block	yipechai	1	-2/+2
	1. Move the variables of ras block instance members from specific xxx_ras_fini to general ras_fini call. 2. Function calls inside the modules only use parameters passed from xxx_ras_fini instead of ras block instance members. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02	drm/amdgpu: Modify .ras_fini function pointer parameter	yipechai	1	-1/+1
	Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-17	drm/amdgpu: Optimize xxx_ras_late_init function of each ras block	yipechai	1	-3/+3
	1. Move calling ras block instance members from module internal function to the top calling xxx_ras_late_init. 2. Module internal function calls can only use parameter variables of xxx_ras_late_init instead of ras block instance members. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-17	drm/amdgpu: Modify .ras_late_init function pointer parameter	yipechai	1	-1/+1
	Modify .ras_late_init function pointer parameter so that it can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-14	drm/amdgpu: Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code	yipechai	1	-37/+5
	Optimize amdgpu_gfx_ras_late_init/amdgpu_gfx_ras_fini function code. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amdgpu: Modify gfx block to fit for the unified ras block data and ops	yipechai	1	-4/+4
	1.Modify gfx block to fit for the unified ras block data and ops. 2.Change amdgpu_gfx_ras_funcs to amdgpu_gfx_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of gfx ras variable so that gfx ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register gfx ras block into amdgpu device ras block link list. 5.Remove the redundant code about gfx in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of gfx versions. If .ras_late_init and .ras_fini had been defined by the selected gfx version, the defined functions will take effect; if not defined, default fill with amdgpu_gfx_ras_late_init and amdgpu_gfx_ras_fini. Signed-off-by: yipechai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-01-14	drm/amd/pm: do not expose implementation details to other blocks out of power	Evan Quan	1	-17/+1
	Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-10-05	drm/amdgpu: During s0ix don't wait to signal GFXOFF	Lijo Lazar	1	-2/+12
	In the rare event when GFX IP suspend coincides with a s0ix entry, don't schedule a delayed work, instead signal PMFW immediately to allow GFXOFF entry. GFXOFF is a prerequisite for s0ix entry. PMFW needs to be signaled about GFXOFF status before amd-pmc module passes OS HINT to PMFW telling that everything is ready for a safe s0ix entry. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1712 Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-20	drm/amdgpu: Cancel delayed work when GFXOFF is disabled	Michel Dänzer	1	-11/+25
	schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: [email protected] Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> # v3 Acked-by: Christian König <[email protected]> # v3 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16	drm/amd/amdgpu: remove unnecessary RAS context field	Candice Li	1	-1/+0
	Delete ras_if->name in the RAS ctx structure and remove related lines. Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19	drm/amdgpu: Conditionally reset RAS counters on boot	John Clements	1	-6/+1
	Only clear RAS error counters if perestent EDC harvesting is not supported Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09	drm/amdgpu: split gfx callbacks into ras and non-ras ones	Hawking Zhang	1	-2/+3
	gfx ras is only available in cerntain ip generations. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09	drm/amd/pm: unify the interface for gfx state setting	Evan Quan	1	-10/+6
	No need to have special handling for swSMU supported ASICs. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09	drm/amdgpu: add the sched_score to amdgpu_ring_init	Christian König	1	-3/+2
	Allow separate ring to share the same scheduler score. No functional change. Signed-off-by: Christian König <[email protected]> Reviewed-and-Tested-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09	drm/amdgpu: wrap kiq ring ops with kiq spinlock	Nirmoy Das	1	-4/+11
	KIQ ring is being operated by kfd as well as amdgpu. KFD is using kiq lock, we should the same from amdgpu side as well. Signed-off-by: Nirmoy Das <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-04-09	drm/amdgpu: add codes to capture invalid hardware access when recovery	Dennis Li	1	-2/+2
	When recovery thread has begun GPU reset, there should be not other threads to access hardware, otherwise system randomly hang. v2 (chk): rewritten from scratch, use trylock and lockdep instead of hand wiring the logic. v3: add in_irq check v4: change to check in_task Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-03-23	drm/amdgpu: harvest edc status when connected to host via xGMI	Dennis Li	1	-1/+8
	When connected to a host via xGMI, system fatal errors may trigger warm reset, driver has no change to query edc status before reset. Therefore in this case, driver should harvest previous error loging registers during boot, instead of only resetting them. v2: 1. IP's ras_manager object is created when its ras feature is enabled, so change to query edc status after amdgpu_ras_late_init called 2. change to enable watchdog timer after finishing gfx edc init Signed-off-by: Dennis Li <[email protected]> Reivewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-02-09	drm/amdgpu: enable only one high prio compute queue	Nirmoy Das	1	-7/+8
	For high priority compute to work properly we need to enable wave limiting on gfx pipe. Wave limiting is done through writing into mmSPI_WCL_PIPE_PERCENT_GFX register. Enable only one high priority compute queue to avoid race condition between multiple high priority compute queues writing that register simultaneously. Signed-off-by: Nirmoy Das <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-13	drm/amd/pm: add gfx_state_change_set() for rn gfx power switch (v2)	Prike Liang	1	-9/+9
	The gfx_state_change_set() funtion can support set GFX power change status to D0/D3. v2: make sure to register callback (Alex) Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-13	drm/amdgpu: add amdgpu_gfx_state_change_set() set gfx power change entry (v2)	Prike Liang	1	-0/+20
	The new amdgpu_gfx_state_change_set() funtion can support set GFX power change status to D0/D3. v2: squash in warning fix (Alex) Signed-off-by: Prike Liang <[email protected]> Acked-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-13	drm/amdgpu: fix compute queue priority if num_kcq is less than 4	Nirmoy Das	1	-3/+7
	Compute queues are configurable with module param, num_kcq. amdgpu_gfx_is_high_priority_compute_queue was setting 1st 4 queues to high priority queue leaving a null drm scheduler in adev->gpu_sched[hw_ip]["normal_prio"].sched if num_kcq < 5. This patch tries to fix it by alternating compute queue priority between normal and high priority. Fixes: 33abcb1f5a1719b1c (drm/amdgpu: set compute queue priority at mqd_init) Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-10-16	drm/amdgpu: move amdgpu_num_kcq handling to a helper	Alex Deucher	1	-0/+11
	Add a helper so we can set per asic default values. Also, the module parameter is currently clamped to 8, but clamp it per asic just in case some asics have different limits in the future. Enable the option on gfx6,7 as well for consistency. Acked-by: Nirmoy Das <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-15	drm/amdgpu: Avoid accessing HW when suspending SW state	Andrey Grodzovsky	1	-0/+6
	At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-24	drm/amdgpu: refine message print for devices of hive	Dennis Li	1	-2/+2
	Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device of a hive is the message for. Reviewed-by: Christian König <[email protected]> Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-24	drm/amdgpu: refine codes to avoid reentering GPU recovery	Dennis Li	1	-2/+2
	if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-14	drm/amdgpu: revert "fix system hang issue during GPU reset"	Christian König	1	-3/+3
	The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-14	drm/amdgpu: reconfigure spm golden settings on Navi1x after GFXOFF exit(v3)	Tianci.Yin	1	-1/+7
	On Navi1x, the SPM golden settings are lost after GFXOFF enter/exit, so reconfigure the golden settings after GFXOFF exit. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04	drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)	Monk Liu	1	-30/+19
	what: the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks during world switch which impacts a lot to multi-VF performance how: introduce a paramter to control the number of KCQ to avoid performance drop if there is no kernel compute queue needed notes: this paramter only affects gfx 8/9/10 v2: refine namings v3: choose queues for each ring to that try best to cross pipes evenly. v4: fix indentation some cleanupsin the gfx_compute_queue_acquire() v5: further fix on indentations more cleanupsin gfx_compute_queue_acquire() TODO: in the future we will let hypervisor driver to set this paramter automatically thus no need for user to configure it through modprobe in virtual machine Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-27	drm/amdgpu: fix system hang issue during GPU reset	Dennis Li	1	-3/+3
	when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <[email protected]> Suggested-by: Andrey Grodzovsky <[email protected]> Suggested-by: Christian König <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Suggested-by: Lijo Lazar <[email protected]> Suggested-by: Luben Tukov <[email protected]> Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-21	drm/amdgpu: add read amdgpu_gfxoff status in debugfs	Jinzhou.Su	1	-0/+14
	Add interface for SMU12 device, used by UMR. v2: fix code style Signed-off-by: Jinzhou.Su <[email protected]> Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-01	drm/amdgpu: Rename amdgpu_gfx_kcq_queue_mask_transform()	Yong Zhao	1	-8/+8
	Rename it to amdgpu_queue_mask_bit_to_set_resource_bit() to be more specific about its functionality. KFD will use it later. Signed-off-by: Yong Zhao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-01	drm/amdgpu: update the method to set kcq queue mask	Likun Gao	1	-1/+14
	Use a common method to set queue mask before set kiq resource. The value of queue mask must suitablt for the designated form. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-24	drm/amdgpu: protect ring overrun	Yintian Tao	1	-4/+18
	Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. v2: fix coding stype and remove abs operation v3: remove the initialization of variable r Signed-off-by: Yintian Tao <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-23	drm/amdgpu: request reg_val_offs each kiq read reg	Yintian Tao	1	-8/+11
	According to the current kiq read register method, there will be race condition when using KIQ to read register if multiple clients want to read at same time just like the expample below: 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the kiq complete these two read operation 6. client-A to read the register at the wb buffer and get REG-1 value Therefore, use amdgpu_device_wb_get() to request reg_val_offs for each kiq read register. v2: fix the error remove v3: fix the print typo v4: remove unused variables Signed-off-by: Yintian Tao <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-13	drm/amdgpu/kiq: add no_scheduler flag to KIQ	Alex Deucher	1	-0/+1
	We don't want a GPU scheduler for this ring. Reviewed-by: Christian König <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-09	drm/amdgpu: rework sched_list generation	Nirmoy Das	1	-1/+2
	Generate HW IP's sched_list in amdgpu_ring_init() instead of amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(), ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary. This patch also stores sched_list for all HW IPs in one big array in struct amdgpu_device which makes amdgpu_ctx_init_entity() much more leaner. v2: fix a coding style issue do not use drm hw_ip const to populate amdgpu_ring_type enum v3: remove ctx reference and move sched array and num_sched to a struct use num_scheds to detect uninitialized scheduler list v4: use array_index_nospec for user space controlled variables fix possible checkpatch.pl warnings Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>