aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
AgeCommit message (Collapse)AuthorFilesLines
2023-08-09drm/amdgpu: Clean up errors in mxgpu_vi.cRan Sun1-1/+1
Fix the following errors reported by checkpatch: ERROR: spaces required around that '-=' (ctx:WxV) Signed-off-by: Ran Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-10-18Revert "drm/amdgpu: let mode2 reset fallback to default when failure"Victor Zhao1-1/+0
This reverts commit dac6b80818ac2353631c5a33d140d8d5508e2957. This commit reverted the AMDGPU_SKIP_MODE2_RESET as it conflicts with the original design of reset handler. Will redesign it. Fixes: dac6b80818ac23 ("drm/amdgpu: let mode2 reset fallback to default when failure") Signed-off-by: Victor Zhao <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-08-16drm/amdgpu: let mode2 reset fallback to default when failureVictor Zhao1-0/+1
- introduce AMDGPU_SKIP_MODE2_RESET flag - let mode2 reset fallback to default reset method if failed v2: move this part out from the asic specific part Signed-off-by: Victor Zhao <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-07-13drm/amdgpu: support reset flag set for gpu resetLikun Gao1-2/+10
Move reset_context out of gpu recover function to make it configurable for different reset purpose. For the reset way of call gpu_recovery sysfs, force to use full reset method. Otherwise, try soft reset by default if the related ASIC supportted, if soft reset failed, will use full reset. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-10drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to ↵Andrey Grodzovsky1-1/+1
amdgpu_device_gpu_recover We removed the wrapper that was queueing the recover function into reset domain queue who was using this name. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-25Merge tag 'drm-misc-next-2022-02-23' of ↵Dave Airlie1-3/+8
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.18: UAPI Changes: Cross-subsystem Changes: - Split out panel-lvds and lvds dt bindings . - Put yes/no on/off disabled/enabled strings in linux/string_helpers.h and use it in drivers and tomoyo. - Clarify dma_fence_chain and dma_fence_array should never include eachother. - Flatten chains in syncobj's. - Don't double add in fbdev/defio when page is already enlisted. - Don't sort deferred-I/O pages by default in fbdev. Core Changes: - Fix missing pm_runtime_put_sync in bridge. - Set modifier support to only linear fb modifier if drivers don't advertise support. - As a result, we remove allow_fb_modifiers. - Add missing clear for EDID Deep Color Modes in drm_reset_display_info. - Assorted documentation updates. - Warn once in drm_clflush if there is no arch support. - Add missing select for dp helper in drm_panel_edp. - Assorted small fixes. - Improve fb-helper's clipping handling. - Don't dump shmem mmaps in a core dump. - Add accounting to ttm resource manager, and use it in amdgpu. - Allow querying the detected eDP panel through debugfs. - Add helpers for xrgb8888 to 8 and 1 bits gray. - Improve drm's buddy allocator. - Add selftests for the buddy allocator. Driver Changes: - Add support for nomodeset to a lot of drm drivers. - Use drm_module_*_driver in a lot of drm drivers. - Assorted small fixes to bridge/lt9611, v3d, vc4, vmwgfx, mxsfb, nouveau, bridge/dw-hdmi, panfrost, lima, ingenic, sprd, bridge/anx7625, ti-sn65dsi86. - Add bridge/it6505. - Create DP and DVI-I connectors in ast. - Assorted nouveau backlight fixes. - Rework amdgpu reset handling. - Add dt bindings for ingenic,jz4780-dw-hdmi. - Support reading edid through aux channel in ingenic. - Add a drm driver for Solomon SSD130x OLED displays. - Add simple support for sharp LQ140M1JW46. - Add more panels to nt35560. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-02-09drm/amdgpu: Rework reset domain to be refcounted.Andrey Grodzovsky1-2/+4
The reset domain contains register access semaphor now and so needs to be present as long as each device in a hive needs it and so it cannot be binded to XGMI hive life cycle. Adress this by making reset domain refcounted and pointed by each member of the hive and the hive itself. v4: Fix crash on boot witrh XGMI hive by adding type to reset_domain. XGMI will only create a new reset_domain if prevoius was of single device type meaning it's first boot. Otherwsie it will take a refocunt to exsiting reset_domain from the amdgou device. Add a wrapper around reset_domain->refcount get/put and a wrapper around send to reset wq (Lijo) Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Christian König <[email protected]> Link: https://www.spinics.net/lists/amd-gfx/msg74121.html
2022-02-09drm/amd/virt: For SRIOV send GPU reset directly to TDR queue.Andrey Grodzovsky1-3/+6
No need to to trigger another work queue inside the work queue. v3: Problem: Extra reset caused by host side FLR notification following guest side triggered reset. Fix: Preven qeuing flr_work from mailbox irq if guest already executing a reset. Suggested-by: Liu Shaoyun <[email protected]> Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Liu Shaoyun <[email protected]> Link: https://www.spinics.net/lists/amd-gfx/msg74114.html
2022-01-27drm/amd/amdgpu: fix spelling mistake "disbale" -> "disable"tangmeng1-1/+1
There is a spelling mistake. Fix it. Signed-off-by: tangmeng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-04-19drm/amd/amdgpu: fix spelling mistake "recieve" -> "receive"Colin Ian King1-1/+1
There is a spelling mistake in a pr_err message. Fix it. Reviewed-by: Mukesh Ojha <[email protected]> Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-11-21drm/amd/amdgpu: Remove duplicate headerBrajeswar Ghosh1-1/+0
Remove gca/gfx_8_0_sh_mask.h which is included more than once Signed-off-by: Brajeswar Ghosh <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-09-26drm/amdgpu: move more defines into amdgpu_irq.hChristian König1-2/+2
Everything that isn't related to the IH ring. Signed-off-by: Christian König <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: cleanup GPU recovery check a bit (v2)Christian König1-1/+2
Check if we should call the function instead of providing the forced flag. v2: rebase on KFD changes (Alex) Signed-off-by: Christian König <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-18drm/amdgpu: rename amdgpu_gpu_recoverAlex Deucher1-1/+1
add device to the name for consistency. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-18drm/amdgpu: rename amdgpu_program_register_sequenceAlex Deucher1-24/+24
add device for consistency with other functions in this file. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-15drm/amdgpu: Simplify amdgpu_lockup_timeout usage.Andrey Grodzovsky1-1/+1
With introduction of amdgpu_gpu_recovery we don't need any more to rely on amdgpu_lockup_timeout == 0 for disabling GPU reset. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-15drm/amdgpu: Add gpu_recovery parameterAndrey Grodzovsky1-1/+1
Add new parameter to control GPU recovery procedure. v2: Add auto logic where reset is disabled for bare metal and enabled for SR-IOV. Allow forced reset from debugfs. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-06drm/amdgpu: remove nonsense const u32 cast on ARRAY_SIZE resultChristian König1-6/+6
Not sure what that should originally been good for, but it doesn't seem to make any sense any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-04drm/amdgpu: return error when sriov access requests get timeoutpding1-2/+4
Reported-by: Sun Gary <[email protected]> Signed-off-by: pding <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-04drm/amdgpu:implement new GPU recover(v3)Monk Liu1-1/+1
1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-04drm/amdgpu/virt: implement wait_reset callbacks for vi/aipding1-0/+6
Reviewed-by: Monk Liu <[email protected]> Signed-off-by: pding <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-14drm/amdgpu: Support passing amdgpu critical error to host via GPU Mailbox.Gavin Wan1-0/+1
This feature works for SRIOV enviroment. For non-SRIOV enviroment, the trans_error function does nothing. The error information includes error_code (16bit), error_flags(16bit) and error_data(64bit). Since there are not many errors, we keep the errors in an array and transfer all errors to Host before amdgpu initialization function (amdgpu_device_init) exit. Signed-off-by: Gavin Wan <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-24drm/amdgpu:only call flr_work under infinite timeoutMonk Liu1-6/+9
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-24drm/amdgpu:use job* to replace voluntaryMonk Liu1-1/+1
that way we can know which job cause hang and can do per sched reset/recovery instead of all sched. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-24drm/amdgpu:need som change on vega10 mailboxMonk Liu1-5/+5
if sriov gpu reset is invoked by job timeout, it is run in a global work-queue which is very slow and better not call msleep ortherwise it takes long time to get back CPU. so make below changes: 1: Change msleep 1 to mdelay 5 2: Ignore the ack fail from pf after time out, because VF FLR will clear ack, sometime VF FLR is done prior to the beginning of poll_ack so we can ignore this ack TODO: Put job_timedout (and the following gpu reset) in a driver thread, instead of the global work_struct. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu/virt: don't check VALID bit for FLR completion messagePixel Ding1-3/+6
The interrupt after FLR is missed sometimes due to hardware reason, so guest driver get the notification of FLR completion via polling message. Then host doesn't write VALID bit to avoid sending interrupt, otherwise the completion will be handled twice. So there's a valid message without VALID bit for FLR completion, driver should handle it without checking. Signed-off-by: Pixel Ding <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: switch ih handling to two levels (v3)Alex Deucher1-2/+2
Newer asics have a two levels of irq ids now: client id - the IP src id - the interrupt src within the IP v2: integrated Christian's comments. v3: fix rebase fail in SI and CIK Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Ken Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu/virt: fix typoXiangliang Yu1-8/+8
When send messages to hypervior, the messages format should be is idh_request, not idh_event. Signed-off-by: Xiangliang Yu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Monk Liu <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:RUNTIME flag should clr laterMonk Liu1-3/+1
this flag will get cleared by request gpu access Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:use work instead of delay-workMonk Liu1-19/+17
no need to use a delay work since we don't know how much time hypervisor takes on FLR, so just polling and waiting in a work. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:no kiq for mailbox registers accessMonk Liu1-16/+16
Use no kiq version reg access due to: 1) better performance 2) INTR context consideration (some routine in mailbox is in INTR context e.g.xgpu_vi_mailbox_rcv_irq) Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:Refine handshake of mailboxKen Xue1-1/+23
Signed-off-by: Ken Xue <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Xiangliang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-01-27drm/amdgpu/virt: implement VI virt operation interfacesXiangliang Yu1-0/+592
VI has asic specific virt support, which including mailbox and golden registers init. Signed-off-by: Xiangliang Yu <[email protected]> Signed-off-by: Monk Liu <[email protected]> Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>