aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-05-10drm/amdgpu: add amd fan ctrl mode enums.Rex Zhu1-0/+6
Add common fan enums that can be used for both powerplay and dpm. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amd/powerplay: add more smu message on Vega10.Rex Zhu1-1/+4
Add some new SMU messages. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amdgpu: fix dependency issueChunming Zhou6-1/+27
The problem is that executing the jobs in the right order doesn't give you the right result because consecutive jobs executed on the same engine are pipelined. In other words job B does it buffer read before job A has written it's result. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amd: fix init order of sched jobChunming Zhou1-1/+1
Need to increment after the fence check. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amdgpu: add some additional vega10 pci idsAlex Deucher1-0/+2
Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amdgpu/soc15: use atomfirmware for setting bios scratch for resetAlex Deucher1-3/+3
Need to use the atomfirmware interface rather than atombios since soc15 is atomfirmware based. Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Harry Wentland <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/amdgpu/atomfirmware: add function to update engine hang statusAlex Deucher2-0/+15
Update the scratch reg for when the engine is hung. Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Harry Wentland <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-10drm/radeon: only warn once in radeon_ttm_bo_destroy if va list not emptyJulien Isorce1-1/+1
Encountered a dozen of exact same backtraces when mesa's pb_cache_release_all_buffers is called after that a gpu reset failed. v2: Remove superfluous error message added in v1. bug: https://bugs.freedesktop.org/show_bug.cgi?id=96271 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu: fix mutex list null pointer referencePixel Ding1-1/+2
Fix NULL pointer reference. Signed-off-by: Pixel Ding <[email protected]> Signed-off-by: Xiangliang Yu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: fix bug sclk/mclk level can't be set on vega10.Rex Zhu1-30/+31
Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: Setup sw CTF to allow graceful exit when temperature ↵Rex Zhu1-28/+45
exceeds maximum. cherry-pick from amd windows driver. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: delete dead code in powerplay.Rex Zhu2-30/+0
Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu: Use less generic enum definitionsGuenter Roeck3-26/+26
alpha:allmodconfig fails to build as follows. drivers/gpu/drm/amd/amdgpu/amdgpu.h:1006:2: error: expected identifier before '(' token drivers/gpu/drm/amd/amdgpu/amdgpu.h:1011:28: error: 'NGG_BUF_MAX' undeclared here The problem is not really the enum definition of NGG_BUF_MAX but PARAM, which happens to be defined differently for alpha and a couple of other architectures. Use less generic defines for NGG enums to solve the problem. Fixes: bce23e00f3369 ("drm/amdgpu: add NGG parameters") Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu/gfx9: derive tile pipes from golden settingsAlex Deucher1-1/+4
rather than hardcoding it. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu/gfx: drop max_gs_waves_per_vgtAlex Deucher3-3/+1
We already have this info: max_gs_threads. Drop the duplicate. Reviewed-by: Junwei Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: disable engine spread spectrum feature on Vega10.Rex Zhu1-1/+5
Vega10 atomfirmware do not have ASIC_InternalSS_Info table so disable this feature by default in driver. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: clean up code in vega10_smumgr.cRex Zhu1-56/+45
1. fix typo in print message info. 2. fix block comments's coding style. Signed-off-by: Rex Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu:fix waiting on dirty fenceMonk Liu1-0/+2
if bo->shadow is NULL (race issue:BO shadow was just released and gpu-reset kick in but BO hasn't yet) recover_vram_from_shadow won't set @next, so the following "fence=next" will wrongly use a fence pointer which may already dirty. fixing it by set next to NULL prior to recover_vram_from_shadow Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Chunming Zhou<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amdgpu:PTE flag should be 64 bit widthMonk Liu1-1/+1
otherwise we'll lost the high 32 bit for pte, which lead to incorrect MTYPE for vega10. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: correct LoadLineResistance value in pptable.Rex Zhu2-4/+4
this value is used by avfs to adjust inversion voltage. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: Allow duplicate enteries in pptable.Rex Zhu1-4/+4
This is a valid configuration. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: set fan target temperature by msg on vega10.Rex Zhu1-0/+5
SMU not support FanTargetTemperature in pptable, so send msg instand. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: set soc floor voltage on boot on vega10.Rex Zhu4-0/+71
Send the VBIOS bootup VDDC as a SOC floor voltage to SMU before populating the PPTABLE. After DPM is enabled, This floor voltage will be removed. This will prevent SMC from going to Vmin upon receiving PPTable causing a violation. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-05drm/amd/powerplay: refine code in vega10_smumgr.cRex Zhu1-39/+39
1. return error code instand of -1. 2. print msg info if send msg failed Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-02drm/amdgpu: Reserve 0-2 invalidation reg sets for none-amdgpu usagesShaoyun Liu1-1/+1
Firmware used reg set 2 for tlb invalidation. AMDGPU can start from reg set 3 to avoid the conflict. AMDKFD will use the reg set 0 or 1 when necesary. Signed-off-by: Shaoyun Liu <[email protected]> Reviewws-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-02drm/amdgpu/gfx9: add additional MQD initializationAlex Deucher1-0/+5
Need to properly set the ROQ space setting. Reviewed-by: monk liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-02drm/amdgpu/gfx9: fix typo in mpd initAlex Deucher1-2/+2
Using the wrong macro for soc15 register access. Reviewed-by: monk liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-02drm/amdgpu/gfx9: use actual gpu num se setting for ngg allocationAlex Deucher1-2/+1
Rather than using a hardcoded value. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-02drm/amdgpu: update revision id settings for BR/STAlex Deucher1-0/+12
Add new RIDs. Acked-by: Christian König <[email protected]> Acked-by: Alex Xie <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-01Revert "drm/amdgpu: Refactor flip into prepare submit and submit. (v3)"Michel Dänzer2-124/+29
This reverts commit cb341a319f7e66f879d69af929c3dadfc1a8f31e. The purpose of the refactor was for amdgpu_crtc_prepare/submit_flip to be used by the DC code, but that's no longer the case. Reviewed-by: Christian König <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Make amdgpu_bo_reserve use uninterruptible waits for cleanupMichel Dänzer16-36/+36
Some of these paths probably cannot be interrupted by a signal anyway. Those that can would fail to clean up things if they actually got interrupted. Reviewed-by: Christian König <[email protected]> Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: implement stop dpm task for vega10.Rex Zhu5-1/+123
Add functions to disable dpm for S3/S4. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: complete disable_smc_firmware_ctf_tasks.Rex Zhu3-1/+7
Disable ctf in eventmgr to fix S3/S4 support. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: add disable_smc_ctf callback in hwmgr.Rex Zhu4-1/+13
export disablesmcctf to eventmgr. need to disable temperature alert when s3/s4. otherwise, when resume back,enable temperature alert will fail. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: fix deadlock of reservation between cs and gpu reset v2Chunming Zhou1-0/+4
the case could happen when gpu reset: 1. when gpu reset, cs can be continue until sw queue is full, then push job will wait with holding pd reservation. 2. gpu_reset routine will also need pd reservation to restore page table from their shadow. 3. cs is waiting for gpu_reset complete, but gpu reset is waiting for cs releases reservation. v2: handle amdgpu_cs_submit error path. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Reviewed-by: Monk Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: bump version for exporting gpu info for gfx9Junwei Zhang1-1/+2
Signed-off-by: Junwei Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: export more gpu info for gfx9Junwei Zhang4-0/+37
v2: 64-bit aligned for gpu info v3: squash in wave_front_fix Signed-off-by: Ken Wang <[email protected]> Signed-off-by: Junwei Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: remove unused and mostly unimplemented CGS functions v2Christian König2-468/+0
Those functions are all unused and some not even implemented. v2: keep cgs_get_pci_resource, it is used by the ACP driver. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: refine set pcie dpm default table on vega10.Rex Zhu1-21/+1
Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: disable cks by default on vega10.Rex Zhu1-1/+1
run gpu test auto reboot when enable cks right now. Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amd/powerplay: correct UlvOffsetVid on Vega10.Rex Zhu1-3/+2
Signed-off-by: Rex Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-1/+1
There is no good mechanism to handle the corresponding error. When signal interrupt happens, unpin is not called. As a result, inside AMDGPU, the statistic of pin size will be wrong. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-7/+7
Either in cgs functions or for callers of cgs functions: 1. The signal interrupt can affect the expected behaviour 2. There is no good mechanism to handle the corresponding error 3. There is no chance of deadlock in these single BO waiting 4. There is no clear benefit for interruptible waiting 5. Future caller of these functions might have same issue. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: fix NULL pointer errorChunming Zhou1-1/+3
[ 141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 141.420532] IP: [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.420563] PGD 20a030067 [ 141.420575] PUD 2088ca067 [ 141.420587] PMD 0 [ 141.420599] Oops: 0000 [#1] SMP [ 141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E) [ 141.420948] nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E) [ 141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G OE 4.9.0-custom #4 [ 141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017 [ 141.421146] Workqueue: events amd_sched_job_timedout [amdgpu] [ 141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000 [ 141.421193] RIP: 0010:[<ffffffff81579ee1>] [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.421229] RSP: 0018:ffffc900016f7d30 EFLAGS: 00010202 [ 141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000 [ 141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000 [ 141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000 [ 141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000 [ 141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60 [ 141.421390] FS: 0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000 [ 141.421421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0 [ 141.421471] Stack: [ 141.421480] ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78 [ 141.421513] ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770 [ 141.421549] ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7 [ 141.421583] Call Trace: [ 141.421628] [<ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu] [ 141.421676] [<ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu] [ 141.421712] [<ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm] [ 141.421770] [<ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu] [ 141.421829] [<ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu] [ 141.421859] [<ffffffff81095493>] process_one_work+0x153/0x3f0 [ 141.421884] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0 [ 141.421907] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350 [ 141.421931] [<ffffffff8109b423>] kthread+0xd3/0xf0 [ 141.421951] [<ffffffff8109b350>] ? kthread_park+0x60/0x60 [ 141.421975] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30 [ 141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95 [ 141.422156] RIP [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.422183] RSP <ffffc900016f7d30> [ 141.422197] CR2: 0000000000000030 [ 141.433483] ---[ end trace bc0949bf7ddd6d4b ]--- if the job is reset twice, then the parent could be NULL. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: validate shadow before restoring from itRoger.He3-0/+34
Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Roger.He <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-2/+2
1. The signal interrupt can affect the expected behaviour. 2. There is no good mechanism to handle the corresponding error. When signal interrupt happens, unpin is not called. As a result, inside AMDGPU, the statistic of pin size will be wrong. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Real return value can be over-written when clean upAlex Xie1-4/+5
Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-1/+1
1. The signal interrupt can affect the expected behaviour. 2. There is no good mechanism to handle the corresponding error. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-1/+1
1. The signal interrupt can affect the expected behaviour. 2. There is no good mechanism to handle the corresponding error. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-28drm/amdgpu: Fix use of interruptible waitingAlex Xie1-3/+3
1. The signal interrupt can affect the expected behaviour. 2. There is no mechanism to handle the corresponding error. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>