blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2019-08-02	drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS	Tao Zhou	4	-4/+4
	ce can also trigger interrupt, and even both ce and ue error can be found in one ras query, distinguishing between ce and ue in interrupt handler is uncessary. Signed-off-by: Tao Zhou <[email protected]> Suggested-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: only uncorrectable error needs gpu reset	Tao Zhou	1	-1/+5
	we only read error information for correctable error in interrupt handler, gpu reset is unnecessary since there is no data lost in correctable error Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: update the calc algorithm of umc ecc error count	Tao Zhou	1	-4/+6
	the initial value of ecc error count can be adjusted Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: implement umc ras init function	Tao Zhou	2	-0/+39
	enable umc ce interrupt and initialize ecc error count Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: support ce interrupt in ras module	Tao Zhou	1	-4/+8
	correctable error can also trigger interrupt in some ras blocks Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: add error address query for umc ras	Tao Zhou	2	-0/+10
	umc error address query can get ce/ue error address and clear error status Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: apply umc_for_each_channel macro to umc_6_1	Tao Zhou	1	-56/+28
	use umc_for_each_channel to make code simpler Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: add macro of umc for each channel	Tao Zhou	1	-0/+23
	common function for all umc versions, loop for each umc channel is a frequent used operation in umc block, define it as a macro to simplify code Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: initialize new parameters and functions for amdgpu_umc structure	Tao Zhou	3	-3/+17
	add initialization for new members of amdgpu_umc structure Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: add more parameters and functions to amdgpu_umc structure	Tao Zhou	2	-0/+15
	expose more parameters and functions of specific umc version to common umc layer, so amdgpu_umc layer and other blocks could access them Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: remove the clear of MCA_ADDR	Tao Zhou	1	-2/+0
	clearing MCA_STATUS is enough to reset the whole MCA, writing zero to MCA_ADDR is unnecessary Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: fix unsigned variable instance compared to less than zero	Colin Ian King	1	-1/+2
	Currenly the error check on variable instance is always false because it is a uint32_t type and this is never less than zero. Fix this by making it an int type. Addresses-Coverity: ("Unsigned compared against 0") Fixes: 7d0e6329dfdc ("drm/amdgpu: update more sdma instances irq support") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdkfd: Extend CU mask to 8 SEs (v3)	Jay Cornwall	1	-0/+4
	Following bitmap layout logic introduced by: "drm/amdgpu: support get_cu_info for Arcturus". v2: squash in fixup for gfx_v9_0.c (Alex) v3: squash in debug print output fix Signed-off-by: Jay Cornwall <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: support get_cu_info for Arcturus	Le Ma	1	-7/+28
	This change is because SE/SH layout on Arcturus is 81, different from 42(or 4*1) on Vega ASICs. Currently the cu bitmap array is 4x4 size, and besides the bitmap is used widely across SW stack. To mostly reduce the scale of impact, we make the cu bitmap array compatible with SE/SH layout on Arcturus. Then the store of cu bits of each shader array for Arcturus will be like below: SE0,SH0 --> bitmap[0][0] SE1,SH0 --> bitmap[1][0] SE2,SH0 --> bitmap[2][0] SE3,SH0 --> bitmap[3][0] SE4,SH0 --> bitmap[0][1] SE5,SH0 --> bitmap[1][1] SE6,SH0 --> bitmap[2][1] SE7,SH0 --> bitmap[3][1] Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: Fix pcie_bw on Vega20	Kent Russell	1	-8/+52
	The registers used for VG20 are different in that certain performance counters were split off to TXCLK3/4. Vega10/12 doesn't have this, so add a new vg20_get_pcie_usage to reflect this change. Signed-off-by: Kent Russell <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: Add amdgpu_asic_funcs.reset_method for Vega20	Andrey Grodzovsky	1	-0/+1
	Fixes GPU reset crash. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: Mark KFD VRAM allocations for wipe on release	Felix Kuehling	1	-1/+1
	Memory used by KFD applications can contain sensitive information that should not be leaked to other processes. The current approach to prevent leaks is to clear VRAM at allocation time. This is not effective because memory can be reused in other ways without being cleared. Synchronously clearing memory on the allocation path also carries a significant performance penalty. Stop clearing memory at allocation time. Instead mark the memory for wipe on release. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: Implement VRAM wipe on release	Felix Kuehling	4	-3/+56
	Wipe VRAM memory containing sensitive data when moving or releasing BOs. Clearing the memory is pipelined to minimize any impact on subsequent memory allocation latency. Use of a poison value should help debug future use-after-free bugs. When moving BOs, the existing ttm_bo_pipelined_move ensures that the memory won't be reused before being wiped. When releasing BOs, the BO is fenced with the memory fill operation, which results in queuing the BO for a delayed delete. v2: Move amdgpu_amdkfd_unreserve_memory_limit into amdgpu_bo_release_notify so that KFD can use memory that's still being cleared in the background Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: fix double ucode load by PSP(v3)	Monk Liu	1	-21/+38
	previously the ucode loading of PSP was repreated, one executed in phase_1 init/re-init/resume and the other in fw_loading routine Avoid this double loading by clearing ip_blocks.status.hw in suspend or reset prior to the FW loading and any block's hw_init/resume v2: still do the smu fw loading since it is needed by bare-metal v3: drop the change in reinit_early_sriov, just clear all block's status.hw in the head place and set the status.hw after hw_init done is enough Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: fix incorrect judge on sos fw version	Monk Liu	1	-1/+1
	for SRIOV the SOS fw of PSP is loaded in hypervisor thus guest won't tell the version of it, and judging feature by reading the sos fw version in guest side is completely wrong Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	drm/amdgpu: cleanup vega10 SRIOV code path	Monk Liu	11	-118/+38
	we can simplify all those unnecessary function under SRIOV for vega10 since: 1) PSP L1 policy is by force enabled in SRIOV 2) original logic always set all flags which make itself a dummy step besides, 1) the ih_doorbell_range set should also be skipped for VEGA10 SRIOV. 2) the gfx_common registers should also be skipped for VEGA10 SRIOV. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-02	Merge tag 'drm-fixes-5.3-2019-07-31' of ↵	Dave Airlie	4	-41/+64
	git://people.freedesktop.org/~agd5f/linux into drm-fixes drm-fixes-5.3-2019-07-31: amdgpu: - Fix temperature granularity for navi - Fix stable pstate setting for navi - Fix VCN DPM enablement on navi - Fix error handling on CS ioctl when processing dependencies - Fix possible information leak in debugfs amdkfd: - fix memory alignment for VegaM Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-07-31	drm/amdkfd: enable KFD support for navi14	Alex Deucher	1	-0/+1
	Same as navi10. Reviewed-by: Xiaojie Yuan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: disable inject for failed subblocks of gfx	Dennis Li	1	-116/+165
	some subblocks of gfx fail in inject test, disable them Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: support gfx ras error injection and err_cnt query	Dennis Li	2	-3/+18
	check gfx error count in both ras querry function and ras interrupt handler. gfx ras is still disabled by default due to known stability issue found in gpu reset. Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add RAS callback for gfx	Dennis Li	2	-1/+531
	Add functions for RAS error inject and query error counter Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add define for gfx ras subblock	Dennis Li	2	-0/+431
	Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: remove ras_reserve_vram in ras injection	Tao Zhou	1	-11/+10
	error injection address is not in gpu address space Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add check for ras error type	Tao Zhou	1	-3/+8
	only ue and ce errors are supported Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: update interrupt callback for all ras clients	Tao Zhou	3	-2/+6
	add err_data parameter in interrupt cb for ras clients Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: allow ras interrupt callback to return error data	Tao Zhou	2	-21/+22
	add error data as parameter for ras interrupt cb and process it Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: query umc ras error address	Tao Zhou	1	-0/+80
	query umc ras error address, translate it to gpu 4k page view and save it. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add structures for umc error address translation	Tao Zhou	2	-0/+12
	add related registers, callback function and channel index table Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add support for recording ras error address	Tao Zhou	2	-1/+3
	more than one error address may be recorded in one query Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: update algorithm of umc uncorrectable error counting	Tao Zhou	1	-6/+6
	remove the check of ErrorCodeExt v2: refine the if condition for ue counting Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: switch to amdgpu_umc structure	Tao Zhou	4	-6/+16
	create new amdgpu_umc structure to for more umc settings in future and switch to the new structure Signed-off-by: Tao Zhou <[email protected]> Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: use 64bit operation macros for umc	Tao Zhou	1	-17/+8
	replace some 32bit macros with 64bit operations to simplify code Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add RREG64/WREG64(_PCIE) operations	Tao Zhou	3	-0/+129
	add 64 bits register access functions v2: implement 64 bit functions in low level Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add ras error count after each query (v2)	Tao Zhou	1	-0/+11
	v1: increase ras ce/ue error count v2: log the number of correctable and uncorrectable errors Signed-off-by: Tao Zhou <[email protected]> Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: querry umc error count	Hawking Zhang	2	-1/+13
	check umc error count in both ras querry function and ras interrupt handler Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: init umc v6_1 functions for vega20	Hawking Zhang	1	-0/+14
	init umc callback function for vega20 in sw early init phase Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add umc v6_1 query error count support	Hawking Zhang	3	-0/+205
	Implement umc query_ras_error_count function to support querry both correctable and uncorrectable error Signed-off-by: Hawking Zhang <[email protected]> Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: add amdgpu_umc_functions structure	Hawking Zhang	2	-0/+31
	This is common structure as UMC callback function Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: init RSMU and UMC ip base address for vega20	Hawking Zhang	2	-0/+4
	the driver needs to program RSMU and UMC registers to support vega20 RAS feature Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: move some ras data structure to amdgpu_ras.h	Hawking Zhang	2	-69/+68
	These are common structures that can be included by IP specific source files Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: drop drmP.h from vcn_v2_5.c	Alex Deucher	1	-1/+1
	Unused. Acked-by: Sam Ravnborg <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: drop drmP.h from vcn_v2_0.c	Alex Deucher	1	-2/+2
	And fix the fallout. Acked-by: Sam Ravnborg <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: drop drmP.h from sdma_v5_0.c	Alex Deucher	1	-3/+6
	And fix the fallout. Acked-by: Sam Ravnborg <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: drop drmP.h from nv.c	Alex Deucher	1	-1/+2
	And fix up the fallout. Acked-by: Sam Ravnborg <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-07-31	drm/amdgpu: drop drmP.h from navi10_ih.c	Alex Deucher	1	-1/+2
	And fix the fallout. Acked-by: Sam Ravnborg <[email protected]> Signed-off-by: Alex Deucher <[email protected]>