blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-01-22	drm/amdgpu: add interface to check mca umc status	YiPeng Chai	4	-2/+35
	Add interface to check mca umc status. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22	drm/amdgpu: Use asynchronous polling to handle umc_v12_0 poisoning	YiPeng Chai	3	-31/+116
	Use asynchronous polling to handle umc_v12_0 poisoning. v2: 1. Change function name. 2. Change the debugging information content. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22	drm/amdgpu: Fix ras features value calltrace	Stanley.Yang	2	-5/+12
	The high three bits of ras features mask indicate socket id, it should skip to check high three bits of ras features mask before disable all ras features. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22	drm/amdgpu: Prepare for asynchronous processing of umc page retirement	YiPeng Chai	2	-0/+39
	Preparing for asynchronous processing of umc page retirement. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22	drm/amdgpu: Add log info for umc_v12_0	YiPeng Chai	1	-0/+11
	Add log info for umc_v12_0. v2: Delete redundant logs. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22	drm/amdgpu: Enable seq64 manager and fix bugs	Arunpravin Paneer Selvam	11	-52/+68
	- Enable the seq64 mapping sequence. - Fix wflinfo va conflict and other bugs. v1: - The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE otherwise the areas will conflict with user space allocations (Alex) - It needs to be mapped read only in the user VM (Alex) v2: - Instead of just one define for TOP/BOTTOM reserved space separate them into two (Christian) - Fix the CPU and VA calculations and while at it also cleanup error handling and kerneldoc (Christian) Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]>
2024-01-19	Merge tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm	Linus Torvalds	23	-133/+87
	Pull more drm fixes from Dave Airlie: "This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix thrown in. The amdgpu ones are just the usual couple of weeks of fixes. The xe ones are bunch of cleanups for the new xe driver, the fix you put in on the merge commit and the kconfig fix that was hiding the problem from me. amdgpu: - DSC fixes - DC resource pool fixes - OTG fix - DML2 fixes - Aux fix - GFX10 RLC firmware handling fix - Revert a broken workaround for SMU 13.0.2 - DC writeback fix - Enable gfxoff when ROCm apps are active on gfx11 with the proper FW version amdkfd: - Fix dma-buf exports using GEM handles nouveau: - fix a unneeded WARN_ON triggering xe: - Fix for definition of wakeref_t - Fix for an error code aliasing - Fix for VM_UNBIND_ALL in the case there are no bound VMAs - Fixes for a number of __iomem address space mismatches reported by sparse - Fixes for the assignment of exec_queue priority - A Fix for skip_guc_pc not taking effect - Workaround for a build problem on GCC 11 - A couple of fixes for error paths - Fix a Flat CCS compression metadata copy issue - Fix a misplace array bounds checking - Don't have display support depend on EXPERT (as discussed on IRC)" * tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits) nouveau/vmm: don't set addr on the fail path to avoid warning drm/amdgpu: Enable GFXOFF for Compute on GFX11 drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests. drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2" drm/amdkfd: init drm_client with funcs hook drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state() drm/amdgpu: Fix the null pointer when load rlc firmware drm/amd/display: Align the returned error code with legacy DP drm/amd/display: Fix DML2 watermark calculation drm/amd/display: Clear OPTC mem select on disable drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A drm/amd/display: Add logging resource checks drm/amd/display: Init link enc resources in dc_state only if res_pool presents drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()' drm/amd/display: Avoid enum conversion warning drm/amd/pm: Fix smuv13.0.6 current clock reporting drm/amd/pm: Add error log for smu v13.0.6 reset drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()' drm/amdgpu: drop exp hw support check for GC 9.4.3 drm/amdgpu: move debug options init prior to amdgpu device init ...
2024-01-18	Merge tag 'i2c-for-6.8-rc1-rebased' of ↵	Linus Torvalds	1	-1/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "This removes the currently unused CLASS_DDC support (controllers set the flag, but there is no client to use it). Also, CLASS_SPD support gets simplified to prepare removal in the future. Class based instantiation is not recommended these days anyhow. Furthermore, I2C core now creates a debugfs directory per I2C adapter. Current bus driver users were converted to use it. Finally, quite some driver updates. Standing out are patches for the wmt-driver which is refactored to support more variants. This is the rebased pull request where a large series for the designware driver was dropped" * tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits) MAINTAINERS: use proper email for my I2C work i2c: stm32f7: add support for stm32mp25 soc i2c: stm32f7: perform I2C_ISR read once at beginning of event isr dt-bindings: i2c: document st,stm32mp25-i2c compatible i2c: stm32f7: simplify status messages in case of errors i2c: stm32f7: perform most of irq job in threaded handler i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq i2c: i801: Add lis3lv02d for Dell XPS 15 7590 i2c: i801: Add lis3lv02d for Dell Precision 3540 i2c: wmt: Reduce redundant: REG_CR setting i2c: wmt: Reduce redundant: function parameter i2c: wmt: Reduce redundant: clock mode setting i2c: wmt: Reduce redundant: wait event complete i2c: wmt: Reduce redundant: bus busy check i2c: mux: reg: Remove class-based device auto-detection support i2c: make i2c_bus_type const dt-bindings: at24: add ROHM BR24G04 eeprom: at24: use of_match_ptr() i2c: cpm: Remove linux,i2c-index conversion from be32 i2c: imx: Make SDA actually optional for bus recovering ...
2024-01-18	drm/amdgpu: Enable GFXOFF for Compute on GFX11	Ori Messinger	1	-4/+2
	On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18	drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"	Christian König	4	-65/+1
	Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18	drm/amdkfd: init drm_client with funcs hook	Flora Cui	1	-1/+4
	otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Fix the null pointer when load rlc firmware	Ma Jun	1	-9/+6
	If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10") Signed-off-by: Ma Jun <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-01-18	drm/amdgpu: update error condition check for umc_v12_0_query_error_address	Tao Zhou	1	-4/+1
	Deferred error is also taken into account. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Skip do PCI error slot reset during RAS recovery	Stanley.Yang	1	-0/+14
	Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure. Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Show deferred error count for UMC	Stanley.Yang	1	-2/+6
	Show deferred error count for UMC syfs node Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Enable GFXOFF for Compute on GFX11	Ori Messinger	1	-4/+2
	On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware issue, which has since been fixed after version 64. This patch only re-enables GFXOFF for GFX version 11 if the GPU's MES KIQ firmware version is newer than version 64. V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64. V3: Add parentheses to avoid GCC warning for parentheses: "suggest parentheses around comparison in operand of ‘&’" V4: Remove "V3" from commit title V5: Change commit description and insert 'Acked-by' Signed-off-by: Ori Messinger <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: fix UBSAN array-index-out-of-bounds for ras_block_string[]	Yang Wang	1	-1/+4
	fix array index out of bounds issue for ras_block_string[] array. Fixes: 30df05fb74f6 ("drm/amdgpu: Align ras block enum with firmware") Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amd/amdgpu: Update RLC_SPM_MC_CNT by ring wreg in guest	YuanShang	8	-13/+23
	Submit command of wreg in GFX and COMPUTE ring to update RLC_SPM_MC_CNT in guest machine during runtime. Signed-off-by: YuanShang <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Remove unnecessary NULL check	Felix Kuehling	1	-1/+1
	A static checker pointed out, that bo_va->base.bo was already derefenced earlier in the same scope. Therefore this check is unnecessary here. Reported-by: Dan Carpenter <[email protected]> Fixes: 50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs") Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"	Christian König	4	-65/+1
	Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the HW in an active state and is an unbalanced use of the IP callbacks. Using the IP callbacks like this can lead to memory leaks, double free and imbalanced reference counters. Leaving the HW in an active state can lead to DMA accesses to memory now freed by the driver. Both is a complete no-go for driver unload so completely revert the workaround for now. This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Remove usage of the deprecated ida_simple_xx() API	Christophe JAILLET	1	-4/+3
	ida_alloc() and ida_free() should be preferred to the deprecated ida_simple_get() and ida_simple_remove(). Note that the upper limit of ida_simple_get() is exclusive, but the one of ida_alloc_range() is inclusive. So a -1 has been added when needed. Signed-off-by: Christophe JAILLET <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdkfd: init drm_client with funcs hook	Flora Cui	1	-1/+4
	otherwise drm_client_dev_unregister() would try to kfree(&adev->kfd.client). Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles") Signed-off-by: Flora Cui <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Clean up errors in umc_v6_0.c	chenxuebing	1	-1/+1
	Fix the following errors reported by checkpatch: ERROR: space required after that ',' (ctx:VxV) Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm/amdgpu: Clean up errors in clearstate_si.h	chenxuebing	1	-16/+8
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-18	drm: remove I2C_CLASS_DDC support	Heiner Kallweit	1	-1/+0
	After removal of the legacy EEPROM driver and I2C_CLASS_DDC support in olpc_dcon there's no i2c client driver left supporting I2C_CLASS_DDC. Class-based device auto-detection is a legacy mechanism and shouldn't be used in new code. So we can remove this class completely now. Acked-by: Alex Deucher <[email protected]> Acked-by: Dmitry Baryshkov <[email protected]> Acked-by: Harry Wentland <[email protected]> Acked-by: Heiko Stuebner <[email protected]> Acked-by: Jani Nikula <[email protected]> Acked-by: Jernej Skrabec <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> Reviewed-by: Neil Armstrong <[email protected]> Signed-off-by: Heiner Kallweit <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in amdgpu.h	chenxuebing	1	-6/+3
	Fix the following errors reported by checkpatch: ERROR: open brace '{' following struct go on the same line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in amdgpu_gmc.c	chenxuebing	1	-1/+1
	Fix the following errors reported by checkpatch: ERROR: need consistent spacing around '-' (ctx:WxV) Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in jpeg_v2_5.c	chenxuebing	1	-6/+4
	Fix the following errors reported by checkpatch: ERROR: space required before the open parenthesis '(' ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in gfx_v9_4.c	chenxuebing	1	-2/+3
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in amdgpu_drv.c	chenxuebing	1	-2/+2
	Fix the following errors reported by checkpatch: ERROR: do not initialise globals to 0 Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amd: Clean up errors in amdgpu_vkms.c	chenxuebing	1	-2/+1
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Fix the null pointer when load rlc firmware	Ma Jun	1	-9/+6
	If the RLC firmware is invalid because of wrong header size, the pointer to the rlc firmware is released in function amdgpu_ucode_request. There will be a null pointer error in subsequent use. So skip validation to fix it. Signed-off-by: Ma Jun <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Centralize ras cap query to amdgpu_ras_check_supported	Hawking Zhang	1	-77/+93
	Move ras capablity check to amdgpu_ras_check_supported. Driver will query ras capablity through psp interace, or vbios interface, or specific ip callbacks. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Query ras capablity from psp v2	Hawking Zhang	3	-0/+38
	Instead of traditional atomfirmware interfaces for RAS capability, host driver can query ras capability from psp starting from psp v13_0_6. v2: drop redundant local variable from get_ras_capability. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in amdgpu_rlc.c	chenxuebing	1	-1/+1
	Fix the following errors reported by checkpatch: ERROR: space prohibited before that '++' (ctx:WxB) Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amd: Clean up errors in sdma_v2_4.c	chenxuebing	1	-9/+6
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line ERROR: trailing statements should be on next line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amd/amdgpu: Clean up errors in amdgpu_umr.h	chenxuebing	1	-2/+2
	Fix the following errors reported by checkpatch: spaces required around that '=' (ctx:VxV) Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in amdgpu_atomfirmware.h	chenxuebing	1	-1/+1
	Fix the following errors reported by checkpatch: ERROR: "foo* bar" should be "foo *bar" Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in clearstate_gfx9.h	chenxuebing	1	-18/+9
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Clean up errors in navi10_ih.c	chenxuebing	1	-2/+1
	Fix the following errors reported by checkpatch: ERROR: that open brace { should be on the previous line Signed-off-by: chenxuebing <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: check PS, WS index	Alexander Richards	8	-47/+75
	Theoretically, it would be possible for a buggy or malicious VBIOS to overwrite past the bounds of the passed parameters (or its own workspace); add bounds checking to prevent this from happening. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093 Signed-off-by: Alexander Richards <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Align ras block enum with firmware	Hawking Zhang	1	-0/+2
	Driver and firmware share the same ras block enum. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: replace MCA macro with ACA for XGMI	Yang Wang	1	-12/+12
	use new ACA macro to instead of MCA Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Log deferred error separately	Candice Li	6	-58/+139
	Separate deferred error from UE and CE and log it individually. Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Do bad page retirement for deferred errors	Candice Li	1	-6/+4
	Needs to do bad page retirement for deferred errors. v2: Drop unused dev_info. Signed-off-by: YiPeng Chai <[email protected]> Signed-off-by: Candice Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: add xgmi v6.4.0 ACA support	Yang Wang	1	-1/+60
	add xgmi v6.4.0 ACA driver support Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: add mmhub v1.8 ACA support	Yang Wang	1	-0/+87
	v1: add mmhub v1.8 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: add sdma v4.4.2 ACA support	Yang Wang	1	-0/+72
	v1: add sdma v4.4.2 ACA driver support v2: use macro to define smn address value. v3: squash in fix for unbalanced irqs Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: add gfx v9.4.3 ACA support	Yang Wang	1	-0/+88
	v1: add gfx v9.4.3 ACA driver support v2: use macro to define smn address value. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-15	drm/amdgpu: Check extended configuration space register when system uses ↵	Ma Jun	1	-0/+4
	large bar Some customer platforms do not enable mmconfig for various reasons, such as bios bug, and therefore cannot access the GPU extend configuration space through mmio. When the system enters the d3cold state and resumes, the amdgpu driver fails to resume because the extend configuration space registers of GPU can't be restored. At this point, Usually we only see some failure dmesg log printed by amdgpu driver, it is difficult to find the root cause. Therefor print a warnning message if the system can't access the extended configuration space register when using large bar. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>