Age | Commit message (Collapse) | Author | Files | Lines |
|
Add interface to check mca umc status.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use asynchronous polling to handle umc_v12_0 poisoning.
v2:
1. Change function name.
2. Change the debugging information content.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The high three bits of ras features mask indicate socket
id, it should skip to check high three bits of ras features
mask before disable all ras features.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Preparing for asynchronous processing of umc page retirement.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add log info for umc_v12_0.
v2:
Delete redundant logs.
Signed-off-by: YiPeng Chai <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
- Enable the seq64 mapping sequence.
- Fix wflinfo va conflict and other bugs.
v1:
- The seq64 area needs to be included in the AMDGPU_VA_RESERVED_SIZE
otherwise the areas will conflict with user space allocations (Alex)
- It needs to be mapped read only in the user VM (Alex)
v2:
- Instead of just one define for TOP/BOTTOM
reserved space separate them into two (Christian)
- Fix the CPU and VA calculations and while at it
also cleanup error handling and kerneldoc (Christian)
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Arunpravin Paneer Selvam <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
Pull more drm fixes from Dave Airlie:
"This is mostly amdgpu and xe fixes, with an amdkfd and nouveau fix
thrown in.
The amdgpu ones are just the usual couple of weeks of fixes. The xe
ones are bunch of cleanups for the new xe driver, the fix you put in
on the merge commit and the kconfig fix that was hiding the problem
from me.
amdgpu:
- DSC fixes
- DC resource pool fixes
- OTG fix
- DML2 fixes
- Aux fix
- GFX10 RLC firmware handling fix
- Revert a broken workaround for SMU 13.0.2
- DC writeback fix
- Enable gfxoff when ROCm apps are active on gfx11 with the proper FW
version
amdkfd:
- Fix dma-buf exports using GEM handles
nouveau:
- fix a unneeded WARN_ON triggering
xe:
- Fix for definition of wakeref_t
- Fix for an error code aliasing
- Fix for VM_UNBIND_ALL in the case there are no bound VMAs
- Fixes for a number of __iomem address space mismatches reported by
sparse
- Fixes for the assignment of exec_queue priority
- A Fix for skip_guc_pc not taking effect
- Workaround for a build problem on GCC 11
- A couple of fixes for error paths
- Fix a Flat CCS compression metadata copy issue
- Fix a misplace array bounds checking
- Don't have display support depend on EXPERT (as discussed on IRC)"
* tag 'drm-next-2024-01-19' of git://anongit.freedesktop.org/drm/drm: (71 commits)
nouveau/vmm: don't set addr on the fail path to avoid warning
drm/amdgpu: Enable GFXOFF for Compute on GFX11
drm/amd/display: Drop 'acrtc' and add 'new_crtc_state' NULL check for writeback requests.
drm/amdgpu: revert "Adjust removal control flow for smu v13_0_2"
drm/amdkfd: init drm_client with funcs hook
drm/amd/display: Fix a switch statement in populate_dml_output_cfg_from_stream_state()
drm/amdgpu: Fix the null pointer when load rlc firmware
drm/amd/display: Align the returned error code with legacy DP
drm/amd/display: Fix DML2 watermark calculation
drm/amd/display: Clear OPTC mem select on disable
drm/amd/display: Port DENTIST hang and TDR fixes to OTG disable W/A
drm/amd/display: Add logging resource checks
drm/amd/display: Init link enc resources in dc_state only if res_pool presents
drm/amd/display: Fix late derefrence 'dsc' check in 'link_set_dsc_pps_packet()'
drm/amd/display: Avoid enum conversion warning
drm/amd/pm: Fix smuv13.0.6 current clock reporting
drm/amd/pm: Add error log for smu v13.0.6 reset
drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'
drm/amdgpu: drop exp hw support check for GC 9.4.3
drm/amdgpu: move debug options init prior to amdgpu device init
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"This removes the currently unused CLASS_DDC support (controllers set
the flag, but there is no client to use it).
Also, CLASS_SPD support gets simplified to prepare removal in the
future. Class based instantiation is not recommended these days
anyhow.
Furthermore, I2C core now creates a debugfs directory per I2C adapter.
Current bus driver users were converted to use it.
Finally, quite some driver updates. Standing out are patches for the
wmt-driver which is refactored to support more variants.
This is the rebased pull request where a large series for the
designware driver was dropped"
* tag 'i2c-for-6.8-rc1-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (38 commits)
MAINTAINERS: use proper email for my I2C work
i2c: stm32f7: add support for stm32mp25 soc
i2c: stm32f7: perform I2C_ISR read once at beginning of event isr
dt-bindings: i2c: document st,stm32mp25-i2c compatible
i2c: stm32f7: simplify status messages in case of errors
i2c: stm32f7: perform most of irq job in threaded handler
i2c: stm32f7: use dev_err_probe upon calls of devm_request_irq
i2c: i801: Add lis3lv02d for Dell XPS 15 7590
i2c: i801: Add lis3lv02d for Dell Precision 3540
i2c: wmt: Reduce redundant: REG_CR setting
i2c: wmt: Reduce redundant: function parameter
i2c: wmt: Reduce redundant: clock mode setting
i2c: wmt: Reduce redundant: wait event complete
i2c: wmt: Reduce redundant: bus busy check
i2c: mux: reg: Remove class-based device auto-detection support
i2c: make i2c_bus_type const
dt-bindings: at24: add ROHM BR24G04
eeprom: at24: use of_match_ptr()
i2c: cpm: Remove linux,i2c-index conversion from be32
i2c: imx: Make SDA actually optional for bus recovering
...
|
|
On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware
issue, which has since been fixed after version 64.
This patch only re-enables GFXOFF for GFX version 11 if the GPU's
MES KIQ firmware version is newer than version 64.
V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64.
V3: Add parentheses to avoid GCC warning for parentheses:
"suggest parentheses around comparison in operand of ‘&’"
V4: Remove "V3" from commit title
V5: Change commit description and insert 'Acked-by'
Signed-off-by: Ori Messinger <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Harish Kasiviswanathan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the
HW in an active state and is an unbalanced use of the IP callbacks.
Using the IP callbacks like this can lead to memory leaks, double
free and imbalanced reference counters.
Leaving the HW in an active state can lead to DMA accesses to memory now
freed by the driver.
Both is a complete no-go for driver unload so completely revert the
workaround for now.
This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04.
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
otherwise drm_client_dev_unregister() would try to
kfree(&adev->kfd.client).
Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles")
Signed-off-by: Flora Cui <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If the RLC firmware is invalid because of wrong header size,
the pointer to the rlc firmware is released in function
amdgpu_ucode_request. There will be a null pointer error
in subsequent use. So skip validation to fix it.
Fixes: 3da9b71563cb ("drm/amd: Use `amdgpu_ucode_*` helpers for GFX10")
Signed-off-by: Ma Jun <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Deferred error is also taken into account.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Why:
The PCI error slot reset maybe triggered after inject ue to UMC multi times, this
caused system hang.
[ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume
[ 557.373718] [drm] PCIE GART of 512M enabled.
[ 557.373722] [drm] PTB located at 0x0000031FED700000
[ 557.373788] [drm] VRAM is lost due to GPU reset!
[ 557.373789] [drm] PSP is resuming...
[ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset
[ 557.547067] [drm] PCI error: detected callback, state(1)!!
[ 557.547069] [drm] No support for XGMI hive yet...
[ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter
[ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations
[ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered
[ 557.610492] [drm] PCI error: slot reset callback!!
...
[ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded!
[ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded!
[ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI
[ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu
[ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023
[ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu]
[ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
[ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00
[ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202
[ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0
[ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010
[ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08
[ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
[ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000
[ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000
[ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0
[ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 560.843444] PKRU: 55555554
[ 560.846480] Call Trace:
[ 560.849225] <TASK>
[ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea
[ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea
[ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
[ 560.867778] ? show_regs.part.0+0x23/0x29
[ 560.872293] ? __die_body.cold+0x8/0xd
[ 560.876502] ? die_addr+0x3e/0x60
[ 560.880238] ? exc_general_protection+0x1c5/0x410
[ 560.885532] ? asm_exc_general_protection+0x27/0x30
[ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu]
[ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu]
[ 560.904520] process_one_work+0x228/0x3d0
How:
In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected
all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Show deferred error count for UMC syfs node
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On GFX version 11, GFXOFF was disabled due to a MES KIQ firmware
issue, which has since been fixed after version 64.
This patch only re-enables GFXOFF for GFX version 11 if the GPU's
MES KIQ firmware version is newer than version 64.
V2: Keep GFXOFF disabled on GFX11 if MES KIQ is below version 64.
V3: Add parentheses to avoid GCC warning for parentheses:
"suggest parentheses around comparison in operand of ‘&’"
V4: Remove "V3" from commit title
V5: Change commit description and insert 'Acked-by'
Signed-off-by: Ori Messinger <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Harish Kasiviswanathan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
fix array index out of bounds issue for ras_block_string[] array.
Fixes: 30df05fb74f6 ("drm/amdgpu: Align ras block enum with firmware")
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Submit command of wreg in GFX and COMPUTE ring to update
RLC_SPM_MC_CNT in guest machine during runtime.
Signed-off-by: YuanShang <[email protected]>
Reviewed-by: Emily Deng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
A static checker pointed out, that bo_va->base.bo was already derefenced
earlier in the same scope. Therefore this check is unnecessary here.
Reported-by: Dan Carpenter <[email protected]>
Fixes: 50661eb1a2c8 ("drm/amdgpu: Auto-validate DMABuf imports in compute VMs")
Reviewed-by: Kent Russell <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Calling amdgpu_device_ip_resume_phase1() during shutdown leaves the
HW in an active state and is an unbalanced use of the IP callbacks.
Using the IP callbacks like this can lead to memory leaks, double
free and imbalanced reference counters.
Leaving the HW in an active state can lead to DMA accesses to memory now
freed by the driver.
Both is a complete no-go for driver unload so completely revert the
workaround for now.
This reverts commit f5c7e7797060255dbc8160734ccc5ad6183c5e04.
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
ida_alloc() and ida_free() should be preferred to the deprecated
ida_simple_get() and ida_simple_remove().
Note that the upper limit of ida_simple_get() is exclusive, but the one of
ida_alloc_range() is inclusive. So a -1 has been added when needed.
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
otherwise drm_client_dev_unregister() would try to
kfree(&adev->kfd.client).
Fixes: 1819200166ce ("drm/amdkfd: Export DMABufs from KFD using GEM handles")
Signed-off-by: Flora Cui <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: space required after that ',' (ctx:VxV)
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After removal of the legacy EEPROM driver and I2C_CLASS_DDC support in
olpc_dcon there's no i2c client driver left supporting I2C_CLASS_DDC.
Class-based device auto-detection is a legacy mechanism and shouldn't
be used in new code. So we can remove this class completely now.
Acked-by: Alex Deucher <[email protected]>
Acked-by: Dmitry Baryshkov <[email protected]>
Acked-by: Harry Wentland <[email protected]>
Acked-by: Heiko Stuebner <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Acked-by: Jernej Skrabec <[email protected]>
Reviewed-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Reviewed-by: AngeloGioacchino Del Regno <[email protected]>
Reviewed-by: Neil Armstrong <[email protected]>
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: open brace '{' following struct go on the same line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: need consistent spacing around '-' (ctx:WxV)
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: space required before the open parenthesis '('
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: do not initialise globals to 0
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If the RLC firmware is invalid because of wrong header size,
the pointer to the rlc firmware is released in function
amdgpu_ucode_request. There will be a null pointer error
in subsequent use. So skip validation to fix it.
Signed-off-by: Ma Jun <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move ras capablity check to amdgpu_ras_check_supported.
Driver will query ras capablity through psp interace, or
vbios interface, or specific ip callbacks.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of traditional atomfirmware interfaces for RAS
capability, host driver can query ras capability from
psp starting from psp v13_0_6.
v2: drop redundant local variable from get_ras_capability.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: space prohibited before that '++' (ctx:WxB)
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
ERROR: trailing statements should be on next line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
spaces required around that '=' (ctx:VxV)
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: "foo* bar" should be "foo *bar"
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix the following errors reported by checkpatch:
ERROR: that open brace { should be on the previous line
Signed-off-by: chenxuebing <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Theoretically, it would be possible for a buggy or malicious VBIOS to
overwrite past the bounds of the passed parameters (or its own
workspace); add bounds checking to prevent this from happening.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3093
Signed-off-by: Alexander Richards <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Driver and firmware share the same ras block enum.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
use new ACA macro to instead of MCA
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Separate deferred error from UE and CE and log it
individually.
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Needs to do bad page retirement for deferred errors.
v2: Drop unused dev_info.
Signed-off-by: YiPeng Chai <[email protected]>
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add xgmi v6.4.0 ACA driver support
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
add mmhub v1.8 ACA driver support
v2:
use macro to define smn address value.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
add sdma v4.4.2 ACA driver support
v2:
use macro to define smn address value.
v3: squash in fix for unbalanced irqs
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v1:
add gfx v9.4.3 ACA driver support
v2:
use macro to define smn address value.
Signed-off-by: Yang Wang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
large bar
Some customer platforms do not enable mmconfig for various reasons,
such as bios bug, and therefore cannot access the GPU extend configuration
space through mmio.
When the system enters the d3cold state and resumes, the amdgpu driver
fails to resume because the extend configuration space registers of
GPU can't be restored. At this point, Usually we only see some failure
dmesg log printed by amdgpu driver, it is difficult to find the root
cause.
Therefor print a warnning message if the system can't access the
extended configuration space register when using large bar.
Signed-off-by: Ma Jun <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|