Age | Commit message (Collapse) | Author | Files | Lines |
|
Delete ras_if->name in the RAS ctx structure and remove related lines.
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Candice Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.
Also, if no pointers are set, don't count, since
we've no way of reporting the counts.
Also, give this function a kernel-doc.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Reported-by: Tom Rix <[email protected]>
Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,
$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_
Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,
$cat ras_eeprom_table
Signature Version FirstOffs Size Checksum
0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage
0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000
1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001
2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002
3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003
4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004
5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005
6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006
7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007
8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008
$_
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Xinhui Pan <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and more or less complexity, and take
advantage of that.
Read and write the table in one go; use a separate
stage to decode or encode the data, as opposed to
on the fly, which keeps the I2C bus busy. Use a
single read/write to read/write the table or at
most two if the number of records we're
reading/writing wraps around.
Check the check-sum of a table in EEPROM on init.
Update the checksum at the same time as when
updating the table header signature, when the
threshold was increased on boot.
Take advantage of arithmetic modulo 256, that is,
use a byte!, to greatly simplify checksum
arithmetic.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Qualify with "ras_". Use kernel's own--don't
redefine your own.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.
This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.
Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().
When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".
Simplify the logic here and there, as well.
Cc: Guchun Chen <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Alexander Deucher <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.
Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Wrap amdgpu_ras_eeprom_xfer(..., bool write),
into amdgpu_ras_eeprom_read() and
amdgpu_ras_eeprom_write(), as that makes reading
and understanding the code clearer.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of fixing the spelling in
amdgpu_ras_eeprom_process_recods(),
rename it to,
amdgpu_ras_eeprom_xfer(),
to look similar to other I2C and protocol
transfer (read/write) functions.
Also to keep the column span to within reason by
using a shorter name.
Change the "num" function parameter from "int" to
"const u32" since it is the number of items
(records) to xfer, i.e. their count, which cannot
be a negative number.
Also swap the order of parameters, keeping the
pointer to records and their number next to each
other, while the direction now becomes the last
parameter.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use SMU to update the bad pages rather than directly
accessing the EEPROM from the driver.
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use adev_to_drm() to get to the drm_device pointer.
Signed-off-by: Guchun Chen <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-06-02:
amdgpu:
- GC/MM register access macro clean up for SR-IOV
- Beige Goby updates
- W=1 Fixes
- Aldebaran fixes
- Misc display fixes
- ACPI ATCS/ATIF handling rework
- SR-IOV fixes
- RAS fixes
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes for suspend/resume
- More buffer object subclassing work
- Add new INFO query for additional vbios information
- Add new placement for preemptable SG buffers
amdkfd:
- Misc fixes
radeon:
- W=1 Fixes
- Misc cleanups
UAPI:
- Add new INFO query for additional vbios information
Useful for debugging vbios related issues. Proposed umr patch:
https://patchwork.freedesktop.org/patch/433297/
- 16bpc fixed point format support
IGT test:
https://lists.freedesktop.org/archives/igt-dev/2021-May/031507.html
Proposed Vulkan patch:
https://github.com/kleinerm/pal/commit/a25d4802074b13a8d5f7edc96ae45469ecbac3c4
- Add a new GEM flag which is only used internally in the kernel driver. Userspace
is not allowed to set it.
drm:
- 16bpc fixed point format fourcc
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
On Context Query2 IOCTL return the correctable and
uncorrectable errors in O(1) fashion, from cached
values, and schedule a delayed work function to
calculate and cache them for the next such IOCTL.
v2: Cancel pending delayed work at ras_fini().
v3: Remove conditionals when dealing with delayed
work manipulation as they're inherently racy.
Cc: Alexander Deucher <[email protected]>
Cc: Christian König <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The correctable and uncorrectable errors
are calculated at each invocation of this
function. Therefore, it is highly inefficient to
return just one of them based on a Boolean
input. If the caller wants both, twice the work
would be done. (And this work is O(n^3) on
Vega20.)
Fix this "interface" to simply return what it had
calculated--both values. Let the caller choose
what it wants to record, inspect, use.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Backmerging from drm/drm-next to the patches for AMD devices
for v5.14.
Signed-off-by: Thomas Zimmermann <[email protected]>
|
|
Some of the stuff in amdgpu_device_fini such as HW interrupts
disable and pending fences finilization must be done right away on
pci_remove while most of the stuff which relates to finilizing and
releasing driver data structures can be kept until
drm_driver.release hook is called, i.e. when the last device
reference is dropped.
v4: Change functions prefix early->hw and late->sw
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Only clear RAS error counters if perestent EDC harvesting is not supported
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix a couple of syntax errors and removed one excess
parameter in the function documentations which lead
to kernel docs build warning.
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Dwaipayan Ray <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For aldebaran, hardware will not clear error status automatically when
reading error status register, insteadly driver should set clear bit of
the error status register explicitly to clear error status.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The original codes use ras status and kernl errno together in the same
function, which is a wrong code style.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If RAS is disabled through amdgpu_ras_enable kernel parameter,
we should quit the RAS initialization eariler to avoid initialization
of some RAS data structure such as sysfs etc.
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Oak Zeng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Export the runtime-set "ras_hw_enabled" and
"ras_enabled" to debugfs, for debugging.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename,
ras_hw_supported --> ras_hw_enabled, and
ras_features --> ras_enabled,
to show that ras_enabled is a subset of
ras_hw_enabled, which itself is a subset
of the ASIC capability.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move ras_hw_supported into struct amdgpu_dev.
The dependency is:
struct amdgpu_ras <== struct amdgpu_dev <== ASIC,
read as "struct amdgpu_ras depends on struct
amdgpu_dev, which depends on the hardware."
This can be loosely understood as, "if RAS is
supported, which is property of the ASIC (struct
amdgpu_dev), then we can access struct
amdgpu_ras."
v2: Fix a typo: must binary AND in ternary cond
in amdgpu_ras.c
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove redundant ras->supported, as this value
is also stored in adev->ras_features.
Use adev->ras_features, as that supercedes "ras",
since the latter is its member.
The dependency goes like this:
ras <== adev->ras_features <== hw_supported,
and is read as "ras depends on ras_features, which
depends on hw_supported." The arrows show the flow
of information, i.e. the dependency update.
"hw_supported" should also live in "adev".
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx ras now can be enabled by default in aldebaran
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add hdp block ras error query and reset support in
amdgpu ras error count query and reset interface
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add socket/die information in RAS messages for platforms
that support query those information
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
aldebaran gfx ras is still under development
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reset the RAS error count and error status registers after
reading to prevent over reporting error counts on Aldebaran.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-By: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
because "sscanf(str, "retire_page")" always return 0, if application use
the raw data for error injection, it always wrongly falls into "op ==
3". Change to use strstr instead.
Signed-off-by: Dennis Li <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add back the double-sscanf so that both decimal
and hexadecimal values could be read in, but this
time invert the scan so that hexadecimal format
with a leading 0x is tried first, and if that
fails, then try decimal format.
Also use a logical-AND instead of nesting double
if-conditional.
See commit "drm/amdgpu: Fix a bug for input with double sscanf"
Cc: Alexander Deucher <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Imporve the kernel-doc for the RAS sysfs
interface. Fix the grammar, fix the context.
Cc: Alexander Deucher <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add bad_page_cnt_threshold to debugfs, an optional
file system used for debugging, for reporting
purposes only--it usually matches the size of
EEPROM but may be different depending on the
"bad_page_threshold" kernel module option.
The "bad_page_cnt_threshold" is a dynamically
computed value. It depends on three things: the
VRAM size; the size of the EEPROM (or the size
allocated to the RAS table therein); and the
"bad_page_threshold" module parameter. It is a
dynamically computed value, when the amdgpu module
is run, on which further parameters and logic
depend, and as such it is helpful to see the
dynamically computed value in debugfs.
Cc: Alexander Deucher <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix if (ret) --> if (!ret), a bug, for
"retire_page", which caused the kernel to recall
the method with *pos == end of file, and that
bounced back with error. On the first run, we
advanced *pos, but returned 0 back to fs layer,
also a bug.
Fix the logic of the check of the result of
amdgpu_reserve_page_direct()--it is 0 on success,
and non-zero on error, not the other way
around. This patch fixes this bug.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove double-sscanf to scan for %llu and 0x%llx,
as that is not going to work!
The %llu will consume the "0" in "0x" of your
input, and the hex value you think you're entering
will always be 0. That is, a valid hex value can
never be consumed.
On the other hand, just entering a hex number
without leading 0x will either be scanned as a
string and not match, for instance FAB123, or
the leading decimal portion is scanned as the
%llu, for instance 123FAB will be scanned as 123,
which is not correct.
Thus remove the first %llu scan and leave only the
%llx scan, removing the leading 0x since %llx can
scan either.
Addresses are usually always hex values, so this
suffices.
Cc: Alexander Deucher <[email protected]>
Cc: Xinhui Pan <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
added support in RAS debugfs to add bad page for isolated page retirement testing
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In event of RAS UE + warm reset, error counters shall be harvested and cleared on driver load
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
gfx ras is only available in cerntain ip generations.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
mmhub ras is only avaiable in cerntain mmhub ip
generation.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
umc ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. split umc callbacks
into ras and non-ras ones so gpu driver only
initializes umc ras callbacks when it manages
umc ras.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
xgmi ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. move all xgmi ras
functions to xgmi_ras_funcs so gpu driver only
initializes xgmi ras functions when it manages
xgmi ras.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
nbio ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. split nbio callbacks
into ras and non-ras ones so gpu driver only
initializes nbio ras callbacks when it manages
nbio ras.
Signed-off-by: Hawking Zhang <[email protected]>
Reviewed-by: Dennis Li <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Driver only manages GFX/SDMA/MMHUB RAS in platforms
that gpu node is connected to cpu through XGMI, other
than that, it queries VBIOS for RAS capabilities.
Signed-off-by: Hawking Zhang <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Fix patch check warning:
WARNING: suspect code indent for conditional statements (8, 17)
+ if (obj && obj->use < 0) {
+ DRM_ERROR("RAS ERROR: Unbalance obj(%s) use\n", obj->head.name);
WARNING: braces {} are not necessary for single statement blocks
+ if (obj && obj->use < 0) {
+ DRM_ERROR("RAS ERROR: Unbalance obj(%s) use\n", obj->head.name);
+ }
Signed-off-by: Bernard Zhao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|