Age | Commit message (Collapse) | Author | Files | Lines |
|
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.
Signed-off-by: Emily Deng <[email protected]>
Reviewed by: Monk Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.
Also, if no pointers are set, don't count, since
we've no way of reporting the counts.
Also, give this function a kernel-doc.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Reported-by: Tom Rix <[email protected]>
Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.
[How]
flr_work should take write_lock to avoid this case.
Signed-off-by: Jingwen Chen <[email protected]>
Reviewed-by: Monk Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use correct channel and instance values
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 075e8080c1a7571563171a07fa9ce47c4bc80044.
Reason for revert: the related commit is reverted.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 31f33243788dcbae8bd2819ed83923a73f7dfd30.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 7a68d188d1c4a9d947369acaa19040a58baaaeda.
Reason for revert: the related commit is reverted.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 51627f03804173a64d23828bc9e4d8474451814f.
Reason for revert: it causes regression on Aldebaran.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After FLR, the msix will be cleared, so need to re-enable it.
Signed-off-by: Peng Ju Zhou <[email protected]>
Signed-off-by: Emily.Deng <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The thunk needs to mmap all BOs for CPU access to allow the debugger to
access them. Invisible ones are mapped with PROT_NONE.
Fixes: 71df0368e9b6 ("drm/amdgpu: Implement mmap as GEM object function")
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
support umc ras function initialization for aldebaran
v2: squash in compile fix
Signed-off-by: John Clements <[email protected]>
Signed-off-by: Stanley.Yang <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The irq number should be decided by num_crtc, and the num_crtc could change
by parameter.
Signed-off-by: Emily Deng <[email protected]>
Reviewed by: Monk Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If copy_to_user() fails then this should return -EFAULT instead of
-EINVAL.
Fixes: c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This error path needs to unlock before returning. While we're at it,
the correct error code from copy_to_user() failure is -EFAULT, not
-EINVAL.
Fixes: c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Pull drm fixes from Dave Airlie:
"Some fixes for rc1 that came in the past weeks, mainly a bunch of
amdgpu fixes, some i915 and the rest are misc around the place. I'm
sending this a bit early so some more stuff may show up, but I'll
probably take tomorrow off.
dma-buf:
- doc fixes
amdgpu:
- Misc Navi fixes
- Powergating fix
- Yellow Carp updates
- Beige Goby updates
- S0ix fix
- Revert overlay validation fix
- GPU reset fix for DC
- PPC64 fix
- Add new dimgrey cavefish DID
- RAS fix
- TTM fixes
amdkfd:
- SVM fixes
radeon:
- Fix missing drm_gem_object_put in error path
- NULL ptr deref fix
i915:
- display DP VSC fix
- DG1 display fix
- IRQ fixes
- IRQ demidlayering
gma500:
- bo leaks in error paths fixed"
* tag 'drm-next-2021-07-08-1' of git://anongit.freedesktop.org/drm/drm: (52 commits)
drm/i915: Drop all references to DRM IRQ midlayer
drm/i915: Use the correct IRQ during resume
drm/i915/display/dg1: Correctly map DPLLs during state readout
drm/i915/display: Do not zero past infoframes.vsc
drm/amdgpu: Conditionally reset SDMA RAS error counts
drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
drm/amdkfd: add invalid pages debug at vram migration
drm/amdkfd: skip migration for pages already in VRAM
drm/amdkfd: skip invalid pages during migrations
drm/amdkfd: classify and map mixed svm range pages in GPU
drm/amdkfd: use hmm range fault to get both domain pfns
drm/amdgpu: get owner ref in validate and map
drm/amdkfd: set owner ref to svm range prefault
drm/amdkfd: add owner ref param to get hmm pages
drm/amdkfd: device pgmap owner at the svm migrate init
drm/amdkfd: inc counter on child ranges with xnack off
drm/amd/display: Extend DMUB diagnostic logging to DCN3.1
drm/amdgpu: Update NV SIMD-per-CU to 2
drm/amdgpu: add new dimgrey cavefish DID
drm/amd/pm: skip PrepareMp1ForUnload message in s0ix
...
|
|
The i2c_transfer() function returns negatives or else the number of
messages transferred. This code does not work because ARRAY_SIZE()
is type size_t and so that means negative values of "r" are type
promoted to high positive values which are greater than the ARRAY_SIZE().
Fix this by changing the < to != which works regardless of type
promotion.
Fixes: 746b584762e452 ("drm/amdgpu: Fixes to the AMDGPU EEPROM driver")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If amdgpu_eeprom_read() returns a negative error code then the error
handling checks:
if (res < buf_size) {
The problem is that "buf_size" is a u32 so negative values are type
promoted to a high positive values and the condition is false. Fix
this by changing the type of "buf_size" to int.
Fixes: 63d4c081a556a1 ("drm/amdgpu: Optimize EEPROM RAS table I/O")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In amdgpu_ras_query_error_count() return an error
if the device doesn't support RAS. This prevents
that function from having to always set the values
of the integer pointers (if set), and thus
prevents function side effects--always to have to
set values of integers if integer pointers set,
regardless of whether RAS is supported or
not--with this change this side effect is
mitigated.
Also, if no pointers are set, don't count, since
we've no way of reporting the counts.
Also, give this function a kernel-doc.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Reported-by: Tom Rix <[email protected]>
Fixes: a46751fbcde505 ("drm/amdgpu: Fix RAS function interface")
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.
[How]
flr_work should take write_lock to avoid this case.
Signed-off-by: Jingwen Chen <[email protected]>
Reviewed-by: Monk Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The I2C IP doesn't support writes or reads of 0 bytes.
In order for a START/STOP transaction to take
place on the bus, the data written/read has to be
at least one byte.
That is, you cannot generate a write with 0 bytes,
just to get the ACK from a device, just so you can
probe that device if it is on the bus and so to
discover all devices on the bus--you'll have to
read at least one byte. Writes of 0 bytes generate
no START/STOP on this I2C IP--the bus is not
engaged at all.
Set the I2C_AQ_NO_ZERO_LEN to the existing I2C
quirk tables for Aldebaran, Arcturus, Navi10 and
Sienna Cichlid, and add a quirk table to the I2C
driver which drives the bus when the SMU
doesn't--for instance on Vega20.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
GPU timing counters are read via KIQ under sriov, which will introduce
a delay.
[How]
It could be directly read by MMIO.
v2: Add additional check to prevent carryover issue.
v3: Only check for carryover for once to prevent performance issue.
v4: Add comments of the rough frequency where carryover happens.
v5: Remove mutex and gfxoff ctrl unused with current timing registers.
Signed-off-by: YuBiao Wang <[email protected]>
Acked-by: Horace Chen <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Monk Liu <[email protected]>
Reviewed-by: Lijo Lazar <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It is based on reverting two patches back.
drm/amdkfd: Make TLB flush conditional on mapping
drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use new helper function amdgpu_vm_set_pasid() to
assign vm pasid value. This also ensures that we don't free
a pasid from vm code as pasids are allocated somewhere else.
Signed-off-by: Nirmoy Das <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Replace idr with xarray as we actually need hash functionality.
Cleanup code related to vm pasid by adding helper function.
Signed-off-by: Nirmoy Das <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
Short summary of fixes pull:
* amdgpu: TTM fixes
* dma-buf: Doc fixes
* gma500: Fix potential BO leaks in error handling
* radeon: Fix NULL-ptr deref
Signed-off-by: Dave Airlie <[email protected]>
From: Thomas Zimmermann <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/YN2GK2SH64yqXqh9@linux-uq9g
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.14-2021-07-01:
amdgpu:
- Misc Navi fixes
- Powergating fix
- Yellow Carp updates
- Beige Goby updates
- S0ix fix
- Revert overlay validation fix
- GPU reset fix for DC
- PPC64 fix
- Add new dimgrey cavefish DID
- RAS fix
amdkfd:
- SVM fixes
radeon:
- Fix missing drm_gem_object_put in error path
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Added option for per CPU threads to the hwlat tracer
- Have hwlat tracer handle hotplug CPUs
- New tracer: osnoise, that detects latency caused by interrupts,
softirqs and scheduling of other tasks.
- Added timerlat tracer that creates a thread and measures in detail
what sources of latency it has for wake ups.
- Removed the "success" field of the sched_wakeup trace event. This has
been hardcoded as "1" since 2015, no tooling should be looking at it
now. If one exists, we can revert this commit, fix that tool and try
to remove it again in the future.
- tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
- New boot command line option "tp_printk_stop", as tp_printk causes
trace events to write to console. When user space starts, this can
easily live lock the system. Having a boot option to stop just after
boot up is useful to prevent that from happening.
- Have ftrace_dump_on_oops boot command line option take numbers that
match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
- Bootconfig clean ups, fixes and enhancements.
- New ktest script that tests bootconfig options.
- Add tracepoint_probe_register_may_exist() to register a tracepoint
without triggering a WARN*() if it already exists. BPF has a path
from user space that can do this. All other paths are considered a
bug.
- Small clean ups and fixes
* tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits)
tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
tracing: Simplify & fix saved_tgids logic
treewide: Add missing semicolons to __assign_str uses
tracing: Change variable type as bool for clean-up
trace/timerlat: Fix indentation on timerlat_main()
trace/osnoise: Make 'noise' variable s64 in run_osnoise()
tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
tracing: Fix spelling in osnoise tracer "interferences" -> "interference"
Documentation: Fix a typo on trace/osnoise-tracer
trace/osnoise: Fix return value on osnoise_init_hotplug_support
trace/osnoise: Make interval u64 on osnoise_main
trace/osnoise: Fix 'no previous prototype' warnings
tracing: Have osnoise_main() add a quiescent state for task rcu
seq_buf: Make trace_seq_putmem_hex() support data longer than 8
seq_buf: Fix overflow in seq_buf_putmem_hex()
trace/osnoise: Support hotplug operations
trace/hwlat: Support hotplug operations
trace/hwlat: Protect kdata->kthread with get/put_online_cpus
trace: Add timerlat tracer
trace: Add osnoise tracer
...
|
|
Pull drm updates from Dave Airlie:
"Highlights:
- AMD enables two more GPUs, with resulting header files
- i915 has started to move to TTM for discrete GPU and enable DG1
discrete GPU support (not by default yet)
- new HyperV drm driver
- vmwgfx adds arm64 support
- TTM refactoring ongoing
- 16bpc display support for AMD hw
Otherwise it's just the usual insane amounts of work all over the
place in lots of drivers and the core, as mostly summarised below:
Core:
- mark AGP ioctls as legacy
- disable force probing for non-master clients
- HDR metadata property helpers
- HDMI infoframe signal colorimetry support
- remove drm_device.pdev pointer
- remove DRM_KMS_FB_HELPER config option
- remove drm_pci_alloc/free
- drm_err_*/drm_dbg_* helpers
- use drm driver names for fbdev
- leaked DMA handle fix
- 16bpc fixed point format fourcc
- add prefetching memcpy for WC
- Documentation fixes
aperture:
- add aperture ownership helpers
dp:
- aux fixes
- downstream 0 port handling
- use extended base receiver capability DPCD
- Rename DP_PSR_SELECTIVE_UPDATE to better mach eDP spec
- mst: use khz as link rate during init
- VCPI fixes for StarTech hub
ttm:
- provide tt_shrink file via debugfs
- warn about freeing pinned BOs
- fix swapping error handling
- move page alignment into BO
- cleanup ttm_agp_backend
- add ttm_sys_manager
- don't override vm_ops
- ttm_bo_mmap removed
- make ttm_resource base of all managers
- remove VM_MIXEDMAP usage
panel:
- sysfs_emit support
- simple: runtime PM support
- simple: power up panel when reading EDID + caching
bridge:
- MHDP8546: HDCP support + DT bindings
- MHDP8546: Register DP AUX channel with userspace
- TI SN65DSI83 + SN65DSI84: add driver
- Sil8620: Fix module dependencies
- dw-hdmi: make CEC driver loading optional
- Ti-sn65dsi86: refclk fixes, subdrivers, runtime pm
- It66121: Add driver + DT bindings
- Adv7511: Support I2S IEC958 encoding
- Anx7625: fix power-on delay
- Nwi-dsi: Modesetting fixes; Cleanups
- lt6911: add missing MODULE_DEVICE_TABLE
- cdns: fix PM reference leak
hyperv:
- add new DRM driver for HyperV graphics
efifb:
- non-PCI device handling fixes
i915:
- refactor IP/device versioning
- XeLPD Display IP preperation work
- ADL-P enablement patches
- DG1 uAPI behind BROKEN
- disable mmap ioctl for discerte GPUs
- start enabling HuC loading for Gen12+
- major GuC backend rework for new platforms
- initial TTM support for Discrete GPUs
- locking rework for TTM prep
- use correct max source link rate for eDP
- %p4cc format printing
- GLK display fixes
- VLV DSI panel power fixes
- PSR2 disabled for RKL and ADL-S
- ACPI _DSM invalid access fixed
- DMC FW path abstraction
- ADL-S PCI ID update
- uAPI headers converted to kerneldoc
- initial LMEM support for DG1
- x86/gpu: add Jasperlake to gen11 early quirks
amdgpu:
- Aldebaran updates + initial SR-IOV
- new GPU: Beige Goby and Yellow Carp support
- more LTTPR display work
- Vangogh updates
- SDMA 5.x GCR fixes
- PCIe ASPM support
- Renoir TMZ enablement
- initial multiple eDP panel support
- use fdinfo to track devices/process info
- pin/unpin TTM fixes
- free resource on fence usage query
- fix fence calculation
- fix hotunplug/suspend issues
- GC/MM register access macro cleanup for SR-IOV
- W=1 fixes
- ACPI ATCS/ATIF handling rework
- 16bpc fixed point format support
- Initial smartshift support
- RV/PCO power tuning fixes
- new INFO query for additional vbios info
amdkfd:
- SR-IOV aldebaran support
- HMM SVM support
radeon:
- SMU regression fixes
- Oland flickering fix
vmwgfx:
- enable console with fbdev emulation
- fix cpu updates of coherent multisample surfaces
- remove reservation semaphore
- add initial SVGA3 support
- support arm64
msm:
- devcoredump support for display errors
- dpu/dsi: yaml bindings conversion
- mdp5: alpha/blend_mode/zpos support
- a6xx: cached coherent buffer support
- gpu iova fault improvement
- a660 support
rockchip:
- RK3036 win1 scaling support
- RK3066/3188 missing register support
- RK3036/3066/3126/3188 alpha support
mediatek:
- MT8167 HDMI support
- MT8183 DPI dual edge support
tegra:
- fixed YUV support/scaling on Tegra186+
ast:
- use pcim_iomap
- fix DP501 EDID
bochs:
- screen blanking support
etnaviv:
- export more GPU ID values to userspace
- add HWDB entry for GPU on i.MX8MP
- rework linear window calcs
exynos:
- pm runtime changes
imx:
- Annotate dma_fence critical section
- fix PRG modifiers after drmm conversion
- Add 8 pixel alignment fix for 1366x768
- fix YUV advertising
- add color properties
ingenic:
- IPU planes fix
panfrost:
- Mediatek MT8183 support + DT bindings
- export AFBC_FEATURES register to userspace
simpledrm:
- %pr for printing resources
nouveau:
- pin/unpin TTM fixes
qxl:
- unpin shadow BO
virtio:
- create dumb BOs as guest blob
vkms:
- drmm_universal_plane_alloc
- add XRGB plane composition
- overlay support"
* tag 'drm-next-2021-07-01' of git://anongit.freedesktop.org/drm/drm: (1570 commits)
drm/i915: Reinstate the mmap ioctl for some platforms
drm/i915/dsc: abstract helpers to get bigjoiner primary/secondary crtc
Revert "drm/msm/mdp5: provide dynamic bandwidth management"
drm/msm/mdp5: provide dynamic bandwidth management
drm/msm/mdp5: add perf blocks for holding fudge factors
drm/msm/mdp5: switch to standard zpos property
drm/msm/mdp5: add support for alpha/blend_mode properties
drm/msm/mdp5: use drm_plane_state for pixel blend mode
drm/msm/mdp5: use drm_plane_state for storing alpha value
drm/msm/mdp5: use drm atomic helpers to handle base drm plane state
drm/msm/dsi: do not enable PHYs when called for the slave DSI interface
drm/msm: Add debugfs to trigger shrinker
drm/msm/dpu: Avoid ABBA deadlock between IRQ modules
drm/msm: devcoredump iommu fault support
iommu/arm-smmu-qcom: Add stall support
drm/msm: Improve the a6xx page fault handler
iommu/arm-smmu-qcom: Add an adreno-smmu-priv callback to get pagefault info
iommu/arm-smmu: Add support for driver IOMMU fault handlers
drm/msm: export hangcheck_period in debugfs
drm/msm/a6xx: add support for Adreno 660 GPU
...
|
|
In case when psp_init_asd_microcode() fails to load ASD microcode file,
psp_v12_0_init_microcode() tries to print the firmware filename that
failed to load before bailing out.
This is wrong because:
- the firmware filename it would want it print is an incorrect one as
psp_init_asd_microcode() and psp_v12_0_init_microcode() are loading
different filenames
- it tries to print fw_name, but that's not yet been initialized by that
time, so it prints random stack contents, e.g.
amdgpu 0000:04:00.0: Direct firmware load for amdgpu/renoir_asd.bin failed with error -2
amdgpu 0000:04:00.0: amdgpu: fail to initialize asd microcode
amdgpu 0000:04:00.0: amdgpu: psp v12.0: Failed to load firmware "\xfeTO\x8e\xff\xff"
Fix that by bailing out immediately, instead of priting the bogus error
message.
Reported-by: Vojtech Pavlik <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
It is not true (as stated in the reverted commit changelog) that we never
unmap the BAR on failure; it actually does happen properly on
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
amdgpu_device_fini() error path.
What's worse, this commit actually completely breaks resource freeing on
probe failure (like e.g. failure to load microcode), as
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
early, leaving all the resources that'd normally be freed in
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
to all sorts of oopses when someone tries to, for example, access the
sysfs and procfs resources which are still around while the driver is
gone.
Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Print the name of the DRM driver when taking over fbdev devices. Makes
the output to dmesg more consistent. Note that the driver name is only
used for printing a string to the kernel log. No UAPI is affected by this
change.
Signed-off-by: Thomas Zimmermann <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Acked-by: Chen-Yu Tsai <[email protected]> # sun4i
Acked-by: Neil Armstrong <[email protected]> # meson
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU
reset. This leads to extra complexity when we need to synchronize timeout
works with the reset work. One solution to address that is to have an
ordered workqueue at the driver level that will be used by the different
schedulers to queue their timeout work. Thanks to the serialization
provided by the ordered workqueue we are guaranteed that timeout
handlers are executed sequentially, and can thus easily reset the GPU
from the timeout handler without extra synchronization.
v5:
* Add a new paragraph to the timedout_job() method
v3:
* New patch
v4:
* Actually use the timeout_wq to queue the timeout work
Suggested-by: Daniel Vetter <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Steven Price <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
Acked-by: Daniel Vetter <[email protected]>
Acked-by: Christian König <[email protected]>
Cc: Qiang Yu <[email protected]>
Cc: Emma Anholt <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: "Christian König" <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Use amdgpu_ucode_name to show ucode name and psp_gfx_cmd_name to
show psp_gfx_cmd name in psp_cmd_submit_buf.
v2: adjust function name
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Implement function psp_gfx_cmd_name to show cmd name
via cmd id.
v2: rename it to psp_gfx_cmd_name
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Implement function amdgpu_ucode_name to show ucode name
via ucode id.
v2: rename it to amdgpu_ucode_name
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
On long transfers to the EEPROM device,
i.e. write, it is observed that the driver aborts
the transfer.
The reason for this is that the driver isn't
patient enough--the IC_STATUS register's contents
is 0x27, which is MST_ACTIVITY | TFE | TFNF |
ACTIVITY. That is, while the transmission FIFO is
empty, we, the I2C master device, are still
driving the bus.
Implement the correct procedure to disable
the block, as described in the DesignWare I2C
Databook, section 3.8.3 Disabling DW_apb_i2c on
page 56. Now there are no premature aborts on long
data transfers.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In smu_v11_0_i2c_transmit() use a single loop to
transmit bytes, instead of two nested loops.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Debugfs RAS EEPROM files are available when
the ASIC supports RAS, and when the debugfs is
enabled, an also when "ras_enable" module
parameter is set to 0. However in this case,
we get a kernel oops when accessing some of
the "ras_..." controls in debugfs. The reason
for this is that struct amdgpu_ras::adev is
unset. This commit sets it, thus enabling access
to those facilities. Note that this facilitates
EEPROM access and not necessarily RAS features or
functionality.
Cc: Alexander Deucher <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
pos is 64 bits.
Fixes: c65b0805e77919 ("drm/amdgpu: RAS EEPROM table is now in debugfs")
Cc: [email protected]
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add "ras_eeprom_size" file in debugfs, which
reports the maximum size allocated to the RAS
table in EEROM, as the number of bytes and the
number of records it could store. For instance,
$cat /sys/kernel/debug/dri/0/ras/ras_eeprom_size
262144 bytes or 10921 records
$_
Add "ras_eeprom_table" file in debugfs, which
dumps the RAS table stored EEPROM, in a formatted
way. For instance,
$cat ras_eeprom_table
Signature Version FirstOffs Size Checksum
0x414D4452 0x00010000 0x00000014 0x000000EC 0x000000DA
Index Offset ErrType Bank/CU TimeStamp Offs/Addr MemChl MCUMCID RetiredPage
0 0x00014 ue 0x00 0x00000000607608DC 0x000000000000 0x00 0x00 0x000000000000
1 0x0002C ue 0x00 0x00000000607608DC 0x000000001000 0x00 0x00 0x000000000001
2 0x00044 ue 0x00 0x00000000607608DC 0x000000002000 0x00 0x00 0x000000000002
3 0x0005C ue 0x00 0x00000000607608DC 0x000000003000 0x00 0x00 0x000000000003
4 0x00074 ue 0x00 0x00000000607608DC 0x000000004000 0x00 0x00 0x000000000004
5 0x0008C ue 0x00 0x00000000607608DC 0x000000005000 0x00 0x00 0x000000000005
6 0x000A4 ue 0x00 0x00000000607608DC 0x000000006000 0x00 0x00 0x000000000006
7 0x000BC ue 0x00 0x00000000607608DC 0x000000007000 0x00 0x00 0x000000000007
8 0x000D4 ue 0x00 0x00000000607608DD 0x000000008000 0x00 0x00 0x000000000008
$_
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Xinhui Pan <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Split functionality between read and write, which
simplifies the code and exposes areas of
optimization and more or less complexity, and take
advantage of that.
Read and write the table in one go; use a separate
stage to decode or encode the data, as opposed to
on the fly, which keeps the I2C bus busy. Use a
single read/write to read/write the table or at
most two if the number of records we're
reading/writing wraps around.
Check the check-sum of a table in EEPROM on init.
Update the checksum at the same time as when
updating the table header signature, when the
threshold was increased on boot.
Take advantage of arithmetic modulo 256, that is,
use a byte!, to greatly simplify checksum
arithmetic.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The code is now tested from userspace.
Remove already macroed out test function.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Qualify with "ras_". Use kernel's own--don't
redefine your own.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
buff --> buf. Essentially buffer abbreviates to
buf, remove 1/2 of it, or just the iron part, as
opposed to just the Er,
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
RAS_MAX_RECORD_NUM may mean the maximum record
number, as in the maximum house number on your
street, or it may mean the maximum number of
records, as in the count of records, which is also
a number. To make this distinction whether the
number is ordinal (index) or cardinal (count),
rename this macro to RAS_MAX_RECORD_COUNT.
This makes it easy to understand what it refers
to, especially when we compute quantities such as,
how many records do we have left in the table,
especially when there are so many other numbers,
quantities and numerical macros around.
Also rename the long,
amdgpu_ras_eeprom_get_record_max_length() to the
more succinct and clear,
amdgpu_ras_eeprom_max_record_count().
When computing the threshold, which also deals
with counts, i.e. "how many", use cardinal
"max_eeprom_records_count", than the quantitative
"max_eeprom_records_len".
Simplify the logic here and there, as well.
Cc: Guchun Chen <[email protected]>
Cc: John Clements <[email protected]>
Cc: Hawking Zhang <[email protected]>
Cc: Alexander Deucher <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename update_table_header() to
write_table_header() as this function is actually
writing it to EEPROM.
Use kernel types; use u8 to carry around the
checksum, in order to take advantage of arithmetic
modulo 8-bits (256).
Tidy up to 80 columns.
When updating the checksum, just recalculate the
whole thing.
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
No need to account for the 2 bytes of EEPROM
address--this is now well abstracted away by
the fixes the the lower layers.
Cc: Andrey Grodzovsky <[email protected]>
Cc: Alexander Deucher <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The low level EEPROM write method, doesn't return
1, but the number of bytes written. Thus do not
compare to 1, instead, compare to greater than 0
for success.
Other cleanup: if the lower layers returned
-errno, then return that, as opposed to
overwriting the error code with one-fits-all
-EINVAL. For instance, some return -EAGAIN.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Reviewed-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The I2C address is kept as a 16-bit quantity in
the kernel. The I2C_TAR::I2C_TAR field is 10-bit
wide.
Fix the width of the I2C address for Vega20 from 8
bits to 16 bits to accommodate the full spectrum
of I2C address space.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add explicit amdgpu_eeprom_read() and
amdgpu_eeprom_write() for clarity.
Cc: Jean Delvare <[email protected]>
Cc: Alexander Deucher <[email protected]>
Cc: Andrey Grodzovsky <[email protected]>
Cc: Lijo Lazar <[email protected]>
Cc: Stanley Yang <[email protected]>
Cc: Hawking Zhang <[email protected]>
Signed-off-by: Luben Tuikov <[email protected]>
Acked-by: Alexander Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|