aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-30Merge tag 'amd-drm-next-6.10-2024-04-26' of ↵Dave Airlie185-286/+1650
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.10-2024-04-26: amdgpu: - Misc code cleanups and refactors - Support setting reset method at runtime - Report OD status - SMU 14.0.1 fixes - SDMA 4.4.2 fixes - VPE fixes - MES fixes - Update BO eviction priorities - UMSCH fixes - Reset fixes - Freesync fixes - GFXIP 9.4.3 fixes - SDMA 5.2 fixes - MES UAF fix - RAS updates - Devcoredump updates for dumping IP state - DSC fixes - JPEG fix - Fix VRAM memory accounting - VCN 5.0 fixes - MES fixes - UMC 12.0 updates - Modify contiguous flags handling - Initial support for mapping kernel queues via MES amdkfd: - Fix rescheduling of restore worker - VRAM accounting for SVM migrations - mGPU fix - Enable SQ watchpoint for gfx10 Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-30Merge tag 'drm-intel-gt-next-2024-04-26' of ↵Dave Airlie54-156/+414
https://anongit.freedesktop.org/git/drm/drm-intel into drm-next UAPI Changes: - drm/i915/guc: Use context hints for GT frequency Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint - drm/i915/gt: Enable only one CCS for compute workload Enable only one CCS engine by default with all the compute sices allocated to it. While generating the list of UABI engines to be exposed to the user, exclude any additional CCS engines beyond the first instance *** NOTE: This W/A will make all DG2 SKUs appear like single CCS SKUs by default to mitigate a hardware bug. All the EUs will still remain usable, and all the userspace drivers have been confirmed to be able to dynamically detect the change in number of CCS engines and adjust. For the smaller percent of applications that get perf benefit from letting the userspace driver dispatch across all 4 CCS engines we will be introducing a sysfs control as a later patch to choose 4 CCS each with 25% EUs (or 50% if 2 CCS). NOTE: A regression has been reported at https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10895 However Andi has been triaging the issue and we're closing in a fix to the gap in the W/A implementation: https://lists.freedesktop.org/archives/intel-gfx/2024-April/348747.html Driver Changes: - Add new and fix to existing workarounds: Wa_14018575942 (MTL), Wa_16019325821 (Gen12.70), Wa_14019159160 (MTL), Wa_16015675438, Wa_14020495402 (Gen12.70) (Tejas, John, Lucas) - Fix UAF on destroy against retire race and remove two earlier partial fixes (Janusz) - Limit the reserved VM space to only the platforms that need it (Andi) - Reset queue_priority_hint on parking for execlist platforms (Chris) - Fix gt reset with GuC submission is disabled (Nirmoy) - Correct capture of EIR register on hang (John) - Remove usage of the deprecated ida_simple_xx() API - Refactor confusing __intel_gt_reset() (Nirmoy) - Fix the fix for GuC reset lock confusion (John) - Simplify/extend platform check for Wa_14018913170 (John) - Replace dev_priv with i915 (Andi) - Add and use gt_to_guc() wrapper (Andi) - Remove bogus null check (Rodrigo, Dan) . Selftest improvements (Janusz, Nirmoy, Daniele) Signed-off-by: Dave Airlie <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-04-29Merge v6.9-rc6 into drm-nextDaniel Vetter355-1900/+3033
Thomas needs the defio fixes, Maíra needs the vkms fixes and Joonas has some fun with i915-gem conflicts. Signed-off-by: Daniel Vetter <[email protected]>
2024-04-28Linux 6.9-rc6Linus Torvalds1-1/+1
2024-04-28Merge tag 'sched-urgent-2024-04-28' of ↵Linus Torvalds3-21/+38
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix EEVDF corner cases - Fix two nohz_full= related bugs that can cause boot crashes and warnings * tag 'sched-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/isolation: Fix boot crash when maxcpus < first housekeeping CPU sched/isolation: Prevent boot crash when the boot CPU is nohz_full sched/eevdf: Prevent vlag from going out of bounds in reweight_eevdf() sched/eevdf: Fix miscalculation in reweight_entity() when se is not curr sched/eevdf: Always update V if se->on_rq when reweighting
2024-04-28Merge tag 'x86-urgent-2024-04-28' of ↵Linus Torvalds10-17/+53
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Make the CPU_MITIGATIONS=n interaction with conflicting mitigation-enabling boot parameters a bit saner. - Re-enable CPU mitigations by default on non-x86 - Fix TDX shared bit propagation on mprotect() - Fix potential show_regs() system hang when PKE initialization is not fully finished yet. - Add the 0x10-0x1f model IDs to the Zen5 range - Harden #VC instruction emulation some more * tag 'x86-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu: Ignore "mitigations" kernel parameter if CPU_MITIGATIONS=n cpu: Re-enable CPU mitigations by default for !X86 architectures x86/tdx: Preserve shared bit on mprotect() x86/cpu: Fix check for RDPKRU in __show_regs() x86/CPU/AMD: Add models 0x10-0x1f to the Zen5 range x86/sev: Check for MWAITX and MONITORX opcodes in the #VC handler
2024-04-28Merge tag 'irq-urgent-2024-04-28' of ↵Linus Torvalds1-7/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix a double free bug in the init error path of the GICv3 irqchip driver" * tag 'irq-urgent-2024-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/gic-v3-its: Prevent double free on error
2024-04-28sched/isolation: Fix boot crash when maxcpus < first housekeeping CPUOleg Nesterov1-1/+6
housekeeping_setup() checks cpumask_intersects(present, online) to ensure that the kernel will have at least one housekeeping CPU after smp_init(), but this doesn't work if the maxcpus= kernel parameter limits the number of processors available after bootup. For example, a kernel with "maxcpus=2 nohz_full=0-2" parameters crashes at boot time on a virtual machine with 4 CPUs. Change housekeeping_setup() to use cpumask_first_and() and check that the returned CPU number is valid and less than setup_max_cpus. Another corner case is "nohz_full=0" on a machine with a single CPU or with the maxcpus=1 kernel argument. In this case non_housekeeping_mask is empty and tick_nohz_full_setup() makes no sense. And indeed, the kernel hits the WARN_ON(tick_nohz_full_running) in tick_sched_do_timer(). And how should the kernel interpret the "nohz_full=" parameter? It should be silently ignored, but currently cpulist_parse() happily returns the empty cpumask and this leads to the same problem. Change housekeeping_setup() to check cpumask_empty(non_housekeeping_mask) and do nothing in this case. Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Phil Auld <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-04-28sched/isolation: Prevent boot crash when the boot CPU is nohz_fullOleg Nesterov2-6/+12
Documentation/timers/no_hz.rst states that the "nohz_full=" mask must not include the boot CPU, which is no longer true after: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full"). However after: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work") the kernel will crash at boot time in this case; housekeeping_any_cpu() returns an invalid CPU number until smp_init() brings the first housekeeping CPU up. Change housekeeping_any_cpu() to check the result of cpumask_any_and() and return smp_processor_id() in this case. This is just the simple and backportable workaround which fixes the symptom, but smp_processor_id() at boot time should be safe at least for type == HK_TYPE_TIMER, this more or less matches the tick_do_timer_boot_cpu logic. There is no worry about cpu_down(); tick_nohz_cpu_down() will not allow to offline tick_do_timer_cpu (the 1st online housekeeping CPU). Fixes: aae17ebb53cd ("workqueue: Avoid using isolated cpus' timers on queue_delayed_work") Reported-by: Chris von Recklinghausen <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Phil Auld <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Link: https://lore.kernel.org/r/[email protected] Closes: https://lore.kernel.org/all/[email protected]/
2024-04-27Merge tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linuxLinus Torvalds9-94/+132
Pull Rust fixes from Miguel Ojeda: - Soundness: make internal functions generated by the 'module!' macro inaccessible, do not implement 'Zeroable' for 'Infallible' and require 'Send' for the 'Module' trait. - Build: avoid errors with "empty" files and workaround 'rustdoc' ICE. - Kconfig: depend on '!CFI_CLANG' and avoid selecting 'CONSTRUCTORS'. - Code docs: remove non-existing key from 'module!' macro example. - Docs: trivial rendering fix in arch table. * tag 'rust-fixes-6.9' of https://github.com/Rust-for-Linux/linux: rust: remove `params` from `module` macro example kbuild: rust: force `alloc` extern to allow "empty" Rust files kbuild: rust: remove unneeded `@rustc_cfg` to avoid ICE rust: kernel: require `Send` for `Module` implementations rust: phy: implement `Send` for `Registration` rust: make mutually exclusive with CFI_CLANG rust: macros: fix soundness issue in `module!` macro rust: init: remove impl Zeroable for Infallible docs: rust: fix improper rendering in Arch Support page rust: don't select CONSTRUCTORS
2024-04-27Merge tag 'riscv-for-linus-6.9-rc6' of ↵Linus Torvalds11-29/+59
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel separation - A fix to avoid loading rv64/NOMMU kernel past the start of RAM - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer overflow in the bitmask - The sud_test kselftest has been fixed to properly swizzle the syscall number into the return register, which are not the same on RISC-V - A fix for a build warning in the perf tools on rv32 - A fix for the CBO selftests, to avoid non-constants leaking into the inline asm - A pair of fixes for T-Head PBMT errata probing, which has been renamed MAE by the vendor * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2 perf riscv: Fix the warning due to the incompatible type riscv: T-Head: Test availability bit before enabling MAE errata riscv: thead: Rename T-Head PBMT to MAE selftests: sud_test: return correct emulated syscall value on RISC-V riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN riscv: Fix loading 64-bit NOMMU kernels past the start of RAM riscv: Fix TASK_SIZE on 64-bit NOMMU
2024-04-27Merge tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds3-4/+9
Pull smb client fixes from Steve French: "Three smb3 client fixes, all also for stable: - two small locking fixes spotted by Coverity - FILE_ALL_INFO and network_open_info packing fix" * tag '6.9-rc5-cifs-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: smb3: fix lock ordering potential deadlock in cifs_sync_mid_result smb3: missing lock when picking channel smb: client: Fix struct_group() usage in __packed structs
2024-04-27Merge tag 'i2c-for-6.9-rc6' of ↵Linus Torvalds4-25/+16
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Fix a race condition in the at24 eeprom handler, a NULL pointer exception in the I2C core for controllers only using target modes, drop a MAINTAINERS entry, and fix an incorrect DT binding for at24" * tag 'i2c-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: smbus: fix NULL function pointer dereference MAINTAINERS: Drop entry for PCA9541 bus master selector eeprom: at24: fix memory corruption race condition dt-bindings: eeprom: at24: Fix ST M24C64-D compatible schema
2024-04-27profiling: Remove create_prof_cpu_mask().Tetsuo Handa2-48/+0
create_prof_cpu_mask() is no longer used after commit 1f44a225777e ("s390: convert interrupt handling to use generic hardirq"). Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2024-04-27Merge tag 'soundwire-6.9-fixes' of ↵Linus Torvalds2-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fix from Vinod Koul: - Single AMD driver fix for wake interrupt handling in clockstop mode * tag 'soundwire-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: amd: fix for wake interrupt handling for clockstop mode
2024-04-27Merge tag 'dmaengine-fix-6.9' of ↵Linus Torvalds14-42/+64
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - Revert pl330 issue_pending waits until WFP state due to regression reported in Bluetooth loading - Xilinx driver fixes for synchronization, buffer offsets, locking and kdoc - idxd fixes for spinlock and preventing the migration of the perf context to an invalid target - idma driver fix for interrupt handling when powered off - Tegra driver residual calculation fix - Owl driver register access fix * tag 'dmaengine-fix-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: idxd: Fix oops during rmmod on single-CPU platforms dmaengine: xilinx: xdma: Clarify kdoc in XDMA driver dmaengine: xilinx: xdma: Fix synchronization issue dmaengine: xilinx: xdma: Fix wrong offsets in the buffers addresses in dma descriptor dma: xilinx_dpdma: Fix locking dmaengine: idxd: Convert spinlock to mutex to lock evl workqueue idma64: Don't try to serve interrupts when device is powered off dmaengine: tegra186: Fix residual calculation dmaengine: owl: fix register access functions dmaengine: Revert "dmaengine: pl330: issue_pending waits until WFP state"
2024-04-27Merge tag 'phy-fixes-6.9' of ↵Linus Torvalds10-42/+80
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - static checker (array size, bounds) fix for marvel driver - Rockchip rk3588 pcie fixes for bifurcation and mux - Qualcomm qmp-compbo fix for VCO, register base and regulator name for m31 driver - charger det crash fix for ti driver * tag 'phy-fixes-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: ti: tusb1210: Resolve charger-det crash if charger psy is unregistered phy: qcom: qmp-combo: fix VCO div offset on v5_5nm and v6 phy: phy-rockchip-samsung-hdptx: Select CONFIG_RATIONAL phy: qcom: m31: match requested regulator name with dt schema phy: qcom: qmp-combo: Fix register base for QSERDES_DP_PHY_MODE phy: qcom: qmp-combo: Fix VCO div offset on v3 phy: rockchip: naneng-combphy: Fix mux on rk3588 phy: rockchip-snps-pcie3: fix clearing PHP_GRF_PCIESEL_CON bits phy: rockchip-snps-pcie3: fix bifurcation on rk3588 phy: freescale: imx8m-pcie: fix pcie link-up instability phy: marvell: a3700-comphy: Fix hardcoded array size phy: marvell: a3700-comphy: Fix out of bounds read
2024-04-27i2c: smbus: fix NULL function pointer dereferenceWolfram Sang1-6/+6
Baruch reported an OOPS when using the designware controller as target only. Target-only modes break the assumption of one transfer function always being available. Fix this by always checking the pointer in __i2c_transfer. Reported-by: Baruch Siach <[email protected]> Closes: https://lore.kernel.org/r/4269631780e5ba789cf1ae391eec1b959def7d99.1712761976.git.baruch@tkos.co.il Fixes: 4b1acc43331d ("i2c: core changes for slave support") [wsa: dropped the simplification in core-smbus to avoid theoretical regressions] Signed-off-by: Wolfram Sang <[email protected]> Tested-by: Baruch Siach <[email protected]>
2024-04-26Merge tag 'soc-fixes-6.9-2' of ↵Linus Torvalds44-193/+331
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There are a lot of minor DT fixes for Mediatek, Rockchip, Qualcomm and Microchip and NXP, addressing both build-time warnings and bugs found during runtime testing. Most of these changes are machine specific fixups, but there are a few notable regressions that affect an entire SoC: - The Qualcomm MSI support that was improved for 6.9 ended up being wrong on some chips and now gets fixed. - The i.MX8MP camera interface broke due to a typo and gets updated again. The main driver fix is also for Qualcomm platforms, rewriting an interface in the QSEECOM firmware support that could lead to crashing the kernel from a trusted application. The only other code changes are minor fixes for Mediatek SoC drivers" * tag 'soc-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits) ARM: dts: imx6ull-tarragon: fix USB over-current polarity soc: mediatek: mtk-socinfo: depends on CONFIG_SOC_BUS soc: mediatek: mtk-svs: Append "-thermal" to thermal zone names arm64: dts: imx8mp: Fix assigned-clocks for second CSI2 ARM: dts: microchip: at91-sama7g54_curiosity: Replace regulator-suspend-voltage with the valid property ARM: dts: microchip: at91-sama7g5ek: Replace regulator-suspend-voltage with the valid property arm64: dts: rockchip: Fix USB interface compatible string on kobol-helios64 arm64: dts: qcom: sc8180x: Fix ss_phy_irq for secondary USB controller arm64: dts: qcom: sm8650: Fix the msi-map entries arm64: dts: qcom: sm8550: Fix the msi-map entries arm64: dts: qcom: sm8450: Fix the msi-map entries arm64: dts: qcom: sc8280xp: add missing PCIe minimum OPP arm64: dts: qcom: x1e80100: Fix the compatible for cluster idle states arm64: dts: qcom: Fix type of "wdog" IRQs for remoteprocs arm64: dts: rockchip: regulator for sd needs to be always on for BPI-R2Pro dt-bindings: rockchip: grf: Add missing type to 'pcie-phy' node arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 2 arm64: dts: rockchip: drop redundant disable-gpios in Lubancat 1 arm64: dts: rockchip: drop redundant pcie-reset-suspend in Scarlet Dumo arm64: dts: rockchip: mark system power controller and fix typo on orangepi-5-plus ...
2024-04-26drm/amd/display: Add some HDCP registers DCN35 listRodrigo Siqueira1-1/+11
Add some missing HDCP registers to be used in DCN35. Signed-off-by: Rodrigo Siqueira <[email protected]> Acked-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu/mes11: update ADD_QUEUE interfaceJack Xiao1-3/+14
Update ADD_QUEUE interface for mes11 to support mes mapping legacy queue. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: fix the warning about the expression (int)size - lenJesse Zhang1-2/+3
Converting size from size_t to int may overflow. v2: keep reverse xmas tree order (Christian) Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu/mes: add mes mapping legacy queue supportJack Xiao2-0/+36
Add mes mapping legacy queue framework support. Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: fix uninitialized scalar variable warningTim Huang1-1/+2
Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Code style adjustmentsRodrigo Siqueira2-3/+3
This commit address some small code style issues in DC. Signed-off-by: Rodrigo Siqueira <[email protected]> Acked-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Adjust registers sequence in the DIO listRodrigo Siqueira1-2/+3
This commit reorganizes the order in which some control registers are presented to make it easier to identify the operations based on the hardware doc. Signed-off-by: Rodrigo Siqueira <[email protected]> Acked-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Clean up code in DCRodrigo Siqueira5-25/+10
This commit removes some unnecessary code and makes the required adjustments to replace other parts of the code with a short option. Signed-off-by: Rodrigo Siqueira <[email protected]> Acked-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: skip ip dump if devcoredump flag is setSunil Khatri1-6/+7
Do not dump the ip registers during driver reload in passthrough environment. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: Modify the contiguous flags behaviourArunpravin Paneer Selvam2-7/+24
Now we have two flags for contiguous VRAM buffer allocation. If the application request for AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS, it would set the ttm place TTM_PL_FLAG_CONTIGUOUS flag in the buffer's placement function. This patch will change the default behaviour of the two flags. When we set AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS - This means contiguous is not mandatory. - we will try to allocate the contiguous buffer. Say if the allocation fails, we fallback to allocate the individual pages. When we setTTM_PL_FLAG_CONTIGUOUS - This means contiguous allocation is mandatory. - we are setting this in amdgpu_bo_pin_restricted() before bo validation and check this flag in the vram manager file. - if this is set, we should allocate the buffer pages contiguously. the allocation fails, we return -ENOSPC. v2: - keep the mem_flags and bo->flags check as is(Christian) - place the TTM_PL_FLAG_CONTIGUOUS flag setting into the amdgpu_bo_pin_restricted function placement range iteration loop(Christian) - rename find_pages with amdgpu_vram_mgr_calculate_pages_per_block (Christian) - Keep the kernel BO allocation as is(Christain) - If BO pin vram allocation failed, we need to return -ENOSPC as RDMA cannot work with scattered VRAM pages(Philip) v3(Christian): - keep contiguous flag handling outside of pages_per_block calculation - remove the hacky implementation in contiguous flag error handling code v4(Christian): - use any variable and return value for non-contiguous fallback v5: rebase to amd-staging-drm-next branch Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Fix uninitialized variables in DCAlex Hung14-19/+19
This fixes 29 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Fix uninitialized variables in DCAlex Hung18-39/+39
This fixes 49 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Fix uninitialized variables in DMAlex Hung2-6/+6
This fixes 11 UNINIT issues reported by Coverity. Reviewed-by: Hersen Wu <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Remove redundant include fileAlex Hung1-1/+0
This fixes 1 PW.INCLUDE_RECURSION reported by Coverity. "./drivers/gpu/drm/amd/amdgpu/../display/dc/dc_types.h" includes itself: dc_types.h -> dal_types.h -> dc_types.h Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: ASSERT when failing to find index by plane/stream idAlex Hung1-2/+4
[WHY] find_disp_cfg_idx_by_plane_id and find_disp_cfg_idx_by_stream_id returns an array index and they return -1 when not found; however, -1 is not a valid index number. [HOW] When this happens, call ASSERT(), and return a positive number (which is fewer than callers' array size) instead. This fixes 4 OVERRUN and 2 NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Do not return negative stream id for arrayAlex Hung1-0/+7
[WHY] resource_stream_to_stream_idx returns an array index and it return -1 when not found; however, -1 is not a valid array index number. [HOW] When this happens, call ASSERT(), and return a zero instead. This fixes an OVERRUN and an NEGATIVE_RETURNS issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Fix overlapping copy within dml_core_mode_programmingHersen Wu1-1/+3
[WHY] &mode_lib->mp.Watermark and &locals->Watermark are the same address. memcpy may lead to unexpected behavior. [HOW] memmove should be used. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Reviewed-by: Alex Hung <[email protected]> Signed-off-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Skip finding free audio for unknown engine_idAlex Hung1-0/+3
[WHY] ENGINE_ID_UNKNOWN = -1 and can not be used as an array index. Plus, it also means it is uninitialized and does not need free audio. [HOW] Skip and return NULL. This fixes 2 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Check pipe offset before setting vblankAlex Hung1-2/+6
pipe_ctx has a size of MAX_PIPES so checking its index before accessing the array. This fixes an OVERRUN issue reported by Coverity. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdkfd: Enable SQ watchpoint for gfx10Lancelot SIX1-13/+58
There are new control registers introduced in gfx10 used to configure hardware watchpoints triggered by SMEM instructions: SQ_WATCH{0,1,2,3}_{CNTL_ADDR_HI,ADDR_L}. Those registers work in a similar way as the TCP_WATCH* registers currently used for gfx9 and above. This patch adds support to program the SQ_WATCH registers for gfx10. The SQ_WATCH?_CNTL.MASK field has one bit more than TCP_WATCH?_CNTL.MASK, so SQ watchpoints can have a finer granularity than TCP_WATCH watchpoints. In this patch, we keep the capabilities advertised to the debugger unchanged (HSA_DBG_WATCH_ADDR_MASK_*_BIT_GFX10) as this reflects what both TCP and SQ watchpoints can do and both watchpoints are configured together. Signed-off-by: Lancelot SIX <[email protected]> Reviewed-by: Jonathan Kim <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Check index msg_id before read or writeAlex Hung1-0/+8
[WHAT] msg_id is used as an array index and it cannot be a negative value, and therefore cannot be equal to MOD_HDCP_MESSAGE_ID_INVALID (-1). [HOW] Check whether msg_id is valid before reading and setting. This fixes 4 OVERRUN issues reported by Coverity. Reviewed-by: Rodrigo Siqueira <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Alex Hung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: Fix buffer size in gfx_v9_4_3_init_ cp_compute_microcode() and ↵Srinivasan Shanmugam1-1/+1
rlc_microcode() The function gfx_v9_4_3_init_microcode in gfx_v9_4_3.c was generating about potential truncation of output when using the snprintf function. The issue was due to the size of the buffer 'ucode_prefix' being too small to accommodate the maximum possible length of the string being written into it. The string being written is "amdgpu/%s_mec.bin" or "amdgpu/%s_rlc.bin", where %s is replaced by the value of 'chip_name'. The length of this string without the %s is 16 characters. The warning message indicated that 'chip_name' could be up to 29 characters long, resulting in a total of 45 characters, which exceeds the buffer size of 30 characters. To resolve this issue, the size of the 'ucode_prefix' buffer has been reduced from 30 to 15. This ensures that the maximum possible length of the string being written into the buffer will not exceed its size, thus preventing potential buffer overflow and truncation issues. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c: In function ‘gfx_v9_4_3_early_init’: drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=] 379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name); | ^~ ...... 439 | r = gfx_v9_4_3_init_rlc_microcode(adev, ucode_prefix); | ~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:379:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30 379 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_rlc.bin", chip_name); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:52: warning: ‘%s’ directive output may be truncated writing up to 29 bytes into a region of size 23 [-Wformat-truncation=] 413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name); | ^~ ...... 443 | r = gfx_v9_4_3_init_cp_compute_microcode(adev, ucode_prefix); | ~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c:413:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 30 413 | snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_mec.bin", chip_name); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixes: 86301129698b ("drm/amdgpu: split gc v9_4_3 functionality from gc v9_0") Cc: Hawking Zhang <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Lijo Lazar <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Suggested-by: Lijo Lazar <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Add NULL pointer check for kzallocHersen Wu11-0/+44
[Why & How] Check return pointer of kzalloc before using it. Reviewed-by: Alex Hung <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amd/display: Handle Y carry-over in VCP X.Y calculationGeorge Shen1-0/+6
Theoretically rare corner case where ceil(Y) results in rounding up to an integer. If this happens, the 1 should be carried over to the X value. CC: [email protected] Reviewed-by: Rodrigo Siqueira <[email protected]> Signed-off-by: George Shen <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: Fix ras mode2 reset failure in ras aca modeYiPeng Chai1-0/+4
Fix ras mode2 reset failure in ras aca mode. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: fix double free err_addr pointer warningsBob Zhou1-0/+1
In amdgpu_umc_bad_page_polling_timeout, the amdgpu_umc_handle_bad_pages will be run many times so that double free err_addr in some special case. So set the err_addr to NULL to avoid the warnings. Signed-off-by: Bob Zhou <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: initialize the last_jump_jiffies in atom_exec_contextJesse Zhang1-0/+1
The parameter "last_jump_jiffies" should be initialized before being used in the function atom_op_jump. Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: add check before free wb entryJesse Zhang3-3/+6
Check if ring is not a mes queue before freeing the wb entry, because we only allocate a wb entry when it's not a mes queue. Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: add return result for amdgpu_i2c_{get/put}_byteBob Zhou1-19/+28
After amdgpu_i2c_get_byte fail, amdgpu_i2c_put_byte shouldn't be conducted to put wrong value. So return and check the i2c transfer result. Signed-off-by: Bob Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: add error handle to avoid out-of-boundsBob Zhou1-0/+3
if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-of-bounds read, so directly return -EINVAL. Signed-off-by: Bob Zhou <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-04-26drm/amdgpu: Initialize timestamp for some legacy SOCsMa Jun1-0/+8
Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>