aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/nouveau/nvkm/subdev/instmem
AgeCommit message (Collapse)AuthorFilesLines
2024-04-15nouveau: fix instmem race condition around ptr storesDave Airlie1-1/+6
Running a lot of VK CTS in parallel against nouveau, once every few hours you might see something like this crash. BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27 Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1 RSP: 0000:ffffac20c5857838 EFLAGS: 00010202 RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001 RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180 RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10 R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c FS: 00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ... ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau] ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau] nvkm_vmm_iter+0x351/0xa20 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] ? __lock_acquire+0x3ed/0x2170 ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau] ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau] ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau] nvkm_vmm_map_locked+0x224/0x3a0 [nouveau] Adding any sort of useful debug usually makes it go away, so I hand wrote the function in a line, and debugged the asm. Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in the nv50_instobj_acquire called from nvkm_kmap. If Thread A and Thread B both get to nv50_instobj_acquire around the same time, and Thread A hits the refcount_set line, and in lockstep thread B succeeds at refcount_inc_not_zero, there is a chance the ptrs value won't have been stored since refcount_set is unordered. Force a memory barrier here, I picked smp_mb, since we want it on all CPUs and it's write followed by a read. v2: use paired smp_rmb/smp_wmb. Cc: <[email protected]> Fixes: be55287aa5ba ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj") Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Danilo Krummrich <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/nouveau: Fixup gk20a instobj hierarchyThierry Reding1-9/+9
Commit 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") uses container_of() to cast from struct nvkm_memory to struct nvkm_instobj, assuming that all instance objects are derived from struct nvkm_instobj. For the gk20a family that's not the case and they are derived from struct nvkm_memory instead. This causes some subtle data corruption (nvkm_instobj.preserve ends up mapping to gk20a_instobj.vaddr) that causes a NULL pointer dereference in gk20a_instobj_acquire_iommu() (and possibly elsewhere) and also prevents suspend/resume from working. Fix this by making struct gk20a_instobj derive from struct nvkm_instobj instead. Fixes: 12c9b05da918 ("drm/nouveau/imem: support allocations not preserved across suspend") Reported-by: Jonathan Hunter <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Tested-by: Jon Hunter <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-11-03nouveau/gsp: move to 535.113.01Dave Airlie1-6/+6
This moves the initial effort to the latest 535 firmware. The gsp msg structs have changed, and the message passing also. The wpr also seems to have some struct changes. This version of the firmware will be what we are stuck on for a while, until we can refactor the driver and work out a better path forward. Reviewed-by: Danilo Krummrich <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2023-10-31drm/nouveau/mmu/r535: initial supportBen Skeggs4-1/+339
- Valid VRAM regions are read from GSP-RM, and used to construct our MM - BAR1/BAR2 VMMs modified to be shared with RM - Client VMMs have RM VASPACE objects created for them - Adds FBSR to backup system objects in VRAM across suspend Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-10-31drm/nouveau/imem/tu102-: prepare for GSP-RMBen Skeggs5-30/+94
- move suspend/resume paths to HW-specific code - allow (future) RM paths to be based on nv50_instmem Signed-off-by: Ben Skeggs <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-09-19drm/nouveau/imem: support allocations not preserved across suspendBen Skeggs2-5/+15
Will initially be used to tag some large grctx allocations which don't need to be saved, to speedup suspend/resume. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Lyude Paul <[email protected]> Acked-by: Danilo Krummrich <[email protected]> Signed-off-by: Lyude Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-01-25iommu: Add a gfp parameter to iommu_map()Jason Gunthorpe1-1/+2
The internal mechanisms support this, but instead of exposting the gfp to the caller it wrappers it into iommu_map() and iommu_map_atomic() Fix this instead of adding more variants for GFP_KERNEL_ACCOUNT. Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2022-11-09drm/nouveau/imem: allow bar2 mapping of user allocationsBen Skeggs3-5/+35
Will be used to init client-allocated USERD to default values. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Lyude Paul <[email protected]>
2022-03-03drm/nouveau/instmem: fix uninitialized_var.cocci warningGuo Zhengkui1-1/+1
Fix following coccicheck warning: drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c:316:11-12: WARNING this kind of initialization is deprecated. `void *map = map` has the same form of uninitialized_var() macro. I remove the redundant assignement. It has been tested with gcc (Debian 8.3.0-6) 8.3.0. The patch which removed uninitialized_var() is: https://lore.kernel.org/all/[email protected]/ And there is very few "/* GCC */" comments in the Linux kernel code now. Signed-off-by: Guo Zhengkui <[email protected]> Reviewed-by: Lyude Paul <[email protected]> Signed-off-by: Lyude Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-02-11drm/nouveau/instmem: switch to instanced constructorBen Skeggs6-13/+12
Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Lyude Paul <[email protected]>
2021-02-11drm/nouveau/instmem: protect mm/lru with private mutexBen Skeggs4-29/+30
nvkm_subdev.mutex is going away. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Lyude Paul <[email protected]>
2020-09-25drm/nouveau/gk20a: stop setting DMA_ATTR_NON_CONSISTENTChristoph Hellwig1-2/+1
DMA_ATTR_NON_CONSISTENT is a no-op except on PA-RISC and a few MIPS configs, so don't set it in this ARM specific driver part. Signed-off-by: Christoph Hellwig <[email protected]>
2019-07-19drm/nouveau: fix bogus GPL-2 license headerBen Skeggs1-1/+1
The bulk SPDX addition made all these files into GPL-2.0 licensed files. However the remainder of the project is MIT-licensed, these files were simply missing the boiler plate and got caught up in the global update. Fixes: 96ac6d4351004 (treewide: Add SPDX license identifier - Kbuild) Signed-off-by: Ben Skeggs <[email protected]>
2019-07-19drm/nouveau: fix bogus GPL-2 license headerIlia Mirkin1-1/+1
The bulk SPDX addition made all these files into GPL-2.0 licensed files. However the remainder of the project is MIT-licensed, these files (primarily header files) were simply missing the boiler plate and got caught up in the global update. Fixes: b24413180f5 (License cleanup: add SPDX GPL-2.0 license identifier to files with no license) Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Emil Velikov <[email protected]> Acked-by: Karol Herbst <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
2019-05-30treewide: Add SPDX license identifier - KbuildGreg Kroah-Hartman1-0/+1
Add SPDX license identifiers to all Make/Kconfig files which: - Have no license information of any form These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0 Reported-by: Masahiro Yamada <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Kate Stewart <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-12-11drm/nouveau/imem/nv50: support pinning objects in BAR2 and returning addressBen Skeggs1-1/+15
Various structures are accessed by the GPU through BAR2 for some reason on newer GPUs. This commit makes it more convenient to handle. Will be used for GP100- fault buffers, and GV100- fault method buffers. Signed-off-by: Ben Skeggs <[email protected]>
2017-12-19Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixesDave Airlie1-1/+1
nouveau regression fixes, and some minor fixes. * 'linux-4.15' of git://github.com/skeggsb/linux: drm/nouveau: use alternate memory type for system-memory buffers with kind != 0 drm/nouveau: avoid GPU page sizes > PAGE_SIZE for buffer objects in host memory drm/nouveau/mmu/gp10b: use correct implementation drm/nouveau/pci: do a msi rearm on init drm/nouveau/imem/nv50: fix refcount_t warning drm/nouveau/bios/dp: support DP Info Table 2.0 drm/nouveau/fbcon: fix NULL pointer access in nouveau_fbcon_destroy
2017-12-19drm/nouveau/imem/nv50: fix refcount_t warningBen Skeggs1-1/+1
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-15Merge tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds6-359/+452
Pull drm updates from Dave Airlie: "This is the main drm pull request for v4.15. Core: - Atomic object lifetime fixes - Atomic iterator improvements - Sparse/smatch fixes - Legacy kms ioctls to be interruptible - EDID override improvements - fb/gem helper cleanups - Simple outreachy patches - Documentation improvements - Fix dma-buf rcu races - DRM mode object leasing for improving VR use cases. - vgaarb improvements for non-x86 platforms. New driver: - tve200: Faraday Technology TVE200 block. This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in the StorLink SL3516 (later Cortina Systems CS3516) as well as the Grain Media GM8180. New bridges: - SiI9234 support New panels: - S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba LT089AC19000, Innolux AT043TN24 i915: - Remove Coffeelake from alpha support - Cannonlake workarounds - Infoframe refactoring for DisplayPort - VBT updates - DisplayPort vswing/emph/buffer translation refactoring - CCS fixes - Restore GPU clock boost on missed vblanks - Scatter list updates for userptr allocations - Gen9+ transition watermarks - Display IPC (Isochronous Priority Control) - Private PAT management - GVT: improved error handling and pci config sanitizing - Execlist refactoring - Transparent Huge Page support - User defined priorities support - HuC/GuC firmware refactoring - DP MST fixes - eDP power sequencing fixes - Use RCU instead of stop_machine - PSR state tracking support - Eviction fixes - BDW DP aux channel timeout fixes - LSPCON fixes - Cannonlake PLL fixes amdgpu: - Per VM BO support - Powerplay cleanups - CI powerplay support - PASID mgr for kfd - SR-IOV fixes - initial GPU reset for vega10 - Prime mmap support - TTM updates - Clock query interface for Raven - Fence to handle ioctl - UVD encode ring support on Polaris - Transparent huge page DMA support - Compute LRU pipe tweaks - BO flag to allow buffers to opt out of implicit sync - CTX priority setting API - VRAM lost infrastructure plumbing qxl: - fix flicker since atomic rework amdkfd: - Further improvements from internal AMD tree - Usermode events - Drop radeon support nouveau: - Pascal temperature sensor support - Improved BAR2 handling - MMU rework to support Pascal MMU exynos: - Improved HDMI/mixer support - HDMI audio interface support tegra: - Prep work for tegra186 - Cleanup/fixes msm: - Preemption support for a5xx - Display fixes for 8x96 (snapdragon 820) - Async cursor plane fixes - FW loading rework - GPU debugging improvements vc4: - Prep for DSI panels - fix T-format tiling scanout - New madvise ioctl Rockchip: - LVDS support omapdrm: - omap4 HDMI CEC support etnaviv: - GPU performance counters groundwork sun4i: - refactor driver load + TCON backend - HDMI improvements - A31 support - Misc fixes udl: - Probe/EDID read fixes. tilcdc: - Misc fixes. pl111: - Support more variants adv7511: - Improve EDID handling. - HDMI CEC support sii8620: - Add remote control support" * tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits) drm/rockchip: analogix_dp: Use mutex rather than spinlock drm/mode_object: fix documentation for object lookups. drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU drm/i915: Move init_clock_gating() back to where it was drm/i915: Prune the reservation shared fence array drm/i915: Idle the GPU before shinking everything drm/i915: Lock llist_del_first() vs llist_del_all() drm/i915: Calculate ironlake intermediate watermarks correctly, v2. drm/i915: Disable lazy PPGTT page table optimization for vGPU drm/i915/execlists: Remove the priority "optimisation" drm/i915: Filter out spurious execlists context-switch interrupts drm/amdgpu: use irq-safe lock for kiq->ring_lock drm/amdgpu: bypass lru touch for KIQ ring submission drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories() drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs() drm/amd/powerplay: initialize a variable before using it drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug drm/rockchip: add CONFIG_OF dependency for lvds ...
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-11-02drm/nouveau/mmu: remove old vmm frontendBen Skeggs1-11/+0
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50-: use new interfaces for vmm operationsBen Skeggs2-32/+41
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/mmu: implement new vmm frontendBen Skeggs1-0/+1
These are the new priviledged interfaces to the VMM backends, and expose some functionality that wasn't previously available. It's now possible to allocate a chunk of address-space (even all of it), without causing page tables to be allocated up-front, and then map into it at arbitrary locations. This is the basic primitive used to support features such as sparse mapping, or to allow userspace control over its own address-space, or HMM (where the GPU driver isn't in control of the address-space layout). Rather than being tied to a subtle combination of memory object and VMA properties, arguments that control map flags (ro, kind, etc) are passed explicitly at map time. The compatibility hacks to implement the old frontend on top of the new driver backends have been replaced with something similar to implement the old frontend's interfaces on top of the new frontend. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau: wrap nvkm_mem objects in nvkm_memory interfacesBen Skeggs1-0/+9
This is a transition step, to enable finer-grained commits while transitioning to new MMU interfaces. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: allocate memory with nvkm_ram_get()Ben Skeggs1-23/+14
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/core/memory: add reference countingBen Skeggs3-7/+7
We need to be able to prevent memory from being freed while it's still mapped in a GPU's address-space. Will be used by upcoming MMU changes. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/core/memory: change map interface to support upcoming mmu changesBen Skeggs2-7/+10
Map flags (access, kind, etc) are currently defined in either the VMA, or the memory object, which turns out to not be ideal for things like suballocated buffers, etc. These will become per-map flags instead, so we need to support passing these arguments in nvkm_memory_map(). Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/core/mm: have users explicitly define heap identifiersBen Skeggs2-2/+2
Different sections of VRAM may have different properties (ie. can't be used for compression/display, can't be mapped, etc). We currently already support this, but it's a bit magic. This change makes it more obvious where we're allocating from. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau: separate buffer object backing memory from nvkm structuresBen Skeggs2-2/+0
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: use fast-path for resume restoreBen Skeggs2-4/+12
Before: "imem: init completed in 299277us" After: "imem: init completed in 11574us" Suspend from Fedora 26 gnome desktop on GP102. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: use fast-path for suspend backupBen Skeggs1-3/+10
Before: "imem: suspend completed in 5540487us" After: "imem: suspend completed in 1871526us" Suspend from Fedora 26 gnome desktop on GP102. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: separate pre-BAR2-bootstrap objects from the restBen Skeggs3-0/+29
These will require slow-path access during suspend/resume. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: switch to kvmalloc/kvfree for suspend/resume backupBen Skeggs1-2/+2
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: separate suspend/resume backup handling into their own ↵Ben Skeggs1-30/+46
functions Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: remove now-unused wrapper for backend objectsBen Skeggs6-170/+2
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: support eviction of BAR2 mappingsBen Skeggs1-5/+67
A good deal of the structures we map into here aren't accessed very often at all, and Fedora 26 has exposed an issue where after creating a heap of channels, BAR2 space would run out, and we'd need to make use of the slow path while accessing important structures like page tables. This implements an LRU on BAR2 space, which allows eviction of mappings that aren't currently needed, to make space for other objects. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: prevent fast-path for mapped objects when BAR isn't readyBen Skeggs1-3/+5
Another piece of solving the "GP100 BAR2 VMM bootstrap" puzzle. Without doing this, we'd attempt to write PDEs for the lower page table levels through BAR2 before BAR2 access has been fully initialised. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: map bar2 write-combinedBen Skeggs1-2/+3
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobjBen Skeggs1-32/+102
This is not as simple as it was for earlier GPUs, due to the need to swap accessor functions depending on whether BAR2 is usable or not. We were previously protected by nvkm_instobj's accessor functions keeping an object mapped permanently, with some unclear magic that managed to hit the slow-path where needed even if an object was marked as mapped. That's been replaced here by reference counting maps (some objects, like page tables can be accessed concurrently), and swapping the functions as necessary. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: move slow-path locking into rd/wr functionsBen Skeggs1-8/+6
This is to simplify upcoming changes. The slow-path is something that currently occurs during bootstrap of the BAR2 VMM, while backing up an object during suspend/resume, or when BAR2 address space runs out. The latter is a real problem that can happen at runtime, and occurs in Fedora 26 already (due to some change that causes a lot of channels to be created at login), so ideally we'd prefer not to make it any slower. We'd also like suspend/resume speed to not suffer. Upcoming commits will solve those problems in a better way, making the extra overhead of moving the locking here a non-issue. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv50: split object map out from api functionsBen Skeggs1-25/+32
acquire()/boot() will need different logic in addition to performing the actual mapping. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv40: map bar2 write-combinedBen Skeggs1-2/+3
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv40: embed nvkm_instobj directly into nv04_instobjBen Skeggs1-7/+7
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem/nv04: directly embed nvkm_instobj into nv04_instobjBen Skeggs1-7/+7
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: allow nvkm_instobj to be directly embedded in backend objectBen Skeggs2-13/+38
This will eliminate a step through the call chain, and give backends more flexibility. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/core/memory: split info pointers from accessor pointersBen Skeggs5-114/+144
The accessor functions can change as a result of acquire()/release() calls, and are protected by any refcounting done there. Other functions must remain constant, as they can be called any time. Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/imem: add some useful debug outputBen Skeggs1-1/+7
Signed-off-by: Ben Skeggs <[email protected]>
2017-11-02drm/nouveau/bar: modify interface to bar2 vmm mappingBen Skeggs1-2/+1
Match API with the BAR1 version. Signed-off-by: Ben Skeggs <[email protected]>
2017-04-06drm/nouveau/imem/gk20a: Turn instmem lock into mutexThierry Reding1-11/+8
The gk20a implementation of instance memory uses vmap()/vunmap() to map memory regions into the kernel's virtual address space. These functions may sleep, so protecting them by a spin lock is not safe. This triggers a warning if the DEBUG_ATOMIC_SLEEP Kconfig option is enabled. Fix this by using a mutex instead. Signed-off-by: Thierry Reding <[email protected]> Reviewed-by: Alexandre Courbot <[email protected]> Tested-by: Alexandre Courbot <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
2017-02-17drm/nouveau/core/memory: distinguish between coherent/non-coherent targetsBen Skeggs1-1/+1
Signed-off-by: Ben Skeggs <[email protected]>