aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-24libbpf: Fix various errors and warning reported by checkpatch.plAndrii Nakryiko1-17/+21
Fix a bunch of warnings and errors reported by checkpatch.pl, to make it easier to spot new problems. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24selftests, bpftool: Set EXIT trap after usage functionQuentin Monnet1-13/+13
The trap on EXIT is used to clean up any temporary directory left by the build attempts. It is not needed when the user simply calls the script with its --help option, and may not be needed either if we add checks (e.g. on the availability of bpftool files) before the build attempts. Let's move this trap and related variables lower down in the code, so that we don't accidentally change the value returned from the script on early exits at pre-checks. Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24libbpf: Refactor relocation handlingAndrii Nakryiko1-118/+143
Relocation handling code is convoluted and unnecessarily deeply nested. Split out per-relocation logic into separate function. Also refactor the logic to be more a sequence of per-relocation type checks and processing steps, making it simpler to follow control flow. This makes it easier to further extends it to new kinds of relocations (e.g., support for extern variables). This patch also makes relocation's section verification more robust. Previously relocations against not yet supported externs were silently ignored because of obj->efile.text_shndx was zero, when all BPF programs had custom section names and there was no .text section. Also, invalid LDIMM64 relocations against non-map sections were passed through, if they were pointing to a .text section (or 0, which is invalid section). All these bugs are fixed within this refactoring and checks are made more appropriate for each type of relocation. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24tools, bpf: Fix build for 'make -s tools/bpf O=<dir>'Quentin Monnet1-0/+6
Building selftests with 'make TARGETS=bpf kselftest' was fixed in commit 55d554f5d140 ("tools: bpf: Use !building_out_of_srctree to determine srctree"). However, by updating $(srctree) in tools/bpf/Makefile for in-tree builds only, we leave out the case where we pass an output directory to build BPF tools, but $(srctree) is not set. This typically happens for: $ make -s tools/bpf O=/tmp/foo Makefile:40: /tools/build/Makefile.feature: No such file or directory Fix it by updating $(srctree) in the Makefile not only for out-of-tree builds, but also if $(srctree) is empty. Detected with test_bpftool_build.sh. Fixes: 55d554f5d140 ("tools: bpf: Use !building_out_of_srctree to determine srctree") Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24selftests/bpf: Ensure no DWARF relocations for BPF object filesAndrii Nakryiko1-1/+1
Add -mattr=dwarfris attribute to llc to avoid having relocations against DWARF data. These relocations make it impossible to inspect DWARF contents: all strings are invalid. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24tools, bpftool: Fix warning on ignored return value for 'read'Quentin Monnet1-3/+3
When building bpftool, a warning was introduced by commit a94364603610 ("bpftool: Allow to read btf as raw data"), because the return value from a call to 'read()' is ignored. Let's address it. Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24xsk: Fix xsk_poll()'s return typeLuc Van Oostenryck1-4/+4
xsk_poll() is defined as returning 'unsigned int' but the .poll method is declared as returning '__poll_t', a bitwise type. Fix this by using the proper return type and using the EPOLL constants instead of the POLL ones, as required for __poll_t. Signed-off-by: Luc Van Oostenryck <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2019-11-24Linux 5.4Linus Torvalds1-1/+1
2019-11-24powerpc: Add const qual to local_read() parameterEric Dumazet1-1/+1
A patch in net-next triggered a compile error on powerpc: include/linux/u64_stats_sync.h: In function 'u64_stats_read': include/asm-generic/local64.h:30:37: warning: passing argument 1 of 'local_read' discards 'const' qualifier from pointer target type This seems reasonable to relax powerpc local_read() requirements. Fixes: 316580b69d0a ("u64_stats: provide u64_stats_t type") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: kbuild test robot <[email protected]> Acked-by: Michael Ellerman <[email protected]> Tested-by: Stephen Rothwell <[email protected]> # build only Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24Merge branch 'bnxt_en-Updates'Jakub Kicinski7-92/+190
Michael Chan says: ==================== bnxt_en: Updates. This patchset contains these main features: 1. Add the proper logic to support suspend/resume on the new 57500 chips. 2. Allow Phy configurations from user on a Multihost function if supported by fw. 3. devlink NVRAM flashing support. 4. Add a couple of chip IDs, PHY loopback enhancement, and provide more RSS contexts to VFs. v2: Dropped the devlink info patches to address some feedback and resubmit for the 5.6 kernel. ==================== Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Add support for flashing the device via devlinkVasundhara Volam3-2/+24
Use the same bnxt_flash_package_from_file() function to support devlink flash operation. Cc: Jiri Pirko <[email protected]> Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Allow PHY settings on multi-function or NPAR PFs if allowed by FW.Michael Chan3-5/+12
Currently, the driver does not allow PHY settings on a multi-function or NPAR NIC whose port is shared by more than one function. Newer firmware now allows PHY settings on some of these NICs. Check for this new firmware setting and allow the user to set the PHY settings accordingly. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Add async. event logic for PHY configuration changes.Michael Chan2-0/+11
If the link settings have been changed by another function sharing the port, firmware will send us an async. message. In response, we will call the new bnxt_init_ethtool_link_settings() function to update the current settings. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Refactor the initialization of the ethtool link settings.Michael Chan1-20/+26
Refactor this logic in bnxt_probe_phy() into a separate function bnxt_init_ethtool_link_settings(). It used to be that the settable link settings will never be changed without going through ethtool. So we only needed to do this once in bnxt_probe_phy(). Now, another function sharing the port may change it and we may need to re-initialize the ethtool settings again in run-time. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Skip disabling autoneg before PHY loopback when appropriate.Michael Chan3-3/+10
New firmware allows PHY loopback to be set without disabling autoneg first. Check this capability and skip disabling autoneg when it is supported by firmware. Using this scheme, loopback will always work even if the PHY only supports autoneg. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Assign more RSS context resources to the VFs.Michael Chan1-2/+6
The driver currently only assignes 1 RSS context to each VF. This works for the Linux VF driver. But other drivers, such as DPDK, can make use of additional RSS contexts. Modify the code to divide up and assign RSS contexts to VFs just like other resources. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Initialize context memory to the value specified by firmware.Michael Chan2-9/+19
Some chips that need host context memory as a backing store requires the memory to be initialized to a non-zero value. Query the value from firmware and initialize the context memory accordingly. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Fix suspend/resume path on 57500 chipsVasundhara Volam1-0/+14
Driver calls HWRM_FUNC_RESET firmware call while resuming the device which clears the context memory backing store. Because of which allocating firmware resources would eventually fail. Fix it by freeing all context memory during suspend and reallocate the memory during resume. Call bnxt_hwrm_queue_qportcfg() in resume path. This firmware call is needed on the 57500 chips so that firmware will set up the proper queue mapping in relation to the context memory. Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Send FUNC_RESOURCE_QCAPS command in bnxt_resume()Vasundhara Volam1-2/+10
After driver unregister, firmware is erasing the information that driver supports new resource management. Send FUNC_RESOURCE_QCAPS command to inform the firmware that driver supports new resource management while resuming from hibernation. Otherwise, we fallback to the older resource allocation scheme. Also, move driver register after sending FUNC_RESOURCE_QCAPS command to be consistent with the normal initialization sequence. Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Combine 2 functions calling the same HWRM_DRV_RGTR fw command.Vasundhara Volam3-47/+35
Everytime driver registers with firmware, driver is required to register for async event notifications as well. These 2 calls are done using the same firmware command and can be combined. We are also missing the 2nd step to register for async events in the suspend/resume path and this will fix it. Prior to this, we were getting only default notifications. ULP can register for additional async events for the RDMA driver, so we add a parameter to the new function to only do step 2 when it is called from ULP. Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Do driver unregister cleanup in bnxt_init_one() failure path.Vasundhara Volam2-3/+11
In the bnxt_init_one() failure path, if the driver has already called firmware to register the driver, it is not undoing the driver registration. Add this missing step to unregister for correctness, so that the firmware knows that the driver has unloaded. Signed-off-by: Vasundhara Volam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Disable/enable Bus master during suspend/resume.Michael Chan1-0/+8
Disable Bus master during suspend to prevent DMAs after the device goes into D3hot state. The new 57500 devices may continue to DMA from context memory after the system goes into D3hot state. This may cause some PCIe errors on some system. Re-enable it during resume. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24bnxt_en: Add chip IDs for 57452 and 57454 chips.Michael Chan1-1/+6
Fix BNXT_CHIP_NUM_5645X() to include 57452 and 56454 chip IDs, so that these chips will be properly classified as P4 chips to take advantage of the P4 fixes and features. Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24sfc: fix build without CONFIG_RFS_ACCELJakub Kicinski1-0/+2
The rfs members of struct efx_channel are under CONFIG_RFS_ACCEL. Ethtool stats which access those need to be as well. Reported-by: kbuild test robot <[email protected]> Fixes: ca70bd423f10 ("sfc: add statistics for ARFS") Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-24Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-2/+2
Pull cramfs fix from Al Viro: "Regression fix, fallen through the cracks" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: cramfs: fix usage on non-MTD device
2019-11-24xen: Fix Kconfig indentationKrzysztof Kozlowski1-29/+29
Adjust indentation from spaces to tab (+optional two spaces) as in coding style with command like: $ sed -e 's/^ /\t/' -i */Kconfig Signed-off-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2019-11-24ALSA: aloop: Fix dependency on timer APITakashi Iwai1-0/+1
An explicit Kconfig dependency is missing for the recent addition of the timer support. CONFIG_SND_TIMER isn't always selected by SND_PCM. Fixes: 26c53379f98d ("ALSA: aloop: Support selection of snd_timer instead of jiffies") Reported-by: kbuild test robot <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2019-11-24erofs: remove unnecessary output in erofs_show_options()Chengguang Xu1-3/+0
We have already handled cache_strategy option carefully, so incorrect setting could not pass option parsing. Meanwhile, print 'cache_strategy=(unknown)' can cause failure on remount. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Chengguang Xu <[email protected]> Reviewed-by: Gao Xiang <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-24erofs: drop all vle annotations for runtime namesGao Xiang3-50/+44
VLE was an old informal name of fixed-sized output compression which came from published ATC'19 paper [1]. Drop those old annotations since erofs can handle all encoded clusters in block-aligned basis, which is wider than fixed-sized output compression after larger clustersize feature is fully implemented. Unaligned encoding won't be considered in EROFS since it's not friendly to inplace I/O and perhaps decompression inplace. a) Fixed-sized output compression with 16KB pcluster: ___________________________________ |xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx| |___ 0___|___ 1___|___ 2___|___ 3___| physical blocks b) Block-aligned fixed-sized input compression with 16KB pcluster: ___________________________________ |xxxxxxxx|xxxxxxxx|xxxxxxxx|xxx00000| |___ 0___|___ 1___|___ 2___|___ 3___| physical blocks c) Block-unaligned fixed-sized input compression with 16KB compression unit: ____________________________________________ |..xxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx|x.......| |___ 0___|___ 1___|___ 2___|___ 3___|___ 4___| physical blocks Refine better names for those as well. [1] https://www.usenix.org/conference/atc19/presentation/gao Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-24erofs: support superblock checksumPratik Shinde4-3/+38
Introduce superblock checksum feature in order to check at mounting time. Note that the first 1024 bytes are ignore for x86 boot sectors and other oddities. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Pratik Shinde <[email protected]> Reviewed-by: Chao Yu <[email protected]> Cc: Dan Carpenter <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-24erofs: set iowait for sync decompressionGao Xiang1-2/+2
For those tasks waiting I/O for sync decompression, they should be better marked as IO wait state. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-24erofs: clean up decompress queue stuffsGao Xiang2-81/+60
Previously, both z_erofs_unzip_io and z_erofs_unzip_io_sb record decompress queues for backend to use. The only difference is that z_erofs_unzip_io is used for on-stack sync decompression so that it doesn't have a super block field (since the caller can pass it in its context), but it increases complexity with only a pointer saving. Rename z_erofs_unzip_io to z_erofs_decompressqueue with a fixed super_block member and kill the other entirely, and it can fallback to sync decompression if memory allocation failure. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-24erofs: get rid of __stagingpage_alloc helperGao Xiang4-24/+21
Now open code is much cleaner due to iterative development. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Gao Xiang <[email protected]>
2019-11-23cramfs: fix usage on non-MTD deviceMaxime Bizon1-2/+2
When both CONFIG_CRAMFS_MTD and CONFIG_CRAMFS_BLOCKDEV are enabled, if we fail to mount on MTD, we don't try on block device. Note: this relies upon cramfs_mtd_fill_super() leaving no side effects on fc state in case of failure; in general, failing get_tree_...() does *not* mean "fine to try again"; e.g. parsed options might've been consumed by fill_super callback and freed on failure. Fixes: 74f78fc5ef43 ("vfs: Convert cramfs to use the new mount API") Signed-off-by: Maxime Bizon <[email protected]> Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Al Viro <[email protected]>
2019-11-23hv_netvsc: make recording RSS hash depend on feature flagStephen Hemminger3-2/+4
The recording of RSS hash should be controlled by NETIF_F_RXHASH. Fixes: 1fac7ca4e63b ("hv_netvsc: record hardware hash in skb") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: Haiyang Zhang <[email protected]> Reviewed-by: Michael Kelley <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23sctp: cache netns in sctp_ep_commonXin Long4-2/+7
This patch is to fix a data-race reported by syzbot: BUG: KCSAN: data-race in sctp_assoc_migrate / sctp_hash_obj write to 0xffff8880b67c0020 of 8 bytes by task 18908 on cpu 1: sctp_assoc_migrate+0x1a6/0x290 net/sctp/associola.c:1091 sctp_sock_migrate+0x8aa/0x9b0 net/sctp/socket.c:9465 sctp_accept+0x3c8/0x470 net/sctp/socket.c:4916 inet_accept+0x7f/0x360 net/ipv4/af_inet.c:734 __sys_accept4+0x224/0x430 net/socket.c:1754 __do_sys_accept net/socket.c:1795 [inline] __se_sys_accept net/socket.c:1792 [inline] __x64_sys_accept+0x4e/0x60 net/socket.c:1792 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 read to 0xffff8880b67c0020 of 8 bytes by task 12003 on cpu 0: sctp_hash_obj+0x4f/0x2d0 net/sctp/input.c:894 rht_key_get_hash include/linux/rhashtable.h:133 [inline] rht_key_hashfn include/linux/rhashtable.h:159 [inline] rht_head_hashfn include/linux/rhashtable.h:174 [inline] head_hashfn lib/rhashtable.c:41 [inline] rhashtable_rehash_one lib/rhashtable.c:245 [inline] rhashtable_rehash_chain lib/rhashtable.c:276 [inline] rhashtable_rehash_table lib/rhashtable.c:316 [inline] rht_deferred_worker+0x468/0xab0 lib/rhashtable.c:420 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269 worker_thread+0xa0/0x800 kernel/workqueue.c:2415 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352 It was caused by rhashtable access asoc->base.sk when sctp_assoc_migrate is changing its value. However, what rhashtable wants is netns from asoc base.sk, and for an asoc, its netns won't change once set. So we can simply fix it by caching netns since created. Fixes: d6c0256a60e6 ("sctp: add the rhashtable apis for sctp global transport hashtable") Reported-by: [email protected] Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23sctp: Fix memory leak in sctp_sf_do_5_2_4_dupcookNavid Emamdoost1-1/+3
In the implementation of sctp_sf_do_5_2_4_dupcook() the allocated new_asoc is leaked if security_sctp_assoc_request() fails. Release it via sctp_association_free(). Fixes: 2277c7cd75e3 ("sctp: Add LSM hooks") Signed-off-by: Navid Emamdoost <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23net: gro: use vlan API instead of accessing directlyTonghao Zhang1-1/+1
Use vlan common api to access the vlan_tag info. Signed-off-by: Tonghao Zhang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23Merge tag 'mlx5-updates-2019-11-22' of ↵Jakub Kicinski8-93/+187
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-11-22 1) Misc Cleanups 2) Software steering support for Geneve ==================== Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23net: phylink: rename mac_link_state() op to mac_pcs_get_state()Russell King10-65/+59
Rename the mac_link_state() method to mac_pcs_get_state() to make it clear that it should be returning the MACs PCS current state, which is used for inband negotiation rather than just reading back what the MAC has been configured for. Update the documentation to explicitly mention that this is for inband. We drop the return value as well; most of phylink doesn't check the return value and it is not clear what it should do on error - instead arrange for state->link to be false. Signed-off-by: Russell King <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2019-11-23mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmapChristoph Hellwig2-170/+0
These two functions have never been used since they were added. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: John Hubbard <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23mm/hmm: make full use of walk_page_range()Ralph Campbell1-64/+56
hmm_range_fault() calls find_vma() and walk_page_range() in a loop. This is unnecessary duplication since walk_page_range() calls find_vma() in a loop already. Simplify hmm_range_fault() by defining a walk_test() callback function to filter unhandled vmas. This also fixes a bug where hmm_range_fault() was not checking start >= vma->vm_start before checking vma->vm_flags so hmm_range_fault() could return an error based on the wrong vma for the requested range. It also fixes a bug when the vma has no read access and the caller did not request a fault, there shouldn't be any error return code. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ralph Campbell <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23xen/gntdev: use mmu_interval_notifier_insertJason Gunthorpe2-138/+49
gntdev simply wants to monitor a specific VMA for any notifier events, this can be done straightforwardly using mmu_interval_notifier_insert() over the VMA's VA range. The notifier should be attached until the original VMA is destroyed. It is unclear if any of this is even sane, but at least a lot of duplicate code is removed. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23mm/hmm: remove hmm_mirror and relatedJason Gunthorpe4-540/+34
The only two users of this are now converted to use mmu_interval_notifier, delete all the code and update hmm.rst. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Jérôme Glisse <[email protected]> Tested-by: Ralph Campbell <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirrorJason Gunthorpe5-237/+94
Convert the collision-retry lock around hmm_range_fault to use the one now provided by the mmu_interval notifier. Although this driver does not seem to use the collision retry lock that hmm provides correctly, it can still be converted over to use the mmu_interval_notifier api instead of hmm_mirror without too much trouble. This also deletes another place where a driver is associating additional data (struct amdgpu_mn) with a mmu_struct. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/amdgpu: Use mmu_interval_insert instead of hmm_mirrorJason Gunthorpe6-281/+77
Remove the interval tree in the driver and rely on the tree maintained by the mmu_notifier for delivering mmu_notifier invalidation callbacks. For some reason amdgpu has a very complicated arrangement where it tries to prevent duplicate entries in the interval_tree, this is not necessary, each amdgpu_bo can be its own stand alone entry. interval_tree already allows duplicates and overlaps in the tree. Also, there is no need to remove entries upon a release callback, the mmu_interval API safely allows objects to remain registered beyond the lifetime of the mm. The driver only has to stop touching the pages during release. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/amdgpu: Call find_vma under mmap_semJason Gunthorpe1-16/+21
find_vma() must be called under the mmap_sem, reorganize this code to do the vma check after entering the lock. Further, fix the unlocked use of struct task_struct's mm, instead use the mm from hmm_mirror which has an active mm_grab. Also the mm_grab must be converted to a mm_get before acquiring mmap_sem or calling find_vma(). Fixes: 66c45500bfdc ("drm/amdgpu: use new HMM APIs and helpers") Fixes: 0919195f2b0d ("drm/amdgpu: Enable amdgpu_ttm_tt_get_user_pages in worker threads") Link: https://lore.kernel.org/r/[email protected] Acked-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Tested-by: Philip Yang <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23nouveau: use mmu_interval_notifier instead of hmm_mirrorJason Gunthorpe1-80/+99
Remove the hmm_mirror object and use the mmu_interval_notifier API instead for the range, and use the normal mmu_notifier API for the general invalidation callback. While here re-organize the pagefault path so the locking pattern is clear. nouveau is the only driver that uses a temporary range object and instead forwards nearly every invalidation range directly to the HW. While this is not how the mmu_interval_notifier was intended to be used, the overheads on the pagefaulting path are similar to the existing hmm_mirror version. Particularly since the interval tree will be small. Link: https://lore.kernel.org/r/[email protected] Tested-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23nouveau: use mmu_notifier directly for invalidate_range_startJason Gunthorpe1-32/+63
There is no reason to get the invalidate_range_start() callback via an indirection through hmm_mirror, just register a normal notifier directly. Link: https://lore.kernel.org/r/[email protected] Tested-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-11-23drm/radeon: use mmu_interval_notifier_insertJason Gunthorpe2-176/+51
The new API is an exact match for the needs of radeon. For some reason radeon tries to remove overlapping ranges from the interval tree, but interval trees (and mmu_interval_notifier_insert()) support overlapping ranges directly. Simply delete all this code. Since this driver is missing a invalidate_range_end callback, but still calls get_user_pages(), it cannot be correct against all races. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>