blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-07-27	bpf: Support new sign-extension load insns	Yonghong Song	1	-0/+3
	Add interpreter/jit support for new sign-extension load insns which adds a new mode (BPF_MEMSX). Also add verifier support to recognize these insns and to do proper verification with new insns. In verifier, besides to deduce proper bounds for the dst_reg, probed memory access is also properly handled. Acked-by: Eduard Zingerman <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-07-27	net: Remove unused declaration dev_restart()	YueHaibing	1	-2/+0
	This is not used, so can remove it. Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-8/+1
	Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-27	mm: fix memory ordering for mm_lock_seq and vm_lock_seq	Jann Horn	3	-8/+59
	mm->mm_lock_seq effectively functions as a read/write lock; therefore it must be used with acquire/release semantics. A specific example is the interaction between userfaultfd_register() and lock_vma_under_rcu(). userfaultfd_register() does the following from the point where it changes a VMA's flags to the point where concurrent readers are permitted again (in a simple scenario where only a single private VMA is accessed and no merging/splitting is involved): userfaultfd_register userfaultfd_set_vm_flags vm_flags_reset vma_start_write down_write(&vma->vm_lock->lock) vma->vm_lock_seq = mm_lock_seq [marks VMA as busy] up_write(&vma->vm_lock->lock) vm_flags_init [sets VM_UFFD_* in __vm_flags] vma->vm_userfaultfd_ctx.ctx = ctx mmap_write_unlock vma_end_write_all WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA] There are no memory barriers in between the __vm_flags update and the mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered to above the `vm_flags_init()` call, which means from the perspective of a concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not VMA-locked. That's bad, we definitely need a store-release for the unlock operation. The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly fine because all accesses to vma->vm_lock_seq that matter are always protected by the VMA lock. There is a racy read in vma_start_read() though that can tolerate false-positives, so we should be using WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN). On the other side, lock_vma_under_rcu() works as follows in the relevant region for locking and userfaultfd check: lock_vma_under_rcu vma_start_read vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout] down_read_trylock(&vma->vm_lock->lock) vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check] userfaultfd_armed checks vma->vm_flags & __VM_UFFD_FLAGS Here, the interesting aspect is how far down the mm->mm_lock_seq read can be reordered - if this read is reordered down below the vma->vm_flags access, this could cause lock_vma_under_rcu() to partly operate on information that was read while the VMA was supposed to be locked. To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we need to read it with a load-acquire. Some of the comment wording is based on suggestions by Suren. BACKPORT WARNING: One of the functions changed by this patch (which I've written against Linus' tree) is vma_try_start_write(), but this function no longer exists in mm/mm-everything. I don't know whether the merged version of this patch will be ordered before or after the patch that removes vma_try_start_write(). If you're backporting this patch to a tree with vma_try_start_write(), make sure this patch changes that function. Link: https://lkml.kernel.org/r/[email protected] Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Suren Baghdasaryan <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-07-27	net/mlx5: Allocate command stats with xarray	Shay Drory	1	-1/+1
	Command stats is an array with more than 2K entries, which amounts to ~180KB. This is way more than actually needed, as only ~190 entries are being used. Therefore, replace the array with xarray. Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-07-27	net/mlx5: Re-organize mlx5_cmd struct	Shay Drory	1	-10/+11
	Downstream patch will split mlx5_cmd_init() to probe and reload routines. As a preparation, organize mlx5_cmd struct so that any field that will be used in the reload routine are grouped at new nested struct. Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-07-27	net/mlx5: Devcom, Infrastructure changes	Roi Dayan	1	-2/+2
	Update devcom infrastructure to be more generic, without depending on max supported ports definition or a device guid, and also more encapsulated so callers don't need to pass the register devcom component id per event call. Signed-off-by: Eli Cohen <[email protected]> Signed-off-by: Roi Dayan <[email protected]> Reviewed-by: Shay Drory <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-07-27	seq_file: seq_show_option_n() is used for precise sizes	Kees Cook	1	-3/+4
	When seq_show_option_n() is used, it is for non-string memory that happens to be printable bytes. As such, we must use memcpy() to copy the bytes and then explicitly NUL-terminate the result. Cc: Andy Shevchenko <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Al Viro <[email protected]> Cc: Muchun Song <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Kees Cook <[email protected]>
2023-07-27	mtd: spi-nor: avoid holes in struct spi_mem_op	Arnd Bergmann	1	-0/+4
	gcc gets confused when -ftrivial-auto-var-init=pattern is used on sparse bit fields such as 'struct spi_mem_op', which caused the previous false positive warning about an uninitialized variable: drivers/mtd/spi-nor/spansion.c: error: 'op' is used uninitialized [-Werror=uninitialized] In fact, the variable is fully initialized and gcc does not see it being used, so the warning is entirely bogus. The problem appears to be a misoptimization in the initialization of single bit fields when the rest of the bytes are not initialized. A previous workaround added another initialization, which ended up shutting up the warning in spansion.c, though it apparently still happens in other files as reported by Peter Foley in the gcc bugzilla. The workaround of adding a fake initialization seems particularly bad because it would set values that can never be correct but prevent the compiler from warning about actually missing initializations. Revert the broken workaround and instead pad the structure to only have bitfields that add up to full bytes, which should avoid this behavior in all drivers. I also filed a new bug against gcc with what I found, so this can hopefully be addressed in future gcc releases. At the moment, only gcc-12 and gcc-13 are affected. Cc: Peter Foley <[email protected]> Cc: Pedro Falcato <[email protected]> Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110743 Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108402 Link: https://godbolt.org/z/efMMsG1Kx Fixes: 420c4495b5e56 ("mtd: spi-nor: spansion: make sure local struct does not contain garbage") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Mark Brown <[email protected]> Acked-by: Tudor Ambarus <[email protected]> Signed-off-by: Miquel Raynal <[email protected]> Link: https://lore.kernel.org/linux-mtd/[email protected]
2023-07-27	backlight: corgi_lcd: fix missing prototype	Arnd Bergmann	1	-0/+2
	The corgi_lcd_limit_intensity() function is called from platform and defined in a driver, but the driver does not see the declaration: drivers/video/backlight/corgi_lcd.c:434:6: error: no previous prototype for 'corgi_lcd_limit_intensity' [-Werror=missing-prototypes] 434 \| void corgi_lcd_limit_intensity(int limit) Move the prototype into a header that can be included from both sides to shut up the warning. Reviewed-by: Daniel Thompson <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]>
2023-07-27	fs: Add fchmodat2()	Alexey Gladkov	1	-0/+2
	On the userspace side fchmodat(3) is implemented as a wrapper function which implements the POSIX-specified interface. This interface differs from the underlying kernel system call, which does not have a flags argument. Most implementations require procfs [1][2]. There doesn't appear to be a good userspace workaround for this issue but the implementation in the kernel is pretty straight-forward. The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, unlike existing fchmodat. [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 Co-developed-by: Palmer Dabbelt <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Alexey Gladkov <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Message-Id: <f2a846ef495943c5d101011eebcf01179d0c7b61.1689092120.git.legion@kernel.org> [brauner: pre reviews, do flag conversion in do_fchmodat() directly] Signed-off-by: Christian Brauner <[email protected]>
2023-07-27	x86/srso: Add a Speculative RAS Overflow mitigation	Borislav Petkov (AMD)	1	-0/+2
	Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <[email protected]>
2023-07-26	net: phy: smsc: add WoL support to LAN8740/LAN8742 PHYs	Tristram Ha	1	-0/+34
	Microchip LAN8740/LAN8742 PHYs support basic unicast, broadcast, and Magic Packet WoL. They have one pattern filter matching up to 128 bytes of frame data, which can be used to implement ARP or multicast WoL. ARP WoL matches any ARP frame with broadcast address. Multicast WoL matches any multicast frame. Signed-off-by: Tristram Ha <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-26	of: fix htmldocs build warnings	Stephen Rothwell	1	-4/+3
	Fix these htmldoc build warnings: include/linux/of.h:115: warning: cannot understand function prototype: 'const struct kobj_type of_node_ktype; ' include/linux/of.h:118: warning: Excess function parameter 'phandle_name' description in 'of_node_init' Reported-by: Stephen Rothwell <[email protected]> Reported-by: Randy Dunlap <[email protected]> Fixes: d9194e009efe ("of: dynamic: add lifecycle docbook info to node creation functions") Signed-off-by: Stephen Rothwell <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]>
2023-07-26	coresight: etm4x: Change etm4_platform_driver driver for MMIO devices	Anshuman Khandual	1	-0/+47
	Add support for handling MMIO based devices via platform driver. We need to make sure that : 1) The APB clock, if present is enabled at probe and via runtime_pm ops 2) Use the ETM4x architecture or CoreSight architecture registers to identify a device as CoreSight ETM4x, instead of relying a white list of "Peripheral IDs" The driver doesn't get to handle the devices yet, until we wire the ACPI changes to move the devices to be handled via platform driver than the etm4_amba driver. Cc: Mathieu Poirier <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Acked-by: Sudeep Holla <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-26	coresight: etm4x: Drop pid argument from etm4_probe()	Anshuman Khandual	1	-0/+12
	Coresight device pid can be retrieved from its iomem base address, which is stored in 'struct etm4x_drvdata'. This drops pid argument from etm4_probe() and 'struct etm4_init_arg'. Instead etm4_check_arch_features() derives the coresight device pid with a new helper coresight_get_pid(), right before it is consumed in etm4_hisi_match_pid(). Cc: Mathieu Poirier <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Suzuki K Poulose <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-26	iommufd/selftest: Test iommufd_device_replace()	Nicolin Chen	1	-0/+1
	Allow the selftest to call the function on the mock idev, add some tests to exercise it. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Kevin Tian <[email protected]> Tested-by: Nicolin Chen <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-07-26	perf: Remove unused extern declaration arch_perf_get_page_size()	YueHaibing	1	-4/+0
	commit 8af26be06272 ("perf/core: Fix arch_perf_get_page_size()") left behind this. Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2023-07-26	perf: Remove unused PERF_PMU_CAP_HETEROGENEOUS_CPUS capability	James Clark	1	-4/+3
	Since commit bd2756811766 ("perf: Rewrite core context handling") the relationship between perf_event_context and PMUs has changed so that the error scenario that PERF_PMU_CAP_HETEROGENEOUS_CPUS originally silenced no longer exists. Remove the capability to avoid confusion that it actually influences any perf core behavior and shift down the following capability bits to fill in the unused space. This change should be a no-op. Signed-off-by: James Clark <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Acked-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-26	perf/mem: Add PERF_MEM_LVLNUM_NA to PERF_MEM_NA	Ravi Bangoria	1	-1/+2
	Add PERF_MEM_LVLNUM_NA wherever PERF_MEM_NA is used to set default values. Signed-off-by: Ravi Bangoria <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-26	net: skbuff: remove unused HAVE_HW_TIME_STAMP feature define	Peter Seiderer	1	-2/+0
	Remove unused HAVE_HW_TIME_STAMP feature define (introduced by commit ac45f602ee3d ("net: infrastructure for hardware time stamping"). Signed-off-by: Peter Seiderer <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-26	usb: phy: add usb phy notify port status API	Stanley Chang	1	-0/+13
	In Realtek SoC, the parameter of usb phy is designed to can dynamic tuning base on port status. Therefore, add a notify callback of phy driver when usb port status change. The Realtek phy driver is designed to dynamically adjust disconnection level and calibrate phy parameters. When the device connected bit changes and when the disconnected bit changes, do port status change notification: Check if portstatus is USB_PORT_STAT_CONNECTION and portchange is USB_PORT_STAT_C_CONNECTION. 1. The device is connected, the driver lowers the disconnection level and calibrates the phy parameters. 2. The device disconnects, the driver increases the disconnect level and calibrates the phy parameters. When controller to notify connect that device is already ready. If we adjust the disconnection level in notify_connect, the disconnect may have been triggered at this stage. So we need to change that as early as possible. The status change of connection is before port reset. Therefore, we add an api to notify phy the port status changes. In this stage, the device is not port enable, and it will not trigger disconnection. Signed-off-by: Stanley Chang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-07-25	serial: make uart_insert_char() accept u8s	Jiri Slaby	1	-1/+1
	Both the character and flag are 8-bit values. So switch from unsigned ints to u8s. The drivers will be cleaned up in the next round. Signed-off-by: Jiri Slaby (SUSE) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-07-25	serial: convert uart sysrq handling to u8	Jiri Slaby	1	-8/+8
	Propagate u8 from the sysrq code further up to serial's uart_handle_sysrq_char() and friends. Signed-off-by: Jiri Slaby (SUSE) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-07-25	tty: sysrq: switch the rest of keys to u8	Jiri Slaby	1	-8/+8
	Propagate u8 more from the bottom to the interface, so that sysrq callers (usually drivers) see that u8 is expected. Signed-off-by: Jiri Slaby (SUSE) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-07-25	tty: sysrq: switch sysrq handlers from int to u8	Jiri Slaby	1	-1/+1
	The passed parameter to sysrq handlers is a key (a character). So change the type from 'int' to 'u8'. Let it specifically be 'u8' for two reasons: * unsigned: unsigned values come from the upper layers (devices) and the tty layer assumes unsigned on most places, and * 8-bit: as that what's supposed to be one day in all the layers built on the top of tty. (Currently, we use mostly 'unsigned char' and somewhere still only 'char'. (But that also translates to the former thanks to -funsigned-char.)) Signed-off-by: Jiri Slaby (SUSE) <[email protected]> Cc: Richard Henderson <[email protected]> Cc: Ivan Kokshaysky <[email protected]> Cc: Matt Turner <[email protected]> Cc: Huacai Chen <[email protected]> Cc: WANG Xuerui <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Jason Wessel <[email protected]> Cc: Daniel Thompson <[email protected]> Cc: Douglas Anderson <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Len Brown <[email protected]> Cc: Pavel Machek <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Neeraj Upadhyay <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Zqiang <[email protected]> Acked-by: Thomas Zimmermann <[email protected]> # DRM Acked-by: WANG Xuerui <[email protected]> # loongarch Acked-by: Paul E. McKenney <[email protected]> Acked-by: Daniel Thompson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-07-25	vfio: Compile vfio_group infrastructure optionally	Yi Liu	1	-3/+22
	vfio_group is not needed for vfio device cdev, so with vfio device cdev introduced, the vfio_group infrastructures can be compiled out if only cdev is needed. Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio: Add VFIO_DEVICE_BIND_IOMMUFD	Yi Liu	1	-2/+3
	This adds ioctl for userspace to bind device cdev fd to iommufd. VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA control provided by the iommufd. open_device op is called after bind_iommufd op. Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	iommufd: Add iommufd_ctx_from_fd()	Yi Liu	1	-0/+1
	It's common to get a reference to the iommufd context from a given file descriptor. So adds an API for it. Existing users of this API are compiled only when IOMMUFD is enabled, so no need to have a stub for the IOMMUFD disabled case. Tested-by: Yanting Jiang <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio: Add cdev for vfio_device	Yi Liu	1	-0/+4
	This adds cdev support for vfio_device. It allows the user to directly open a vfio device w/o using the legacy container/group interface, as a prerequisite for supporting new iommu features like nested translation and etc. The device fd opened in this manner doesn't have the capability to access the device as the fops open() doesn't open the device until the successful VFIO_DEVICE_BIND_IOMMUFD ioctl which will be added in a later patch. With this patch, devices registered to vfio core would have both the legacy group and the new device interfaces created. - group interface : /dev/vfio/$groupID - device interface: /dev/vfio/devices/vfioX - normal device ("X" is a unique number across vfio devices) For a given device, the user can identify the matching vfioX by searching the vfio-dev folder under the sysfs path of the device. Take PCI device (0000:6a:01.0) as an example, /sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfioX implies the matching vfioX under /dev/vfio/devices/, and vfio-dev/vfioX/dev contains the major:minor number of the matching /dev/vfio/devices/vfioX. The user can get device fd by opening the /dev/vfio/devices/vfioX. The vfio_device cdev logic in this patch: ) __vfio_register_dev() path ends up doing cdev_device_add() for each vfio_device if VFIO_DEVICE_CDEV configured. ) vfio_unregister_group_dev() path does cdev_device_del(); cdev interface does not support noiommu devices, so VFIO only creates the legacy group interface for the physical devices that do not have IOMMU. noiommu users should use the legacy group interface. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio-iommufd: Add detach_ioas support for emulated VFIO devices	Yi Liu	1	-0/+3
	This prepares for adding DETACH ioctl for emulated VFIO devices. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	iommufd/device: Add iommufd_access_detach() API	Nicolin Chen	1	-0/+1
	Previously, the detach routine is only done by the destroy(). And it was called by vfio_iommufd_emulated_unbind() when the device runs close(), so all the mappings in iopt were cleaned in that setup, when the call trace reaches this detach() routine. Now, there's a need of a detach uAPI, meaning that it does not only need a new iommufd_access_detach() API, but also requires access->ops->unmap() call as a cleanup. So add one. However, leaving that unprotected can introduce some potential of a race condition during the pin_/unpin_pages() call, where access->ioas->iopt is getting referenced. So, add an ioas_lock to protect the context of iopt referencings. Also, to allow the iommufd_access_unpin_pages() callback to happen via this unmap() call, add an ioas_unpin pointer, so the unpin routine won't be affected by the "access->ioas = NULL" trick. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio-iommufd: Add detach_ioas support for physical VFIO devices	Yi Liu	1	-1/+7
	This prepares for adding DETACH ioctl for physical VFIO devices. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio: Refine vfio file kAPIs for KVM	Yi Liu	1	-0/+1
	This prepares for making the below kAPIs to accept both group file and device file instead of only vfio group file. bool vfio_file_enforced_coherent(struct file file); void vfio_file_set_kvm(struct file file, struct kvm *kvm); Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Eric Auger <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Nicolin Chen <[email protected]> Tested-by: Matthew Rosato <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Shameer Kolothum <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio/pci: Extend VFIO_DEVICE_GET_PCI_HOT_RESET_INFO for vfio device cdev	Yi Liu	1	-0/+14
	This allows VFIO_DEVICE_GET_PCI_HOT_RESET_INFO ioctl use the iommufd_ctx of the cdev device to check the ownership of the other affected devices. When VFIO_DEVICE_GET_PCI_HOT_RESET_INFO is called on an IOMMUFD managed device, the new flag VFIO_PCI_HOT_RESET_FLAG_DEV_ID is reported to indicate the values returned are IOMMUFD devids rather than group IDs as used when accessing vfio devices through the conventional vfio group interface. Additionally the flag VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED will be reported in this mode if all of the devices affected by the hot-reset are owned by either virtue of being directly bound to the same iommufd context as the calling device, or implicitly owned via a shared IOMMU group. Suggested-by: Jason Gunthorpe <[email protected]> Suggested-by: Alex Williamson <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio: Add helper to search vfio_device in a dev_set	Yi Liu	1	-0/+3
	There are drivers that need to search vfio_device within a given dev_set. e.g. vfio-pci. So add a helper. vfio_pci_is_device_in_set() now returns -EBUSY in commit a882c16a2b7e ("vfio/pci: Change vfio_pci_try_bus_reset() to use the dev_set") where it was trying to preserve the return of vfio_pci_try_zap_and_vma_lock_cb(). However, it makes more sense to return -ENODEV. Suggested-by: Alex Williamson <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	vfio: Mark cdev usage in vfio_device	Yi Liu	1	-0/+5
	This can be used to differentiate whether to report group_id or devid in the revised VFIO_DEVICE_GET_PCI_HOT_RESET_INFO ioctl. At this moment, no cdev path yet, so the vfio_device_cdev_opened() helper always returns false. Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	iommufd: Add helper to retrieve iommufd_ctx and devid	Yi Liu	1	-0/+3
	This is needed by the vfio-pci driver to report affected devices in the hot-reset for a given device. Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	iommufd: Add iommufd_ctx_has_group()	Yi Liu	1	-0/+2
	This adds the helper to check if any device within the given iommu_group has been bound with the iommufd_ctx. This is helpful for the checking on device ownership for the devices which have not been bound but cannot be bound to any other iommufd_ctx as the iommu_group has been bound. Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Yanting Jiang <[email protected]> Tested-by: Terrence Xu <[email protected]> Tested-by: Zhenzhong Duan <[email protected]> Signed-off-by: Yi Liu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alex Williamson <[email protected]>
2023-07-25	net/mlx5: Add relevant capabilities bits to support NAT-T	Leon Romanovsky	1	-2/+5
	Provide an ability to check if flow steering supports UDP encapsulation and decapsulation of IPsec ESP packets. Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-07-25	net: phylink: strip out pre-March 2020 legacy code	Russell King (Oracle)	1	-39/+6
	Strip out all the pre-March 2020 legacy code from phylink now that the last user of it is gone. Reviewed-by: Daniel Golle <[email protected]> Tested-by: Daniel Golle <[email protected]> Tested-by: Frank Wunderlich <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2023-07-25	regulator: Use bitfield values for range selectors	Chen-Yu Tsai	1	-5/+6
	Right now the regulator helpers expect raw register values for the range selectors. This is different from the voltage selectors, which are normalized as bitfield values. This leads to a bit of confusion. Also, raw values are harder to copy from datasheets or match up with them, as datasheets will typically have bitfield values. Make the helpers expect bitfield values, and convert existing users. The field in regulator_desc is renamed to \|linear_range_selectors_bitfield\|. This is intended to cause drivers added in the same merge window and out-of-tree drivers using the incorrect variable and values to break, preventing incorrect values being used on actual hardware and potentially producing magic smoke. Also include bitops.h explicitly for ffs(), and reorder the header include statements. While at it, also replace module.h with export.h, since the only use is EXPORT_SYMBOL_GPL. Signed-off-by: Chen-Yu Tsai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-07-25	fs/nls: make load_nls() take a const parameter	Winston Wen	1	-1/+1
	load_nls() take a char * parameter, use it to find nls module in list or construct the module name to load it. This change make load_nls() take a const parameter, so we don't need do some cast like this: ses->local_nls = load_nls((char *)ctx->local_nls->charset); Suggested-by: Stephen Rothwell <[email protected]> Signed-off-by: Winston Wen <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Steve French <[email protected]>
2023-07-25	iomap: Add per-block dirty state tracking to improve performance	Ritesh Harjani (IBM)	1	-0/+1
	When filesystem blocksize is less than folio size (either with mapping_large_folio_support() or with blocksize < pagesize) and when the folio is uptodate in pagecache, then even a byte write can cause an entire folio to be written to disk during writeback. This happens because we currently don't have a mechanism to track per-block dirty state within struct iomap_folio_state. We currently only track uptodate state. This patch implements support for tracking per-block dirty state in iomap_folio_state->state bitmap. This should help improve the filesystem write performance and help reduce write amplification. Performance testing of below fio workload reveals ~16x performance improvement using nvme with XFS (4k blocksize) on Power (64K pagesize) FIO reported write bw scores improved from around ~28 MBps to ~452 MBps. 1. <test_randwrite.fio> [global] ioengine=psync rw=randwrite overwrite=1 pre_read=1 direct=0 bs=4k size=1G dir=./ numjobs=8 fdatasync=1 runtime=60 iodepth=64 group_reporting=1 [fio-run] 2. Also our internal performance team reported that this patch improves their database workload performance by around ~83% (with XFS on Power) Reported-by: Aravinda Herle <[email protected]> Reported-by: Brian Foster <[email protected]> Signed-off-by: Ritesh Harjani (IBM) <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2023-07-24	net: add missing net_device::xdp_zc_max_segs description	Maciej Fijalkowski	1	-0/+2
	Cited commit under 'Fixes' tag introduced new member to struct net_device without providing description of it - fix it. Reported-by: Stephen Rothwell <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Fixes: 13ce2daa259a ("xsk: add new netlink attribute dedicated for ZC max frags") Signed-off-by: Maciej Fijalkowski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Tested-by: Simon Horman <[email protected]> # build-tested Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-24	iomap: Create large folios in the buffered write path	Matthew Wilcox (Oracle)	1	-1/+1
	Use the size of the write as a hint for the size of the folio to create. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2023-07-24	filemap: Allow __filemap_get_folio to allocate large folios	Matthew Wilcox (Oracle)	1	-0/+34
	Allow callers of __filemap_get_folio() to specify a preferred folio order in the FGP flags. This is only honoured in the FGP_CREATE path; if there is already a folio in the page cache that covers the index, we will return it, no matter what its order is. No create-around is attempted; we will only create folios which start at the specified index. Unmodified callers will continue to allocate order 0 folios. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2023-07-24	filemap: Add fgf_t typedef	Matthew Wilcox (Oracle)	1	-11/+37
	Similarly to gfp_t, define fgf_t as its own type to prevent various misuses and confusion. Leave the flags as FGP_* for now to reduce the size of this patch; they will be converted to FGF_* later. Move the documentation to the definition of the type insted of burying it in the __filemap_get_folio() documentation. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Reviewed-by: Kent Overstreet <[email protected]>
2023-07-24	iov_iter: Add copy_folio_from_iter_atomic()	Matthew Wilcox (Oracle)	1	-1/+8
	Add a folio wrapper around copy_page_from_iter_atomic(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]>
2023-07-24	modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbols	Palmer Dabbelt	1	-14/+2
	Trying to restrict the '$'-prefix change to RISC-V caused some fallout, so let's just treat all those symbols as special. Fixes: c05780ef3c190 ("module: Ignore RISC-V mapping symbols too") Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Palmer Dabbelt <[email protected]> Reviewed-by: Masahiro Yamada <[email protected]> Signed-off-by: Luis Chamberlain <[email protected]>