| Age | Commit message (Collapse) | Author | Files | Lines |
|
Implement the uABI of UFFDIO_MOVE ioctl.
UFFDIO_COPY performs ~20% better than UFFDIO_MOVE when the application
needs pages to be allocated [1]. However, with UFFDIO_MOVE, if pages are
available (in userspace) for recycling, as is usually the case in heap
compaction algorithms, then we can avoid the page allocation and memcpy
(done by UFFDIO_COPY). Also, since the pages are recycled in the
userspace, we avoid the need to release (via madvise) the pages back to
the kernel [2].
We see over 40% reduction (on a Google pixel 6 device) in the compacting
thread's completion time by using UFFDIO_MOVE vs. UFFDIO_COPY. This was
measured using a benchmark that emulates a heap compaction implementation
using userfaultfd (to allow concurrent accesses by application threads).
More details of the usecase are explained in [2]. Furthermore,
UFFDIO_MOVE enables moving swapped-out pages without touching them within
the same vma. Today, it can only be done by mremap, however it forces
splitting the vma.
[1] https://lore.kernel.org/all/[email protected]/
[2] https://lore.kernel.org/linux-mm/CA+EESO4uO84SSnBhArH4HvLNhaUQ5nZKNKXqxRCyjniNVjp0Aw@mail.gmail.com/
Update for the ioctl_userfaultfd(2) manpage:
UFFDIO_MOVE
(Since Linux xxx) Move a continuous memory chunk into the
userfault registered range and optionally wake up the blocked
thread. The source and destination addresses and the number of
bytes to move are specified by the src, dst, and len fields of
the uffdio_move structure pointed to by argp:
struct uffdio_move {
__u64 dst; /* Destination of move */
__u64 src; /* Source of move */
__u64 len; /* Number of bytes to move */
__u64 mode; /* Flags controlling behavior of move */
__s64 move; /* Number of bytes moved, or negated error */
};
The following value may be bitwise ORed in mode to change the
behavior of the UFFDIO_MOVE operation:
UFFDIO_MOVE_MODE_DONTWAKE
Do not wake up the thread that waits for page-fault
resolution
UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES
Allow holes in the source virtual range that is being moved.
When not specified, the holes will result in ENOENT error.
When specified, the holes will be accounted as successfully
moved memory. This is mostly useful to move hugepage aligned
virtual regions without knowing if there are transparent
hugepages in the regions or not, but preventing the risk of
having to split the hugepage during the operation.
The move field is used by the kernel to return the number of
bytes that was actually moved, or an error (a negated errno-
style value). If the value returned in move doesn't match the
value that was specified in len, the operation fails with the
error EAGAIN. The move field is output-only; it is not read by
the UFFDIO_MOVE operation.
The operation may fail for various reasons. Usually, remapping of
pages that are not exclusive to the given process fail; once KSM
might deduplicate pages or fork() COW-shares pages during fork()
with child processes, they are no longer exclusive. Further, the
kernel might only perform lightweight checks for detecting whether
the pages are exclusive, and return -EBUSY in case that check fails.
To make the operation more likely to succeed, KSM should be
disabled, fork() should be avoided or MADV_DONTFORK should be
configured for the source VMA before fork().
This ioctl(2) operation returns 0 on success. In this case, the
entire area was moved. On error, -1 is returned and errno is
set to indicate the error. Possible errors include:
EAGAIN The number of bytes moved (i.e., the value returned in
the move field) does not equal the value that was
specified in the len field.
EINVAL Either dst or len was not a multiple of the system page
size, or the range specified by src and len or dst and len
was invalid.
EINVAL An invalid bit was specified in the mode field.
ENOENT
The source virtual memory range has unmapped holes and
UFFDIO_MOVE_MODE_ALLOW_SRC_HOLES is not set.
EEXIST
The destination virtual memory range is fully or partially
mapped.
EBUSY
The pages in the source virtual memory range are either
pinned or not exclusive to the process. The kernel might
only perform lightweight checks for detecting whether the
pages are exclusive. To make the operation more likely to
succeed, KSM should be disabled, fork() should be avoided
or MADV_DONTFORK should be configured for the source virtual
memory area before fork().
ENOMEM Allocating memory needed for the operation failed.
ESRCH
The target process has exited at the time of a UFFDIO_MOVE
operation.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Suren Baghdasaryan <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Axel Rasmussen <[email protected]>
Cc: Brian Geffon <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Kalesh Singh <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Lokesh Gidra <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Mike Rapoport (IBM) <[email protected]>
Cc: Nicolas Geoffray <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: ZhangPeng <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The kernel doc comments for struct ethtool_link_settings includes
documentation for three fields that were never present there, leading to
these docs-build warnings:
./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'supported' description in 'ethtool_link_settings'
./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'advertising' description in 'ethtool_link_settings'
./include/uapi/linux/ethtool.h:2207: warning: Excess struct member 'lp_advertising' description in 'ethtool_link_settings'
Remove the entries to make the warnings go away. There was some
information there on how data in >link_mode_masks is formatted; move that
to the body of the comment to preserve it.
Signed-off-by: Jonathan Corbet <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So far the mirred action has dealt with syntax that handles
mirror/redirection for netdev. A matching packet is redirected or mirrored
to a target netdev.
In this patch we enable mirred to mirror to a tc block as well.
IOW, the new syntax looks as follows:
... mirred <ingress | egress> <mirror | redirect> [index INDEX] < <blockid BLOCKID> | <dev <devname>> >
Examples of mirroring or redirecting to a tc block:
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 192.168.0.0/16 action mirred egress mirror blockid 22
$ tc filter add block 22 protocol ip pref 25 \
flower dst_ip 10.10.10.10/32 action mirred egress redirect blockid 22
Co-developed-by: Jamal Hadi Salim <[email protected]>
Signed-off-by: Jamal Hadi Salim <[email protected]>
Co-developed-by: Pedro Tammela <[email protected]>
Signed-off-by: Pedro Tammela <[email protected]>
Signed-off-by: Victor Nogueira <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Virtual ISM devices introduced in SMCv2.1 requires a 128 bit extended
GID vs. the existing ISM 64bit GID. So the 2nd 64 bit of extended GID
should be included in SMC-D linkgroup netlink attribute as well.
Signed-off-by: Wen Gu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ctx in struct lsm_ctx is an array of size ctx_len, tell the compiler
about this using __counted_by() where supported to improve the ability to
detect overflow issues.
Reported-by: Aishwarya TCV <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
UHB AP send supported power type(LPI, SP, VLP)
in beacon and probe response IE and STA should
connect to these AP only if their regulatory support
the AP power type.
Beacon/Probe response are reported to userspace
with reason "STA regulatory not supporting to connect to AP
based on transmitted power type" and it should
not connect to AP.
Signed-off-by: Mukesh Sisodiya <[email protected]>
Reviewed-by: Gregory Greenman <[email protected]>
Signed-off-by: Miri Korenblit <[email protected]>
Link: https://msgid.link/20231220133549.cbfbef9170a9.I432f78438de18aa9f5c9006be12e41dc34cc47c5@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
FCC-594280 D01 Section B.3 allows peer-to-peer and ad hoc devices to
operate on DFS channels while they operate under the control of a
concurrent DFS master. For example, it is possible to have a P2P GO on a
DFS channel as long as BSS connection is active on the same channel.
Allow such operation by adding additional regulatory flags to indicate
DFS concurrent channels and capable devices. Add the required
relaxations in DFS regulatory checks.
Signed-off-by: Andrei Otcheretianski <[email protected]>
Reviewed-by: Gregory Greenman <[email protected]>
Signed-off-by: Miri Korenblit <[email protected]>
Link: https://msgid.link/20231220133549.bdfb8a9c7c54.I973563562969a27fea8ec5685b96a3a47afe142f@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
The tail of the provided ring buffer is shared between the kernel and
the application, but the head is private to the kernel as the
application doesn't need to see it. However, this also prevents the
application from knowing how many buffers the kernel has consumed.
Usually this is fine, as the information is inherently racy in that
the kernel could be consuming buffers continually, but for cleanup
purposes it may be relevant to know how many buffers are still left
in the ring.
Add IORING_REGISTER_PBUF_STATUS which will return status for a given
provided buffer ring. Right now it just returns the head, but space
is reserved for more information later in, if needed.
Link: https://github.com/axboe/liburing/discussions/1020
Signed-off-by: Jens Axboe <[email protected]>
|
|
We should't be depending on time.h; we should only be pulling in other
uapi headers.
Signed-off-by: Kent Overstreet <[email protected]>
|
|
Patch series "kexec_file: print out debugging message if required", v4.
Currently, specifying '-d' on kexec command will print a lot of debugging
informationabout kexec/kdump loading with kexec_load interface.
However, kexec_file_load prints nothing even though '-d' is specified.
It's very inconvenient to debug or analyze the kexec/kdump loading when
something wrong happened with kexec/kdump itself or develper want to check
the kexec/kdump loading.
In this patchset, a kexec_file flag is KEXEC_FILE_DEBUG added and checked
in code. If it's passed in, debugging message of kexec_file code will be
printed out and can be seen from console and dmesg. Otherwise, the
debugging message is printed like beofre when pr_debug() is taken.
Note:
****
=====
1) The code in kexec-tools utility also need be changed to support
passing KEXEC_FILE_DEBUG to kernel when 'kexec -s -d' is specified.
The patch link is here:
=========
[PATCH] kexec_file: add kexec_file flag to support debug printing
http://lists.infradead.org/pipermail/kexec/2023-November/028505.html
2) s390 also has kexec_file code, while I am not sure what debugging
information is necessary. So leave it to s390 developer.
Test:
****
====
Testing was done in v1 on x86_64 and arm64. For v4, tested on x86_64
again. And on x86_64, the printed messages look like below:
--------------------------------------------------------------
kexec measurement buffer for the loaded kernel at 0x207fffe000.
Loaded purgatory at 0x207fff9000
Loaded boot_param, command line and misc at 0x207fff3000 bufsz=0x1180 memsz=0x1180
Loaded 64bit kernel at 0x207c000000 bufsz=0xc88200 memsz=0x3c4a000
Loaded initrd at 0x2079e79000 bufsz=0x2186280 memsz=0x2186280
Final command line is: root=/dev/mapper/fedora_intel--knightslanding--lb--02-root ro
rd.lvm.lv=fedora_intel-knightslanding-lb-02/root console=ttyS0,115200N81 crashkernel=256M
E820 memmap:
0000000000000000-000000000009a3ff (1)
000000000009a400-000000000009ffff (2)
00000000000e0000-00000000000fffff (2)
0000000000100000-000000006ff83fff (1)
000000006ff84000-000000007ac50fff (2)
......
000000207fff6150-000000207fff615f (128)
000000207fff6160-000000207fff714f (1)
000000207fff7150-000000207fff715f (128)
000000207fff7160-000000207fff814f (1)
000000207fff8150-000000207fff815f (128)
000000207fff8160-000000207fffffff (1)
nr_segments = 5
segment[0]: buf=0x000000004e5ece74 bufsz=0x211 mem=0x207fffe000 memsz=0x1000
segment[1]: buf=0x000000009e871498 bufsz=0x4000 mem=0x207fff9000 memsz=0x5000
segment[2]: buf=0x00000000d879f1fe bufsz=0x1180 mem=0x207fff3000 memsz=0x2000
segment[3]: buf=0x000000001101cd86 bufsz=0xc88200 mem=0x207c000000 memsz=0x3c4a000
segment[4]: buf=0x00000000c6e38ac7 bufsz=0x2186280 mem=0x2079e79000 memsz=0x2187000
kexec_file_load: type:0, start:0x207fff91a0 head:0x109e004002 flags:0x8
---------------------------------------------------------------------------
This patch (of 7):
When specifying 'kexec -c -d', kexec_load interface will print loading
information, e.g the regions where kernel/initrd/purgatory/cmdline are
put, the memmap passed to 2nd kernel taken as system RAM ranges, and
printing all contents of struct kexec_segment, etc. These are very
helpful for analyzing or positioning what's happening when kexec/kdump
itself failed. The debugging printing for kexec_load interface is made in
user space utility kexec-tools.
Whereas, with kexec_file_load interface, 'kexec -s -d' print nothing.
Because kexec_file code is mostly implemented in kernel space, and the
debugging printing functionality is missed. It's not convenient when
debugging kexec/kdump loading and jumping with kexec_file_load interface.
Now add KEXEC_FILE_DEBUG to kexec_file flag to control the debugging
message printing. And add global variable kexec_file_dbg_print and macro
kexec_dprintk() to facilitate the printing.
This is a preparation, later kexec_dprintk() will be used to replace the
existing pr_debug(). Once 'kexec -s -d' is specified, it will print out
kexec/kdump loading information. If '-d' is not specified, it regresses
to pr_debug().
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Baoquan He <[email protected]>
Cc: Conor Dooley <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:
1st set of IIO new device support, features and cleanup for 6.8
New device support
------------------
adi,hmc425a
* Add support for ADRF5740 attenuators. Minor changes to driver needed
alongside new IDs.
aosong,ags02ma
* New driver for this volatile organic compounds sensor.
bosch,bmp280
* Add BMP390 (small amount of refactoring + ID)
bosch,bmi323
* New driver to support the BMI323 6-axis IMU.
honeywell,hsc030pa
* New driver supporting a huge number of SSC and HSC series pressure and
temperature sensors.
isil,isl76682
* New driver for this simple Ambient Light sensor.
liteon,ltr390
* New driver for this ambient and ultraviolet light sensor.
maxim,max34408
* New driver to support the MAX34408 and MAX34409 current monitoring ADCs.
melexis,mlx90635
* New driver for this Infrared contactless temperature sensor.
mirochip,mcp9600
* New driver for this thermocouple EMF convertor.
ti,hdc3020
* New driver for this integrated relative humidity and temperature
sensor.
vishay,veml6075
* New driver for this UVA and UVB light sensor.
General features
----------------
Device properties
* Add fwnode_property_match_property_string() helper to allow matching
single value property against an array of predefined strings.
* Use fwnode_property_string_array_count() inside
fwnode_property_match_string() instead of open coding the same.
checkpatch.pl
* Add exclusion of __aligned() from a warning reducing false positives
on IIO drivers (and hopefully beyond)
IIO Features
------------
core
* New light channel modifiers for UVA and UVB.
* Add IIO_CHAN_INFO_TROUGH as counterpart to IIO_CHAN_INFO_PEAK so that
we can support device that keep running track of the lowest value they
have seen in similar fashion to the existing peak tracking.
adi,adis library
* Use spi cs inactive delay even when a burst reading is performed.
As it's now used every time, can centralize the handling in the SPI
setup code in the driver.
adi,ad2s1210
* Support for fixed-mode to this resolver driver where the A0 and A1
pins are hard wired to config mode in which case position and config
must be read from appropriate config registers.
* Support reset GPIO if present.
adi,ad5791
* Allow configuration of presence of external amplifier in DT binding.
adi,adis16400
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
adi,adis16475
* Add spi-cs-inactive-delay-ns to bindings to allow it to be tweaked
if default delays are not quite enough for a specific board.
bosch,bmp280
* Enable multiple chip IDs per family of devices.
rohm,bu27008
* Add an illuminance channel calculated from RGB and IR data.
Cleanup
-------
Minor white space, typos and tidy up not explicitly called out.
Core
* Check that the available_scan_masks array passed to the IIO core
by a driver is sensible by ensuring the entries are ordered so the
minimum number of channels is enabled in the earlier entries (as they
will be selected if sufficient for the requested channels).
* Document that the available_scan_masks infrastructure doesn't currently
handle masks that don't fit in a long int.
* Improve intensity documentation to reflect that there is no expectation
of sensible units (it's dependent on a frequency sensitivity curve)
Various
* Use new device_property_match_property_string() to replace open coded
versions of the same thing.
* Fix a few MAINTAINERS filenames.
* i2c_get_match_data() and spi_get_device_match_data() pushed into
more drivers reducing boilerplate handling.
* Some unnecessary headers removed.
* ACPI_PTR() removals. It's rarely worth using this.
adi,ad7091r (early part of a series adding device support - useful in
their own right)
* Pass iio_dev directly an event handler rather than relying
on broken use of dev_get_drvdata() as drvdata is never set in this driver.
* Make sure alert is turned on.
adi,ad9467 (general driver fixing up as precursor to iio-backend proposal
which is under review for 6.9)
* Fix reset gpio handling to match expected polarity.
* Always handle error codes from spi_writes.
* Add a driver instance local mutex to avoid some races.
* Fix scale setting to align with available scale values.
* Split array of chip_info structures up into named individual elements.
* Convert to regmap.
honeywell,mprls0025pa
* Drop now unnecessary type references in DT binding for properties in
pascals.
invensense,mpu6050
* Don't eat a potentially useful return value from regmap_bulk_write()
invensense,icm42600
* Use max macro to improve code readability and save a few lines.
liteon,ltrf216a
* Improve prevision of light intensity.
microchip,mcp3911
* Use cleanup.h magic.
qcom,spmi*
* Fix wrong descriptions of SPMI reg fields in bindings.
Other
----
mailmap
* Update for Matt Ranostay
* tag 'iio-for-6.8a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (83 commits)
iio: adc: ad7091r: Align arguments to function call parenthesis
iio: adc: ad7091r: Set alert bit in config register
iio: adc: ad7091r: Pass iio_dev to event handler
scripts: checkpatch: Add __aligned to the list of attribute notes
iio: chemical: add support for Aosong AGS02MA
dt-bindings: iio: chemical: add aosong,ags02ma
dt-bindings: vendor-prefixes: add aosong
iio: accel: bmi088: update comments and Kconfig
dt-bindings: iio: humidity: Add TI HDC302x support
iio: humidity: Add driver for ti HDC302x humidity sensors
iio: ABI: document temperature and humidity peak/trough raw attributes
iio: core: introduce trough info element for minimum values
iio: light: driver for Lite-On ltr390
dt-bindings: iio: light: add ltr390
iio: light: isl76682: remove unreachable code
iio: pressure: driver for Honeywell HSC/SSC series
dt-bindings: iio: pressure: add honeywell,hsc030
doc: iio: Document intensity scale as poorly defined
dt-bindings: iio: temperature: add MLX90635 device
iio: temperature: mlx90635 MLX90635 IR Temperature sensor
...
|
|
Currently, the 'state' field in 'struct br_port_msg' can be set to 1 if
the MDB entry is permanent or 0 if it is temporary. Additional states
might be added in the future.
In a similar fashion to 'NDA_NDM_STATE_MASK', add an MDB state mask uAPI
attribute that will allow the upcoming bulk deletion API to bulk delete
MDB entries with a certain state or any state.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
|
|
Introduces admin commands, as follow:
The "list query" command can be used by the driver to query the
set of admin commands supported by the virtio device.
The "list use" command is used to inform the virtio device which
admin commands the driver will use.
The "legacy common cfg rd/wr" commands are used to read from/write
into the legacy common configuration structure.
The "legacy dev cfg rd/wr" commands are used to read from/write
into the legacy device configuration structure.
The "notify info" command is used to query the notification region
information.
Signed-off-by: Feng Liu <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Yishai Hadas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
Add support for sending admin command through admin virtqueue interface.
Abort any inflight admin commands once device reset completes. Activate
admin queue when device becomes ready; deactivate on device reset.
To comply to the below specification statement [1], the admin virtqueue
is activated for upper layer users only after setting DRIVER_OK status.
[1] The driver MUST NOT send any buffer available notifications to the
device before setting DRIVER_OK.
Signed-off-by: Feng Liu <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Yishai Hadas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
Introduce support for the admin virtqueue. By negotiating
VIRTIO_F_ADMIN_VQ feature, driver detects capability and creates one
administration virtqueue. Administration virtqueue implementation in
virtio pci generic layer, enables multiple types of upper layer
drivers such as vfio, net, blk to utilize it.
Signed-off-by: Feng Liu <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Yishai Hadas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
Introduce VIRTIO_F_ADMIN_VQ which is used for administration virtqueue
support.
Signed-off-by: Feng Liu <[email protected]>
Reviewed-by: Parav Pandit <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Yishai Hadas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
md-faulty has been marked as deprecated for 2.5 years. Remove it.
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Neil Brown <[email protected]>
Cc: Guoqing Jiang <[email protected]>
Cc: Mateusz Grzonka <[email protected]>
Cc: Jes Sorensen <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
md-multipath has been marked as deprecated for 2.5 years. Remove it.
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Neil Brown <[email protected]>
Cc: Guoqing Jiang <[email protected]>
Cc: Mateusz Grzonka <[email protected]>
Cc: Jes Sorensen <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
md-linear has been marked as deprecated for 2.5 years. Remove it.
Cc: Christoph Hellwig <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Neil Brown <[email protected]>
Cc: Guoqing Jiang <[email protected]>
Cc: Mateusz Grzonka <[email protected]>
Cc: Jes Sorensen <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Hannes Reinecke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-12-19
Hi David, hi Jakub, hi Paolo, hi Eric,
The following pull-request contains BPF updates for your *net-next* tree.
We've added 2 non-merge commits during the last 1 day(s) which contain
a total of 40 files changed, 642 insertions(+), 2926 deletions(-).
The main changes are:
1) Revert all of BPF token-related patches for now as per list discussion [0],
from Andrii Nakryiko.
[0] https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com
2) Fix a syzbot-reported use-after-free read in nla_find() triggered from
bpf_skb_get_nlattr_nest() helper, from Jakub Kicinski.
bpf-next-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
Revert BPF token-related functionality
bpf: Use nla_ok() instead of checking nla_len directly
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
This patch includes the following revert (one conflicting BPF FS
patch and three token patch sets, represented by merge commits):
- revert 0f5d5454c723 "Merge branch 'bpf-fs-mount-options-parsing-follow-ups'";
- revert 750e785796bb "bpf: Support uid and gid when mounting bpffs";
- revert 733763285acf "Merge branch 'bpf-token-support-in-libbpf-s-bpf-object'";
- revert c35919dcce28 "Merge branch 'bpf-token-and-bpf-fs-based-delegation'".
Link: https://lore.kernel.org/bpf/CAHk-=wg7JuFYwGy=GOMbRCtOL+jwSQsdUaBsRWkDVYbxipbM5A@mail.gmail.com
Signed-off-by: Andrii Nakryiko <[email protected]>
|
|
Currently the user listening on a socket for devlink notifications
gets always all messages for all existing instances, even if he is
interested only in one of those. That may cause unnecessary overhead
on setups with thousands of instances present.
User is currently able to narrow down the devlink objects replies
to dump commands by specifying select attributes.
Allow similar approach for notifications. Introduce a new devlink
NOTIFY_FILTER_SET which the user passes the select attributes. Store
these per-socket and use them for filtering messages
during multicast send.
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:
====================
pull-request: bpf-next 2023-12-18
This PR is larger than usual and contains changes in various parts
of the kernel.
The main changes are:
1) Fix kCFI bugs in BPF, from Peter Zijlstra.
End result: all forms of indirect calls from BPF into kernel
and from kernel into BPF work with CFI enabled. This allows BPF
to work with CONFIG_FINEIBT=y.
2) Introduce BPF token object, from Andrii Nakryiko.
It adds an ability to delegate a subset of BPF features from privileged
daemon (e.g., systemd) through special mount options for userns-bound
BPF FS to a trusted unprivileged application. The design accommodates
suggestions from Christian Brauner and Paul Moore.
Example:
$ sudo mkdir -p /sys/fs/bpf/token
$ sudo mount -t bpf bpffs /sys/fs/bpf/token \
-o delegate_cmds=prog_load:MAP_CREATE \
-o delegate_progs=kprobe \
-o delegate_attachs=xdp
3) Various verifier improvements and fixes, from Andrii Nakryiko, Andrei Matei.
- Complete precision tracking support for register spills
- Fix verification of possibly-zero-sized stack accesses
- Fix access to uninit stack slots
- Track aligned STACK_ZERO cases as imprecise spilled registers.
It improves the verifier "instructions processed" metric from single
digit to 50-60% for some programs.
- Fix verifier retval logic
4) Support for VLAN tag in XDP hints, from Larysa Zaremba.
5) Allocate BPF trampoline via bpf_prog_pack mechanism, from Song Liu.
End result: better memory utilization and lower I$ miss for calls to BPF
via BPF trampoline.
6) Fix race between BPF prog accessing inner map and parallel delete,
from Hou Tao.
7) Add bpf_xdp_get_xfrm_state() kfunc, from Daniel Xu.
It allows BPF interact with IPSEC infra. The intent is to support
software RSS (via XDP) for the upcoming ipsec pcpu work.
Experiments on AWS demonstrate single tunnel pcpu ipsec reaching
line rate on 100G ENA nics.
8) Expand bpf_cgrp_storage to support cgroup1 non-attach, from Yafang Shao.
9) BPF file verification via fsverity, from Song Liu.
It allows BPF progs get fsverity digest.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (164 commits)
bpf: Ensure precise is reset to false in __mark_reg_const_zero()
selftests/bpf: Add more uprobe multi fail tests
bpf: Fail uprobe multi link with negative offset
selftests/bpf: Test the release of map btf
s390/bpf: Fix indirect trampoline generation
selftests/bpf: Temporarily disable dummy_struct_ops test on s390
x86/cfi,bpf: Fix bpf_exception_cb() signature
bpf: Fix dtor CFI
cfi: Add CFI_NOSEAL()
x86/cfi,bpf: Fix bpf_struct_ops CFI
x86/cfi,bpf: Fix bpf_callback_t CFI
x86/cfi,bpf: Fix BPF JIT call
cfi: Flip headers
selftests/bpf: Add test for abnormal cnt during multi-kprobe attachment
selftests/bpf: Don't use libbpf_get_error() in kprobe_multi_test
selftests/bpf: Add test for abnormal cnt during multi-uprobe attachment
bpf: Limit the number of kprobes when attaching program to multiple kprobes
bpf: Limit the number of uprobes when attaching program to multiple uprobes
bpf: xdp: Register generic_kfunc_set with XDP programs
selftests/bpf: utilize string values for delegate_xxx mount options
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.8
The second features pull request for v6.8. A bigger one this time with
changes both to stack and drivers. We have a new Wifi band RFI (WBRF)
mitigation feature for which we pulled an immutable branch shared with
other subsystems. And, as always, other new features and bug fixes all
over.
Major changes:
cfg80211/mac80211
* AMD ACPI based Wifi band RFI (WBRF) mitigation feature
* Basic Service Set (BSS) usage reporting
* TID to link mapping support
* mac80211 hardware flag to disallow puncturing
iwlwifi
* new debugfs file fw_dbg_clear
mt76
* NVMEM EEPROM improvements
* mt7996 Extremely High Throughpu (EHT) improvements
* mt7996 Wireless Ethernet Dispatcher (WED) support
* mt7996 36-bit DMA support
ath12k
* support one MSI vector
* WCN7850: support AP mode
* tag 'wireless-next-2023-12-18' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (207 commits)
wifi: mt76: mt7996: Use DECLARE_FLEX_ARRAY() and fix -Warray-bounds warnings
wifi: ath11k: workaround too long expansion sparse warnings
Revert "wifi: ath12k: use ATH12K_PCI_IRQ_DP_OFFSET for DP IRQ"
wifi: rt2x00: remove useless code in rt2x00queue_create_tx_descriptor()
wifi: rtw89: only reset BB/RF for existing WiFi 6 chips while starting up
wifi: rtw89: add DBCC H2C to notify firmware the status
wifi: rtw89: mac: add suffix _ax to MAC functions
wifi: rtw89: mac: add flags to check if CMAC and DMAC are enabled
wifi: rtw89: 8922a: add power on/off functions
wifi: rtw89: add XTAL SI for WiFi 7 chips
wifi: rtw89: phy: print out RFK log with formatted string
wifi: rtw89: parse and print out RFK log from C2H events
wifi: rtw89: add C2H event handlers of RFK log and report
wifi: rtw89: load RFK log format string from firmware file
wifi: rtw89: fw: add version field to BB MCU firmware element
wifi: rtw89: fw: load TX power track tables from fw_element
wifi: mwifiex: configure BSSID consistently when starting AP
wifi: mwifiex: add extra delay for firmware ready
wifi: mac80211: sta_info.c: fix sentence grammar
wifi: mac80211: rx.c: fix sentence grammar
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In the root complex pci endpoint test function driver, change macros and
functions names using the term "legacy" to use "intx" instead to
match the term used in the PCI specifications.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
|
|
This Venus capset definition is used by Qemu, and Qemu imports the
kernel protocol header file. Add Venus capset to the VirtIO-GPU protocol.
Signed-off-by: Huang Rui <[email protected]>
[[email protected]: edit commit message]
Signed-off-by: Dmitry Osipenko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
All of the other constants in this file are defined using enums, so make
the constants more consistent by defining the ioctls in an enum as well.
This is necessary for Rust Binder since the _IO macros are too
complicated for bindgen to see that they expand to integer constants.
Replacing the #defines with an enum forces bindgen to evaluate them
properly, which allows us to access them from Rust.
I originally intended to include this change in the first patch of the
Rust Binder patchset [1], but at plumbers Carlos Llamas told me that
this change has been discussed previously [2] and suggested that I send
it upstream separately.
Link: https://lore.kernel.org/rust-for-linux/[email protected]/ [1]
Link: https://lore.kernel.org/all/[email protected]/ [2]
Signed-off-by: Alice Ryhl <[email protected]>
Acked-by: Carlos Llamas <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
For include/uapi/linux/mei.h, correct spellos reported by codespell.
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Tomas Winkler <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When compiling with gcc version 14.0.0 20231206 (experimental)
and CONFIG_FORTIFY_SOURCE=y, I've noticed the following warning:
...
In function 'fortify_memcpy_chk',
inlined from '__ffs_func_bind_do_os_desc' at drivers/usb/gadget/function/f_fs.c:2934:3:
./include/linux/fortify-string.h:588:25: warning: call to '__read_overflow2_field'
declared with attribute warning: detected read beyond size of field (2nd parameter);
maybe use struct_group()? [-Wattribute-warning]
588 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This call to 'memcpy()' is interpreted as an attempt to copy both
'CompatibleID' and 'SubCompatibleID' of 'struct usb_ext_compat_desc'
from an address of the first one, which causes an overread warning.
Since we actually want to copy both of them at once, use the
convenient 'struct_group()' and 'sizeof_field()' here.
Signed-off-by: Dmitry Antipov <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
We assume in handful of places that the name of the spec is
the same as the name of the family. We could fix that but
it seems like a fair assumption to make. Rename the MPTCP
spec instead.
Reviewed-by: Mat Martineau <[email protected]>
Reviewed-by: Donald Hunter <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/intel/iavf/iavf_ethtool.c
3a0b5a2929fd ("iavf: Introduce new state machines for flow director")
95260816b489 ("iavf: use iavf_schedule_aq_request() helper")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/broadcom/bnxt/bnxt.c
c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic")
c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.")
a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7")
1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()")
84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters")
https://lore.kernel.org/all/[email protected]/
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled")
cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present")
https://lore.kernel.org/all/[email protected]/
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Correct spelling as reported by codespell.
Correct run-on sentences and other grammar issues.
Add hyphenation of adjectives.
Correct some punctuation.
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: [email protected]
Cc: Kalle Valo <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Link: https://msgid.link/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Make it extensible so that we have the liberty to reuse it in future
mount-id based apis. Treat zero size as the first published struct.
Signed-off-by: Christian Brauner <[email protected]>
|
|
Add way to query the children of a particular mount. This is a more
flexible way to iterate the mount tree than having to parse
/proc/self/mountinfo.
Lookup the mount by the new 64bit mount ID. If a mount needs to be
queried based on path, then statx(2) can be used to first query the
mount ID belonging to the path.
Return an array of new (64bit) mount ID's. Without privileges only
mounts are listed which are reachable from the task's root.
Folded into this patch are several later improvements. Keeping them
separate would make the history pointlessly confusing:
* Recursive listing of mounts is the default now (cf. [1]).
* Remove explicit LISTMOUNT_UNREACHABLE flag (cf. [1]) and fail if mount
is unreachable from current root. This also makes permission checking
consistent with statmount() (cf. [3]).
* Start listing mounts in unique mount ID order (cf. [2]) to allow
continuing listmount() from a midpoint.
* Allow to continue listmount(). The @request_mask parameter is renamed
and to @param to be usable by both statmount() and listmount().
If @param is set to a mount id then listmount() will continue listing
mounts from that id on. This allows listing mounts in multiple
listmount invocations without having to resize the buffer. If @param
is zero then the listing starts from the beginning (cf. [4]).
* Don't return EOVERFLOW, instead return the buffer size which allows to
detect a full buffer as well (cf. [4]).
Signed-off-by: Miklos Szeredi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Ian Kent <[email protected]>
Link: https://lore.kernel.org/r/[email protected] [1] (folded)
Link: https://lore.kernel.org/r/[email protected] [2] (folded)
Link: https://lore.kernel.org/r/[email protected] [3] (folded)
Link: https://lore.kernel.org/r/[email protected] [4] (folded)
[Christian Brauner <[email protected]>: various smaller fixes]
Signed-off-by: Christian Brauner <[email protected]>
|
|
Symmetric RSS hash functions are beneficial in applications that monitor
both Tx and Rx packets of the same flow (IDS, software firewalls, ..etc).
Getting all traffic of the same flow on the same RX queue results in
higher CPU cache efficiency.
A NIC that supports "symmetric-xor" can achieve this RSS hash symmetry
by XORing the source and destination fields and pass the values to the
RSS hash algorithm.
The user may request RSS hash symmetry for a specific algorithm, via:
# ethtool -X eth0 hfunc <hash_alg> symmetric-xor
or turn symmetry off (asymmetric) by:
# ethtool -X eth0 hfunc <hash_alg>
The specific fields for each flow type should then be specified as usual
via:
# ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n
Reviewed-by: Wojciech Drewek <[email protected]>
Signed-off-by: Ahmed Zaki <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Implement functionality that enables drivers to expose VLAN tag
to XDP code.
VLAN tag is represented by 2 variables:
- protocol ID, which is passed to bpf code in BE
- VLAN TCI, in host byte order
Acked-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Larysa Zaremba <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Due to a historical mishap, the v4l2_subdev_frame_interval structure
is the only part of the V4L2 subdev userspace API that doesn't contain a
'which' field. This prevents trying frame intervals using the subdev
'TRY' state mechanism.
Adding a 'which' field is simple as the structure has 8 reserved fields.
This would however break userspace as the field is currently set to 0,
corresponding to V4L2_SUBDEV_FORMAT_TRY, while the corresponding ioctls
currently operate on the 'ACTIVE' state. We thus need to add a new
subdev client cap, V4L2_SUBDEV_CLIENT_CAP_INTERVAL_USES_WHICH, to
indicate that userspace is aware of this new field.
All drivers that implement the subdev .get_frame_interval() and
.set_frame_interval() operations are updated to return -EINVAL when
operating on the TRY state, preserving the current behaviour.
While at it, fix a bad copy&paste in the documentation of the struct
v4l2_subdev_frame_interval_enum 'which' field.
Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]> # for imx-media
Reviewed-by: Hans Verkuil <[email protected]>
Reviewed-by: Luca Ceresoli <[email protected]> # for tegra-video
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
|
|
io_uring can currently open/close regular files or fixed/direct
descriptors. Or you can instantiate a fixed descriptor from a regular
one, and then close the regular descriptor. But you currently can't turn
a purely fixed/direct descriptor into a regular file descriptor.
IORING_OP_FIXED_FD_INSTALL adds support for installing a direct
descriptor into the normal file table, just like receiving a file
descriptor or opening a new file would do. This is all nicely abstracted
into receive_fd(), and hence adding support for this is truly trivial.
Since direct descriptors are only usable within io_uring itself, it can
be useful to turn them into real file descriptors if they ever need to
be accessed via normal syscalls. This can either be a transitory thing,
or just a permanent transition for a given direct descriptor.
By default, new fds are installed with O_CLOEXEC set. The application
can disable O_CLOEXEC by setting IORING_FIXED_FD_NO_CLOEXEC in the
sqe->install_fd_flags member.
Suggested-by: Christian Brauner <[email protected]>
Reviewed-by: Christian Brauner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add support for setting the TID to link mapping for a non-AP MLD
station.
This is useful in cases user space needs to restrict the possible
set of active links, e.g., since it got a BSS Transition Management
request forcing to use only a subset of the valid links etc.
Signed-off-by: Ilan Peer <[email protected]>
Reviewed-by: Gregory Greenman <[email protected]>
Signed-off-by: Miri Korenblit <[email protected]>
Link: https://msgid.link/20231211085121.da4d56a5f3ff.Iacf88e943326bf9c169c49b728c4a3445fdedc97@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
Sometimes there may be reasons for which a BSS that's
actually found in scan cannot be used to connect to,
for example a nonprimary link of an NSTR mobile AP MLD
cannot be used for normal direct connections to it.
Not indicating these to userspace as we do now of course
avoids being able to connect to them, but it's better if
they're shown to userspace and it can make an appropriate
decision, without e.g. doing an additional ML probe.
Thus add an indication of what a BSS can be used for,
currently "normal" and "MLD link", including a reason
bitmap for it being not usable.
The latter can be extended later for certain BSSes if there
are other reasons they cannot be used.
Signed-off-by: Johannes Berg <[email protected]>
Reviewed-by: Ilan Peer <[email protected]>
Reviewed-by: Gregory Greenman <[email protected]>
Signed-off-by: Miri Korenblit <[email protected]>
Link: https://msgid.link/20231211085121.0464f25e0b1d.I9f70ca9f1440565ad9a5207d0f4d00a20cca67e7@changeid
Signed-off-by: Johannes Berg <[email protected]>
|
|
Current handling of del pmksa with SSID is limited to FILS
security. In the current change the del pmksa support is extended
to SAE/OWE security offloads as well. For OWE/SAE offloads, the
PMK is generated and cached at driver/FW, so user app needs the
capability to request cache deletion based on SSID for drivers
supporting SAE/OWE offload.
Signed-off-by: Vinayak Yadawad <[email protected]>
Link: https://msgid.link/ecdae726459e0944c377a6a6f6cb2c34d2e057d0.1701262123.git.vinayak.yadawad@broadcom.com
[drop whitespace-damaged rdev_ops pointer completely, enabling tracing]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Linux 6.7-rc5
Alex requested this for some amdkfd work relying on the symbols exports.
Signed-off-by: Dave Airlie <[email protected]>
|
|
Add a way to query attributes of a single mount instead of having to parse
the complete /proc/$PID/mountinfo, which might be huge.
Lookup the mount the new 64bit mount ID. If a mount needs to be queried
based on path, then statx(2) can be used to first query the mount ID
belonging to the path.
Design is based on a suggestion by Linus:
"So I'd suggest something that is very much like "statfsat()", which gets
a buffer and a length, and returns an extended "struct statfs" *AND*
just a string description at the end."
The interface closely mimics that of statx.
Handle ASCII attributes by appending after the end of the structure (as per
above suggestion). Pointers to strings are stored in u64 members to make
the structure the same regardless of pointer size. Strings are nul
terminated.
Link: https://lore.kernel.org/all/CAHk-=wh5YifP7hzKSbwJj94+DZ2czjrZsczy6GBimiogZws=rg@mail.gmail.com/
Signed-off-by: Miklos Szeredi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Ian Kent <[email protected]>
[Christian Brauner <[email protected]>: various minor changes]
Signed-off-by: Christian Brauner <[email protected]>
|
|
We need the char/misc fixes in here as well for testing and to build off
of.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The PAGEMAP_SCAN ioctl returns information regarding page table entries.
It is more efficient compared to reading pagemap files. CRIU can start to
utilize this ioctl, but it needs info about soft-dirty bits to track
memory changes.
We are aware of a new method for tracking memory changes implemented in
the PAGEMAP_SCAN ioctl. For CRIU, the primary advantage of this method is
its usability by unprivileged users. However, it is not feasible to
transparently replace the soft-dirty tracker with the new one. The main
problem here is userfault descriptors that have to be preserved between
pre-dump iterations. It means criu continues supporting the soft-dirty
method to avoid breakage for current users. The new method will be
implemented as a separate feature.
[[email protected]: update tools/include/uapi/linux/fs.h]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Andrei Vagin <[email protected]>
Reviewed-by: Muhammad Usama Anjum <[email protected]>
Cc: Michał Mirosław <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The deprecated interfaces were removed 15 years ago. KVM's
device assignment was deprecated in 4.2 and removed 6.5 years
ago; the only interest might be in compiling ancient versions
of QEMU, but QEMU has been using its own imported copy of the
kernel headers since June 2011. So again we go into archaeology
territory; just remove the cruft.
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/stmicro/stmmac/dwmac5.c
drivers/net/ethernet/stmicro/stmmac/dwmac5.h
drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
drivers/net/ethernet/stmicro/stmmac/hwif.h
37e4b8df27bc ("net: stmmac: fix FPE events losing")
c3f3b97238f6 ("net: stmmac: Refactor EST implementation")
https://lore.kernel.org/all/[email protected]/
Adjacent changes:
net/ipv4/tcp_ao.c
9396c4ee93f9 ("net/tcp: Don't store TCP-AO maclen on reqsk")
7b0f570f879a ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Unless explicitly told to do so (by passing 'clocksource=tsc' and
'tsc=stable:socket', and then jumping through some hoops concerning
potential CPU hotplug) Xen will never use TSC as its clocksource.
Hence, by default, a Xen guest will not see PVCLOCK_TSC_STABLE_BIT set
in either the primary or secondary pvclock memory areas. This has
led to bugs in some guest kernels which only become evident if
PVCLOCK_TSC_STABLE_BIT *is* set in the pvclocks. Hence, to support
such guests, give the VMM a new Xen HVM config flag to tell KVM to
forcibly clear the bit in the Xen pvclocks.
Signed-off-by: Paul Durrant <[email protected]>
Reviewed-by: David Woodhouse <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sean Christopherson <[email protected]>
|
|
Add the call to the UAPI such that userspace may corelate the
timestamps from the device log with system wall time, if, for
example there's any sort of inaccuracy or skew in the device.
Signed-off-by: Davidlohr Bueso <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Dan Williams <[email protected]>
|