Age | Commit message (Collapse) | Author | Files | Lines |
|
IEEE 802.1Q-2018 defines the concept of a cycle-time-extension, so the
last entry of a schedule before the start of a new schedule can be
extended, so "too-short" entries can be avoided.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
IEEE 802.1Q-2018 defines that a the cycle-time of a schedule may be
overridden, so the schedule is truncated to a determined "width".
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The IEEE 802.1Q-2018 defines two "types" of schedules, the "Oper" (from
operational?) and "Admin" ones. Up until now, 'taprio' only had
support for the "Oper" one, added when the qdisc is created. This adds
support for the "Admin" one, which allows the .change() operation to
be supported.
Just for clarification, some quick (and dirty) definitions, the "Oper"
schedule is the currently (as in this instant) running one, and it's
read-only. The "Admin" one is the one that the system configurator has
installed, it can be changed, and it will be "promoted" to "Oper" when
it's 'base-time' is reached.
The idea behing this patch is that calling something like the below,
(after taprio is already configured with an initial schedule):
$ tc qdisc change taprio dev IFACE parent root \
base-time X \
sched-entry <CMD> <GATES> <INTERVAL> \
...
Will cause a new admin schedule to be created and programmed to be
"promoted" to "Oper" at instant X. If an "Admin" schedule already
exists, it will be overwritten with the new parameters.
Up until now, there was some code that was added to ease the support
of changing a single entry of a schedule, but was ultimately unused.
Now, that we have support for "change" with more well thought
semantics, updating a single entry seems to be less useful.
So we remove what is in practice dead code, and return a "not
supported" error if the user tries to use it. If changing a single
entry would make the user's life easier we may ressurrect this idea,
but at this point, removing it simplifies the code.
For now, only the schedule specific bits are allowed to be added for a
new schedule, that means that 'clockid', 'num_tc', 'map' and 'queues'
cannot be modified.
Example:
$ tc qdisc change dev IFACE parent root handle 100 taprio \
base-time $BASE_TIME \
sched-entry S 00 500000 \
sched-entry S 0f 500000 \
clockid CLOCK_TAI
The only change in the netlink API introduced by this change is the
introduction of an "admin" type in the response to a dump request,
that type allows userspace to separate the "oper" schedule from the
"admin" schedule. If userspace doesn't support the "admin" type, it
will only display the "oper" schedule.
Signed-off-by: Vinicius Costa Gomes <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The file already has the correct SPDX header.
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The 'id' key returns the unique id of the conntrack entry as returned
by nf_ct_get_id().
Signed-off-by: Brett Mastbergen <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to
let QEMU connect the vCPU presenters to the XIVE KVM device if
required. The capability is not advertised for now as the full support
for the XIVE native exploitation mode is not yet available. When this
is case, the capability will be advertised on PowerNV Hypervisors
only. Nested guests (pseries KVM Hypervisor) are not supported.
Internally, the interface to the new KVM device is protected with a
new interrupt mode: KVMPPC_IRQ_XIVE.
Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This is the basic framework for the new KVM device supporting the XIVE
native exploitation mode. The user interface exposes a new KVM device
to be created by QEMU, only available when running on a L0 hypervisor.
Support for nested guests is not available yet.
The XIVE device reuses the device structure of the XICS-on-XIVE device
as they have a lot in common. That could possibly change in the future
if the need arise.
Signed-off-by: Cédric Le Goater <[email protected]>
Reviewed-by: David Gibson <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
|
|
Deduplicate the btrfs file type conversion implementation - file systems
that use the same file types as defined by POSIX do not need to define
their own versions and can use the common helper functions decared in
fs_types.h and implemented in fs_types.c
Common implementation can be found via commit:
bbe7449e2599 "fs: common implementation of file type"
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Phillip Potter <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
|
|
Add a serial driver for the SiFive UART, found on SiFive FU540 devices
(among others).
The underlying serial IP block is relatively basic, and currently does
not support serial break detection. Further information on the IP
block can be found in the documentation and Chisel sources:
https://static.dev.sifive.com/FU540-C000-v1.0.pdf
https://github.com/sifive/sifive-blocks/tree/master/src/main/scala/devices/uart
This driver was written in collaboration with Wesley Terpstra
<[email protected]>.
Tested on a SiFive HiFive Unleashed A00 board, using BBL and the open-
source FSBL (using a DT file based on what's targeted for mainline).
This revision incorporates changes based on comments by Julia Lawall
<[email protected]>, Emil Renner Berthing <[email protected]>, and
Andreas Schwab <[email protected]>. Thanks also to Andreas for testing
the driver with his userspace and reporting a bug with the
set_termios implementation.
Signed-off-by: Paul Walmsley <[email protected]>
Signed-off-by: Paul Walmsley <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Wesley Terpstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Julia Lawall <[email protected]>
Cc: Emil Renner Berthing <[email protected]>
Cc: Andreas Schwab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2019-04-28
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) Introduce BPF socket local storage map so that BPF programs can store
private data they associate with a socket (instead of e.g. separate hash
table), from Martin.
2) Add support for bpftool to dump BTF types. This is done through a new
`bpftool btf dump` sub-command, from Andrii.
3) Enable BPF-based flow dissector for skb-less eth_get_headlen() calls which
was currently not supported since skb was used to lookup netns, from Stanislav.
4) Add an opt-in interface for tracepoints to expose a writable context
for attached BPF programs, used here for NBD sockets, from Matt.
5) BPF xadd related arm64 JIT fixes and scalability improvements, from Daniel.
6) Change the skb->protocol for bpf_skb_adjust_room() helper in order to
support tunnels such as sit. Add selftests as well, from Willem.
7) Various smaller misc fixes.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
After allowing a bpf prog to
- directly read the skb->sk ptr
- get the fullsock bpf_sock by "bpf_sk_fullsock()"
- get the bpf_tcp_sock by "bpf_tcp_sock()"
- get the listener sock by "bpf_get_listener_sock()"
- avoid duplicating the fields of "(bpf_)sock" and "(bpf_)tcp_sock"
into different bpf running context.
this patch is another effort to make bpf's network programming
more intuitive to do (together with memory and performance benefit).
When bpf prog needs to store data for a sk, the current practice is to
define a map with the usual 4-tuples (src/dst ip/port) as the key.
If multiple bpf progs require to store different sk data, multiple maps
have to be defined. Hence, wasting memory to store the duplicated
keys (i.e. 4 tuples here) in each of the bpf map.
[ The smallest key could be the sk pointer itself which requires
some enhancement in the verifier and it is a separate topic. ]
Also, the bpf prog needs to clean up the elem when sk is freed.
Otherwise, the bpf map will become full and un-usable quickly.
The sk-free tracking currently could be done during sk state
transition (e.g. BPF_SOCK_OPS_STATE_CB).
The size of the map needs to be predefined which then usually ended-up
with an over-provisioned map in production. Even the map was re-sizable,
while the sk naturally come and go away already, this potential re-size
operation is arguably redundant if the data can be directly connected
to the sk itself instead of proxy-ing through a bpf map.
This patch introduces sk->sk_bpf_storage to provide local storage space
at sk for bpf prog to use. The space will be allocated when the first bpf
prog has created data for this particular sk.
The design optimizes the bpf prog's lookup (and then optionally followed by
an inline update). bpf_spin_lock should be used if the inline update needs
to be protected.
BPF_MAP_TYPE_SK_STORAGE:
-----------------------
To define a bpf "sk-local-storage", a BPF_MAP_TYPE_SK_STORAGE map (new in
this patch) needs to be created. Multiple BPF_MAP_TYPE_SK_STORAGE maps can
be created to fit different bpf progs' needs. The map enforces
BTF to allow printing the sk-local-storage during a system-wise
sk dump (e.g. "ss -ta") in the future.
The purpose of a BPF_MAP_TYPE_SK_STORAGE map is not for lookup/update/delete
a "sk-local-storage" data from a particular sk.
Think of the map as a meta-data (or "type") of a "sk-local-storage". This
particular "type" of "sk-local-storage" data can then be stored in any sk.
The main purposes of this map are mostly:
1. Define the size of a "sk-local-storage" type.
2. Provide a similar syscall userspace API as the map (e.g. lookup/update,
map-id, map-btf...etc.)
3. Keep track of all sk's storages of this "type" and clean them up
when the map is freed.
sk->sk_bpf_storage:
------------------
The main lookup/update/delete is done on sk->sk_bpf_storage (which
is a "struct bpf_sk_storage"). When doing a lookup,
the "map" pointer is now used as the "key" to search on the
sk_storage->list. The "map" pointer is actually serving
as the "type" of the "sk-local-storage" that is being
requested.
To allow very fast lookup, it should be as fast as looking up an
array at a stable-offset. At the same time, it is not ideal to
set a hard limit on the number of sk-local-storage "type" that the
system can have. Hence, this patch takes a cache approach.
The last search result from sk_storage->list is cached in
sk_storage->cache[] which is a stable sized array. Each
"sk-local-storage" type has a stable offset to the cache[] array.
In the future, a map's flag could be introduced to do cache
opt-out/enforcement if it became necessary.
The cache size is 16 (i.e. 16 types of "sk-local-storage").
Programs can share map. On the program side, having a few bpf_progs
running in the networking hotpath is already a lot. The bpf_prog
should have already consolidated the existing sock-key-ed map usage
to minimize the map lookup penalty. 16 has enough runway to grow.
All sk-local-storage data will be removed from sk->sk_bpf_storage
during sk destruction.
bpf_sk_storage_get() and bpf_sk_storage_delete():
------------------------------------------------
Instead of using bpf_map_(lookup|update|delete)_elem(),
the bpf prog needs to use the new helper bpf_sk_storage_get() and
bpf_sk_storage_delete(). The verifier can then enforce the
ARG_PTR_TO_SOCKET argument. The bpf_sk_storage_get() also allows to
"create" new elem if one does not exist in the sk. It is done by
the new BPF_SK_STORAGE_GET_F_CREATE flag. An optional value can also be
provided as the initial value during BPF_SK_STORAGE_GET_F_CREATE.
The BPF_MAP_TYPE_SK_STORAGE also supports bpf_spin_lock. Together,
it has eliminated the potential use cases for an equivalent
bpf_map_update_elem() API (for bpf_prog) in this patch.
Misc notes:
----------
1. map_get_next_key is not supported. From the userspace syscall
perspective, the map has the socket fd as the key while the map
can be shared by pinned-file or map-id.
Since btf is enforced, the existing "ss" could be enhanced to pretty
print the local-storage.
Supporting a kernel defined btf with 4 tuples as the return key could
be explored later also.
2. The sk->sk_lock cannot be acquired. Atomic operations is used instead.
e.g. cmpxchg is done on the sk->sk_bpf_storage ptr.
Please refer to the source code comments for the details in
synchronization cases and considerations.
3. The mem is charged to the sk->sk_omem_alloc as the sk filter does.
Benchmark:
---------
Here is the benchmark data collected by turning on
the "kernel.bpf_stats_enabled" sysctl.
Two bpf progs are tested:
One bpf prog with the usual bpf hashmap (max_entries = 8192) with the
sk ptr as the key. (verifier is modified to support sk ptr as the key
That should have shortened the key lookup time.)
Another bpf prog is with the new BPF_MAP_TYPE_SK_STORAGE.
Both are storing a "u32 cnt", do a lookup on "egress_skb/cgroup" for
each egress skb and then bump the cnt. netperf is used to drive
data with 4096 connected UDP sockets.
BPF_MAP_TYPE_HASH with a modifier verifier (152ns per bpf run)
27: cgroup_skb name egress_sk_map tag 74f56e832918070b run_time_ns 58280107540 run_cnt 381347633
loaded_at 2019-04-15T13:46:39-0700 uid 0
xlated 344B jited 258B memlock 4096B map_ids 16
btf_id 5
BPF_MAP_TYPE_SK_STORAGE in this patch (66ns per bpf run)
30: cgroup_skb name egress_sk_stora tag d4aa70984cc7bbf6 run_time_ns 25617093319 run_cnt 390989739
loaded_at 2019-04-15T13:47:54-0700 uid 0
xlated 168B jited 156B memlock 4096B map_ids 17
btf_id 6
Here is a high-level picture on how are the objects organized:
sk
┌──────┐
│ │
│ │
│ │
│*sk_bpf_storage─────▶ bpf_sk_storage
└──────┘ ┌───────┐
┌───────────┤ list │
│ │ │
│ │ │
│ │ │
│ └───────┘
│
│ elem
│ ┌────────┐
├─▶│ snode │
│ ├────────┤
│ │ data │ bpf_map
│ ├────────┤ ┌─────────┐
│ │map_node│◀─┬─────┤ list │
│ └────────┘ │ │ │
│ │ │ │
│ elem │ │ │
│ ┌────────┐ │ └─────────┘
└─▶│ snode │ │
├────────┤ │
bpf_map │ data │ │
┌─────────┐ ├────────┤ │
│ list ├───────▶│map_node│ │
│ │ └────────┘ │
│ │ │
│ │ elem │
└─────────┘ ┌────────┐ │
┌─▶│ snode │ │
│ ├────────┤ │
│ │ data │ │
│ ├────────┤ │
│ │map_node│◀─┘
│ └────────┘
│
│
│ ┌───────┐
sk └──────────│ list │
┌──────┐ │ │
│ │ │ │
│ │ │ │
│ │ └───────┘
│*sk_bpf_storage───────▶bpf_sk_storage
└──────┘
Signed-off-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This is an opt-in interface that allows a tracepoint to provide a safe
buffer that can be written from a BPF_PROG_TYPE_RAW_TRACEPOINT program.
The size of the buffer must be a compile-time constant, and is checked
before allowing a BPF program to attach to a tracepoint that uses this
feature.
The pointer to this buffer will be the first argument of tracepoints
that opt in; the pointer is valid and can be bpf_probe_read() by both
BPF_PROG_TYPE_RAW_TRACEPOINT and BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE
programs that attach to such a tracepoint, but the buffer to which it
points may only be written by the latter.
Signed-off-by: Matt Mullins <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
The HID usage tables define a key to cycle through a set of keyboard
layouts, let's add corresponding keycode.
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Various updates, notably:
* extended key ID support (from 802.11-2016)
* per-STA TX power control support
* mac80211 TX performance improvements
* HE (802.11ax) updates
* mesh link probing support
* enhancements of multi-BSSID support (also related to HE)
* OWE userspace processing support
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Adding support to allow mesh HWMP to measure link metrics on unexercised
direct mesh path by sending some data frames to other mesh points which
are not currently selected as a primary traffic path but only 1 hop away.
The absence of the primary path to the chosen node makes it necessary to
apply some form of marking on a chosen packet stream so that the packets
can be properly steered to the selected node for testing, and not by the
regular mesh path lookup.
Tested-by: Pradeep Kumar Chitrapu <[email protected]>
Signed-off-by: Rajkumar Manoharan <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
|
|
This patch adds support to set transmit power setting type and transmit
power level attributes to NL80211_CMD_SET_STATION in order to facilitate
adjusting the transmit power level of a station associated to the AP.
The added attributes allow selection of automatic and limited transmit
power level, with the level defined in dBm format.
Co-developed-by: Balaji Pothunoori <[email protected]>
Signed-off-by: Ashok Raj Nagarajan <[email protected]>
Signed-off-by: Balaji Pothunoori <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
|
|
Add support for IEEE 802.11-2016 "Extended Key ID for Individually
Addressed Frames".
Extend cfg80211 and nl80211 to allow pairwise keys to be installed for
Rx only, enable Tx separately and allow Key ID 1 for pairwise keys.
Signed-off-by: Alexander Wetzel <[email protected]>
[use NLA_POLICY_RANGE() for NL80211_KEY_MODE]
Signed-off-by: Johannes Berg <[email protected]>
|
|
The iwlwifi driver creates one rule per channel, thus it needs more
rules than normal. To solve this, increase NL80211_MAX_SUPP_REG_RULES
so iwlwifi can also fit UHB (ultra high band) channels.
Signed-off-by: Shaul Triebitz <[email protected]>
Signed-off-by: Luca Coelho <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
|
|
Two easy cases of overlapping changes.
Signed-off-by: David S. Miller <[email protected]>
|
|
When the label says "for internal use only", then it doesn't belong
in the 'uapi' subtree.
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
The ASPEED AST2400, and AST2500 in some configurations include a
PCI-to-AHB MMIO bridge. This bridge allows a server to read and write
in the BMC's physical address space. This feature is especially useful
when using this bridge to send large files to the BMC.
The host may use this to send down a firmware image by staging data at a
specific memory address, and in a coordinated effort with the BMC's
software stack and kernel, transmit the bytes.
This driver enables the BMC to unlock the PCI bridge on demand, and
configure it via ioctl to allow the host to write bytes to an agreed
upon location. In the primary use-case, the region to use is known
apriori on the BMC, and the host requests this information. Once this
request is received, the BMC's software stack will enable the bridge and
the region and then using some software flow control (possibly via IPMI
packets), copy the bytes down. Once the process is complete, the BMC
will disable the bridge and unset any region involved.
The default behavior of this bridge when present is: enabled and all
regions marked read-write. This driver will fix the regions to be
read-only and then disable the bridge entirely.
The memory regions protected are:
* BMC flash MMIO window
* System flash MMIO windows
* SOC IO (peripheral MMIO)
* DRAM
The DRAM region itself is all of DRAM and cannot be further specified.
Once the PCI bridge is enabled, the host can read all of DRAM, and if
the DRAM section is write-enabled, then it can write to all of it.
Signed-off-by: Patrick Venture <[email protected]>
Reviewed-by: Andrew Jeffery <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The V4L2 API is missing the 16-bit RGB555 formats for the RGBA, RGBX,
ABGR, XBGR, BGRA and BGRX component orders. Add them, using the same
4CCs as DRM.
Signed-off-by: Laurent Pinchart <[email protected]>
Acked-by: Sakari Ailus <[email protected]>
Reviewed-by: Jacopo Mondi <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
The V4L2 API is missing the 16-bit RGB4444 formats for the RGBA, RGBX,
ABGR, XBGR, BGRA and BGRX component orders. Add them, using the same
4CCs as DRM.
Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Jacopo Mondi <[email protected]>
Acked-by: Sakari Ailus <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
The V4L2 API is missing the 32-bit RGB formats for the ABGR, XBGR, RGBA
and RGBX component orders. Add them, using the same 4CCs as DRM.
Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Jacopo Mondi <[email protected]>
Acked-by: Sakari Ailus <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Currently, a CUSE server running on a 64-bit kernel can tell when an ioctl
request comes from a process running a 32-bit ABI, but cannot tell whether
the requesting process is using legacy IA32 emulation or x32 ABI. In
particular, the server does not know the size of the client process's
`time_t` type.
For 64-bit kernels, the `FUSE_IOCTL_COMPAT` and `FUSE_IOCTL_32BIT` flags
are currently set in the ioctl input request (`struct fuse_ioctl_in` member
`flags`) for a 32-bit requesting process. This patch defines a new flag
`FUSE_IOCTL_COMPAT_X32` and sets it if the 32-bit requesting process is
using the x32 ABI. This allows the server process to distinguish between
requests coming from client processes using IA32 emulation or the x32 ABI
and so infer the size of the client process's `time_t` type and any other
IA32/x32 differences.
Signed-off-by: Ian Abbott <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Retroactively add changelog entry for the atime and mtime "now" flags.
This was an oversight in commit 17637cbaba59 ("fuse: improve utimes
support").
Signed-off-by: Alan Somers <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
This was a mistake in the comment in commit e0a43ddcc08c ("fuse: allow
umask processing in userspace").
Signed-off-by: Alan Somers <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
The FUSE_FSYNC_DATASYNC flag was introduced by commit b6aeadeda22a
("[PATCH] FUSE - file operations") as a magic number. No new values have
been added to fsync_flags since.
Signed-off-by: Alan Somers <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Starting from commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per
POSIX") files opened even via nonseekable_open gate read and write via lock
and do not allow them to be run simultaneously. This can create read vs
write deadlock if a filesystem is trying to implement a socket-like file
which is intended to be simultaneously used for both read and write from
filesystem client. See commit 10dce8af3422 ("fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock") for details and e.g. commit 581d21a2d02a ("xenbus: fix deadlock
on writes to /proc/xen/xenbus") for a similar deadlock example on
/proc/xen/xenbus.
To avoid such deadlock it was tempting to adjust fuse_finish_open to use
stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags,
but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE,
and in particular GVFS which actually uses offset in its read and write
handlers
https://codesearch.debian.net/search?q=-%3Enonseekable+%3D
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346
https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481
so if we would do such a change it will break a real user.
Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the
opened handler is having stream-like semantics; does not use file position
and thus the kernel is free to issue simultaneous read and write request on
opened file handle.
This patch together with stream_open() should be added to stable kernels
starting from v3.14+. This will allow to patch OSSPD and other FUSE
filesystems that provide stream-like files to return FOPEN_STREAM |
FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all
kernel versions. This should work because fuse_finish_open ignores unknown
open flags returned from a filesystem and so passing FOPEN_STREAM to a
kernel that is not aware of this flag cannot hurt. In turn the kernel that
is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE
is sufficient to implement streams without read vs write deadlock.
Cc: [email protected] # v3.14+
Signed-off-by: Kirill Smelkov <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
On networked filesystems file data can be changed externally. FUSE
provides notification messages for filesystem to inform kernel that
metadata or data region of a file needs to be invalidated in local page
cache. That provides the basis for filesystem implementations to invalidate
kernel cache explicitly based on observed filesystem-specific events.
FUSE has also "automatic" invalidation mode(*) when the kernel
automatically invalidates data cache of a file if it sees mtime change. It
also automatically invalidates whole data cache of a file if it sees file
size being changed.
The automatic mode has corresponding capability - FUSE_AUTO_INVAL_DATA.
However, due to probably historical reason, that capability controls only
whether mtime change should be resulting in automatic invalidation or
not. A change in file size always results in invalidating whole data cache
of a file irregardless of whether FUSE_AUTO_INVAL_DATA was negotiated(+).
The filesystem I write[1] represents data arrays stored in networked
database as local files suitable for mmap. It is read-only filesystem -
changes to data are committed externally via database interfaces and the
filesystem only glues data into contiguous file streams suitable for mmap
and traditional array processing. The files are big - starting from
hundreds gigabytes and more. The files change regularly, and frequently by
data being appended to their end. The size of files thus changes
frequently.
If a file was accessed locally and some part of its data got into page
cache, we want that data to stay cached unless there is memory pressure, or
unless corresponding part of the file was actually changed. However current
FUSE behaviour - when it sees file size change - is to invalidate the whole
file. The data cache of the file is thus completely lost even on small size
change, and despite that the filesystem server is careful to accurately
translate database changes into FUSE invalidation messages to kernel.
Let's fix it: if a filesystem, through new FUSE_EXPLICIT_INVAL_DATA
capability, indicates to kernel that it is fully responsible for data cache
invalidation, then the kernel won't invalidate files data cache on size
change and only truncate that cache to new size in case the size decreased.
(*) see 72d0d248ca "fuse: add FUSE_AUTO_INVAL_DATA init flag",
eed2179efe "fuse: invalidate inode mapping if mtime changes"
(+) in writeback mode the kernel does not invalidate data cache on file
size change, but neither it allows the filesystem to set the size due to
external event (see 8373200b12 "fuse: Trust kernel i_size only")
[1] https://lab.nexedi.com/kirr/wendelin.core/blob/a50f1d9f/wcfs/wcfs.go#L20
Signed-off-by: Kirill Smelkov <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
This patch advertises the capability of two cpu feature called address
pointer authentication and generic pointer authentication. These
capabilities depend upon system support for pointer authentication and
VHE mode.
The current arm64 KVM partially implements pointer authentication and
support of address/generic authentication are tied together. However,
separate ABI requirements for both of them is added so that any future
isolated implementation will not require any ABI changes.
Signed-off-by: Amit Daniel Kachhap <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: [email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When nfsdcld was released, it was quickly deprecated in favor of the
nfsdcltrack usermodehelper, so as to not require another running daemon.
That prevents NFSv4 clients from reclaiming locks from nfsd's running in
containers, since neither nfsdcltrack nor the legacy client tracking
code work in containers.
This commit un-deprecates the use of nfsdcld, with one twist: we will
populate the reclaim_str_hashtbl on startup.
During client tracking initialization, do an upcall ("GraceStart") to
nfsdcld to get a list of clients from the database. nfsdcld will do
one downcall with a status of -EINPROGRESS for each client record in
the database, which in turn will cause an nfs4_client_reclaim to be
added to the reclaim_str_hashtbl. When complete, nfsdcld will do a
final downcall with a status of 0.
This will save nfsd from having to do an upcall to the daemon during
nfs4_check_open_reclaim() processing.
Even though nfsdcld was quickly deprecated, there is a very small chance
of old nfsdcld daemons running in the wild. These will respond to the
new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a
message and fall back to the original nfsdcld tracking ops (now called
nfsd4_cld_tracking_ops_v0).
Signed-off-by: Scott Mayhew <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
|
|
Add a region to the vfio-ccw device that can be used to submit
asynchronous I/O instructions. ssch continues to be handled by the
existing I/O region; the new region handles hsch and csch.
Interrupt status continues to be reported through the same channels
as for ssch.
Acked-by: Eric Farman <[email protected]>
Reviewed-by: Farhan Ali <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
|
|
Allow to extend the regions used by vfio-ccw. The first user will be
handling of halt and clear subchannel.
Reviewed-by: Eric Farman <[email protected]>
Reviewed-by: Farhan Ali <[email protected]>
Signed-off-by: Cornelia Huck <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.2:
UAPI Changes:
- Document which feature flags belong to which command in virtio_gpu.h
- Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy.
Cross-subsystem Changes:
- Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE
- Add parameters to the device tree bindings for tfp410
- iommu/io-pgtable: Add ARM Mali midgard MMU page table format
- dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound.
- Use the 64-bits compare for dma-fence-chains
Core Changes:
- Make the fb conversion functions use __iomem dst.
- Rename drm_client_add to drm_client_register
- Move intel_fb_initial_config to core.
- Add a drm_gem_objects_lookup helper
- Add drm_gem_fence_array helpers, and use it in lima.
- Add drm_format_helper.c to kerneldoc.
Driver Changes:
- Add panfrost driver for mali midgard/bitfrost.
- Converts bochs to use the simple display type.
- Small fixes to sun4i, tinydrm, ti-fp410.
- Fid aspeed's Kconfig options.
- Make some symbols/functions static in lima, sun4i and meson.
- Add a driver for the lg,acx467akm-7 panel.
Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Alexei Starovoitov says:
====================
pull-request: bpf-next 2019-04-22
The following pull-request contains BPF updates for your *net-next* tree.
The main changes are:
1) allow stack/queue helpers from more bpf program types, from Alban.
2) allow parallel verification of root bpf programs, from Alexei.
3) introduce bpf sysctl hook for trusted root cases, from Andrey.
4) recognize var/datasec in btf deduplication, from Andrii.
5) cpumap performance optimizations, from Jesper.
6) verifier prep for alu32 optimization, from Jiong.
7) libbpf xsk cleanup, from Magnus.
8) other various fixes and cleanups.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The timestamps in ir-keytable -t output showed that the Xbox DVD
IR dongle decodes scancodes every 64ms. The last scancode of a
longer button press is decodes 64ms after the last-but-one which
indicates the decoder doesn't use a timeout but decodes on the last
edge of the signal.
267.042629: lirc protocol(unknown): scancode = 0xace
267.042665: event type EV_MSC(0x04): scancode = 0xace
267.042665: event type EV_KEY(0x01) key_down: KEY_1(0x0002)
267.042665: event type EV_SYN(0x00).
267.106625: lirc protocol(unknown): scancode = 0xace
267.106643: event type EV_MSC(0x04): scancode = 0xace
267.106643: event type EV_SYN(0x00).
267.170623: lirc protocol(unknown): scancode = 0xace
267.170638: event type EV_MSC(0x04): scancode = 0xace
267.170638: event type EV_SYN(0x00).
267.234621: lirc protocol(unknown): scancode = 0xace
267.234636: event type EV_MSC(0x04): scancode = 0xace
267.234636: event type EV_SYN(0x00).
267.298623: lirc protocol(unknown): scancode = 0xace
267.298638: event type EV_MSC(0x04): scancode = 0xace
267.298638: event type EV_SYN(0x00).
267.543345: event type EV_KEY(0x01) key_down: KEY_1(0x0002)
267.543345: event type EV_SYN(0x00).
267.570015: event type EV_KEY(0x01) key_up: KEY_1(0x0002)
267.570015: event type EV_SYN(0x00).
Add a protocol with the repeat value and set the timeout in the
driver to 10ms (to have a bit of headroom for delays) so the Xbox
DVD remote performs more responsive.
Signed-off-by: Matthias Reichl <[email protected]>
Acked-by: Benjamin Valentin <[email protected]>
Signed-off-by: Sean Young <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a
comment, and is trivial. The other one is a conflict due to a later fix
in the bio multi-page work, and needs a bit more care.
* tag 'v5.1-rc6': (770 commits)
Linux 5.1-rc6
block: make sure that bvec length can't be overflow
block: kill all_q_node in request_queue
x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
mm/kmemleak.c: fix unused-function warning
init: initialize jump labels before command line option parsing
kernel/watchdog_hld.c: hard lockup message should end with a newline
kcov: improve CONFIG_ARCH_HAS_KCOV help text
mm: fix inactive list balancing between NUMA nodes and cgroups
mm/hotplug: treat CMA pages as unmovable
proc: fixup proc-pid-vm test
proc: fix map_files test on F29
mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
mm: swapoff: shmem_unuse() stop eviction without igrab()
mm: swapoff: take notice of completion sooner
mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
mm: swapoff: shmem_find_swap_entries() filter out other types
slab: store tagged freelist for off-slab slabmgmt
...
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch adds MEDIA_BUS_FMT_BGR888_3X8 used by STM MIPID02 CSI-2 to
PARALLEL bridge driver when input format is MEDIA_BUS_FMT_BGR888_1X24.
Signed-off-by: Mickael Guene <[email protected]>
Signed-off-by: Sakari Ailus <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Move PCM_CAPTURE, PCM_PLAYBACK, and CONTROL ALSA MEDIA_INTF_T* interface
types back into __KERNEL__ scope to get ready for adding ALSA support for
these to the media controller.
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
Add following V4L2 QP parameters for H.264:
* V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MIN_QP
* V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MAX_QP
* V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MIN_QP
* V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MAX_QP
These controls will limit QP range for intra and inter frame,
provide more manual control to improve video encode quality.
Signed-off-by: Fish Lin <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
|
|
We want the serial/tty fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
We want the fixes, and this resolves a merge error in the fastrpc
driver.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the
number of buffers in receive socket buffer which is not so helpful
for user space applications.
This commit introduces the new option TIPC_SOCK_RECVQ_USED which
returns the current allocated bytes of the receive socket buffer.
This helps user space applications dimension its buffer usage to
avoid buffer overload issue.
Signed-off-by: Tung Nguyen <[email protected]>
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The 'timeval' and 'timespec' data structures used for socket timestamps
are going to be redefined in user space based on 64-bit time_t in future
versions of the C library to deal with the y2038 overflow problem,
which breaks the ABI definition.
Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not
use the _IOR() macro to encode the size of the transferred data, so it
remains ambiguous whether the application uses the old or new layout.
The best workaround I could find is rather ugly: we redefine the command
code based on the size of the respective data structure with a ternary
operator. This lets it get evaluated as late as possible, hopefully after
that structure is visible to the caller. We cannot use an #ifdef here,
because inux/sockios.h might have been included before any libc header
that could determine the size of time_t.
The ioctl implementation now interprets the new command codes as always
referring to the 64-bit structure on all architectures, while the old
architecture specific command code still refers to the old architecture
specific layout. The new command number is only used when they are
actually different.
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the case of vlan filtering on bridges, the bridge may also have the
corresponding vlan devices as upper devices. Currently the link state
of vlan devices is transferred from the lower device. So this is up if
the bridge is in admin up state and there is at least one bridge port
that is up, regardless of the vlan that the port is a member of.
The link state of the vlan device may need to track only the state of
the subset of ports that are also members of the corresponding vlan,
rather than that of all ports.
Add a flag to specify a vlan bridge binding mode, by which the link
state is no longer automatically transferred from the lower device,
but is instead determined by the bridge ports that are members of the
vlan.
Signed-off-by: Mike Manning <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- several new key mappings for HID
- a host of new ACPI IDs used to identify Elan touchpads in Lenovo
laptops
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ
HID: input: add mapping for "Toggle Display" key
HID: input: add mapping for "Full Screen" key
HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys
HID: input: add mapping for Expose/Overview key
HID: input: fix mapping of aspect ratio key
[media] doc-rst: switch to new names for Full Screen/Aspect keys
Input: document meanings of KEY_SCREEN and KEY_ZOOM
Input: elan_i2c - add hardware ID for multiple Lenovo laptops
|
|
To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP
message types use larger numeric values, a simple bitmask doesn't fit.
I use large bitmap. The input and output are the in form of list of
ranges. Set the default to rate limit all error messages but Packet Too
Big. For Packet Too Big, use ratemask instead of hard-coded.
There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow()
aren't called. This patch only adds them to icmpv6_echo_reply().
Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says
that it is also acceptable to rate limit informational messages. Thus,
I removed the current hard-coded behavior of icmpv6_mask_allow() that
doesn't rate limit informational messages.
v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL
isn't defined, expand the description in ip-sysctl.txt and remove
unnecessary conditional before kfree().
v3: Inline the bitmap instead of dynamically allocated. Still is a
pointer to it is needed because of the way proc_do_large_bitmap work.
Signed-off-by: Stephen Suryaputra <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The Switchtec devices supports two PCIe Function Frameworks (PFFs) per
upstream port (one for the port itself and one for the management endoint),
and each PFF may have up to 255 ports. Previously the driver only
supported 48 of those ports, and the SWITCHTEC_IOCTL_EVENT_SUMMARY ioctl
only returned information about those 48.
Increase SWITCHTEC_MAX_PFF_CSR from 48 to 255 so the driver supports all
255 possible ports.
Rename SWITCHTEC_IOCTL_EVENT_SUMMARY and associated struct
switchtec_ioctl_event_summary to SWITCHTEC_IOCTL_EVENT_SUMMARY_LEGACY and
switchtec_ioctl_event_summary_legacy with so existing applications work
unchanged, supporting up to 48 ports.
Add replacement SWITCHTEC_IOCTL_EVENT_SUMMARY and struct
switchtec_ioctl_event_summary that new and recompiled applications support
up to 255 ports.
Signed-off-by: Wesley Sheng <[email protected]>
[bhelgaas: changelog]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>
|
|
The "Enhanced Allocation (EA) for Memory and I/O Resources" ECN, approved
23 October 2014, sec 6.9.1.2, specifies a second DW in the capability for
type 1 (bridge) functions to describe fixed secondary and subordinate bus
numbers. This ECN was included in the PCIe r4.0 spec, but sec 6.9.1.2 was
omitted, presumably by mistake.
Read fixed bus numbers from the EA capability for bridges.
Signed-off-by: Subbaraya Sundeep <[email protected]>
[bhelgaas: add pci_ea_fixed_busnrs() return value]
Signed-off-by: Bjorn Helgaas <[email protected]>
|