aboutsummaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)AuthorFilesLines
2019-05-01taprio: Add support for cycle-time-extensionVinicius Costa Gomes1-0/+1
IEEE 802.1Q-2018 defines the concept of a cycle-time-extension, so the last entry of a schedule before the start of a new schedule can be extended, so "too-short" entries can be avoided. Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-01taprio: Add support for setting the cycle-time manuallyVinicius Costa Gomes1-0/+1
IEEE 802.1Q-2018 defines that a the cycle-time of a schedule may be overridden, so the schedule is truncated to a determined "width". Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-01taprio: Add support adding an admin scheduleVinicius Costa Gomes1-0/+11
The IEEE 802.1Q-2018 defines two "types" of schedules, the "Oper" (from operational?) and "Admin" ones. Up until now, 'taprio' only had support for the "Oper" one, added when the qdisc is created. This adds support for the "Admin" one, which allows the .change() operation to be supported. Just for clarification, some quick (and dirty) definitions, the "Oper" schedule is the currently (as in this instant) running one, and it's read-only. The "Admin" one is the one that the system configurator has installed, it can be changed, and it will be "promoted" to "Oper" when it's 'base-time' is reached. The idea behing this patch is that calling something like the below, (after taprio is already configured with an initial schedule): $ tc qdisc change taprio dev IFACE parent root \ base-time X \ sched-entry <CMD> <GATES> <INTERVAL> \ ... Will cause a new admin schedule to be created and programmed to be "promoted" to "Oper" at instant X. If an "Admin" schedule already exists, it will be overwritten with the new parameters. Up until now, there was some code that was added to ease the support of changing a single entry of a schedule, but was ultimately unused. Now, that we have support for "change" with more well thought semantics, updating a single entry seems to be less useful. So we remove what is in practice dead code, and return a "not supported" error if the user tries to use it. If changing a single entry would make the user's life easier we may ressurrect this idea, but at this point, removing it simplifies the code. For now, only the schedule specific bits are allowed to be added for a new schedule, that means that 'clockid', 'num_tc', 'map' and 'queues' cannot be modified. Example: $ tc qdisc change dev IFACE parent root handle 100 taprio \ base-time $BASE_TIME \ sched-entry S 00 500000 \ sched-entry S 0f 500000 \ clockid CLOCK_TAI The only change in the netlink API introduced by this change is the introduction of an "admin" type in the response to a dump request, that type allows userspace to separate the "oper" schedule from the "admin" schedule. If userspace doesn't support the "admin" type, it will only display the "oper" schedule. Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-30sed-opal.h: remove redundant licence boilerplateChristoph Hellwig1-9/+0
The file already has the correct SPDX header. Reviewed-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-04-30netfilter: nft_ct: Add ct id supportBrett Mastbergen1-0/+2
The 'id' key returns the unique id of the conntrack entry as returned by nf_ct_get_id(). Signed-off-by: Brett Mastbergen <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-04-30KVM: PPC: Book3S HV: XIVE: Introduce a new capability KVM_CAP_PPC_IRQ_XIVECédric Le Goater1-0/+1
The user interface exposes a new capability KVM_CAP_PPC_IRQ_XIVE to let QEMU connect the vCPU presenters to the XIVE KVM device if required. The capability is not advertised for now as the full support for the XIVE native exploitation mode is not yet available. When this is case, the capability will be advertised on PowerNV Hypervisors only. Nested guests (pseries KVM Hypervisor) are not supported. Internally, the interface to the new KVM device is protected with a new interrupt mode: KVMPPC_IRQ_XIVE. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-30KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation modeCédric Le Goater1-0/+2
This is the basic framework for the new KVM device supporting the XIVE native exploitation mode. The user interface exposes a new KVM device to be created by QEMU, only available when running on a L0 hypervisor. Support for nested guests is not available yet. The XIVE device reuses the device structure of the XICS-on-XIVE device as they have a lot in common. That could possibly change in the future if the need arise. Signed-off-by: Cédric Le Goater <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2019-04-29btrfs: use common file type conversionPhillip Potter1-0/+2
Deduplicate the btrfs file type conversion implementation - file systems that use the same file types as defined by POSIX do not need to define their own versions and can use the common helper functions decared in fs_types.h and implemented in fs_types.c Common implementation can be found via commit: bbe7449e2599 "fs: common implementation of file type" Reviewed-by: Jan Kara <[email protected]> Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Phillip Potter <[email protected]> Reviewed-by: David Sterba <[email protected]> Signed-off-by: David Sterba <[email protected]>
2019-04-29tty: serial: add driver for the SiFive UARTPaul Walmsley1-0/+3
Add a serial driver for the SiFive UART, found on SiFive FU540 devices (among others). The underlying serial IP block is relatively basic, and currently does not support serial break detection. Further information on the IP block can be found in the documentation and Chisel sources: https://static.dev.sifive.com/FU540-C000-v1.0.pdf https://github.com/sifive/sifive-blocks/tree/master/src/main/scala/devices/uart This driver was written in collaboration with Wesley Terpstra <[email protected]>. Tested on a SiFive HiFive Unleashed A00 board, using BBL and the open- source FSBL (using a DT file based on what's targeted for mainline). This revision incorporates changes based on comments by Julia Lawall <[email protected]>, Emil Renner Berthing <[email protected]>, and Andreas Schwab <[email protected]>. Thanks also to Andreas for testing the driver with his userspace and reporting a bug with the set_termios implementation. Signed-off-by: Paul Walmsley <[email protected]> Signed-off-by: Paul Walmsley <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Wesley Terpstra <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Julia Lawall <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Andreas Schwab <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-1/+44
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-04-28 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Introduce BPF socket local storage map so that BPF programs can store private data they associate with a socket (instead of e.g. separate hash table), from Martin. 2) Add support for bpftool to dump BTF types. This is done through a new `bpftool btf dump` sub-command, from Andrii. 3) Enable BPF-based flow dissector for skb-less eth_get_headlen() calls which was currently not supported since skb was used to lookup netns, from Stanislav. 4) Add an opt-in interface for tracepoints to expose a writable context for attached BPF programs, used here for NBD sockets, from Matt. 5) BPF xadd related arm64 JIT fixes and scalability improvements, from Daniel. 6) Change the skb->protocol for bpf_skb_adjust_room() helper in order to support tunnels such as sit. Add selftests as well, from Willem. 7) Various smaller misc fixes. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-27bpf: Introduce bpf sk local storageMartin KaFai Lau1-1/+43
After allowing a bpf prog to - directly read the skb->sk ptr - get the fullsock bpf_sock by "bpf_sk_fullsock()" - get the bpf_tcp_sock by "bpf_tcp_sock()" - get the listener sock by "bpf_get_listener_sock()" - avoid duplicating the fields of "(bpf_)sock" and "(bpf_)tcp_sock" into different bpf running context. this patch is another effort to make bpf's network programming more intuitive to do (together with memory and performance benefit). When bpf prog needs to store data for a sk, the current practice is to define a map with the usual 4-tuples (src/dst ip/port) as the key. If multiple bpf progs require to store different sk data, multiple maps have to be defined. Hence, wasting memory to store the duplicated keys (i.e. 4 tuples here) in each of the bpf map. [ The smallest key could be the sk pointer itself which requires some enhancement in the verifier and it is a separate topic. ] Also, the bpf prog needs to clean up the elem when sk is freed. Otherwise, the bpf map will become full and un-usable quickly. The sk-free tracking currently could be done during sk state transition (e.g. BPF_SOCK_OPS_STATE_CB). The size of the map needs to be predefined which then usually ended-up with an over-provisioned map in production. Even the map was re-sizable, while the sk naturally come and go away already, this potential re-size operation is arguably redundant if the data can be directly connected to the sk itself instead of proxy-ing through a bpf map. This patch introduces sk->sk_bpf_storage to provide local storage space at sk for bpf prog to use. The space will be allocated when the first bpf prog has created data for this particular sk. The design optimizes the bpf prog's lookup (and then optionally followed by an inline update). bpf_spin_lock should be used if the inline update needs to be protected. BPF_MAP_TYPE_SK_STORAGE: ----------------------- To define a bpf "sk-local-storage", a BPF_MAP_TYPE_SK_STORAGE map (new in this patch) needs to be created. Multiple BPF_MAP_TYPE_SK_STORAGE maps can be created to fit different bpf progs' needs. The map enforces BTF to allow printing the sk-local-storage during a system-wise sk dump (e.g. "ss -ta") in the future. The purpose of a BPF_MAP_TYPE_SK_STORAGE map is not for lookup/update/delete a "sk-local-storage" data from a particular sk. Think of the map as a meta-data (or "type") of a "sk-local-storage". This particular "type" of "sk-local-storage" data can then be stored in any sk. The main purposes of this map are mostly: 1. Define the size of a "sk-local-storage" type. 2. Provide a similar syscall userspace API as the map (e.g. lookup/update, map-id, map-btf...etc.) 3. Keep track of all sk's storages of this "type" and clean them up when the map is freed. sk->sk_bpf_storage: ------------------ The main lookup/update/delete is done on sk->sk_bpf_storage (which is a "struct bpf_sk_storage"). When doing a lookup, the "map" pointer is now used as the "key" to search on the sk_storage->list. The "map" pointer is actually serving as the "type" of the "sk-local-storage" that is being requested. To allow very fast lookup, it should be as fast as looking up an array at a stable-offset. At the same time, it is not ideal to set a hard limit on the number of sk-local-storage "type" that the system can have. Hence, this patch takes a cache approach. The last search result from sk_storage->list is cached in sk_storage->cache[] which is a stable sized array. Each "sk-local-storage" type has a stable offset to the cache[] array. In the future, a map's flag could be introduced to do cache opt-out/enforcement if it became necessary. The cache size is 16 (i.e. 16 types of "sk-local-storage"). Programs can share map. On the program side, having a few bpf_progs running in the networking hotpath is already a lot. The bpf_prog should have already consolidated the existing sock-key-ed map usage to minimize the map lookup penalty. 16 has enough runway to grow. All sk-local-storage data will be removed from sk->sk_bpf_storage during sk destruction. bpf_sk_storage_get() and bpf_sk_storage_delete(): ------------------------------------------------ Instead of using bpf_map_(lookup|update|delete)_elem(), the bpf prog needs to use the new helper bpf_sk_storage_get() and bpf_sk_storage_delete(). The verifier can then enforce the ARG_PTR_TO_SOCKET argument. The bpf_sk_storage_get() also allows to "create" new elem if one does not exist in the sk. It is done by the new BPF_SK_STORAGE_GET_F_CREATE flag. An optional value can also be provided as the initial value during BPF_SK_STORAGE_GET_F_CREATE. The BPF_MAP_TYPE_SK_STORAGE also supports bpf_spin_lock. Together, it has eliminated the potential use cases for an equivalent bpf_map_update_elem() API (for bpf_prog) in this patch. Misc notes: ---------- 1. map_get_next_key is not supported. From the userspace syscall perspective, the map has the socket fd as the key while the map can be shared by pinned-file or map-id. Since btf is enforced, the existing "ss" could be enhanced to pretty print the local-storage. Supporting a kernel defined btf with 4 tuples as the return key could be explored later also. 2. The sk->sk_lock cannot be acquired. Atomic operations is used instead. e.g. cmpxchg is done on the sk->sk_bpf_storage ptr. Please refer to the source code comments for the details in synchronization cases and considerations. 3. The mem is charged to the sk->sk_omem_alloc as the sk filter does. Benchmark: --------- Here is the benchmark data collected by turning on the "kernel.bpf_stats_enabled" sysctl. Two bpf progs are tested: One bpf prog with the usual bpf hashmap (max_entries = 8192) with the sk ptr as the key. (verifier is modified to support sk ptr as the key That should have shortened the key lookup time.) Another bpf prog is with the new BPF_MAP_TYPE_SK_STORAGE. Both are storing a "u32 cnt", do a lookup on "egress_skb/cgroup" for each egress skb and then bump the cnt. netperf is used to drive data with 4096 connected UDP sockets. BPF_MAP_TYPE_HASH with a modifier verifier (152ns per bpf run) 27: cgroup_skb name egress_sk_map tag 74f56e832918070b run_time_ns 58280107540 run_cnt 381347633 loaded_at 2019-04-15T13:46:39-0700 uid 0 xlated 344B jited 258B memlock 4096B map_ids 16 btf_id 5 BPF_MAP_TYPE_SK_STORAGE in this patch (66ns per bpf run) 30: cgroup_skb name egress_sk_stora tag d4aa70984cc7bbf6 run_time_ns 25617093319 run_cnt 390989739 loaded_at 2019-04-15T13:47:54-0700 uid 0 xlated 168B jited 156B memlock 4096B map_ids 17 btf_id 6 Here is a high-level picture on how are the objects organized: sk ┌──────┐ │ │ │ │ │ │ │*sk_bpf_storage─────▶ bpf_sk_storage └──────┘ ┌───────┐ ┌───────────┤ list │ │ │ │ │ │ │ │ │ │ │ └───────┘ │ │ elem │ ┌────────┐ ├─▶│ snode │ │ ├────────┤ │ │ data │ bpf_map │ ├────────┤ ┌─────────┐ │ │map_node│◀─┬─────┤ list │ │ └────────┘ │ │ │ │ │ │ │ │ elem │ │ │ │ ┌────────┐ │ └─────────┘ └─▶│ snode │ │ ├────────┤ │ bpf_map │ data │ │ ┌─────────┐ ├────────┤ │ │ list ├───────▶│map_node│ │ │ │ └────────┘ │ │ │ │ │ │ elem │ └─────────┘ ┌────────┐ │ ┌─▶│ snode │ │ │ ├────────┤ │ │ │ data │ │ │ ├────────┤ │ │ │map_node│◀─┘ │ └────────┘ │ │ │ ┌───────┐ sk └──────────│ list │ ┌──────┐ │ │ │ │ │ │ │ │ │ │ │ │ └───────┘ │*sk_bpf_storage───────▶bpf_sk_storage └──────┘ Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-26bpf: add writable context for raw tracepointsMatt Mullins1-0/+1
This is an opt-in interface that allows a tracepoint to provide a safe buffer that can be written from a BPF_PROG_TYPE_RAW_TRACEPOINT program. The size of the buffer must be a compile-time constant, and is checked before allowing a BPF program to attach to a tracepoint that uses this feature. The pointer to this buffer will be the first argument of tracepoints that opt in; the pointer is valid and can be bpf_probe_read() by both BPF_PROG_TYPE_RAW_TRACEPOINT and BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE programs that attach to such a tracepoint, but the buffer to which it points may only be written by the latter. Signed-off-by: Matt Mullins <[email protected]> Acked-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2019-04-26Input: add KEY_KBD_LAYOUT_NEXTDmitry Torokhov1-0/+1
The HID usage tables define a key to cycle through a set of keyboard layouts, let's add corresponding keycode. Signed-off-by: Dmitry Torokhov <[email protected]>
2019-04-26Merge tag 'mac80211-next-for-davem-2019-04-26' of ↵David S. Miller1-2/+84
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Various updates, notably: * extended key ID support (from 802.11-2016) * per-STA TX power control support * mac80211 TX performance improvements * HE (802.11ax) updates * mesh link probing support * enhancements of multi-BSSID support (also related to HE) * OWE userspace processing support ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-26cfg80211: add support to probe unexercised mesh linkRajkumar Manoharan1-0/+17
Adding support to allow mesh HWMP to measure link metrics on unexercised direct mesh path by sending some data frames to other mesh points which are not currently selected as a primary traffic path but only 1 hop away. The absence of the primary path to the chosen node makes it necessary to apply some form of marking on a chosen packet stream so that the packets can be properly steered to the selected node for testing, and not by the regular mesh path lookup. Tested-by: Pradeep Kumar Chitrapu <[email protected]> Signed-off-by: Rajkumar Manoharan <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2019-04-26cfg80211: Add support to set tx power for a station associatedAshok Raj Nagarajan1-0/+15
This patch adds support to set transmit power setting type and transmit power level attributes to NL80211_CMD_SET_STATION in order to facilitate adjusting the transmit power level of a station associated to the AP. The added attributes allow selection of automatic and limited transmit power level, with the level defined in dBm format. Co-developed-by: Balaji Pothunoori <[email protected]> Signed-off-by: Ashok Raj Nagarajan <[email protected]> Signed-off-by: Balaji Pothunoori <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2019-04-26nl80211/cfg80211: Extended Key ID supportAlexander Wetzel1-0/+28
Add support for IEEE 802.11-2016 "Extended Key ID for Individually Addressed Frames". Extend cfg80211 and nl80211 to allow pairwise keys to be installed for Rx only, enable Tx separately and allow Key ID 1 for pairwise keys. Signed-off-by: Alexander Wetzel <[email protected]> [use NLA_POLICY_RANGE() for NL80211_KEY_MODE] Signed-off-by: Johannes Berg <[email protected]>
2019-04-26nl80211: increase NL80211_MAX_SUPP_REG_RULESShaul Triebitz1-2/+2
The iwlwifi driver creates one rule per channel, thus it needs more rules than normal. To solve this, increase NL80211_MAX_SUPP_REG_RULES so iwlwifi can also fit UHB (ultra high band) channels. Signed-off-by: Shaul Triebitz <[email protected]> Signed-off-by: Luca Coelho <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2019-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+4
Two easy cases of overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2019-04-25NFS: Move internal constants out of uapi/linux/nfs_mount.hTrond Myklebust1-9/+0
When the label says "for internal use only", then it doesn't belong in the 'uapi' subtree. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2019-04-25drivers/misc: Add Aspeed P2A control driverPatrick Venture1-0/+62
The ASPEED AST2400, and AST2500 in some configurations include a PCI-to-AHB MMIO bridge. This bridge allows a server to read and write in the BMC's physical address space. This feature is especially useful when using this bridge to send large files to the BMC. The host may use this to send down a firmware image by staging data at a specific memory address, and in a coordinated effort with the BMC's software stack and kernel, transmit the bytes. This driver enables the BMC to unlock the PCI bridge on demand, and configure it via ioctl to allow the host to write bytes to an agreed upon location. In the primary use-case, the region to use is known apriori on the BMC, and the host requests this information. Once this request is received, the BMC's software stack will enable the bridge and the region and then using some software flow control (possibly via IPMI packets), copy the bytes down. Once the process is complete, the BMC will disable the bridge and unset any region involved. The default behavior of this bridge when present is: enabled and all regions marked read-write. This driver will fix the regions to be read-only and then disable the bridge entirely. The memory regions protected are: * BMC flash MMIO window * System flash MMIO windows * SOC IO (peripheral MMIO) * DRAM The DRAM region itself is all of DRAM and cannot be further specified. Once the PCI bridge is enabled, the host can read all of DRAM, and if the DRAM section is write-enabled, then it can write to all of it. Signed-off-by: Patrick Venture <[email protected]> Reviewed-by: Andrew Jeffery <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-25media: v4l: Add definitions for missing 16-bit RGB555 formatsLaurent Pinchart1-0/+6
The V4L2 API is missing the 16-bit RGB555 formats for the RGBA, RGBX, ABGR, XBGR, BGRA and BGRX component orders. Add them, using the same 4CCs as DRM. Signed-off-by: Laurent Pinchart <[email protected]> Acked-by: Sakari Ailus <[email protected]> Reviewed-by: Jacopo Mondi <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-25media: v4l: Add definitions for missing 16-bit RGB4444 formatsLaurent Pinchart1-0/+6
The V4L2 API is missing the 16-bit RGB4444 formats for the RGBA, RGBX, ABGR, XBGR, BGRA and BGRX component orders. Add them, using the same 4CCs as DRM. Signed-off-by: Laurent Pinchart <[email protected]> Reviewed-by: Jacopo Mondi <[email protected]> Acked-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-25media: v4l: Add definitions for missing 32-bit RGB formatsLaurent Pinchart1-0/+4
The V4L2 API is missing the 32-bit RGB formats for the ABGR, XBGR, RGBA and RGBX component orders. Add them, using the same 4CCs as DRM. Signed-off-by: Laurent Pinchart <[email protected]> Reviewed-by: Jacopo Mondi <[email protected]> Acked-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-24fuse: Add ioctl flag for x32 compat ioctlIan Abbott1-0/+3
Currently, a CUSE server running on a 64-bit kernel can tell when an ioctl request comes from a process running a 32-bit ABI, but cannot tell whether the requesting process is using legacy IA32 emulation or x32 ABI. In particular, the server does not know the size of the client process's `time_t` type. For 64-bit kernels, the `FUSE_IOCTL_COMPAT` and `FUSE_IOCTL_32BIT` flags are currently set in the ioctl input request (`struct fuse_ioctl_in` member `flags`) for a 32-bit requesting process. This patch defines a new flag `FUSE_IOCTL_COMPAT_X32` and sets it if the 32-bit requesting process is using the x32 ABI. This allows the server process to distinguish between requests coming from client processes using IA32 emulation or the x32 ABI and so infer the size of the client process's `time_t` type and any other IA32/x32 differences. Signed-off-by: Ian Abbott <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24fuse: fix changelog entry for protocol 7.9Alan Somers1-0/+1
Retroactively add changelog entry for the atime and mtime "now" flags. This was an oversight in commit 17637cbaba59 ("fuse: improve utimes support"). Signed-off-by: Alan Somers <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24fuse: fix changelog entry for protocol 7.12Alan Somers1-1/+1
This was a mistake in the comment in commit e0a43ddcc08c ("fuse: allow umask processing in userspace"). Signed-off-by: Alan Somers <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24fuse: document fuse_fsync_in.fsync_flagsAlan Somers1-0/+7
The FUSE_FSYNC_DATASYNC flag was introduced by commit b6aeadeda22a ("[PATCH] FUSE - file operations") as a magic number. No new values have been added to fsync_flags since. Signed-off-by: Alan Somers <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24fuse: Add FOPEN_STREAM to use stream_open()Kirill Smelkov1-0/+2
Starting from commit 9c225f2655e3 ("vfs: atomic f_pos accesses as per POSIX") files opened even via nonseekable_open gate read and write via lock and do not allow them to be run simultaneously. This can create read vs write deadlock if a filesystem is trying to implement a socket-like file which is intended to be simultaneously used for both read and write from filesystem client. See commit 10dce8af3422 ("fs: stream_open - opener for stream-like files so that read and write can run simultaneously without deadlock") for details and e.g. commit 581d21a2d02a ("xenbus: fix deadlock on writes to /proc/xen/xenbus") for a similar deadlock example on /proc/xen/xenbus. To avoid such deadlock it was tempting to adjust fuse_finish_open to use stream_open instead of nonseekable_open on just FOPEN_NONSEEKABLE flags, but grepping through Debian codesearch shows users of FOPEN_NONSEEKABLE, and in particular GVFS which actually uses offset in its read and write handlers https://codesearch.debian.net/search?q=-%3Enonseekable+%3D https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1080 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1247-1346 https://gitlab.gnome.org/GNOME/gvfs/blob/1.40.0-6-gcbc54396/client/gvfsfusedaemon.c#L1399-1481 so if we would do such a change it will break a real user. Add another flag (FOPEN_STREAM) for filesystem servers to indicate that the opened handler is having stream-like semantics; does not use file position and thus the kernel is free to issue simultaneous read and write request on opened file handle. This patch together with stream_open() should be added to stable kernels starting from v3.14+. This will allow to patch OSSPD and other FUSE filesystems that provide stream-like files to return FOPEN_STREAM | FOPEN_NONSEEKABLE in open handler and this way avoid the deadlock on all kernel versions. This should work because fuse_finish_open ignores unknown open flags returned from a filesystem and so passing FOPEN_STREAM to a kernel that is not aware of this flag cannot hurt. In turn the kernel that is not aware of FOPEN_STREAM will be < v3.14 where just FOPEN_NONSEEKABLE is sufficient to implement streams without read vs write deadlock. Cc: [email protected] # v3.14+ Signed-off-by: Kirill Smelkov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24fuse: allow filesystems to have precise control over data cacheKirill Smelkov1-1/+6
On networked filesystems file data can be changed externally. FUSE provides notification messages for filesystem to inform kernel that metadata or data region of a file needs to be invalidated in local page cache. That provides the basis for filesystem implementations to invalidate kernel cache explicitly based on observed filesystem-specific events. FUSE has also "automatic" invalidation mode(*) when the kernel automatically invalidates data cache of a file if it sees mtime change. It also automatically invalidates whole data cache of a file if it sees file size being changed. The automatic mode has corresponding capability - FUSE_AUTO_INVAL_DATA. However, due to probably historical reason, that capability controls only whether mtime change should be resulting in automatic invalidation or not. A change in file size always results in invalidating whole data cache of a file irregardless of whether FUSE_AUTO_INVAL_DATA was negotiated(+). The filesystem I write[1] represents data arrays stored in networked database as local files suitable for mmap. It is read-only filesystem - changes to data are committed externally via database interfaces and the filesystem only glues data into contiguous file streams suitable for mmap and traditional array processing. The files are big - starting from hundreds gigabytes and more. The files change regularly, and frequently by data being appended to their end. The size of files thus changes frequently. If a file was accessed locally and some part of its data got into page cache, we want that data to stay cached unless there is memory pressure, or unless corresponding part of the file was actually changed. However current FUSE behaviour - when it sees file size change - is to invalidate the whole file. The data cache of the file is thus completely lost even on small size change, and despite that the filesystem server is careful to accurately translate database changes into FUSE invalidation messages to kernel. Let's fix it: if a filesystem, through new FUSE_EXPLICIT_INVAL_DATA capability, indicates to kernel that it is fully responsible for data cache invalidation, then the kernel won't invalidate files data cache on size change and only truncate that cache to new size in case the size decreased. (*) see 72d0d248ca "fuse: add FUSE_AUTO_INVAL_DATA init flag", eed2179efe "fuse: invalidate inode mapping if mtime changes" (+) in writeback mode the kernel does not invalidate data cache on file size change, but neither it allows the filesystem to set the size due to external event (see 8373200b12 "fuse: Trust kernel i_size only") [1] https://lab.nexedi.com/kirr/wendelin.core/blob/a50f1d9f/wcfs/wcfs.go#L20 Signed-off-by: Kirill Smelkov <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2019-04-24KVM: arm64: Add capability to advertise ptrauth for guestAmit Daniel Kachhap1-0/+2
This patch advertises the capability of two cpu feature called address pointer authentication and generic pointer authentication. These capabilities depend upon system support for pointer authentication and VHE mode. The current arm64 KVM partially implements pointer authentication and support of address/generic authentication are tied together. However, separate ABI requirements for both of them is added so that any future isolated implementation will not require any ABI changes. Signed-off-by: Amit Daniel Kachhap <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Christoffer Dall <[email protected]> Cc: [email protected] Signed-off-by: Marc Zyngier <[email protected]>
2019-04-24nfsd: un-deprecate nfsdcldScott Mayhew1-0/+1
When nfsdcld was released, it was quickly deprecated in favor of the nfsdcltrack usermodehelper, so as to not require another running daemon. That prevents NFSv4 clients from reclaiming locks from nfsd's running in containers, since neither nfsdcltrack nor the legacy client tracking code work in containers. This commit un-deprecates the use of nfsdcld, with one twist: we will populate the reclaim_str_hashtbl on startup. During client tracking initialization, do an upcall ("GraceStart") to nfsdcld to get a list of clients from the database. nfsdcld will do one downcall with a status of -EINPROGRESS for each client record in the database, which in turn will cause an nfs4_client_reclaim to be added to the reclaim_str_hashtbl. When complete, nfsdcld will do a final downcall with a status of 0. This will save nfsd from having to do an upcall to the daemon during nfs4_check_open_reclaim() processing. Even though nfsdcld was quickly deprecated, there is a very small chance of old nfsdcld daemons running in the wild. These will respond to the new "GraceStart" upcall with -EOPNOTSUPP, in which case we will log a message and fall back to the original nfsdcld tracking ops (now called nfsd4_cld_tracking_ops_v0). Signed-off-by: Scott Mayhew <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2019-04-24vfio-ccw: add handling for async channel instructionsCornelia Huck2-0/+14
Add a region to the vfio-ccw device that can be used to submit asynchronous I/O instructions. ssch continues to be handled by the existing I/O region; the new region handles hsch and csch. Interrupt status continues to be reported through the same channels as for ssch. Acked-by: Eric Farman <[email protected]> Reviewed-by: Farhan Ali <[email protected]> Signed-off-by: Cornelia Huck <[email protected]>
2019-04-24vfio-ccw: add capabilities chainCornelia Huck1-0/+2
Allow to extend the regions used by vfio-ccw. The first user will be handling of halt and clear subchannel. Reviewed-by: Eric Farman <[email protected]> Reviewed-by: Farhan Ali <[email protected]> Signed-off-by: Cornelia Huck <[email protected]>
2019-04-24Merge tag 'drm-misc-next-2019-04-18' of ↵Dave Airlie1-2/+10
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.2: UAPI Changes: - Document which feature flags belong to which command in virtio_gpu.h - Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy. Cross-subsystem Changes: - Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE - Add parameters to the device tree bindings for tfp410 - iommu/io-pgtable: Add ARM Mali midgard MMU page table format - dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound. - Use the 64-bits compare for dma-fence-chains Core Changes: - Make the fb conversion functions use __iomem dst. - Rename drm_client_add to drm_client_register - Move intel_fb_initial_config to core. - Add a drm_gem_objects_lookup helper - Add drm_gem_fence_array helpers, and use it in lima. - Add drm_format_helper.c to kerneldoc. Driver Changes: - Add panfrost driver for mali midgard/bitfrost. - Converts bochs to use the simple display type. - Small fixes to sun4i, tinydrm, ti-fp410. - Fid aspeed's Kconfig options. - Make some symbols/functions static in lima, sun4i and meson. - Add a driver for the lg,acx467akm-7 panel. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-04-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-5/+149
Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-04-22 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) allow stack/queue helpers from more bpf program types, from Alban. 2) allow parallel verification of root bpf programs, from Alexei. 3) introduce bpf sysctl hook for trusted root cases, from Andrey. 4) recognize var/datasec in btf deduplication, from Andrii. 5) cpumap performance optimizations, from Jesper. 6) verifier prep for alu32 optimization, from Jiong. 7) libbpf xsk cleanup, from Magnus. 8) other various fixes and cleanups. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-04-22media: rc: xbox_remote: add protocol and set timeoutMatthias Reichl1-0/+2
The timestamps in ir-keytable -t output showed that the Xbox DVD IR dongle decodes scancodes every 64ms. The last scancode of a longer button press is decodes 64ms after the last-but-one which indicates the decoder doesn't use a timeout but decodes on the last edge of the signal. 267.042629: lirc protocol(unknown): scancode = 0xace 267.042665: event type EV_MSC(0x04): scancode = 0xace 267.042665: event type EV_KEY(0x01) key_down: KEY_1(0x0002) 267.042665: event type EV_SYN(0x00). 267.106625: lirc protocol(unknown): scancode = 0xace 267.106643: event type EV_MSC(0x04): scancode = 0xace 267.106643: event type EV_SYN(0x00). 267.170623: lirc protocol(unknown): scancode = 0xace 267.170638: event type EV_MSC(0x04): scancode = 0xace 267.170638: event type EV_SYN(0x00). 267.234621: lirc protocol(unknown): scancode = 0xace 267.234636: event type EV_MSC(0x04): scancode = 0xace 267.234636: event type EV_SYN(0x00). 267.298623: lirc protocol(unknown): scancode = 0xace 267.298638: event type EV_MSC(0x04): scancode = 0xace 267.298638: event type EV_SYN(0x00). 267.543345: event type EV_KEY(0x01) key_down: KEY_1(0x0002) 267.543345: event type EV_SYN(0x00). 267.570015: event type EV_KEY(0x01) key_up: KEY_1(0x0002) 267.570015: event type EV_SYN(0x00). Add a protocol with the repeat value and set the timeout in the driver to 10ms (to have a bit of headroom for delays) so the Xbox DVD remote performs more responsive. Signed-off-by: Matthias Reichl <[email protected]> Acked-by: Benjamin Valentin <[email protected]> Signed-off-by: Sean Young <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-22Merge tag 'v5.1-rc6' into for-5.2/blockJens Axboe2-3/+5
Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a comment, and is trivial. The other one is a conflict due to a later fix in the bio multi-page work, and needs a bit more care. * tag 'v5.1-rc6': (770 commits) Linux 5.1-rc6 block: make sure that bvec length can't be overflow block: kill all_q_node in request_queue x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping mm/kmemleak.c: fix unused-function warning init: initialize jump labels before command line option parsing kernel/watchdog_hld.c: hard lockup message should end with a newline kcov: improve CONFIG_ARCH_HAS_KCOV help text mm: fix inactive list balancing between NUMA nodes and cgroups mm/hotplug: treat CMA pages as unmovable proc: fixup proc-pid-vm test proc: fix map_files test on F29 mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock mm: swapoff: shmem_unuse() stop eviction without igrab() mm: swapoff: take notice of completion sooner mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES mm: swapoff: shmem_find_swap_entries() filter out other types slab: store tagged freelist for off-slab slabmgmt ... Signed-off-by: Jens Axboe <[email protected]>
2019-04-22media: uapi: Add MEDIA_BUS_FMT_BGR888_3X8 media bus formatMickael Guene1-1/+2
This patch adds MEDIA_BUS_FMT_BGR888_3X8 used by STM MIPID02 CSI-2 to PARALLEL bridge driver when input format is MEDIA_BUS_FMT_BGR888_1X24. Signed-off-by: Mickael Guene <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-22media: media.h: Enable ALSA MEDIA_INTF_T* interface typesShuah Khan1-10/+15
Move PCM_CAPTURE, PCM_PLAYBACK, and CONTROL ALSA MEDIA_INTF_T* interface types back into __KERNEL__ scope to get ready for adding ALSA support for these to the media controller. Signed-off-by: Shuah Khan <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-22media: v4l: add I / P frame min max QP definitionsFish Lin1-0/+4
Add following V4L2 QP parameters for H.264: * V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MIN_QP * V4L2_CID_MPEG_VIDEO_H264_I_FRAME_MAX_QP * V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MIN_QP * V4L2_CID_MPEG_VIDEO_H264_P_FRAME_MAX_QP These controls will limit QP range for intra and inter frame, provide more manual control to improve video encode quality. Signed-off-by: Fish Lin <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2019-04-21Merge 5.1-rc6 into tty-nextGreg Kroah-Hartman2-3/+5
We want the serial/tty fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-21Merge 5.1-rc6 into char-misc-nextGreg Kroah-Hartman2-3/+5
We want the fixes, and this resolves a merge error in the fastrpc driver. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-04-19tipc: introduce new socket option TIPC_SOCK_RECVQ_USEDTung Nguyen1-0/+1
When using TIPC_SOCK_RECVQ_DEPTH for getsockopt(), it returns the number of buffers in receive socket buffer which is not so helpful for user space applications. This commit introduces the new option TIPC_SOCK_RECVQ_USED which returns the current allocated bytes of the receive socket buffer. This helps user space applications dimension its buffer usage to avoid buffer overload issue. Signed-off-by: Tung Nguyen <[email protected]> Acked-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-19net: socket: implement 64-bit timestampsArnd Bergmann1-0/+21
The 'timeval' and 'timespec' data structures used for socket timestamps are going to be redefined in user space based on 64-bit time_t in future versions of the C library to deal with the y2038 overflow problem, which breaks the ABI definition. Unlike many modern ioctl commands, SIOCGSTAMP and SIOCGSTAMPNS do not use the _IOR() macro to encode the size of the transferred data, so it remains ambiguous whether the application uses the old or new layout. The best workaround I could find is rather ugly: we redefine the command code based on the size of the respective data structure with a ternary operator. This lets it get evaluated as late as possible, hopefully after that structure is visible to the caller. We cannot use an #ifdef here, because inux/sockios.h might have been included before any libc header that could determine the size of time_t. The ioctl implementation now interprets the new command codes as always referring to the 64-bit structure on all architectures, while the old architecture specific command code still refers to the old architecture specific layout. The new command number is only used when they are actually different. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-19vlan: support binding link state to vlan member bridge portsMike Manning1-4/+5
In the case of vlan filtering on bridges, the bridge may also have the corresponding vlan devices as upper devices. Currently the link state of vlan devices is transferred from the lower device. So this is up if the bridge is in admin up state and there is at least one bridge port that is up, regardless of the vlan that the port is a member of. The link state of the vlan device may need to track only the state of the subset of ports that are also members of the corresponding vlan, rather than that of all ports. Add a flag to specify a vlan bridge binding mode, by which the link state is no longer automatically transferred from the lower device, but is instead determined by the bridge ports that are members of the vlan. Signed-off-by: Mike Manning <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-19Merge branch 'for-linus' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: - several new key mappings for HID - a host of new ACPI IDs used to identify Elan touchpads in Lenovo laptops * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: snvs_pwrkey - initialize necessary driver data before enabling IRQ HID: input: add mapping for "Toggle Display" key HID: input: add mapping for "Full Screen" key HID: input: add mapping for keyboard Brightness Up/Down/Toggle keys HID: input: add mapping for Expose/Overview key HID: input: fix mapping of aspect ratio key [media] doc-rst: switch to new names for Full Screen/Aspect keys Input: document meanings of KEY_SCREEN and KEY_ZOOM Input: elan_i2c - add hardware ID for multiple Lenovo laptops
2019-04-18ipv6: Add rate limit mask for ICMPv6 messagesStephen Suryaputra1-0/+4
To make ICMPv6 closer to ICMPv4, add ratemask parameter. Since the ICMP message types use larger numeric values, a simple bitmask doesn't fit. I use large bitmap. The input and output are the in form of list of ranges. Set the default to rate limit all error messages but Packet Too Big. For Packet Too Big, use ratemask instead of hard-coded. There are functions where icmpv6_xrlim_allow() and icmpv6_global_allow() aren't called. This patch only adds them to icmpv6_echo_reply(). Rate limiting error messages is mandated by RFC 4443 but RFC 4890 says that it is also acceptable to rate limit informational messages. Thus, I removed the current hard-coded behavior of icmpv6_mask_allow() that doesn't rate limit informational messages. v2: Add dummy function proc_do_large_bitmap() if CONFIG_PROC_SYSCTL isn't defined, expand the description in ip-sysctl.txt and remove unnecessary conditional before kfree(). v3: Inline the bitmap instead of dynamically allocated. Still is a pointer to it is needed because of the way proc_do_large_bitmap work. Signed-off-by: Stephen Suryaputra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-04-17switchtec: Increase PFF limit from 48 to 255Wesley Sheng1-1/+12
The Switchtec devices supports two PCIe Function Frameworks (PFFs) per upstream port (one for the port itself and one for the management endoint), and each PFF may have up to 255 ports. Previously the driver only supported 48 of those ports, and the SWITCHTEC_IOCTL_EVENT_SUMMARY ioctl only returned information about those 48. Increase SWITCHTEC_MAX_PFF_CSR from 48 to 255 so the driver supports all 255 possible ports. Rename SWITCHTEC_IOCTL_EVENT_SUMMARY and associated struct switchtec_ioctl_event_summary to SWITCHTEC_IOCTL_EVENT_SUMMARY_LEGACY and switchtec_ioctl_event_summary_legacy with so existing applications work unchanged, supporting up to 48 ports. Add replacement SWITCHTEC_IOCTL_EVENT_SUMMARY and struct switchtec_ioctl_event_summary that new and recompiled applications support up to 255 ports. Signed-off-by: Wesley Sheng <[email protected]> [bhelgaas: changelog] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]>
2019-04-17PCI: Assign bus numbers present in EA capability for bridgesSubbaraya Sundeep1-0/+6
The "Enhanced Allocation (EA) for Memory and I/O Resources" ECN, approved 23 October 2014, sec 6.9.1.2, specifies a second DW in the capability for type 1 (bridge) functions to describe fixed secondary and subordinate bus numbers. This ECN was included in the PCIe r4.0 spec, but sec 6.9.1.2 was omitted, presumably by mistake. Read fixed bus numbers from the EA capability for bridges. Signed-off-by: Subbaraya Sundeep <[email protected]> [bhelgaas: add pci_ea_fixed_busnrs() return value] Signed-off-by: Bjorn Helgaas <[email protected]>