aboutsummaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)AuthorFilesLines
2018-07-12Merge tag 'fsi-updates-2018-07-12' of ↵Greg Kroah-Hartman1-0/+58
git://git.kernel.org/pub/scm/linux/kernel/git/benh/linux-fsi into char-misc-next Ben writes: FSI fixes and updates: - Reported build fixes - Add configuration of send/echo delayus - Object lifetime fix - Re-arrange some definitions in preparation for adding the CF master
2018-07-11drm/amd: Add kfd ioctl defines for hw_exception eventShaoyun Liu1-1/+19
Signed-off-by: Shaoyun Liu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-11drm/amdkfd: Handle VM faults in KFDshaoyunl1-1/+1
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-10sched: Add Common Applications Kept Enhanced (cake) qdiscToke Høiland-Jørgensen1-0/+114
sch_cake targets the home router use case and is intended to squeeze the most bandwidth and latency out of even the slowest ISP links and routers, while presenting an API simple enough that even an ISP can configure it. Example of use on a cable ISP uplink: tc qdisc add dev eth0 cake bandwidth 20Mbit nat docsis ack-filter To shape a cable download link (ifb and tc-mirred setup elided) tc qdisc add dev ifb0 cake bandwidth 200mbit nat docsis ingress wash CAKE is filled with: * A hybrid Codel/Blue AQM algorithm, "Cobalt", tied to an FQ_Codel derived Flow Queuing system, which autoconfigures based on the bandwidth. * A novel "triple-isolate" mode (the default) which balances per-host and per-flow FQ even through NAT. * An deficit based shaper, that can also be used in an unlimited mode. * 8 way set associative hashing to reduce flow collisions to a minimum. * A reasonable interpretation of various diffserv latency/loss tradeoffs. * Support for zeroing diffserv markings for entering and exiting traffic. * Support for interacting well with Docsis 3.0 shaper framing. * Extensive support for DSL framing types. * Support for ack filtering. * Extensive statistics for measuring, loss, ecn markings, latency variation. A paper describing the design of CAKE is available at https://arxiv.org/abs/1804.07617, and will be published at the 2018 IEEE International Symposium on Local and Metropolitan Area Networks (LANMAN). This patch adds the base shaper and packet scheduler, while subsequent commits add the optional (configurable) features. The full userspace API and most data structures are included in this commit, but options not understood in the base version will be ignored. Various versions baking have been available as an out of tree build for kernel versions going back to 3.10, as the embedded router world has been running a few years behind mainline Linux. A stable version has been generally available on lede-17.01 and later. sch_cake replaces a combination of iptables, tc filter, htb and fq_codel in the sqm-scripts, with sane defaults and vastly simpler configuration. CAKE's principal author is Jonathan Morton, with contributions from Kevin Darbyshire-Bryant, Toke Høiland-Jørgensen, Sebastian Moeller, Ryan Mounce, Tony Ambardar, Dean Scarff, Nils Andreas Svee, Dave Täht, and Loganaden Velvindron. Testing from Pete Heist, Georgios Amanakis, and the many other members of the [email protected] mailing list. tc -s qdisc show dev eth2 qdisc cake 8017: root refcnt 2 bandwidth 1Gbit diffserv3 triple-isolate split-gso rtt 100.0ms noatm overhead 38 mpu 84 Sent 51504294511 bytes 37724591 pkt (dropped 6, overlimits 64958695 requeues 12) backlog 0b 0p requeues 12 memory used: 1053008b of 15140Kb capacity estimate: 970Mbit min/max network layer size: 28 / 1500 min/max overhead-adjusted size: 84 / 1538 average network hdr offset: 14 Bulk Best Effort Voice thresh 62500Kbit 1Gbit 250Mbit target 5.0ms 5.0ms 5.0ms interval 100.0ms 100.0ms 100.0ms pk_delay 5us 5us 6us av_delay 3us 2us 2us sp_delay 2us 1us 1us backlog 0b 0b 0b pkts 3164050 25030267 9530280 bytes 3227519915 35396974782 12879808898 way_inds 0 8 0 way_miss 21 366 25 way_cols 0 0 0 drops 5 0 1 marks 0 0 0 ack_drop 0 0 0 sp_flows 1 3 0 bk_flows 0 1 1 un_flows 0 0 0 max_len 68130 68130 68130 Tested-by: Pete Heist <[email protected]> Tested-by: Georgios Amanakis <[email protected]> Signed-off-by: Dave Taht <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-10rseq: Remove unused types_32_64.h uapi headerMathieu Desnoyers1-50/+0
This header was introduced in the 4.18 merge window, and rseq does not need it anymore. Nuke it before the final release. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Watson <[email protected]> Cc: Paul Turner <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Russell King <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Ben Maurer <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-10rseq: uapi: Declare rseq_cs field as union, update includesMathieu Desnoyers1-8/+19
Declaring the rseq_cs field as a union between __u64 and two __u32 allows both 32-bit and 64-bit kernels to read the full __u64, and therefore validate that a 32-bit user-space cleared the upper 32 bits, thus ensuring a consistent behavior between native 32-bit kernels and 32-bit compat tasks on 64-bit kernels. Check that the rseq_cs value read is < TASK_SIZE. The asm/byteorder.h header needs to be included by rseq.h, now that it is not using linux/types_32_64.h anymore. Considering that only __32 and __u64 types are declared in linux/rseq.h, the linux/types.h header should always be included for both kernel and user-space code: including stdint.h is just for u64 and u32, which are not used in this header at all. Use copy_from_user()/clear_user() to interact with a 64-bit field, because arm32 does not implement 64-bit __get_user, and ppc32 does not 64-bit get_user. Considering that the rseq_cs pointer does not need to be loaded/stored with single-copy atomicity from the kernel anymore, we can simply use copy_from_user()/clear_user(). Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Watson <[email protected]> Cc: Paul Turner <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Russell King <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Ben Maurer <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-10rseq: uapi: Update uapi commentsMathieu Desnoyers1-33/+36
Update rseq uapi header comments to reflect that user-space need to do thread-local loads/stores from/to the struct rseq fields. As a consequence of this added requirement, the kernel does not need to perform loads/stores with single-copy atomicity. Update the comment associated to the "flags" fields to describe more accurately that it's only useful to facilitate single-stepping through rseq critical sections with debuggers. Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Watson <[email protected]> Cc: Paul Turner <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Russell King <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Ben Maurer <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-10rseq: Use __u64 for rseq_cs fields, validate user inputsMathieu Desnoyers1-3/+3
Change the rseq ABI so rseq_cs start_ip, post_commit_offset and abort_ip fields are seen as 64-bit fields by both 32-bit and 64-bit kernels rather that ignoring the 32 upper bits on 32-bit kernels. This ensures we have a consistent behavior for a 32-bit binary executed on 32-bit kernels and in compat mode on 64-bit kernels. Validating the value of abort_ip field to be below TASK_SIZE ensures the kernel don't return to an invalid address when returning to userspace after an abort. I don't fully trust each architecture code to consistently deal with invalid return addresses. Validating the value of the start_ip and post_commit_offset fields prevents overflow on arithmetic performed on those values, used to check whether abort_ip is within the rseq critical section. If validation fails, the process is killed with a segmentation fault. When the signature encountered before abort_ip does not match the expected signature, return -EINVAL rather than -EPERM to be consistent with other input validation return codes from rseq_get_rseq_cs(). Signed-off-by: Mathieu Desnoyers <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: "Paul E . McKenney" <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Watson <[email protected]> Cc: Paul Turner <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Russell King <[email protected]> Cc: "H . Peter Anvin" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Chris Lameter <[email protected]> Cc: Ben Maurer <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Josh Triplett <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Michael Kerrisk <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-07-09net: Use __u32 in uapi net_stamp.hJesus Sanchez-Palencia1-2/+2
We are not supposed to use u32 in uapi, so change the flags member of struct sock_txtime from u32 to __u32 instead. Fixes: 80b14dee2bea ("net: Add a new socket option for a future transmit time") Reported-by: Eric Dumazet <[email protected]> Signed-off-by: Jesus Sanchez-Palencia <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-09include/uapi/linux/blkzoned.h: Remove a superfluous __packed directiveBart Van Assche1-1/+1
Using the __packed directive for a structure that does not need it is wrong because it makes gcc generate suboptimal code on some architectures. Hence remove the __packed directive from the blk_zone_report structure definition. See also http://digitalvampire.org/blog/index.php/2006/07/31/why-you-shouldnt-use-__attribute__packed/. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Cc: Matias Bjorling <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-07-08openvswitch: kernel datapath clone actionYifeng Sun1-0/+3
Add 'clone' action to kernel datapath by using existing functions. When actions within clone don't modify the current flow, the flow key is not cloned before executing clone actions. This is a follow up patch for this incomplete work: https://patchwork.ozlabs.org/patch/722096/ v1 -> v2: Refactor as advised by reviewer. Signed-off-by: Yifeng Sun <[email protected]> Signed-off-by: Andy Zhou <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-07kbd: complete dead keys definitionsSamuel Thibault1-1/+22
This completes dead keys definitions for internationalization completeness on the console. The representatives have been chosen coherently with libx11 compose sequences, which avoid symetry conflicts (e.g. there is U with caron, but no c with breve). Signed-off-by: Samuel Thibault <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-07-07net/sched: flower: Add supprt for matching on QinQ vlan headersJianbo Liu1-0/+4
As support dissecting of QinQ inner and outer vlan headers, user can add rules to match on QinQ vlan headers. Signed-off-by: Jianbo Liu <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-05devlink: Add devlink notifications support for paramsMoshe Shemesh1-0/+2
Add devlink_param_notify() function to support devlink param notifications. Add notification call to devlink param set, register and unregister functions. Add devlink_param_value_changed() function to enable the driver notify devlink on value change. Driver should use this function after value was changed on any configuration mode part to driverinit. Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-05devlink: Add param set commandMoshe Shemesh1-0/+1
Add param set command to set value for a parameter. Value can be set to any of the supported configuration modes. Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-05devlink: Add param get commandMoshe Shemesh1-0/+11
Add param get command which gets data per parameter. Option to dump the parameters data per device. Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-05devlink: Add devlink_param register and unregisterMoshe Shemesh1-0/+10
Define configuration parameters data structure. Add functions to register and unregister the driver supported configuration parameters table. For each parameter registered, the driver should fill all the parameter's fields. In case the only supported configuration mode is "driverinit" the parameter's get()/set() functions are not required and should be set to NULL, for any other configuration mode, these functions are required and should be set by the driver. Signed-off-by: Moshe Shemesh <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04media: v4l2-ctrl: Add control for VP9 profileKeiichi Watanabe1-0/+7
Add a new control V4L2_CID_MPEG_VIDEO_VP9_PROFILE for VP9 profiles. This control allows selecting the desired profile for VP9 encoder and querying for supported profiles by VP9 encoder/decoder. Though this control is similar to V4L2_CID_MPEG_VIDEO_VP8_PROFILE, we need to separate this control from it because supported profiles usually differ between VP8 and VP9. Signed-off-by: Keiichi Watanabe <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2018-07-04net/sched: Make etf report drops on error_queueJesus Sanchez-Palencia2-1/+8
Use the socket error queue for reporting dropped packets if the socket has enabled that feature through the SO_TXTIME API. Packets are dropped either on enqueue() if they aren't accepted by the qdisc or on dequeue() if the system misses their deadline. Those are reported as different errors so applications can react accordingly. Userspace can retrieve the errors through the socket error queue and the corresponding cmsg interfaces. A struct sock_extended_err* is used for returning the error data, and the packet's timestamp can be retrieved by adding both ee_data and ee_info fields as e.g.: ((__u64) serr->ee_data << 32) + serr->ee_info This feature is disabled by default and must be explicitly enabled by applications. Enabling it can bring some overhead for the Tx cycles of the application. Signed-off-by: Jesus Sanchez-Palencia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04net/sched: Add HW offloading capability to ETFJesus Sanchez-Palencia1-0/+1
Add infra so etf qdisc supports HW offload of time-based transmission. For hw offload, the time sorted list is still used, so packets are dequeued always in order of txtime. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf offload delta 100000 \ clockid CLOCK_REALTIME In this example, the Qdisc will use HW offload for the control of the transmission time through the network adapter. The hrtimer used for packets scheduling inside the qdisc will use the clockid CLOCK_REALTIME as reference and packets leave the Qdisc "delta" (100000) nanoseconds before their transmission time. Because this will be using HW offload and since dynamic clocks are not supported by the hrtimer, the system clock and the PHC clock must be synchronized for this mode to behave as expected. Signed-off-by: Jesus Sanchez-Palencia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04net/sched: Introduce the ETF QdiscVinicius Costa Gomes1-0/+17
The ETF (Earliest TxTime First) qdisc uses the information added earlier in this series (the socket option SO_TXTIME and the new role of sk_buff->tstamp) to schedule packets transmission based on absolute time. For some workloads, just bandwidth enforcement is not enough, and precise control of the transmission of packets is necessary. Example: $ tc qdisc replace dev enp2s0 parent root handle 100 mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc add dev enp2s0 parent 100:1 etf delta 100000 \ clockid CLOCK_TAI In this example, the Qdisc will provide SW best-effort for the control of the transmission time to the network adapter, the time stamp in the socket will be in reference to the clockid CLOCK_TAI and packets will leave the qdisc "delta" (100000) nanoseconds before its transmission time. The ETF qdisc will buffer packets sorted by their txtime. It will drop packets on enqueue() if their skbuff clockid does not match the clock reference of the Qdisc. Moreover, on dequeue(), a packet will be dropped if it expires while being enqueued. The qdisc also supports the SO_TXTIME deadline mode. For this mode, it will dequeue a packet as soon as possible and change the skb timestamp to 'now' during etf_dequeue(). Note that both the qdisc's and the SO_TXTIME ABIs allow for a clockid to be configured, but it's been decided that usage of CLOCK_TAI should be enforced until we decide to allow for other clockids to be used. The rationale here is that PTP times are usually in the TAI scale, thus no other clocks should be necessary. For now, the qdisc will return EINVAL if any clocks other than CLOCK_TAI are used. Signed-off-by: Jesus Sanchez-Palencia <[email protected]> Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04net: Add a new socket option for a future transmit time.Richard Cochran1-0/+15
This patch introduces SO_TXTIME. User space enables this option in order to pass a desired future transmit time in a CMSG when calling sendmsg(2). The argument to this socket option is a 8-bytes long struct provided by the uapi header net_tstamp.h defined as: struct sock_txtime { clockid_t clockid; u32 flags; }; Note that new fields were added to struct sock by filling a 2-bytes hole found in the struct. For that reason, neither the struct size or number of cachelines were altered. Signed-off-by: Richard Cochran <[email protected]> Signed-off-by: Jesus Sanchez-Palencia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04media: v4l2-ctrl: Change control for VP8 profile to menu controlKeiichi Watanabe1-1/+10
Add a menu control V4L2_CID_MPEG_VIDEO_VP8_PROFILE for VP8 profile and make V4L2_CID_MPEG_VIDEO_VPX_PROFILE an alias of it. This new control is used to select the desired profile for VP8 encoder and query for supported profiles by VP8 encoder/decoder. Though we have originally a control V4L2_CID_MPEG_VIDEO_VPX_PROFILE and its name contains 'VPX', it works only for VP8 because supported profiles usually differ between VP8 and VP9. In addition, this control cannot be used for querying since it is not a menu control but an integer control, which cannot return an arbitrary set of supported profiles. The new control V4L2_CID_MPEG_VIDEO_VP8_PROFILE is a menu control as with controls for other codec profiles. (e.g. H264) In addition, this patch also fixes the use of V4L2_CID_MPEG_VIDEO_VPX_PROFILE in drivers of Qualcomm's venus and Samsung's s5p-mfc. Signed-off-by: Keiichi Watanabe <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2018-07-04net:sched: add action inheritdsfield to skbeditQiaobin Fu1-0/+2
The new action inheritdsfield copies the field DS of IPv4 and IPv6 packets into skb->priority. This enables later classification of packets based on the DS field. v5: *Update the drop counter for TC_ACT_SHOT v4: *Not allow setting flags other than the expected ones. *Allow dumping the pure flags. v3: *Use optional flags, so that it won't break old versions of tc. *Allow users to set both SKBEDIT_F_PRIORITY and SKBEDIT_F_INHERITDSFIELD flags. v2: *Fix the style issue *Move the code from skbmod to skbedit Original idea by Jamal Hadi Salim <[email protected]> Signed-off-by: Qiaobin Fu <[email protected]> Reviewed-by: Michel Machado <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Reviewed-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-04media: v4l2-ctrls: Fix CID base conflict between MAX217X and IMXSteve Longerbeam1-1/+1
When the imx-media driver was initially merged, there was a conflict with 8d67ae25 ("media: v4l2-ctrls: Reserve controls for MAX217X") which was not fixed up correctly, resulting in V4L2_CID_USER_MAX217X_BASE and V4L2_CID_USER_IMX_BASE taking on the same value. Fix by assigning imx CID base the next available range at 0x10b0. Signed-off-by: Steve Longerbeam <[email protected]> Acked-by: Sakari Ailus <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2018-07-04sctp: add spp_ipv6_flowlabel and spp_dscp for sctp_paddrparamsXin Long1-0/+4
spp_ipv6_flowlabel and spp_dscp are added in sctp_paddrparams in this patch so that users could set sctp_sock/asoc/transport dscp and flowlabel with spp_flags SPP_IPV6_FLOWLABEL or SPP_DSCP by SCTP_PEER_ADDR_PARAMS , as described section 8.1.12 in RFC6458. As said in last patch, it uses '| 0x100000' or '|0x1' to mark flowlabel or dscp is set, so that their values could be set to 0. Note that to guarantee that an old app built with old kernel headers could work on the newer kernel, the param's check in sctp_g/setsockopt_peer_addr_params() is also improved, which follows the way that sctp_g/setsockopt_delayed_ack() or some other sockopts' process that accept two types of params does. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-07-03Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-8/+30
Simple overlapping changes in stmmac driver. Adjust skb_gro_flush_final_remcsum function signature to make GRO list changes in net-next, as per Stephen Rothwell's example merge resolution. Signed-off-by: David S. Miller <[email protected]>
2018-07-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-5/+23
Pull networking fixes from David Miller: 1) Verify netlink attributes properly in nf_queue, from Eric Dumazet. 2) Need to bump memory lock rlimit for test_sockmap bpf test, from Yonghong Song. 3) Fix VLAN handling in lan78xx driver, from Dave Stevenson. 4) Fix uninitialized read in nf_log, from Jann Horn. 5) Fix raw command length parsing in mlx5, from Alex Vesker. 6) Cleanup loopback RDS connections upon netns deletion, from Sowmini Varadhan. 7) Fix regressions in FIB rule matching during create, from Jason A. Donenfeld and Roopa Prabhu. 8) Fix mpls ether type detection in nfp, from Pieter Jansen van Vuuren. 9) More bpfilter build fixes/adjustments from Masahiro Yamada. 10) Fix XDP_{TX,REDIRECT} flushing in various drivers, from Jesper Dangaard Brouer. 11) fib_tests.sh file permissions were broken, from Shuah Khan. 12) Make sure BH/preemption is disabled in data path of mac80211, from Denis Kenzior. 13) Don't ignore nla_parse_nested() return values in nl80211, from Johannes berg. 14) Properly account sock objects ot kmemcg, from Shakeel Butt. 15) Adjustments to setting bpf program permissions to read-only, from Daniel Borkmann. 16) TCP Fast Open key endianness was broken, it always took on the host endiannness. Whoops. Explicitly make it little endian. From Yuching Cheng. 17) Fix prefix route setting for link local addresses in ipv6, from David Ahern. 18) Potential Spectre v1 in zatm driver, from Gustavo A. R. Silva. 19) Various bpf sockmap fixes, from John Fastabend. 20) Use after free for GRO with ESP, from Sabrina Dubroca. 21) Passing bogus flags to crypto_alloc_shash() in ipv6 SR code, from Eric Biggers. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits) qede: Adverstise software timestamp caps when PHC is not available. qed: Fix use of incorrect size in memcpy call. qed: Fix setting of incorrect eswitch mode. qed: Limit msix vectors in kdump kernel to the minimum required count. ipvlan: call dev_change_flags when ipvlan mode is reset ipv6: sr: fix passing wrong flags to crypto_alloc_shash() net: fix use-after-free in GRO with ESP tcp: prevent bogus FRTO undos with non-SACK flows bpf: sockhash, add release routine bpf: sockhash fix omitted bucket lock in sock_close bpf: sockmap, fix smap_list_map_remove when psock is in many maps bpf: sockmap, fix crash when ipv6 sock is added net: fib_rules: bring back rule_exists to match rule during add hv_netvsc: split sub-channel setup into async and sync net: use dev_change_tx_queue_len() for SIOCSIFTXQLEN atm: zatm: Fix potential Spectre v1 s390/qeth: consistently re-enable device features s390/qeth: don't clobber buffer on async TX completion s390/qeth: avoid using is_multicast_ether_addr_64bits on (u8 *)[6] s390/qeth: fix race when setting MAC address ...
2018-07-02Merge 4.18-rc3 into staging-nextGreg Kroah-Hartman3-3/+10
We want the staging/iio fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-07-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-5/+23
Daniel Borkmann says: ==================== pull-request: bpf 2018-07-01 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) A bpf_fib_lookup() helper fix to change the API before freeze to return an encoding of the FIB lookup result and return the nexthop device index in the params struct (instead of device index as return code that we had before), from David. 2) Various BPF JIT fixes to address syzkaller fallout, that is, do not reject progs when set_memory_*() fails since it could still be RO. Also arm32 JIT was not using bpf_jit_binary_lock_ro() API which was an issue, and a memory leak in s390 JIT found during review, from Daniel. 3) Multiple fixes for sockmap/hash to address most of the syzkaller triggered bugs. Usage with IPv6 was crashing, a GPF in bpf_tcp_close(), a missing sock_map_release() routine to hook up to callbacks, and a fix for an omitted bucket lock in sock_close(), from John. 4) Two bpftool fixes to remove duplicated error message on program load, and another one to close the libbpf object after program load. One additional fix for nfp driver's BPF offload to avoid stopping offload completely if replace of program failed, from Jakub. 5) Couple of BPF selftest fixes that bail out in some of the test scripts if the user does not have the right privileges, from Jeffrin. 6) Fixes in test_bpf for s390 when CONFIG_BPF_JIT_ALWAYS_ON is set where we need to set the flag that some of the test cases are expected to fail, from Kleber. 7) Fix to detangle BPF_LIRC_MODE2 dependency from CONFIG_CGROUP_BPF since it has no relation to it and lirc2 users often have configs without cgroups enabled and thus would not be able to use it, from Sean. 8) Fix a selftest failure in sockmap by removing a useless setrlimit() call that would set a too low limit where at the same time we are already including bpf_rlimit.h that does the job, from Yonghong. 9) Fix BPF selftest config with missing missing NET_SCHED, from Anders. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-06-30PCI: Enable PASID only if entire path supports End-End TLP prefixesSinan Kaya1-0/+1
A PCIe endpoint carries the process address space identifier (PASID) in the TLP prefix as part of the memory read/write transaction. The address information in the TLP is relevant only for a given PASID context. An IOMMU takes PASID value and the address information from the TLP to look up the physical address in the system. PASID is an End-End TLP Prefix (PCIe r4.0, sec 6.20). Sec 2.2.10.2 says It is an error to receive a TLP with an End-End TLP Prefix by a Receiver that does not support End-End TLP Prefixes. A TLP in violation of this rule is handled as a Malformed TLP. This is a reported error associated with the Receiving Port (see Section 6.2). Prevent error condition by proactively requiring End-End TLP prefix to be supported on the entire data path between the endpoint and the root port before enabling PASID. Signed-off-by: Sinan Kaya <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2018-06-30Merge tag 'mac80211-next-for-davem-2018-06-29' of ↵David S. Miller1-1/+101
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Small merge conflict in net/mac80211/scan.c, I preserved the kcalloc() conversion. -DaveM Johannes Berg says: ==================== This round's updates: * finally some of the promised HE code, but it turns out to be small - but everything kept changing, so one part I did in the driver was >30 patches for what was ultimately <200 lines of code ... similar here for this code. * improved scan privacy support - can now specify scan flags for randomizing the sequence number as well as reducing the probe request element content * rfkill cleanups * a timekeeping cleanup from Arnd * various other cleanups ==================== Signed-off-by: David S. Miller <[email protected]>
2018-06-30tipc: extend sock diag for group communicationGhantaKrishnamurthy MohanKrishna1-0/+14
This commit extends the existing TIPC socket diagnostics framework for information related to TIPC group communication. Acked-by: Ying Xue <[email protected]> Acked-by: Jon Maloy <[email protected]> Signed-off-by: GhantaKrishnamurthy MohanKrishna <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-30net/smc: add SMC-D diag supportHans Wippel1-0/+10
This patch adds diag support for SMC-D. Signed-off-by: Hans Wippel <[email protected]> Signed-off-by: Ursula Braun <[email protected]> Suggested-by: Thomas Richter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-30tcp: add new SNMP counter for drops when try to queue in rcv queueYafang Shao1-0/+1
When sk_rmem_alloc is larger than the receive buffer and we can't schedule more memory for it, the skb will be dropped. In above situation, if this skb is put into the ofo queue, LINUX_MIB_TCPOFODROP is incremented to track it. While if this skb is put into the receive queue, there's no record. So a new SNMP counter is introduced to track this behavior. LINUX_MIB_TCPRCVQDROP: Number of packets meant to be queued in rcv queue but dropped because socket rcvbuf limit hit. Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-29PCI: Cleanup PCI_REBAR_CTRL_BAR_SHIFT handlingChristian König1-1/+2
Cleanup PCI_REBAR_CTRL_BAR_SHIFT handling. That was hard coded instead of properly defined in the header for some reason. Signed-off-by: Christian König <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2018-06-29net/sched: add tunnel option support to act_tunnel_keySimon Horman1-0/+26
Allow setting tunnel options using the act_tunnel_key action. Options are expressed as class:type:data and multiple options may be listed using a comma delimiter. # ip link add name geneve0 type geneve dstport 0 external # tc qdisc add dev eth0 ingress # tc filter add dev eth0 protocol ip parent ffff: \ flower indev eth0 \ ip_proto udp \ action tunnel_key \ set src_ip 10.0.99.192 \ dst_ip 10.0.99.193 \ dst_port 6081 \ id 11 \ geneve_opts 0102:80:00800022,0102:80:00800022 \ action mirred egress redirect dev geneve0 Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-29aio: mark __aio_sigset::sigmask constAvi Kivity1-1/+1
io_pgetevents() will not change the signal mask. Mark it const to make it clear and to reduce the need for casts in user code. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Al Viro <[email protected]> [hch: reapply the patch that got incorrectly reverted] Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-29sctp: add support for SCTP_REUSE_PORT sockoptXin Long1-0/+1
This feature is actually already supported by sk->sk_reuse which can be set by socket level opt SO_REUSEADDR. But it's not working exactly as RFC6458 demands in section 8.1.27, like: - This option only supports one-to-one style SCTP sockets - This socket option must not be used after calling bind() or sctp_bindx(). Besides, SCTP_REUSE_PORT sockopt should be provided for user's programs. Otherwise, the programs with SCTP_REUSE_PORT from other systems will not work in linux. To separate it from the socket level version, this patch adds 'reuse' in sctp_sock and it works pretty much as sk->sk_reuse, but with some extra setup limitations that are needed when it is being enabled. "It should be noted that the behavior of the socket-level socket option to reuse ports and/or addresses for SCTP sockets is unspecified", so it leaves SO_REUSEADDR as is for the compatibility. Note that the name SCTP_REUSE_PORT is somewhat confusing, as its functionality is nearly identical to SO_REUSEADDR, but with some extra restrictions. Here it uses 'reuse' in sctp_sock instead of 'reuseport'. As for sk->sk_reuseport support for SCTP, it will be added in another patch. Thanks to Neil to make this clear. v1->v2: - add sctp_sk->reuse to separate it from the socket level version. v2->v3: - improve changelog according to Marcelo's suggestion. Acked-by: Neil Horman <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-29ila: Flush netlink command to clear xlat tableTom Herbert1-0/+1
Add ILA_CMD_FLUSH netlink command to clear the ILA translation table. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-29bpf: Change bpf_fib_lookup to return lookup statusDavid Ahern1-5/+23
For ACLs implemented using either FIB rules or FIB entries, the BPF program needs the FIB lookup status to be able to drop the packet. Since the bpf_fib_lookup API has not reached a released kernel yet, change the return code to contain an encoding of the FIB lookup result and return the nexthop device index in the params struct. In addition, inform the BPF program of any post FIB lookup reason as to why the packet needs to go up the stack. The fib result for unicast routes must have an egress device, so remove the check that it is non-NULL. Signed-off-by: David Ahern <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-06-28Revert changes to convert to ->poll_mask() and aio IOCB_CMD_POLLLinus Torvalds1-3/+5
The poll() changes were not well thought out, and completely unexplained. They also caused a huge performance regression, because "->poll()" was no longer a trivial file operation that just called down to the underlying file operations, but instead did at least two indirect calls. Indirect calls are sadly slow now with the Spectre mitigation, but the performance problem could at least be largely mitigated by changing the "->get_poll_head()" operation to just have a per-file-descriptor pointer to the poll head instead. That gets rid of one of the new indirections. But that doesn't fix the new complexity that is completely unwarranted for the regular case. The (undocumented) reason for the poll() changes was some alleged AIO poll race fixing, but we don't make the common case slower and more complex for some uncommon special case, so this all really needs way more explanations and most likely a fundamental redesign. [ This revert is a revert of about 30 different commits, not reverted individually because that would just be unnecessarily messy - Linus ] Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-06-28net sched actions: fix coding style in pedit headersRoman Mashak1-2/+7
Fix coding style issues in tc pedit headers detected by the checkpatch script. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Roman Mashak <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-28netem: slotting with non-uniform distributionYousuk Seung1-0/+3
Extend slotting with support for non-uniform distributions. This is similar to netem's non-uniform distribution delay feature. Commit f043efeae2f1 ("netem: support delivering packets in delayed time slots") added the slotting feature to approximate the behaviors of media with packet aggregation but only supported a uniform distribution for delays between transmission attempts. Tests with TCP BBR with emulated wifi links with non-uniform distributions produced more useful results. Syntax: slot dist DISTRIBUTION DELAY JITTER [packets MAX_PACKETS] \ [bytes MAX_BYTES] The syntax and use of the distribution table is the same as in the non-uniform distribution delay feature. A file DISTRIBUTION must be present in TC_LIB_DIR (e.g. /usr/lib/tc) containing numbers scaled by NETEM_DIST_SCALE. A random value x is selected from the table and it takes DELAY + ( x * JITTER ) as delay. Correlation between values is not supported. Examples: Normal distribution delay with mean = 800us and stdev = 100us. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us Optionally set the max slot size in bytes and/or packets. > tc qdisc add dev eth0 root netem slot dist normal 800us 100us \ bytes 64k packets 42 Signed-off-by: Yousuk Seung <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-28media: media.h: remove __NEED_MEDIA_LEGACY_APIHans Verkuil1-1/+1
The __NEED_MEDIA_LEGACY_API define is 1) ugly and 2) dangerous since it is all too easy for drivers to define it to get hold of legacy defines. Instead just define what we need in media-device.c which is the only place where we need the legacy define (MEDIA_ENT_T_DEVNODE_UNKNOWN). Signed-off-by: Hans Verkuil <[email protected]> Acked-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2018-06-27inotify: Add flag IN_MASK_CREATE for inotify_add_watch()Henry Wilson1-0/+1
The flag IN_MASK_CREATE is introduced as a flag for inotiy_add_watch() which prevents inotify from modifying any existing watches when invoked. If the pathname specified in the call has a watched inode associated with it and IN_MASK_CREATE is specified, fail with an errno of EEXIST. Use of IN_MASK_CREATE with IN_MASK_ADD is reserved for future use and will return EINVAL. RATIONALE In the current implementation, there is no way to prevent inotify_add_watch() from modifying existing watch descriptors. Even if the caller keeps a record of all watch descriptors collected, this is only sufficient to detect that an existing watch descriptor may have been modified. The assumption that a particular path will map to the same inode over multiple calls to inotify_add_watch() cannot be made as files can be renamed or deleted. It is also not possible to assume that two distinct paths do no map to the same inode, due to hard-links or a dereferenced symbolic link. Further uses of inotify_add_watch() to revert the change may cause other watch descriptors to be modified or created, merely compunding the problem. There is currently no system call such as inotify_modify_watch() to explicity modify a watch descriptor, which would be able to revert unwanted changes. Thus the caller cannot guarantee to be able to revert any changes to existing watch decriptors. Additionally the caller cannot assume that the events that are associated with a watch descriptor are within the set requested, as any future calls to inotify_add_watch() may unintentionally modify a watch descriptor's mask. Thus it cannot currently be guaranteed that a watch descriptor will only generate events which have been requested. The program must filter events which come through its watch descriptor to within its expected range. Reviewed-by: Amir Goldstein <[email protected]> Signed-off-by: Henry Wilson <[email protected]> Signed-off-by: Jan Kara <[email protected]>
2018-06-27Merge tag 'scsi-fixes' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Three small bug fixes (barrier elimination, memory leak on unload, spinlock recursion) and a technical enhancement left over from the merge window: the TCMU read length support is required for tape devices read when the length of the read is greater than the tape block size" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: scsi_debug: Fix memory leak on module unload scsi: qla2xxx: Spinlock recursion in qla_target scsi: ipr: Eliminate duplicate barriers scsi: target: tcmu: add read length support
2018-06-26tcp: add SNMP counter for zero-window dropsYafang Shao1-0/+1
It will be helpful if we could display the drops due to zero window or no enough window space. So a new SNMP MIB entry is added to track this behavior. This entry is named LINUX_MIB_TCPZEROWINDOWDROP and published in /proc/net/netstat in TcpExt line as TCPZeroWindowDrop. Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-06-26Merge tag 'iio-for-4.19a' of ↵Greg Kroah-Hartman1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next Jonathan writes: First set of IIO new device support, features and cleanups in the 4.19 cycle The usual mixed bunch. Particular good to see is the generic touch screen driver. Will be interesting to see if this works for other ADCs without major changes. Core features * Channel types - New position relative channel type primarily for touch screen sensors to feed the generic touchscreen driver. New device support * ad5586 - Add support for the AD5311R DAC. * Generic touch screen driver as an IIO consumer. - Note this is in input, but due to dependencies is coming through the IIO tree. - Specific support for this added to the at91-sama5d2 ADC. - Various necessary DT bindings added. Staging Drops * ADIS16060 gyro - A device with a very odd interface that was never cleanly supported. It's now very difficult to get, so unlikely it'll ever be fixed up. Cleanups and minor features and fixes * core - Fix y2038 timestamp issues now the core support is in place. * 104-quad-8 - Provide some defines for magic numbers to help readability. - Fix an off by one error in register selection * ad7606 - Put in a missing function parameter name in a prototype. * adis16023 - Use generic sign_extend function rather than local version. * adis16240 - Use generic sign_extend funciton rather than local version. * at91-sama5d2 - Drop dependency on HAS_DMA now this is handled elsewhere. Will improve build test coverage. - Add oversampling ratio control. Note there is a minor ABI change here to increase the apparent depth to 14 bits so as to allow for transparent provision of different oversampling ratios that drop the actual bit depth to 13 or 12 bits. * hx711 - Add a MAINTAINERS entry for this device. * inv_mpu6050 - Replace the timestamp fifo 'special' code with generic timestamp handling. - Switch to using local store of timestamp divider rather than rate as that is more helpful for accurate time measurement. - Fix an unaligned access that didn't seem to be causing any trouble. - Use the fifo overflow bit to track the overflow status rather than a software counter. - New timestamping mechanism to deal with missed sample interrupts. * stm32-adc - Drop HAS_DMA build dependency. * sun4i-gpadc - Select REGMAP_IRQ a very rarely hit build issue fix.
2018-06-24time: Introduce struct __kernel_itimerspecDeepa Dinamani1-0/+7
struct itimerspec is not y2038-safe. Introduce a new struct __kernel_itimerspec based on the kernel internal y2038-safe struct itimerspec64. The definition of struct __kernel_itimerspec includes two struct __kernel_timespec. Since struct __kernel_timespec has the same representation in native and compat modes, so does struct __kernel_itimerspec. This helps have a common entry point for syscalls using struct __kernel_itimerspec. New y2038-safe syscalls will use this new type. Since most of the new syscalls are just an update to the native syscalls with the type update, place the new definition under CONFIG_64BIT_TIME. This helps architectures that do not support the above config to keep using the old definition of struct itimerspec. Also change the get/put_itimerspec64 to use struct__kernel_itimerspec. This will help 32 bit architectures to use the new syscalls when architectures select CONFIG_64BIT_TIME. Signed-off-by: Deepa Dinamani <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected]