aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2020-01-27netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attributePablo Neira Ayuso2-1/+15
Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the interval between kernel and userspace. This patch also adds the NFT_SET_EXT_KEY_END extension to store the closing element value in this interval. v4: No changes v3: New patch [sbrivio: refactor error paths and labels; add corresponding nft_set_ext_type for new key; rebase] Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-01-26Merge tag 'block-5.5-2020-01-26' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+12
Pull block fix from Jens Axboe: "Unfortunately this weekend we had a few last minute reports, one was for block. The partition disable for zoned devices was overly restrictive, it can work (and be supported) just fine for host-aware variants. Here's a fix ensuring that's the case so we don't break existing users of that" * tag 'block-5.5-2020-01-26' of git://git.kernel.dk/linux-block: block: allow partitions on host aware zone devices
2020-01-26block: allow partitions on host aware zone devicesChristoph Hellwig1-0/+12
Host-aware SMR drives can be used with the commands to explicitly manage zone state, but they can also be used as normal disks. In the former case it makes perfect sense to allow partitions on them, in the latter it does not, just like for host managed devices. Add a check to add_partition to allow partitions on host aware devices, but give up any zone management capabilities in that case, which also catches the previously missed case of adding a partition vs just scanning it. Because sd can rescan the attribute at runtime it needs to check if a disk has partitions, for which a new helper is added to genhd.h. Fixes: 5eac3eb30c9a ("block: Remove partition support for zoned block devices") Reported-by: Borislav Petkov <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Damien Le Moal <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-01-26tcp: export count for rehash attemptsAbdul Kabbani3-0/+5
Using IPv6 flow-label to swiftly route around avoid congested or disconnected network path can greatly improve TCP reliability. This patch adds SNMP counters and a OPT_STATS counter to track both host-level and connection-level statistics. Network administrators can use these counters to evaluate the impact of this new ability better. Export count for rehash attempts to 1) two SNMP counters: TcpTimeoutRehash (rehash due to timeouts), and TcpDuplicateDataRehash (rehash due to receiving duplicate packets) 2) Timestamping API SOF_TIMESTAMPING_OPT_STATS. Signed-off-by: Abdul Kabbani <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Kevin(Yudong) Yang <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-26sched.h: Annotate sighand_struct with __rcuMadhuparna Bhowmik1-1/+1
This patch fixes the following sparse errors by annotating the sighand_struct with __rcu kernel/fork.c:1511:9: error: incompatible types in comparison expression kernel/exit.c:100:19: error: incompatible types in comparison expression kernel/signal.c:1370:27: error: incompatible types in comparison expression This fix introduces the following sparse error in signal.c due to checking the sighand pointer without rcu primitives: kernel/signal.c:1386:21: error: incompatible types in comparison expression This new sparse error is also fixed in this patch. Signed-off-by: Madhuparna Bhowmik <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christian Brauner <[email protected]>
2020-01-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller7-15/+51
Minor conflict in mlx5 because changes happened to code that has moved meanwhile. Signed-off-by: David S. Miller <[email protected]>
2020-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds4-8/+4
Pull networking fixes from David Miller: 1) Off by one in mt76 airtime calculation, from Dan Carpenter. 2) Fix TLV fragment allocation loop condition in iwlwifi, from Luca Coelho. 3) Don't confirm neigh entries when doing ipsec pmtu updates, from Xu Wang. 4) More checks to make sure we only send TSO packets to lan78xx chips that they can actually handle. From James Hughes. 5) Fix ip_tunnel namespace move, from William Dauchy. 6) Fix unintended packet reordering due to cooperation between listification done by GRO and non-GRO paths. From Maxim Mikityanskiy. 7) Add Jakub Kicincki formally as networking co-maintainer. 8) Info leak in airo ioctls, from Michael Ellerman. 9) IFLA_MTU attribute needs validation during rtnl_create_link(), from Eric Dumazet. 10) Use after free during reload in mlxsw, from Ido Schimmel. 11) Dangling pointers are possible in tp->highest_sack, fix from Eric Dumazet. 12) Missing *pos++ in various networking seq_next handlers, from Vasily Averin. 13) CHELSIO_GET_MEM operation neds CAP_NET_ADMIN check, from Michael Ellerman. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (109 commits) firestream: fix memory leaks net: cxgb3_main: Add CAP_NET_ADMIN check to CHELSIO_GET_MEM net: bcmgenet: Use netif_tx_napi_add() for TX NAPI tipc: change maintainer email address net: stmmac: platform: fix probe for ACPI devices net/mlx5e: kTLS, Do not send decrypted-marked SKBs via non-accel path net/mlx5e: kTLS, Remove redundant posts in TX resync flow net/mlx5e: kTLS, Fix corner-case checks in TX resync flow net/mlx5e: Clear VF config when switching modes net/mlx5: DR, use non preemptible call to get the current cpu number net/mlx5: E-Switch, Prevent ingress rate configuration of uplink rep net/mlx5: DR, Enable counter on non-fwd-dest objects net/mlx5: Update the list of the PCI supported devices net/mlx5: Fix lowest FDB pool size net: Fix skb->csum update in inet_proto_csum_replace16(). netfilter: nf_tables: autoload modules from the abort path netfilter: nf_tables: add __nft_chain_type_get() netfilter: nf_tables_offload: fix check the chain offload flag netfilter: conntrack: sctp: use distinct states for new SCTP connections ipv6_route_seq_next should increase position index ...
2020-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller3-8/+2
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Missing netlink attribute sanity check for NFTA_OSF_DREG, from Florian Westphal. 2) Use bitmap infrastructure in ipset to fix KASAN slab-out-of-bounds reads, from Jozsef Kadlecsik. 3) Missing initial CLOSED state in new sctp connection through ctnetlink events, from Jiri Wiesner. 4) Missing check for NFT_CHAIN_HW_OFFLOAD in nf_tables offload indirect block infrastructure, from wenxu. 5) Add __nft_chain_type_get() to sanity check family and chain type. 6) Autoload modules from the nf_tables abort path to fix races reported by syzbot. 7) Remove unnecessary skb->csum update on inet_proto_csum_replace16(), from Praveen Chaudhary. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-01-25bpf: Allow to resolve bpf trampoline and dispatcher in unwindJiri Olsa1-1/+11
When unwinding the stack we need to identify each address to successfully continue. Adding latch tree to keep trampolines for quick lookup during the unwind. The patch uses first 48 bytes for latch tree node, leaving 4048 bytes from the rest of the page for trampoline or dispatcher generated code. It's still enough not to affect trampoline and dispatcher progs maximum counts. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-01-25net: sched: Make TBF Qdisc offloadablePetr Machata2-0/+23
Invoke ndo_setup_tc as appropriate to signal init / replacement, destroying and dumping of TBF Qdisc. Signed-off-by: Petr Machata <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-25Merge branch 'for-mingo' of ↵Ingo Molnar10-95/+206
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Expedited grace-period updates - kfree_rcu() updates - RCU list updates - Preemptible RCU updates - Torture-test updates - Miscellaneous fixes - Documentation updates Signed-off-by: Ingo Molnar <[email protected]>
2020-01-25jbd2: clean __jbd2_journal_abort_hard() and __journal_abort_soft()zhangyi (F)1-1/+0
__jbd2_journal_abort_hard() is no longer used, so now we can merge __jbd2_journal_abort_hard() and __journal_abort_soft() these two functions into jbd2_journal_abort() and remove them. Signed-off-by: zhangyi (F) <[email protected]> Reviewed-by: Jan Kara <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2020-01-25mptcp: do not inherit inet proto opsFlorian Westphal1-0/+3
We need to initialise the struct ourselves, else we expose tcp-specific callbacks such as tcp_splice_read which will then trigger splat because the socket is an mptcp one: BUG: KASAN: slab-out-of-bounds in tcp_mstamp_refresh+0x80/0xa0 net/ipv4/tcp_output.c:57 Write of size 8 at addr ffff888116aa21d0 by task syz-executor.0/5478 CPU: 1 PID: 5478 Comm: syz-executor.0 Not tainted 5.5.0-rc6 #3 Call Trace: tcp_mstamp_refresh+0x80/0xa0 net/ipv4/tcp_output.c:57 tcp_rcv_space_adjust+0x72/0x7f0 net/ipv4/tcp_input.c:612 tcp_read_sock+0x622/0x990 net/ipv4/tcp.c:1674 tcp_splice_read+0x20b/0xb40 net/ipv4/tcp.c:791 do_splice+0x1259/0x1560 fs/splice.c:1205 To prevent build error with ipv6, add the recv/sendmsg function declaration to ipv6.h. The functions are already accessible "thanks" to retpoline related work, but they are currently only made visible by socket.c specific INDIRECT_CALLABLE macros. Reported-by: Christoph Paasch <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-25ext4: fix extent_status trace pointsDmitry Monakhov1-4/+4
Show pblock only if it has meaningful value. # before ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 576460752303423487 H ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 576460752303423487 HR # after ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [1/4294967294) 0 H ext4:ext4_es_lookup_extent_exit: dev 253,0 ino 12 found 1 [2/4294967293) 0 HR Signed-off-by: Dmitry Monakhov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2020-01-25ext4: fix symbolic enum printing in trace outputDmitry Monakhov1-1/+18
Trace's macro __print_flags() produce raw event's decraration w/o knowing actual flags value cat /sys/kernel/debug/tracing/events/ext4/ext4_ext_map_blocks_exit/format .. __print_flags(REC->mflags, "", { (1 << BH_New), For that reason we have to explicitly define it via special macro TRACE_DEFINE_ENUM() Also add missed EXTENT_STATUS_REFERENCED flag. #Before patch ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags lblk 0 pblk 4177 len 1 mflags 0x20 ret 1 ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34304 len 1 mflags 0x60 ret 1 #With patch ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 2 flags lblk 0 pblk 4177 len 1 mflags M ret 1 ext4:ext4_ext_map_blocks_exit: dev 253,0 ino 12 flags CREATE lblk 0 pblk 34816 len 1 mflags NM ret 1 Signed-off-by: Dmitry Monakhov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2020-01-24alarmtimer: Make alarmtimer_get_rtcdev() a stub when CONFIG_RTC_CLASS=nStephen Boyd1-0/+4
The stubbed version of alarmtimer_get_rtcdev() is not exported. so this won't work if this function is used in a module when CONFIG_RTC_CLASS=n. Move the stub function to the header file and make it inline so that callers don't have to worry about linking against this symbol. rtcdev isn't used outside of this ifdef so it's not required to be redefined to NULL. Drop that while touching this area. Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-01-24netfilter: nf_tables: autoload modules from the abort pathPablo Neira Ayuso2-1/+2
This patch introduces a list of pending module requests. This new module list is composed of nft_module_request objects that contain the module name and one status field that tells if the module has been already loaded (the 'done' field). In the first pass, from the preparation phase, the netlink command finds that a module is missing on this list. Then, a module request is allocated and added to this list and nft_request_module() returns -EAGAIN. This triggers the abort path with the autoload parameter set on from nfnetlink, request_module() is called and the module request enters the 'done' state. Since the mutex is released when loading modules from the abort phase, the module list is zapped so this is iteration occurs over a local list. Therefore, the request_module() calls happen when object lists are in consistent state (after fulling aborting the transaction) and the commit list is empty. On the second pass, the netlink command will find that it already tried to load the module, so it does not request it again and nft_request_module() returns 0. Then, there is a look up to find the object that the command was missing. If the module was successfully loaded, the command proceeds normally since it finds the missing object in place, otherwise -ENOENT is reported to userspace. This patch also updates nfnetlink to include the reason to enter the abort phase, which is required for this new autoload module rationale. Fixes: ec7470b834fe ("netfilter: nf_tables: store transaction list locally while requesting module") Reported-by: [email protected] Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-01-24smp: Remove allocation mask from on_each_cpu_cond.*()Sebastian Andrzej Siewior1-3/+2
The allocation mask is no longer used by on_each_cpu_cond() and on_each_cpu_cond_mask() and can be removed. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-01-24smp: Use smp_cond_func_t as type for the conditional functionSebastian Andrzej Siewior1-6/+6
Use a typdef for the conditional function instead defining it each time in the function prototype. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-01-24Merge tag 'irqchip-5.6' of ↵Thomas Gleixner4-9/+101
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates from Marc Zyngier: - Conversion of the SiFive PLIC to hierarchical domains - New SiFive GPIO irqchip driver - New Aspeed SCI irqchip driver - New NXP INTMUX irqchip driver - Additional support for the Meson A1 GPIO irqchip - First part of the GICv4.1 support - Assorted fixes
2020-01-24Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', ↵Paul E. McKenney10-95/+206
'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD doc.2019.12.10a: Documentations updates exp.2019.12.09a: Expedited grace-period updates fixes.2020.01.24a: Miscellaneous fixes kfree_rcu.2020.01.24a: Batch kfree_rcu() work list.2020.01.10a: RCU-protected-list updates preempt.2020.01.24a: Preemptible RCU updates torture.2019.12.09a: Torture-test updates
2020-01-24rcu: Move rcu_{expedited,normal} definitions into rcupdate.hBen Dooks1-0/+4
This commit moves the rcu_{expedited,normal} definitions from kernel/rcu/update.c to include/linux/rcupdate.h to make sure they are in sync, and also to avoid the following warning from sparse: kernel/ksysfs.c:150:5: warning: symbol 'rcu_expedited' was not declared. Should it be static? kernel/ksysfs.c:167:5: warning: symbol 'rcu_normal' was not declared. Should it be static? Signed-off-by: Ben Dooks <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-01-24rcu: Remove kfree_call_rcu_nobatch()Joel Fernandes (Google)2-6/+0
Now that the kfree_rcu() special-casing has been removed from tree RCU, this commit removes kfree_call_rcu_nobatch() since it is no longer needed. Signed-off-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-01-24rcu: Remove kfree_rcu() special casing and lazy-callback handlingJoel Fernandes (Google)2-22/+12
This commit removes kfree_rcu() special-casing and the lazy-callback handling from Tree RCU. It moves some of this special casing to Tiny RCU, the removal of which will be the subject of later commits. This results in a nice negative delta. Suggested-by: Paul E. McKenney <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> [ paulmck: Add slab.h #include, thanks to kbuild test robot <[email protected]>. ] Signed-off-by: Paul E. McKenney <[email protected]>
2020-01-24rcu: Add basic support for kfree_rcu() batchingByungchul Park2-0/+8
Recently a discussion about stability and performance of a system involving a high rate of kfree_rcu() calls surfaced on the list [1] which led to another discussion how to prepare for this situation. This patch adds basic batching support for kfree_rcu(). It is "basic" because we do none of the slab management, dynamic allocation, code moving or any of the other things, some of which previous attempts did [2]. These fancier improvements can be follow-up patches and there are different ideas being discussed in those regards. This is an effort to start simple, and build up from there. In the future, an extension to use kfree_bulk and possibly per-slab batching could be done to further improve performance due to cache-locality and slab-specific bulk free optimizations. By using an array of pointers, the worker thread processing the work would need to read lesser data since it does not need to deal with large rcu_head(s) any longer. Torture tests follow in the next patch and show improvements of around 5x reduction in number of grace periods on a 16 CPU system. More details and test data are in that patch. There is an implication with rcu_barrier() with this patch. Since the kfree_rcu() calls can be batched, and may not be handed yet to the RCU machinery in fact, the monitor may not have even run yet to do the queue_rcu_work(), there seems no easy way of implementing rcu_barrier() to wait for those kfree_rcu()s that are already made. So this means a kfree_rcu() followed by an rcu_barrier() does not imply that memory will be freed once rcu_barrier() returns. Another implication is higher active memory usage (although not run-away..) until the kfree_rcu() flooding ends, in comparison to without batching. More details about this are in the second patch which adds an rcuperf test. Finally, in the near future we will get rid of kfree_rcu() special casing within RCU such as in rcu_do_batch and switch everything to just batching. Currently we don't do that since timer subsystem is not yet up and we cannot schedule the kfree_rcu() monitor as the timer subsystem's lock are not initialized. That would also mean getting rid of kfree_call_rcu_nobatch() entirely. [1] http://lore.kernel.org/lkml/[email protected] [2] https://lkml.org/lkml/2017/12/19/824 Cc: [email protected] Cc: [email protected] Co-developed-by: Byungchul Park <[email protected]> Signed-off-by: Byungchul Park <[email protected]> Signed-off-by: Joel Fernandes (Google) <[email protected]> [ paulmck: Applied 0day and Paul Walmsley feedback on ->monitor_todo. ] [ paulmck: Make it work during early boot. ] [ paulmck: Add a crude early boot self-test. ] [ paulmck: Style adjustments and experimental docbook structure header. ] Link: https://lore.kernel.org/lkml/[email protected]/T/#me9956f66cb611b95d26ae92700e1d901f46e8c59 Signed-off-by: Paul E. McKenney <[email protected]>
2020-01-24mptcp: parse and emit MP_CAPABLE option according to v1 specChristoph Paasch2-8/+12
This implements MP_CAPABLE options parsing and writing according to RFC 6824 bis / RFC 8684: MPTCP v1. Local key is sent on syn/ack, and both keys are sent on 3rd ack. MP_CAPABLE messages len are updated accordingly. We need the skbuff to correctly emit the above, so we push the skbuff struct as an argument all the way from tcp code to the relevant mptcp callbacks. When processing incoming MP_CAPABLE + data, build a full blown DSS-like map info, to simplify later processing. On child socket creation, we need to record the remote key, if available. Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Implement MPTCP receive pathMat Martineau2-0/+18
Parses incoming DSS options and populates outgoing MPTCP ACK fields. MPTCP fields are parsed from the TCP option header and placed in an skb extension, allowing the upper MPTCP layer to access MPTCP options after the skb has gone through the TCP stack. The subflow implements its own data_ready() ops, which ensures that the pending data is in sequence - according to MPTCP seq number - dropping out-of-seq skbs. The DATA_READY bit flag is set if this is the case. This allows the MPTCP socket layer to determine if more data is available without having to consult the individual subflows. It additionally validates the current mapping and propagates EoF events to the connection socket. Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Co-developed-by: Peter Krystad <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Co-developed-by: Davide Caratti <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Write MPTCP DSS headers to outgoing data packetsMat Martineau1-0/+1
Per-packet metadata required to write the MPTCP DSS option is written to the skb_ext area. One write to the socket may contain more than one packet of data, which is copied to page fragments and mapped in to MPTCP DSS segments with size determined by the available page fragments and the maximum mapping length allowed by the MPTCP specification. If do_tcp_sendpages() splits a DSS segment in to multiple skbs, that's ok - the later skbs can either have duplicated DSS mapping information or none at all, and the receiver can handle that. The current implementation uses the subflow frag cache and tcp sendpages to avoid excessive code duplication. More work is required to ensure that it works correctly under memory pressure and to support MPTCP-level retransmissions. The MPTCP DSS checksum is not yet implemented. Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Co-developed-by: Peter Krystad <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Handle MP_CAPABLE options for outgoing connectionsPeter Krystad2-0/+60
Add hooks to tcp_output.c to add MP_CAPABLE to an outgoing SYN request, to capture the MP_CAPABLE in the received SYN-ACK, to add MP_CAPABLE to the final ACK of the three-way handshake. Use the .sk_rx_dst_set() handler in the subflow proto to capture when the responding SYN-ACK is received and notify the MPTCP connection layer. Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Associate MPTCP context with TCP socketPeter Krystad1-0/+3
Use ULP to associate a subflow_context structure with each TCP subflow socket. Creating these sockets requires new bind and connect functions to make sure ULP is set up immediately when the subflow sockets are created. Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Co-developed-by: Davide Caratti <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Handle MPTCP TCP optionsPeter Krystad2-0/+36
Add hooks to parse and format the MP_CAPABLE option. This option is handled according to MPTCP version 0 (RFC6824). MPTCP version 1 MP_CAPABLE (RFC6824bis/RFC8684) will be added later in coordination with related code changes. Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Co-developed-by: Davide Caratti <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24mptcp: Add MPTCP socket stubsMat Martineau1-0/+16
Implements the infrastructure for MPTCP sockets. MPTCP sockets open one in-kernel TCP socket per subflow. These subflow sockets are only managed by the MPTCP socket that owns them and are not visible from userspace. This commit allows a userspace program to open an MPTCP socket with: sock = socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP); The resulting socket is simply a wrapper around a single regular TCP socket, without any of the MPTCP protocol implemented over the wire. Co-developed-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Co-developed-by: Peter Krystad <[email protected]> Signed-off-by: Peter Krystad <[email protected]> Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Co-developed-by: Paolo Abeni <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: Christoph Paasch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24net: bridge: vlan: add per-vlan stateNikolay Aleksandrov1-0/+1
The first per-vlan option added is state, it is needed for EVPN and for per-vlan STP. The state allows to control the forwarding on per-vlan basis. The vlan state is considered only if the port state is forwarding in order to avoid conflicts and be consistent. br_allowed_egress is called only when the state is forwarding, but the ingress case is a bit more complicated due to the fact that we may have the transition between port:BR_STATE_FORWARDING -> vlan:BR_STATE_LEARNING which should still allow the bridge to learn from the packet after vlan filtering and it will be dropped after that. Also to optimize the pvid state check we keep a copy in the vlan group to avoid one lookup. The state members are modified with *_ONCE() to annotate the lockless access. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24net: bridge: vlan: add basic option setting supportNikolay Aleksandrov1-0/+1
This patch adds support for option modification of single vlans and ranges. It allows to only modify options, i.e. skip create/delete by using the BRIDGE_VLAN_INFO_ONLY_OPTS flag. When working with a range option changes we try to pack the notifications as much as possible. v2: do full port (all vlans) notification only when creating/deleting vlans for compatibility, rework the range detection when changing options, add more verbose extack errors and check if a vlan should be used (br_vlan_should_use checks) Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-01-24KVM: Introduce kvm_vcpu_destroy()Sean Christopherson1-0/+1
Add kvm_vcpu_destroy() and wire up all architectures to call the common function instead of their arch specific implementation. The common destruction function will be used by future patches to move allocation and initialization of vCPUs to common KVM code, i.e. to free resources that are allocated by arch agnostic code. No functional change intended. Acked-by: Christoffer Dall <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-01-24KVM: Add kvm_arch_vcpu_precreate() to handle pre-allocation issuesSean Christopherson1-0/+1
Add a pre-allocation arch hook to handle checks that are currently done by arch specific code prior to allocating the vCPU object. This paves the way for moving the allocation to common KVM code. Acked-by: Christoffer Dall <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-01-24KVM: Remove kvm_arch_vcpu_free() declarationSean Christopherson1-1/+0
Remove KVM's declaration of kvm_arch_vcpu_free() now that the function is gone from all architectures (several architectures were relying on the forward declaration). Acked-by: Christoffer Dall <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-01-24dmaengine: Create symlinks between DMA channels and slavesGeert Uytterhoeven1-0/+4
Currently it is not easy to find out which DMA channels are in use, and which slave devices are using which channels. Fix this by creating two symlinks between the DMA channel and the actual slave device when a channel is requested: 1. A "slave" symlink from DMA channel to slave device, 2. A "dma:<name>" symlink slave device to DMA channel. When the channel is released, the symlinks are removed again. The latter requires keeping track of the slave device and the channel name in the dma_chan structure. Note that this is limited to channel request functions for requesting an exclusive slave channel that take a device pointer (dma_request_chan() and dma_request_slave_channel*()). Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Niklas Söderlund <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Vinod Koul <[email protected]>
2020-01-24dmaengine: idxd: Init and probe for Intel data acceleratorsDave Jiang1-0/+228
The idxd driver introduces the Intel Data Stream Accelerator [1] that will be available on future Intel Xeon CPUs. One of the kernel access point for the driver is through the dmaengine subsystem. It will initially provide the DMA copy service to the kernel. Some of the main functionality introduced with this accelerator are: shared virtual memory (SVM) support, and descriptor submission using Intel CPU instructions movdir64b and enqcmds. There will be additional accelerator devices that share the same driver with variations to capabilities. This commit introduces the probe and initialization component of the driver. [1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification Signed-off-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <[email protected]>
2020-01-24dmaengine: add support to dynamic register/unregister of channelsDave Jiang1-0/+4
With the channel registration routines broken out, now add support code to allow independent registering and unregistering of channels in a hotplug fashion. Signed-off-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/157965023364.73301.7821862091077299040.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <[email protected]>
2020-01-23hwmon: (pmbus) Detect if chip is write protectedGuenter Roeck1-1/+10
If a chip is write protected, we can not change any limits, and we can not clear status flags. This may be the reason why clearing status flags is reported to not work for some chips. Detect the condition in the pmbus core. If the chip is write protected, set limit attributes as read-only, and set the flag indicating that the status flag should be ignored. Signed-off-by: Guenter Roeck <[email protected]>
2020-01-23hwmon: Add support for enable attributes to hwmon coreGuenter Roeck1-3/+15
The hwmon ABI supports enable attributes since commit fb41a710f84e ("hwmon: Document the sensor enable attribute"), but did not add support for those attributes to the hwmon core. Do that now. Since the enable attributes are logically the most important attributes, they are added as first attribute to the attribute list. Move hwmon_in_enable from last to first place for consistency. Signed-off-by: Guenter Roeck <[email protected]>
2020-01-23hwmon: Add intrusion templatesDr. David Alan Gilbert1-0/+8
Add templates for intrusion%d_alarm and intrusion%d_beep. Note, these start at 0. Signed-off-by: Dr. David Alan Gilbert <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Guenter Roeck <[email protected]>
2020-01-23Merge tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-daxLinus Torvalds1-5/+40
Pull XArray fixes from Matthew Wilcox: "Primarily bugfixes, mostly around handling index wrap-around correctly. A couple of doc fixes and adding missing APIs. I had an oops live on stage at linux.conf.au this year, and it turned out to be a bug in xas_find() which I can't prove isn't triggerable in the current codebase. Then in looking for the bug, I spotted two more bugs. The bots have had a few days to chew on this with no problems reported, and it passes the test-suite (which now has more tests to make sure these problems don't come back)" * tag 'xarray-5.5' of git://git.infradead.org/users/willy/linux-dax: XArray: Add xa_for_each_range XArray: Fix xas_find returning too many entries XArray: Fix xa_find_after with multi-index entries XArray: Fix infinite loop with entry at ULONG_MAX XArray: Add wrappers for nested spinlocks XArray: Improve documentation of search marks XArray: Fix xas_pause at ULONG_MAX
2020-01-23Merge tag 'trace-v5.5-rc6-2' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Various tracing fixes: - Fix a function comparison warning for a xen trace event macro - Fix a double perf_event linking to a trace_uprobe_filter for multiple events - Fix suspicious RCU warnings in trace event code for using list_for_each_entry_rcu() when the "_rcu" portion wasn't needed. - Fix a bug in the histogram code when using the same variable - Fix a NULL pointer dereference when tracefs lockdown enabled and calling trace_set_default_clock() - A fix to a bug found with the double perf_event linking patch" * tag 'trace-v5.5-rc6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/uprobe: Fix to make trace_uprobe_filter alignment safe tracing: Do not set trace clock if tracefs lockdown is in effect tracing: Fix histogram code when expression has same var as value tracing: trigger: Replace unneeded RCU-list traversals tracing/uprobe: Fix double perf_event linking on multiprobe uprobe tracing: xen: Ordered comparison of function pointers
2020-01-23mm: remove arch_bprm_mm_init() hookDave Hansen1-5/+0
From: Dave Hansen <[email protected]> MPX is being removed from the kernel due to a lack of support in the toolchain going forward (gcc). arch_bprm_mm_init() is used at execve() time. The only non-stub implementation is on x86 for MPX. Remove the hook entirely from all architectures and generic code. Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: [email protected] Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Jeff Dike <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Anton Ivanov <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dave Hansen <[email protected]>
2020-01-23bcache: print written and keys in trace_bcache_btree_writeGuoju Fang1-1/+2
It's useful to dump written block and keys on btree write, this patch add them into trace_bcache_btree_write. Signed-off-by: Guoju Fang <[email protected]> Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-01-23bcache: use read_cache_page_gfp to read the superblockChristoph Hellwig1-0/+1
Avoid a pointless dependency on buffer heads in bcache by simply open coding reading a single page. Also add a SB_OFFSET define for the byte offset of the superblock instead of using magic numbers. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-01-23bcache: use a separate data structure for the on-disk super blockChristoph Hellwig1-0/+51
Split out an on-disk version struct cache_sb with the proper endianness annotations. This fixes a fair chunk of sparse warnings, but there are some left due to the way the checksum is defined. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2020-01-23usb: phy: phy-gpio-vbus-usb: Convert to GPIO descriptorsLinus Walleij1-33/+0
Instead of using the legacy GPIO API and keeping track on polarity inversion semantics in the driver, switch to use GPIO descriptors for this driver and change all consumers in the process. This makes it possible to retire platform data completely: the only remaining platform data member was "wakeup" which was intended to make the vbus interrupt wakeup capable, but was not set by any users and thus remained unused. VBUS was not waking any devices up. Leave a comment about it so later developers using the platform can consider setting it to always enabled so plugging in USB wakes up the platform. Cc: Daniel Mack <[email protected]> Cc: Haojian Zhuang <[email protected]> Acked-by: Robert Jarzmik <[email protected]> Acked-by: Felipe Balbi <[email protected]> Acked-by: Sylwester Nawrocki <[email protected]> Acked-by: Philipp Zabel <[email protected]> Signed-off-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>