aboutsummaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)AuthorFilesLines
2020-10-09netlink: export policy in extended ACKJohannes Berg1-0/+4
Add a new attribute NLMSGERR_ATTR_POLICY to the extended ACK to advertise the policy, e.g. if an attribute was out of range, you'll know the range that's permissible. Add new NL_SET_ERR_MSG_ATTR_POL() and NL_SET_ERR_MSG_ATTR_POL() macros to set this, since realistically it's only useful to do this when the bad attribute (offset) is also returned. Use it in lib/nlattr.c which practically does all the policy validation. v2: - add and use netlink_policy_dump_attr_size_estimate() v3: - remove redundant break v4: - really remove redundant break ... sorry Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09net/tls: remove a duplicate function prototypeRandy Dunlap1-4/+0
Remove one of the two instances of the function prototype for tls_validate_xmit_skb(). Signed-off-by: Randy Dunlap <[email protected]> Cc: Boris Pismenny <[email protected]> Cc: Aviad Yehezkel <[email protected]> Cc: John Fastabend <[email protected]> Cc: Daniel Borkmann <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09devlink: Add enable_remote_dev_reset generic parameterMoshe Shemesh1-0/+4
The enable_remote_dev_reset devlink param flags that the host admin allows device resets that can be initiated by other hosts. This parameter is useful for setups where a device is shared by different hosts, such as multi-host setup. Once the user set this parameter to false, the driver should NACK any attempt to reset the device while the driver is loaded. Signed-off-by: Moshe Shemesh <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09devlink: Add remote reload statsMoshe Shemesh1-0/+4
Add remote reload stats to hold the history of actions performed due devlink reload commands initiated by remote host. For example, in case firmware activation with reset finished successfully but was initiated by remote host. The function devlink_remote_reload_actions_performed() is exported to enable drivers update on remote reload actions performed as it was not initiated by their own devlink instance. Expose devlink remote reload stats to the user through devlink dev get command. Examples: $ devlink dev show pci/0000:82:00.0: stats: reload: driver_reinit 2 fw_activate 1 fw_activate_no_reset 0 remote_reload: driver_reinit 0 fw_activate 0 fw_activate_no_reset 0 pci/0000:82:00.1: stats: reload: driver_reinit 1 fw_activate 0 fw_activate_no_reset 0 remote_reload: driver_reinit 1 fw_activate 1 fw_activate_no_reset 0 $ devlink dev show -jp { "dev": { "pci/0000:82:00.0": { "stats": { "reload": { "driver_reinit": 2, "fw_activate": 1, "fw_activate_no_reset": 0 }, "remote_reload": { "driver_reinit": 0, "fw_activate": 0, "fw_activate_no_reset": 0 } } }, "pci/0000:82:00.1": { "stats": { "reload": { "driver_reinit": 1, "fw_activate": 0, "fw_activate_no_reset": 0 }, "remote_reload": { "driver_reinit": 1, "fw_activate": 1, "fw_activate_no_reset": 0 } } } } } Signed-off-by: Moshe Shemesh <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09devlink: Add reload statsMoshe Shemesh1-0/+8
Add reload stats to hold the history per reload action type and limit. For example, the number of times fw_activate has been performed on this device since the driver module was added or if the firmware activation was performed with or without reset. Add devlink notification on stats update. Expose devlink reload stats to the user through devlink dev get command. Examples: $ devlink dev show pci/0000:82:00.0: stats: reload: driver_reinit 2 fw_activate 1 fw_activate_no_reset 0 pci/0000:82:00.1: stats: reload: driver_reinit 1 fw_activate 0 fw_activate_no_reset 0 $ devlink dev show -jp { "dev": { "pci/0000:82:00.0": { "stats": { "reload": { "driver_reinit": 2, "fw_activate": 1, "fw_activate_no_reset": 0 } } }, "pci/0000:82:00.1": { "stats": { "reload": { "driver_reinit": 1, "fw_activate": 0, "fw_activate_no_reset": 0 } } } } } Signed-off-by: Moshe Shemesh <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09devlink: Add devlink reload limit optionMoshe Shemesh1-2/+6
Add reload limit to demand restrictions on reload actions. Reload limits supported: no_reset: No reset allowed, no down time allowed, no link flap and no configuration is lost. By default reload limit is unspecified and so no constraints on reload actions are required. Some combinations of action and limit are invalid. For example, driver can not reinitialize its entities without any downtime. The no_reset reload limit will have usecase in this patchset to implement restricted fw_activate on mlx5. Have the uapi parameter of reload limit ready for future support of multiselection. Signed-off-by: Moshe Shemesh <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09devlink: Add reload action option to devlink reload commandMoshe Shemesh1-3/+4
Add devlink reload action to allow the user to request a specific reload action. The action parameter is optional, if not specified then devlink driver re-init action is used (backward compatible). Note that when required to do firmware activation some drivers may need to reload the driver. On the other hand some drivers may need to reset the firmware to reinitialize the driver entities. Therefore, the devlink reload command returns the actions which were actually performed. Reload actions supported are: driver_reinit: driver entities re-initialization, applying devlink-param and devlink-resource values. fw_activate: firmware activate. command examples: $devlink dev reload pci/0000:82:00.0 action driver_reinit reload_actions_performed: driver_reinit $devlink dev reload pci/0000:82:00.0 action fw_activate reload_actions_performed: driver_reinit fw_activate Signed-off-by: Moshe Shemesh <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-09net/sched: get rid of qdisc->paddedEric Dumazet2-5/+5
kmalloc() of sufficiently big portion of memory is cache-aligned in regular conditions. If some debugging options are used, there is no reason qdisc structures would need 64-byte alignment if most other kernel structures are not aligned. This get rid of QDISC_ALIGN and QDISC_ALIGNTO. Addition of privdata field will help implementing the reverse of qdisc_priv() and documents where the private data is. Signed-off-by: Eric Dumazet <[email protected]> Cc: Allen Pais <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-10-08mac80211: copy configured beacon tx rate to driverRajkumar Manoharan1-0/+3
The user is allowed to change beacon tx rate (HT/VHT/HE) from hostapd. This information needs to be passed to the driver when the rate control is offloaded to the firmware. The driver capability of allowing beacon rate is already validated in cfg80211, so simply passing the rate information to the driver is enough. Signed-off-by: Rajkumar Manoharan <[email protected]> Link: https://lore.kernel.org/r/[email protected] [adjust commit message slightly] Signed-off-by: Johannes Berg <[email protected]>
2020-10-06netlink: add mask validationJakub Kicinski1-0/+10
We don't have good validation policy for existing unsigned int attrs which serve as flags (for new ones we could use NLA_BITFIELD32). With increased use of policy dumping having the validation be expressed as part of the policy is important. Add validation policy in form of a mask of supported/valid bits. Support u64 in the uAPI to be future-proof, but really for now the embedded mask member can only hold 32 bits, so anything with bit 32+ set will always fail validation. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-06netlink: create helpers for checking type is an intJakub Kicinski1-8/+9
There's a number of policies which check if type is a uint or sint. Factor the checking against the list of value sizes to a helper for easier reuse. v2: - new patch Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller4-12/+13
Rejecting non-native endian BTF overlapped with the addition of support for it. The rest were more simple overlapping changes, except the renesas ravb binding update, which had to follow a file move as well as a YAML conversion. Signed-off-by: David S. Miller <[email protected]>
2020-10-05net: dsa: propagate switchdev vlan_filtering prepare phase to driversVladimir Oltean1-1/+2
A driver may refuse to enable VLAN filtering for any reason beyond what the DSA framework cares about, such as: - having tc-flower rules that rely on the switch being VLAN-aware - the particular switch does not support VLAN, even if the driver does (the DSA framework just checks for the presence of the .port_vlan_add and .port_vlan_del pointers) - simply not supporting this configuration to be toggled at runtime Currently, when a driver rejects a configuration it cannot support, it does this from the commit phase, which triggers various warnings in switchdev. So propagate the prepare phase to drivers, to give them the ability to refuse invalid configurations cleanly and avoid the warnings. Since we need to modify all function prototypes and check for the prepare phase from within the drivers, take that opportunity and move the existing driver restrictions within the prepare phase where that is possible and easy. Cc: Florian Fainelli <[email protected]> Cc: Martin Blumenstingl <[email protected]> Cc: Hauke Mehrtens <[email protected]> Cc: Woojung Huh <[email protected]> Cc: Microchip Linux Driver Support <[email protected]> Cc: Sean Wang <[email protected]> Cc: Landen Chao <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Vivien Didelot <[email protected]> Cc: Jonathan McDowell <[email protected]> Cc: Linus Walleij <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: Claudiu Manoil <[email protected]> Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-04net: dsa: Add helper for converting devlink port to ds and portAndrew Lunn1-0/+14
Hide away from DSA drivers how devlink works. Signed-off-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-04net: dsa: Add devlink port regions support to DSAAndrew Lunn1-0/+5
Allow DSA drivers to make use of devlink port regions, via simple wrappers. Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-04net: devlink: Add support for port regionsAndrew Lunn1-0/+27
Allow regions to be registered to a devlink port. The same netlink API is used, but the port index is provided to indicate when a region is a port region as opposed to a device region. Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-04net: dsa: Register devlink ports before calling DSA driver setup()Andrew Lunn1-0/+1
DSA drivers want to create regions on devlink ports as well as the devlink device instance, in order to export registers and other tables per port. To keep all this code together in the drivers, have the devlink ports registered early, so the setup() method can setup both device and port devlink regions. v3: Remove dp->setup Move common code out of switch statement. Fix wrong goto Signed-off-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Tested-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2-7/+13
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: 1) Rename 'searched' column to 'clashres' in conntrack /proc/ stats to amend a recent patch, from Florian Westphal. 2) Remove unused nft_data_debug(), from YueHaibing. 3) Remove unused definitions in IPVS, also from YueHaibing. 4) Fix user data memleak in tables and objects, this is also amending a recent patch, from Jose M. Guisado. 5) Use nla_memdup() to allocate user data in table and objects, also from Jose M. Guisado 6) User data support for chains, from Jose M. Guisado 7) Remove unused definition in nf_tables_offload, from YueHaibing. 8) Use kvzalloc() in ip_set_alloc(), from Vasily Averin. 9) Fix false positive reported by lockdep in nfnetlink mutexes, from Florian Westphal. 10) Extend fast variant of cmp for neq operation, from Phil Sutter. 11) Implement fast bitwise variant, also from Phil Sutter. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-04netfilter: nf_tables: Implement fast bitwise expressionPhil Sutter1-0/+9
A typical use of bitwise expression is to mask out parts of an IP address when matching on the network part only. Optimize for this common use with a fast variant for NFT_BITWISE_BOOL-type expressions operating on 32bit-sized values. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-10-04netfilter: nf_tables: Enable fast nft_cmp for inverted matchesPhil Sutter1-0/+2
Add a boolean indicating NFT_CMP_NEQ. To include it into the match decision, it is sufficient to XOR it with the data comparison's result. While being at it, store the mask that is calculated during expression init and free the eval routine from having to recalculate it each time. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-10-03net/sched: act_vlan: Add {POP,PUSH}_ETH actionsGuillaume Nault1-0/+2
Implement TCA_VLAN_ACT_POP_ETH and TCA_VLAN_ACT_PUSH_ETH, to respectively pop and push a base Ethernet header at the beginning of a frame. POP_ETH is just a matter of pulling ETH_HLEN bytes. VLAN tags, if any, must be stripped before calling POP_ETH. PUSH_ETH is restricted to skbs with no mac_header, and only the MAC addresses can be configured. The Ethertype is automatically set from skb->protocol. These restrictions ensure that all skb's fields remain consistent, so that this action can't confuse other part of the networking stack (like GSO). Since openvswitch already had these actions, consolidate the code in skbuff.c (like for vlan and mpls push/pop). Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-03netlink: rework policy dump to support multiple policiesJohannes Berg1-3/+6
Rework the policy dump code a bit to support adding multiple policies to a single dump, in order to e.g. support per-op policies in generic netlink. v2: - move kernel-doc to implementation [Jakub] - squash the first patch to not flip-flop on the prototype [Jakub] - merge netlink_policy_dump_get_policy_idx() with the old get_policy_idx() we already had - rebase without Jakub's patch to have per-op dump Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02genetlink: bring back per op policyJakub Kicinski1-0/+4
Add policy to the struct genl_ops structure, this time with maxattr, so it can be used properly. Propagate .policy and .maxattr from the family in genl_get_cmd() if needed, this way the rest of the code does not have to worry if the policy is per op or global. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02genetlink: add a structure for dump stateJakub Kicinski1-4/+7
Whenever netlink dump uses more than 2 cb->args[] entries code gets hard to read. We're about to add more state to ctrl_dumppolicy() so create a structure. Since the structure is typed and clearly named we can remove the local fam_id variable and use ctx->fam_id directly. v3: - rebase onto explicit free fix v1: - s/nl_policy_dump/netlink_policy_dump_state/ - forward declare struct netlink_policy_dump_state, and move from passing unsigned long to actual pointer type - add build bug on - u16 fam_id - s/args/ctx/ Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02genetlink: add small version of opsJakub Kicinski1-14/+39
We want to add maxattr and policy back to genl_ops, to enable dumping per command policy to user space. This, however, would cause bloat for all the families with global policies. Introduce smaller version of ops (half the size of genl_ops). Translate these smaller ops into a full blown struct before use in the core. v1: - use struct assignment - put a full copy of the op in struct genl_dumpit_info - s/light/small/ Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02genetlink: reorg struct genl_familyJakub Kicinski1-5/+5
There are holes and oversized members in struct genl_family. Before: /* size: 104, cachelines: 2, members: 16 */ After: /* size: 88, cachelines: 2, members: 16 */ The command field in struct genlmsghdr is a u8, so no point in the operation count being 32 bit. Also operation 0 is usually undefined, so we only need 255 entries. netnsok and parallel_ops are only ever initialized to true. We can grow the fields as needed, compiler should warn us if someone tries to assign larger constants. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02devlink: add .trap_group_action_set() callbackIoana Ciornei1-0/+10
Add a new devlink callback, .trap_group_action_set(), which can be used by device drivers which do not support controlling the action (drop, trap) on each trap but rather on the entire group trap. If this new callback is populated, it will take precedence over the .trap_action_set() callback when the user requests a change of all the traps in a group. Signed-off-by: Ioana Ciornei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02devlink: add parser error drop packet trapsIoana Ciornei1-0/+52
Add parser error drop packet traps, so that capable device driver could register them with devlink. The new packet trap group holds any drops of packets which were marked by the device as erroneous during header parsing. Add documentation for every added packet trap and packet trap group. Signed-off-by: Ioana Ciornei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02Merge tag 'mac80211-next-for-net-next-2020-10-02' of ↵David S. Miller2-1/+52
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Another set of changes, this time with: * lots more S1G band support * 6 GHz scanning, finally * kernel-doc fixes * non-split wiphy dump fixes in nl80211 * various other small cleanups/features ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-02net: dsa: Call dsa_untag_bridge_pvid() from dsa_switch_rcv()Florian Fainelli1-0/+8
When a DSA switch driver needs to call dsa_untag_bridge_pvid(), it can set dsa_switch::untag_brige_pvid to indicate this is necessary. This is a pre-requisite to making sure that we are always calling dsa_untag_bridge_pvid() after eth_type_trans() has been called. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02Merge branch 'master' of ↵David S. Miller1-0/+33
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-10-02 1) Add a full xfrm compatible layer for 32-bit applications on 64-bit kernels. From Dmitry Safonov. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-02netlink: fix policy dump leakJohannes Berg1-1/+2
[ Upstream commit a95bc734e60449e7b073ff7ff70c35083b290ae9 ] If userspace doesn't complete the policy dump, we leak the allocated state. Fix this. Fixes: d07dcf9aadd6 ("netlink: add infrastructure to expose policies to userspace") Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02netlink: fix policy dump leakJohannes Berg1-1/+2
If userspace doesn't complete the policy dump, we leak the allocated state. Fix this. Fixes: d07dcf9aadd6 ("netlink: add infrastructure to expose policies to userspace") Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02bpf: tcp: Do not limit cb_flags when creating child sk from listen skMartin KaFai Lau1-33/+0
The commit 0813a841566f ("bpf: tcp: Allow bpf prog to write and parse TCP header option") unnecessarily introduced bpf_skops_init_child() which limited the child sk from inheriting all bpf_sock_ops_cb_flags of the listen sk. That breaks existing user expectation. This patch removes the bpf_skops_init_child() and just allows sock_copy() to do its job to copy everything from listen sk to the child sk. Fixes: 0813a841566f ("bpf: tcp: Allow bpf prog to write and parse TCP header option") Reported-by: Stanislav Fomichev <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2-3/+1
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-10-01 The following pull-request contains BPF updates for your *net-next* tree. We've added 90 non-merge commits during the last 8 day(s) which contain a total of 103 files changed, 7662 insertions(+), 1894 deletions(-). Note that once bpf(/net) tree gets merged into net-next, there will be a small merge conflict in tools/lib/bpf/btf.c between commit 1245008122d7 ("libbpf: Fix native endian assumption when parsing BTF") from the bpf tree and the commit 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") from the bpf-next tree. Correct resolution would be to stick with bpf-next, it should look like: [...] /* check BTF magic */ if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) { err = -EIO; goto err_out; } if (magic != BTF_MAGIC && magic != bswap_16(BTF_MAGIC)) { /* definitely not a raw BTF */ err = -EPROTO; goto err_out; } /* get file size */ [...] The main changes are: 1) Add bpf_snprintf_btf() and bpf_seq_printf_btf() helpers to support displaying BTF-based kernel data structures out of BPF programs, from Alan Maguire. 2) Speed up RCU tasks trace grace periods by a factor of 50 & fix a few race conditions exposed by it. It was discussed to take these via BPF and networking tree to get better testing exposure, from Paul E. McKenney. 3) Support multi-attach for freplace programs, needed for incremental attachment of multiple XDP progs using libxdp dispatcher model, from Toke Høiland-Jørgensen. 4) libbpf support for appending new BTF types at the end of BTF object, allowing intrusive changes of prog's BTF (useful for future linking), from Andrii Nakryiko. 5) Several BPF helper improvements e.g. avoid atomic op in cookie generator and add a redirect helper into neighboring subsys, from Daniel Borkmann. 6) Allow map updates on sockmaps from bpf_iter context in order to migrate sockmaps from one to another, from Lorenz Bauer. 7) Fix 32 bit to 64 bit assignment from latest alu32 bounds tracking which caused a verifier issue due to type downgrade to scalar, from John Fastabend. 8) Follow-up on tail-call support in BPF subprogs which optimizes x64 JIT prologue and epilogue sections, from Maciej Fijalkowski. 9) Add an option to perf RB map to improve sharing of event entries by avoiding remove- on-close behavior. Also, add BPF_PROG_TEST_RUN for raw_tracepoint, from Song Liu. 10) Fix a crash in AF_XDP's socket_release when memory allocation for UMEMs fails, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-30drop_monitor: Filter control packets in drop monitorIdo Schimmel1-0/+2
Previously, devlink called into drop monitor in order to report hardware originated drops / exceptions. devlink intentionally filtered control packets and did not pass them to drop monitor as they were not dropped by the underlying hardware. Now drop monitor registers its probe on a generic 'devlink_trap_report' tracepoint and should therefore perform this filtering itself instead of having devlink do that. Add the trap type as metadata and have drop monitor ignore control packets. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30drop_monitor: Convert to using devlink tracepointIdo Schimmel1-36/+0
Convert drop monitor to use the recently introduced 'devlink_trap_report' tracepoint instead of having devlink call into drop monitor. This is both consistent with software originated drops ('kfree_skb' tracepoint) and also allows drop monitor to be built as a module and still report hardware originated drops. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30devlink: Add a tracepoint for trap reportsIdo Schimmel1-0/+14
Add a tracepoint for trap reports so that drop monitor could register its probe on it. Use trace_devlink_trap_report_enabled() to avoid wasting cycles setting the trap metadata if the tracepoint is not enabled. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30tcp: add exponential backoff in __tcp_send_ack()Eric Dumazet1-1/+2
Whenever host is under very high memory pressure, __tcp_send_ack() skb allocation fails, and we setup a 200 ms (TCP_DELACK_MAX) timer before retrying. On hosts with high number of TCP sockets, we can spend considerable amount of cpu cycles in these attempts, add high pressure on various spinlocks in mm-layer, ultimately blocking threads attempting to free space from making any progress. This patch adds standard exponential backoff to avoid adding fuel to the fire. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30inet: remove icsk_ack.blockedEric Dumazet1-2/+2
TCP has been using it to work around the possibility of tcp_delack_timer() finding the socket owned by user. After commit 6f458dfb4092 ("tcp: improve latencies of timer triggered events") we added TCP_DELACK_TIMER_DEFERRED atomic bit for more immediate recovery, so we can get rid of icsk_ack.blocked This frees space that following patch will reuse. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-30bpf, net: Rework cookie generator as per-cpu oneDaniel Borkmann1-1/+1
With its use in BPF, the cookie generator can be called very frequently in particular when used out of cgroup v2 hooks (e.g. connect / sendmsg) and attached to the root cgroup, for example, when used in v1/v2 mixed environments. In particular, when there's a high churn on sockets in the system there can be many parallel requests to the bpf_get_socket_cookie() and bpf_get_netns_cookie() helpers which then cause contention on the atomic counter. As similarly done in f991bd2e1421 ("fs: introduce a per-cpu last_ino allocator"), add a small helper library that both can use for the 64 bit counters. Given this can be called from different contexts, we also need to deal with potential nested calls even though in practice they are considered extremely rare. One idea as suggested by Eric Dumazet was to use a reverse counter for this situation since we don't expect 64 bit overflows anyways; that way, we can avoid bigger gaps in the 64 bit counter space compared to just batch-wise increase. Even on machines with small number of cores (e.g. 4) the cookie generation shrinks from min/max/med/avg (ns) of 22/50/40/38.9 down to 10/35/14/17.3 when run in parallel from multiple CPUs. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Cc: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/bpf/8a80b8d27d3c49f9a14e1d5213c19d8be87d1dc8.1601477936.git.daniel@iogearbox.net
2020-09-30netfilter: nf_tables: add userdata attributes to nft_chainJose M. Guisado Gomez1-0/+2
Enables storing userdata for nft_chain. Field udata points to user data and udlen stores its length. Adds new attribute flag NFTA_CHAIN_USERDATA. Signed-off-by: Jose M. Guisado Gomez <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2020-09-29net: caif: Remove unused caif SPI driverThomas Gleixner1-155/+0
While chasing in_interrupt() (ab)use in drivers it turned out that the caif_spi driver has never been in use since the driver was merged 10 years ago. There never was any matching code which provides a platform device. The driver has not seen any update (asided of treewide changes and cleanups) since 8 years and the maintainers vanished from the planet. So analysing the potential contexts and the (in)correctness of in_interrupt() usage is just a pointless exercise. Remove the cruft. Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-29Merge branch 'for-upstream' of ↵David S. Miller3-0/+26
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2020-09-29 Here's the main bluetooth-next pull request for 5.10: - Multiple fixes to suspend/resume handling - Added mgmt events for controller suspend/resume state - Improved extended advertising support - btintel: Enhanced support for next generation controllers - Added Qualcomm Bluetooth SoC WCN6855 support - Several other smaller fixes & improvements ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-28genetlink: add missing kdoc for validation flagsJakub Kicinski1-0/+1
Validation flags are missing kdoc, add it. Fixes: ef6243acb478 ("genetlink: optionally validate strictly/dumps") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28net/smc: introduce CHID callback for ISM devicesUrsula Braun1-0/+1
With SMCD version 2 the CHIDs of ISM devices are needed for the CLC handshake. This patch provides the new callback to retrieve the CHID of an ISM device. Signed-off-by: Ursula Braun <[email protected]> Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28net/smc: introduce System Enterprise ID (SEID)Ursula Braun1-0/+3
SMCD version 2 defines a System Enterprise ID (short SEID). This patch contains the SEID creation and adds the callback to retrieve the created SEID. Signed-off-by: Ursula Braun <[email protected]> Signed-off-by: Karsten Graul <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28udp_tunnel: add the ability to share port tablesJakub Kicinski1-0/+24
Unfortunately recent Intel NIC designs share the UDP port table across netdevs. So far the UDP tunnel port state was maintained per netdev, we need to extend that to cater to Intel NICs. Expect NICs to allocate the info structure dynamically and link to the state from there. All the shared NICs will record port offload information in the one instance of the table so we need to make sure that the use count can accommodate larger numbers. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28Merge branch 'master' of ↵David S. Miller1-10/+6
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2020-09-28 1) Fix a build warning in ip_vti if CONFIG_IPV6 is not set. From YueHaibing. 2) Restore IPCB on espintcp before handing the packet to xfrm as the information there is still needed. From Sabrina Dubroca. 3) Fix pmtu updating for xfrm interfaces. From Sabrina Dubroca. 4) Some xfrm state information was not cloned with xfrm_do_migrate. Fixes to clone the full xfrm state, from Antony Antony. 5) Use the correct address family in xfrm_state_find. The struct flowi must always be interpreted along with the original address family. This got lost over the years. Fix from Herbert Xu. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-09-28nl80211: extend support to config spatial reuse parameter setRajkumar Manoharan1-0/+10
Allow the user to configure below Spatial Reuse Parameter Set element. * Non-SRG OBSS PD Max Offset * SRG BSS Color Bitmap * SRG Partial BSSID Bitmap Signed-off-by: Rajkumar Manoharan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>