aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2016-04-20rtnetlink: add new RTM_GETSTATS message to dump link statsRoopa Prabhu2-0/+28
This patch adds a new RTM_GETSTATS message to query link stats via netlink from the kernel. RTM_NEWLINK also dumps stats today, but RTM_NEWLINK returns a lot more than just stats and is expensive in some cases when frequent polling for stats from userspace is a common operation. RTM_GETSTATS is an attempt to provide a light weight netlink message to explicity query only link stats from the kernel on an interface. The idea is to also keep it extensible so that new kinds of stats can be added to it in the future. This patch adds the following attribute for NETDEV stats: struct nla_policy ifla_stats_policy[IFLA_STATS_MAX + 1] = { [IFLA_STATS_LINK_64] = { .len = sizeof(struct rtnl_link_stats64) }, }; Like any other rtnetlink message, RTM_GETSTATS can be used to get stats of a single interface or all interfaces with NLM_F_DUMP. Future possible new types of stat attributes: link af stats: - IFLA_STATS_LINK_IPV6 (nested. for ipv6 stats) - IFLA_STATS_LINK_MPLS (nested. for mpls/mdev stats) extended stats: - IFLA_STATS_LINK_EXTENDED (nested. extended software netdev stats like bridge, vlan, vxlan etc) - IFLA_STATS_LINK_HW_EXTENDED (nested. extended hardware stats which are available via ethtool today) This patch also declares a filter mask for all stat attributes. User has to provide a mask of stats attributes to query. filter mask can be specified in the new hdr 'struct if_stats_msg' for stats messages. Other important field in the header is the ifindex. This api can also include attributes for global stats (eg tcp) in the future. When global stats are included in a stats msg, the ifindex in the header must be zero. A single stats message cannot contain both global and netdev specific stats. To easily distinguish them, netdev specific stat attributes name are prefixed with IFLA_STATS_LINK_ Without any attributes in the filter_mask, no stats will be returned. This patch has been tested with mofified iproute2 ifstat. Suggested-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-20net: nla_align_64bit() needs to test the right pointer.David S. Miller1-1/+1
Netlink messages are appended, one object at a time, to the end of the SKB. Therefore we need to test skb_tail_pointer() not skb->data for alignment. Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.") Signed-off-by: David S. Miller <[email protected]>
2016-04-20net: fix HAVE_EFFICIENT_UNALIGNED_ACCESS typosEric Dumazet1-8/+11
HAVE_EFFICIENT_UNALIGNED_ACCESS needs CONFIG_ prefix. Also add a comment in nla_align_64bit() explaining we have to add a padding if current skb->data is aligned, as it certainly can be confusing. Fixes: 35c5845957c7 ("net: Add helpers for 64-bit aligning netlink attributes.") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-20net/hsr: Fixed version field in ENUMPeter Heise1-1/+1
New field (IFLA_HSR_VERSION) was added in the middle of an existing ENUM and would break kernel ABI, therefore moved to the end. Reported by Stephen Hemminger. Signed-off-by: Peter Heise <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-19bpf: add event output helper for notifications/sampling/loggingDaniel Borkmann1-0/+2
This patch adds a new helper for cls/act programs that can push events to user space applications. For networking, this can be f.e. for sampling, debugging, logging purposes or pushing of arbitrary wake-up events. The idea is similar to a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") and 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example"). The eBPF program utilizes a perf event array map that user space populates with fds from perf_event_open(), the eBPF program calls into the helper f.e. as skb_event_output(skb, &my_map, BPF_F_CURRENT_CPU, raw, sizeof(raw)) so that the raw data is pushed into the fd f.e. at the map index of the current CPU. User space can poll/mmap/etc on this and has a data channel for receiving events that can be post-processed. The nice thing is that since the eBPF program and user space application making use of it are tightly coupled, they can define their own arbitrary raw data format and what/when they want to push. While f.e. packet headers could be one part of the meta data that is being pushed, this is not a substitute for things like packet sockets as whole packet is not being pushed and push is only done in a single direction. Intention is more of a generically usable, efficient event pipe to applications. Workflow is that tc can pin the map and applications can attach themselves e.g. after cls/act setup to one or multiple map slots, demuxing is done by the eBPF program. Adding this facility is with minimal effort, it reuses the helper introduced in a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") and we get its functionality for free by overloading its BPF_FUNC_ identifier for cls/act programs, ctx is currently unused, but will be made use of in future. Example will be added to iproute2's BPF example files. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-19bpf, trace: add BPF_F_CURRENT_CPU flag for bpf_perf_event_outputDaniel Borkmann1-0/+4
Add a BPF_F_CURRENT_CPU flag to optimize the use-case where user space has per-CPU ring buffers and the eBPF program pushes the data into the current CPU's ring buffer which saves us an extra helper function call in eBPF. Also, make sure to properly reserve the remaining flags which are not used. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-19net/ipv6/addrconf: simplify sysctl registrationKonstantin Khlebnikov1-1/+2
Struct ctl_table_header holds pointer to sysctl table which could be used for freeing it after unregistration. IPv4 sysctls already use that. Remove redundant NULL assignment: ndev allocated using kzalloc. This also saves some bytes: sysctl table could be shorter than DEVCONF_MAX+1 if some options are disable in config. Signed-off-by: Konstantin Khlebnikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-19net: Add helpers for 64-bit aligning netlink attributes.David S. Miller1-0/+37
Suggested-by: Eric Dumazet <[email protected]> Suggested-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-19net: Align IFLA_STATS64 attributes properly on architectures that need it.David S. Miller1-0/+1
Since the nlattr header is 4 bytes in size, it can cause the netlink attribute payload to not be 8-byte aligned. This is particularly troublesome for IFLA_STATS64 which contains 64-bit statistic values. Solve this by creating a dummy IFLA_PAD attribute which has a payload which is zero bytes in size. When HAVE_EFFICIENT_UNALIGNED_ACCESS is false, we insert an IFLA_PAD attribute into the netlink response when necessary such that the IFLA_STATS64 payload will be properly aligned. With help and suggestions from Eric Dumazet. Signed-off-by: David S. Miller <[email protected]>
2016-04-18phy: add generic function to support ksetting supportPhilippe Reynes1-0/+4
The old ethtool api (get_setting and set_setting) has generic phy functions phy_ethtool_sset and phy_ethtool_gset. To supprt the new ethtool api (get_link_ksettings and set_link_ksettings), we add generic phy function phy_ethtool_ksettings_get and phy_ethtool_ksettings_set. Signed-off-by: Philippe Reynes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-18net: ethtool: export conversion function between u32 and link modePhilippe Reynes1-0/+7
The function convert_legacy_u32_to_link_mode and convert_link_mode_to_legacy_u32 may be used outside of ethtool.c. We rename them to ethtool_convert_... and export them, so we could use them in others drivers and modules. Signed-off-by: Philippe Reynes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-17net: dsa: constify probed nameVivien Didelot1-2/+3
Change the dsa_switch_driver.probe function to return a const char *. Signed-off-by: Vivien Didelot <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-16netdev_features: Add NETIF_F_TSO_MANGLEID to NETIF_F_ALL_TSOAlexander Duyck1-1/+2
I realized that when I added NETIF_F_TSO_MANGLEID as a TSO type I forgot to add it to NETIF_F_ALL_TSO. This patch corrects that so the flag will be included correctly. The result should be minor as it was only used by a few drivers and in a few specific cases such as when NETIF_F_SG was not supported on a device so the TSO flags were cleared. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-16ip_tunnel_core: iptunnel_handle_offloads returns int and doesn't free skbAlexander Duyck2-3/+2
This patch updates the IP tunnel core function iptunnel_handle_offloads so that we return an int and do not free the skb inside the function. This actually allows us to clean up several paths in several tunnels so that we can free the skb at one point in the path without having to have a secondary path if we are supporting tunnel offloads. In addition it should resolve some double-free issues I have found in the tunnels paths as I believe it is possible for us to end up triggering such an event in the case of fou or gue. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-16vxlan: synchronously and race-free destruction of vxlan socketsHannes Frederic Sowa1-2/+0
Due to the fact that the udp socket is destructed asynchronously in a work queue, we have some nondeterministic behavior during shutdown of vxlan tunnels and creating new ones. Fix this by keeping the destruction process synchronous in regards to the user space process so IFF_UP can be reliably set. udp_tunnel_sock_release destroys vs->sock->sk if reference counter indicates so. We expect to have the same lifetime of vxlan_sock and vxlan_sock->sock->sk even in fast paths with only rcu locks held. So only destruct the whole socket after we can be sure it cannot be found by searching vxlan_net->sock_list. Cc: Eric Dumazet <[email protected]> Cc: Jiri Benc <[email protected]> Cc: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15sctp: add the sctp_diag.c fileXin Long1-0/+2
This one will implement all the interface of inet_diag, inet_diag_handler. which includes sctp_diag_dump, sctp_diag_dump_one and sctp_diag_get_info. It will work as a module, and register inet_diag_handler when loading. v2->v3: - fix the mistake in inet_assoc_attr_size(). - change inet_diag_msg_laddrs_fill() name to inet_diag_msg_sctpladdrs_fill. - change inet_diag_msg_paddrs_fill() name to inet_diag_msg_sctpaddrs_fill. - add inet_diag_msg_sctpinfo_fill() to make asoc/ep fill code clearer. - add inet_diag_msg_sctpasoc_fill() to make asoc fill code clearer. - merge inet_asoc_diag_fill() and inet_ep_diag_fill() to inet_sctp_diag_fill(). - call sctp_diag_get_info() directly, instead by handler, cause the caller is in the same file with it. - call lock_sock in sctp_tsp_dump_one() to make sure we call get sctp info safely. - after lock_sock(sk), we should check sk != assoc->base.sk. - change mem[SK_MEMINFO_WMEM_ALLOC] to asoc->sndbuf_used for asoc dump when asoc->ep->sndbuf_policy is set. don't use INET_DIAG_MEMINFO attr any more. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15sctp: export some apis or variables for sctp_diag and reuse some for procXin Long1-0/+13
For some main variables in sctp.ko, we couldn't export it to other modules, so we have to define some api to access them. It will include sctp transport and endpoint's traversal. There are some transport traversal functions for sctp_diag, we can also use it for sctp_proc. cause they have the similar situation to traversal transport. v2->v3: - rhashtable_walk_init need the parameter gfp, because of recent upstrem update Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15sctp: add sctp_info dump api for sctp_diagXin Long2-0/+70
sctp_diag will dump some important details of sctp's assoc or ep, we use sctp_info to describe them, sctp_get_sctp_info to get them, and export it to sctp_diag.ko. v2->v3: - we will not use list_for_each_safe in sctp_get_sctp_info, cause all the callers of it will use lock_sock. - fix the holes in struct sctp_info with __reserved* field. because sctp_diag is a new feature, and sctp_info is just for now, it may be changed in the future. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15sctp: simplify sk_receive_queue lockingMarcelo Ricardo Leitner1-15/+0
SCTP already serializes access to rcvbuf through its sock lock: sctp_recvmsg takes it right in the start and release at the end, while rx path will also take the lock before doing any socket processing. On sctp_rcv() it will check if there is an user using the socket and, if there is, it will queue incoming packets to the backlog. The backlog processing will do the same. Even timers will do such check and re-schedule if an user is using the socket. Simplifying this will allow us to remove sctp_skb_list_tail and get ride of some expensive lockings. The lists that it is used on are also mangled with functions like __skb_queue_tail and __skb_unlink in the same context, like on sctp_ulpq_tail_event() and sctp_clear_pd(). sctp_close() will also purge those while using only the sock lock. Therefore the lockings performed by sctp_skb_list_tail() are not necessary. This patch removes this function and replaces its calls with just skb_queue_splice_tail_init() instead. The biggest gain is at sctp_ulpq_tail_event(), because the events always contain a list, even if it's queueing a single skb and this was triggering expensive calls to spin_lock_irqsave/_irqrestore for every data chunk received. As SCTP will deliver each data chunk on a corresponding recvmsg, the more effective the change will be. Before this patch, with chunks with 30 bytes: netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000 400000 -s 400000 400000 on a 10Gbit link with 1500 MTU: SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 425984 425984 30 60.00 137.45 7.34 7.36 52.504 52.608 With it: SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 425984 425984 30 60.00 179.10 7.97 6.70 43.740 36.788 Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15net/mlx5: Update mlx5_ifc hardware featuresSaeed Mahameed1-22/+124
Adding the needed mlx5_ifc hardware bits and structs for the following feature: * Add vport to steering commands for SRIOV ACL support * Add mlcr, pcmr and mcia registers for dump module EEPROM * Add support for FCS, baeacon led and disable_link bits to hca caps * Add CQE period mode bit in CQ context for CQE based CQ moderation support * Add umr SQ bit for fragmented memory registration * Add needed bits and caps for Striding RQ support Signed-off-by: Saeed Mahameed <[email protected]> Signed-off-by: Matan Barak <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15net/mlx5: Fix mlx5 ifc cmd_hca_cap bad offsetsTariq Toukan1-52/+55
All reserved fields after early_vf_enable are off by 1, since early_vf_enable was not explicitly declared as array of size 1. Reserved field before cqe_zip had a wrong size, it should be 0x80 + 0x3f. Fixes: b0844444590e ("net/mlx5_core: Introduce access function to read internal timer ") Fixes: b4ff3a36d3e4 ("net/mlx5: Use offset based reserved field names in the IFC header file") Signed-off-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Signed-off-by: Matan Barak <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15qed: Add infrastructure support for tunnelingManish Chopra1-0/+10
This patch adds various structure/APIs needed to configure/enable different tunnel [VXLAN/GRE/GENEVE] parameters on the adapter. Signed-off-by: Manish Chopra <[email protected]> Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15net/hsr: Added support for HSR v1Peter Heise2-0/+2
This patch adds support for the newer version 1 of the HSR networking standard. Version 0 is still default and the new version has to be selected via iproute2. Main changes are in the supervision frame handling and its ethertype field. Signed-off-by: Peter Heise <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15tcp: do not mess with listener sk_wmem_allocEric Dumazet1-2/+7
When removing sk_refcnt manipulation on synflood, I missed that using skb_set_owner_w() was racy, if sk->sk_wmem_alloc had already transitioned to 0. We should hold sk_refcnt instead, but this is a big deal under attack. (Doing so increase performance from 3.2 Mpps to 3.8 Mpps only) In this patch, I chose to not attach a socket to syncookies skb. Performance is now 5 Mpps instead of 3.2 Mpps. Following patch will remove last known false sharing in tcp_rcv_state_process() Fixes: 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-15devlink: fix sb register stub in case devlink is disabledJiri Pirko1-1/+3
Reported-by: kbuild test robot <[email protected]> Fixes: bf7974710a40 ("devlink: add shared buffer configuration") Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14bpf, verifier: add ARG_PTR_TO_RAW_STACK typeDaniel Borkmann1-0/+5
When passing buffers from eBPF stack space into a helper function, we have ARG_PTR_TO_STACK argument type for helpers available. The verifier makes sure that such buffers are initialized, within boundaries, etc. However, the downside with this is that we have a couple of helper functions such as bpf_skb_load_bytes() that fill out the passed buffer in the expected success case anyway, so zero initializing them prior to the helper call is unneeded/wasted instructions in the eBPF program that can be avoided. Therefore, add a new helper function argument type called ARG_PTR_TO_RAW_STACK. The idea is to skip the STACK_MISC check in check_stack_boundary() and color the related stack slots as STACK_MISC after we checked all call arguments. Helper functions using ARG_PTR_TO_RAW_STACK must make sure that every path of the helper function will fill the provided buffer area, so that we cannot leak any uninitialized stack memory. This f.e. means that error paths need to memset() the buffers, but the expected fast-path doesn't have to do this anymore. Since there's no such helper needing more than at most one ARG_PTR_TO_RAW_STACK argument, we can keep it simple and don't need to check for multiple areas. Should in future such a use-case really appear, we have check_raw_mode() that will make sure we implement support for it first. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14GSO: Support partial segmentation offloadAlexander Duyck3-2/+14
This patch adds support for something I am referring to as GSO partial. The basic idea is that we can support a broader range of devices for segmentation if we use fixed outer headers and have the hardware only really deal with segmenting the inner header. The idea behind the naming is due to the fact that everything before csum_start will be fixed headers, and everything after will be the region that is handled by hardware. With the current implementation it allows us to add support for the following GSO types with an inner TSO_MANGLEID or TSO6 offload: NETIF_F_GSO_GRE NETIF_F_GSO_GRE_CSUM NETIF_F_GSO_IPIP NETIF_F_GSO_SIT NETIF_F_UDP_TUNNEL NETIF_F_UDP_TUNNEL_CSUM In the case of hardware that already supports tunneling we may be able to extend this further to support TSO_TCPV4 without TSO_MANGLEID if the hardware can support updating inner IPv4 headers. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14GRO: Add support for TCP with fixed IPv4 ID field, limit tunnel IP ID valuesAlexander Duyck1-1/+4
This patch does two things. First it allows TCP to aggregate TCP frames with a fixed IPv4 ID field. As a result we should now be able to aggregate flows that were converted from IPv6 to IPv4. In addition this allows us more flexibility for future implementations of segmentation as we may be able to use a fixed IP ID when segmenting the flow. The second thing this does is that it places limitations on the outer IPv4 ID header in the case of tunneled frames. Specifically it forces the IP ID to be incrementing by 1 unless the DF bit is set in the outer IPv4 header. This way we can avoid creating overlapping series of IP IDs that could possibly be fragmented if the frame goes through GRO and is then resegmented via GSO. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14GSO: Add GSO type for fixed IPv4 IDAlexander Duyck3-9/+15
This patch adds support for TSO using IPv4 headers with a fixed IP ID field. This is meant to allow us to do a lossless GRO in the case of TCP flows that use a fixed IP ID such as those that convert IPv6 header to IPv4 headers. In addition I am adding a feature that for now I am referring to TSO with IP ID mangling. Basically when this flag is enabled the device has the option to either output the flow with incrementing IP IDs or with a fixed IP ID regardless of what the original IP ID ordering was. This is useful in cases where the DF bit is set and we do not care if the original IP ID value is maintained. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14devlink: implement shared buffer occupancy monitoring interfaceJiri Pirko2-0/+18
User needs to monitor shared buffer occupancy. For that, he issues a snapshot command in order to instruct hardware to catch current and maximal occupancy values, and clear command in order to clear the historical maximal values. Also port-pool and tc-pool-bind command response messages are extended to carry occupancy values. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14devlink: add shared buffer configurationJiri Pirko2-0/+104
Define userspace API and drivers API for configuration of shared buffers. Four basic objects are defined: shared buffer - attributes are size, number of pools and TCs pool - chunk of sharedbuffer definition, it has some size and either static or dynamic threshold port pool threshold - to set per-port threshold for each pool port tc threshold bind - to bind port and TC to specified pool with threshold. Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14gre: eliminate holes in ip_tunnelstephen hemminger1-4/+3
The structure can be packed denser by doing minor rearrangement of existing elements. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14qed: add Rx flow hash/indirection support.Sudarsana Reddy Kalluru2-0/+12
Adds the required API for passing RSS-related configuration from qede. Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14qed*: remove version dependencyRahul Verma2-10/+1
Inbox drivers don't need versioning scheme in order to guarantee compatibility, as both qed and qede are compiled from same codebase. Signed-off-by: Rahul Verma <[email protected]> Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-14Merge branch 'for-davem' of ↵David S. Miller1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
2016-04-14net: remove netdevice gso_min_segsEric Dumazet1-3/+1
After introduction of ndo_features_check(), we believe that very specific checks for rare features should not be done in core networking stack. No driver uses gso_min_segs yet, so we revert this feature and save few instructions per tx packet in fast path. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13sctp: delay calls to sk_data_ready() as much as possibleMarcelo Ricardo Leitner1-1/+2
Currently processing of multiple chunks in a single SCTP packet leads to multiple calls to sk_data_ready, causing multiple wake up signals which are costy and doesn't make it wake up any faster. With this patch it will note that the wake up is pending and will do it before leaving the state machine interpreter, latest place possible to do it realiably and cleanly. Note that sk_data_ready events are not dependent on asocs, unlike waking up writers. v2: series re-checked v3: use local vars to cleanup the code, suggested by Jakub Sitnicki Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13sctp: compress bit-wide flags to a bitfield on sctp_sockMarcelo Ricardo Leitner1-6/+6
It wastes space and gets worse as we add new flags, so convert bit-wide flags to a bitfield. Currently it already saves 4 bytes in sctp_sock, which are left as holes in it for now. The whole struct needs packing, which should be done in another patch. Note that do_auto_asconf cannot be merged, as explained in the comment before it. Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13net: force inlining of netif_tx_start/stop_queue, sock_hold, __sock_putDenys Vlasenko2-4/+4
Sometimes gcc mysteriously doesn't inline very small functions we expect to be inlined. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 Arguably, gcc should do better, but gcc people aren't willing to invest time into it, asking to use __always_inline instead. With this .config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os, the following functions get deinlined many times. netif_tx_stop_queue: 207 copies, 590 calls: 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 80 8f e0 01 00 00 01 lock orb $0x1,0x1e0(%rdi) 5d pop %rbp c3 retq netif_tx_start_queue: 47 copies, 111 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 80 a7 e0 01 00 00 fe lock andb $0xfe,0x1e0(%rdi) 5d pop %rbp c3 retq sock_hold: 39 copies, 124 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 ff 87 80 00 00 00 lock incl 0x80(%rdi) 5d pop %rbp c3 retq __sock_put: 6 copies, 13 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 ff 8f 80 00 00 00 lock decl 0x80(%rdi) 5d pop %rbp c3 retq This patch fixes this via s/inline/__always_inline/. Code size decrease after the patch is ~2.5k: text data bss dec hex filename 56719876 56364551 36196352 149280779 8e5d80b vmlinux_before 56717440 56364551 36196352 149278343 8e5ce87 vmlinux Signed-off-by: Denys Vlasenko <[email protected]> CC: David S. Miller <[email protected]> CC: [email protected] CC: [email protected] CC: [email protected] Signed-off-by: David S. Miller <[email protected]>
2016-04-13sock: tigthen lockdep checks for sock_owned_by_userHannes Frederic Sowa1-15/+29
sock_owned_by_user should not be used without socket lock held. It seems to be a common practice to check .owned before lock reclassification, so provide a little help to abstract this check away. Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13dsa: Rename phys_port_mask to enabled_port_maskAndrew Lunn1-2/+2
The phys in phys_port_mask suggests this mask is about PHYs. In fact, it means physical ports. Rename to enabled_port_mask, indicating external enabled ports of the switch, which is hopefully less confusing. Signed-off-by: Andrew Lunn <[email protected]> Tested-by: Vivien Didelot <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13net: dsa: Remove allocation of driver private memoryAndrew Lunn1-1/+0
The drivers now allocate their own memory for private usage. Remove the allocation from the core code. Signed-off-by: Andrew Lunn <[email protected]> Acked-by: Florian Fainelli <[email protected]> Tested-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13net: dsa: Have the switch driver allocate there own private memoryAndrew Lunn1-2/+8
Now the switch devices have a dev pointer, make use of it for allocating the drivers private data structures using a devm_kzalloc(). Signed-off-by: Andrew Lunn <[email protected]> Acked-by: Florian Fainelli <[email protected]> Tested-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13net: dsa: Pass the dsa device to the switch driversAndrew Lunn1-1/+2
By passing a device structure to the switch devices, it allows them to use devm_* methods for resource management. Signed-off-by: Andrew Lunn <[email protected]> Acked-by: Florian Fainelli <[email protected]> Tested-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-13Merge tag 'mac80211-next-for-davem-2016-04-13' of ↵David S. Miller3-44/+58
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== To synchronize with Kalle, here's just a big change that affects all drivers - removing the duplicated enum ieee80211_band and replacing it by enum nl80211_band. On top of that, just a small documentation update. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2-93/+25
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for your net-next tree. 1) Define pr_fmt() in nf_conntrack, from Weongyo Jeong. 2) Define and register netfilter's afinfo for the bridge family, this comes in preparation for native nfqueue's bridge for nft, from Stephane Bryant. 3) Add new attributes to store layer 2 and VLAN headers to nfqueue, also from Stephane Bryant. 4) Parse new NFQA_VLAN and NFQA_L2HDR nfqueue netlink attributes coming from userspace, from Stephane Bryant. 5) Use net->ipv6.devconf_all->hop_limit instead of hardcoded hop_limit in IPv6 SYNPROXY, from Liping Zhang. 6) Remove unnecessary check for dst == NULL in nf_reject_ipv6, from Haishuang Yan. 7) Deinline ctnetlink event report functions, from Florian Westphal. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-04-12netfilter: conntrack: move expectation event helper to ecache.cFlorian Westphal1-39/+3
Not performance critical, it is only invoked when an expectation is added/destroyed. While at it, kill unused nf_ct_expect_event() wrapper. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-04-12netfilter: conntrack: de-inline nf_conntrack_eventmask_reportFlorian Westphal1-54/+12
Way too large; move it to nf_conntrack_ecache.c. Reduces total object size by 1216 byte on my machine. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-04-12cfg80211: remove enum ieee80211_bandJohannes Berg3-38/+22
This enum is already perfectly aliased to enum nl80211_band, and the only reason for it is that we get IEEE80211_NUM_BANDS out of it. There's no really good reason to not declare the number of bands in nl80211 though, so do that and remove the cfg80211 one. Signed-off-by: Johannes Berg <[email protected]>
2016-04-12cfg80211: Improve Connect/Associate command documentationJouni Malinen2-6/+36
The roaming cases for the Connect command were not fully covered and neither Connect nor Associate command uses of the prev_bssid parameter were very clear. Add details to describe how the prev_bssid argument is supposed to be used and when the driver should use association or reassociation. Signed-off-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>