aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2016-05-02net: rtnetlink: add linkxstats callbacks and attributeNikolay Aleksandrov1-0/+30
Add callbacks to calculate the size and fill link extended statistics which can be split into multiple messages and are dumped via the new rtnl stats API (RTM_GETSTATS) with the IFLA_STATS_LINK_XSTATS attribute. Also add that attribute to the idx mask check since it is expected to be able to save state and resume dumping (e.g. future bridge per-vlan stats will be dumped via this attribute and callbacks). Each link type should nest its private attributes under the per-link type attribute. This allows to have any number of separated private attributes and to avoid one call to get the dev link type. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02net: rtnetlink: allow rtnl_fill_statsinfo to save private state counterNikolay Aleksandrov1-13/+31
The new prividx argument allows the current dumping device to save a private state counter which would enable it to continue dumping from where it left off. And the idxattr is used to save the current idx user so multiple prividx using attributes can be requested at the same time as suggested by Roopa Prabhu. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02Merge getxattr prototype change into work.lookupsAl Viro1-1/+1
The rest of work.xattr stuff isn't needed for this branch
2016-05-02gre6: Cleanup GREv6 transmit path, call common GRE functionsTom Herbert1-202/+50
Changes in GREv6 transmit path: - Call gre_checksum, remove gre6_checksum - Rename ip6gre_xmit2 to __gre6_xmit - Call gre_build_header utility function - Call ip6_tnl_xmit common function - Call ip6_tnl_change_mtu, eliminate ip6gre_tunnel_change_mtu Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02ipv6: Generic tunnel cleanupTom Herbert1-2/+5
A few generic changes to generalize tunnels in IPv6: - Export ip6_tnl_change_mtu so that it can be called by ip6_gre - Add tun_hlen to ip6_tnl structure. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02gre: Create common functions for transmitTom Herbert1-47/+5
Create common functions for both IPv4 and IPv6 GRE in transmit. These are put into gre.h. Common functions are for: - GRE checksum calculation. Move gre_checksum to gre.h. - Building a GRE header. Move GRE build_header and rename gre_build_header. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02ipv6: Create ip6_tnl_xmitTom Herbert1-17/+30
This patch renames ip6_tnl_xmit2 to ip6_tnl_xmit and exports it. Other users like GRE will be able to call this. The original ip6_tnl_xmit function is renamed to ip6_tnl_start_xmit (this is an ndo_start_xmit function). Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02gre6: Cleanup GREv6 receive path, call common GRE functionsTom Herbert1-117/+23
- Create gre_rcv function. This calls gre_parse_header and ip6gre_rcv. - Call ip6_tnl_rcv. Doing this and using gre_parse_header eliminates most of the code in ip6gre_rcv. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02gre: Move utility functions to common headersTom Herbert2-129/+84
Several of the GRE functions defined in net/ipv4/ip_gre.c are usable for IPv6 GRE implementation (that is they are protocol agnostic). These include: - GRE flag handling functions are move to gre.h - GRE build_header is moved to gre.h and renamed gre_build_header - parse_gre_header is moved to gre_demux.c and renamed gre_parse_header - iptunnel_pull_header is taken out of gre_parse_header. This is now done by caller. The header length is returned from gre_parse_header in an int* argument. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02ipv6: Cleanup IPv6 tunnel receive pathTom Herbert1-70/+142
Some basic changes to make IPv6 tunnel receive path look more like IPv4 path: - Make ip6_tnl_rcv non-static so that GREv6 and others can call it - Make ip6_tnl_rcv look like ip_tunnel_rcv - Switch to gro_cells_receive - Make ip6_tnl_rcv non-static and export it Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02tcp: make tcp_sendmsg() aware of socket backlogEric Dumazet2-2/+13
Large sendmsg()/write() hold socket lock for the duration of the call, unless sk->sk_sndbuf limit is hit. This is bad because incoming packets are parked into socket backlog for a long time. Critical decisions like fast retransmit might be delayed. Receivers have to maintain a big out of order queue with additional cpu overhead, and also possible stalls in TX once windows are full. Bidirectional flows are particularly hurt since the backlog can become quite big if the copy from user space triggers IO (page faults) Some applications learnt to use sendmsg() (or sendmmsg()) with small chunks to avoid this issue. Kernel should know better, right ? Add a generic sk_flush_backlog() helper and use it right before a new skb is allocated. Typically we put 64KB of payload per skb (unless MSG_EOR is requested) and checking socket backlog every 64KB gives good results. As a matter of fact, tests with TSO/GSO disabled give very nice results, as we manage to keep a small write queue and smaller perceived rtt. Note that sk_flush_backlog() maintains socket ownership, so is not equivalent to a {release_sock(sk); lock_sock(sk);}, to ensure implicit atomicity rules that sendmsg() was giving to (possibly buggy) applications. In this simple implementation, I chose to not call tcp_release_cb(), but we might consider this later. Signed-off-by: Eric Dumazet <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Marcelo Ricardo Leitner <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02net: do not block BH while processing socket backlogEric Dumazet1-14/+8
Socket backlog processing is a major latency source. With current TCP socket sk_rcvbuf limits, I have sampled __release_sock() holding cpu for more than 5 ms, and packets being dropped by the NIC once ring buffer is filled. All users are now ready to be called from process context, we can unblock BH and let interrupts be serviced faster. cond_resched_softirq() could be removed, as it has no more user. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02sctp: prepare for socket backlog behavior changeEric Dumazet1-0/+2
sctp_inq_push() will soon be called without BH being blocked when generic socket code flushes the socket backlog. It is very possible SCTP can be converted to not rely on BH, but this needs to be done by SCTP experts. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02udp: prepare for non BH masking at backlog processingEric Dumazet2-4/+4
UDP uses the generic socket backlog code, and this will soon be changed to not disable BH when protocol is called back. We need to use appropriate SNMP accessors. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02dccp: do not assume DCCP code is non preemptibleEric Dumazet4-6/+6
DCCP uses the generic backlog code, and this will soon be changed to not disable BH when protocol is called back. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02tcp: do not block bh during prequeue processingEric Dumazet2-32/+2
AFAIK, nothing in current TCP stack absolutely wants BH being disabled once socket is owned by a thread running in process context. As mentioned in my prior patch ("tcp: give prequeue mode some care"), processing a batch of packets might take time, better not block BH at all. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02tcp: do not assume TCP code is non preemptibleEric Dumazet11-99/+104
We want to to make TCP stack preemptible, as draining prequeue and backlog queues can take lot of time. Many SNMP updates were assuming that BH (and preemption) was disabled. Need to convert some __NET_INC_STATS() calls to NET_INC_STATS() and some __TCP_INC_STATS() to TCP_INC_STATS() Before using this_cpu_ptr(net->ipv4.tcp_sk) in tcp_v4_send_reset() and tcp_v4_send_ack(), we add an explicit preempt disabled section. Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-02wext: remove a/b/g/n from SIOCGIWNAMEJohannes Berg1-35/+0
Since a/b/g/n no longer exist as spec amendements and VHT (ex 802.11ac) wasn't handled at all, it's better to just remove the amendment strings to avoid confusion. Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Luca Coelho <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-05-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds11-19/+57
Pull networking fixes from David Miller: 1) MODULE_FIRMWARE firmware string not correct for iwlwifi 8000 chips, from Sara Sharon. 2) Fix SKB size checks in batman-adv stack on receive, from Sven Eckelmann. 3) Leak fix on mac80211 interface add error paths, from Johannes Berg. 4) Cannot invoke napi_disable() with BH disabled in myri10ge driver, fix from Stanislaw Gruszka. 5) Fix sign extension problem when computing feature masks in net_gso_ok(), from Marcelo Ricardo Leitner. 6) lan78xx driver doesn't count packets and packet lengths in its statistics properly, fix from Woojung Huh. 7) Fix the buffer allocation sizes in pegasus USB driver, from Petko Manolov. 8) Fix refcount overflows in bpf, from Alexei Starovoitov. 9) Unified dst cache handling introduced a preempt warning in ip_tunnel, fix by resetting rather then setting the cached route. From Paolo Abeni. 10) Listener hash collision test fix in soreuseport, from Craig Gallak * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits) gre: do not pull header in ICMP error processing net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case tipc: only process unicast on intended node cxgb3: fix out of bounds read net/smscx5xx: use the device tree for mac address soreuseport: Fix TCP listener hash collision net: l2tp: fix reversed udp6 checksum flags ip_tunnel: fix preempt warning in ip tunnel creation/updating samples/bpf: fix trace_output example bpf: fix check_map_func_compatibility logic bpf: fix refcnt overflow drivers: net: cpsw: use of_phy_connect() in fixed-link case dt: cpsw: phy-handle, phy_id, and fixed-link are mutually exclusive drivers: net: cpsw: don't ignore phy-mode if phy-handle is used drivers: net: cpsw: fix segfault in case of bad phy-handle drivers: net: cpsw: fix parsing of phy-handle DT property in dual_emac config MAINTAINERS: net: Change maintainer for GRETH 10/100/1G Ethernet MAC device driver gre: reject GUE and FOU in collect metadata mode pegasus: fixes reported packet length pegasus: fixes URB buffer allocation size; ...
2016-05-02gre: do not pull header in ICMP error processingJiri Benc1-3/+8
iptunnel_pull_header expects that IP header was already pulled; with this expectation, it pulls the tunnel header. This is not true in gre_err. Furthermore, ipv4_update_pmtu and ipv4_redirect expect that skb->data points to the IP header. We cannot pull the tunnel header in this path. It's just a matter of not calling iptunnel_pull_header - we don't need any of its effects. Fixes: bda7bb463436 ("gre: Allow multiple protocol listener for gre protocol.") Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-01sctp: signal sk_data_ready earlier on data chunks receptionMarcelo Ricardo Leitner2-13/+19
Dave Miller pointed out that fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible") may insert latency specially if the receiving application is running on another CPU and that it would be better if we signalled as early as possible. This patch thus basically inverts the logic on fb586f25300f and signals it as early as possible, similar to what we had before. Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible") Reported-by: Dave Miller <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-01tipc: only process unicast on intended nodeHamish Martin1-0/+5
We have observed complete lock up of broadcast-link transmission due to unacknowledged packets never being removed from the 'transmq' queue. This is traced to nodes having their ack field set beyond the sequence number of packets that have actually been transmitted to them. Consider an example where node 1 has sent 10 packets to node 2 on a link and node 3 has sent 20 packets to node 2 on another link. We see examples of an ack from node 2 destined for node 3 being treated as an ack from node 2 at node 1. This leads to the ack on the node 1 to node 2 link being increased to 20 even though we have only sent 10 packets. When node 1 does get around to sending further packets, none of the packets with sequence numbers less than 21 are actually removed from the transmq. To resolve this we reinstate some code lost in commit d999297c3dbb ("tipc: reduce locking scope during packet reception") which ensures that only messages destined for the receiving node are processed by that node. This prevents the sequence numbers from getting out of sync and resolves the packet leakage, thereby resolving the broadcast-link transmission lock-ups we observed. While we are aware that this change only patches over a root problem that we still haven't identified, this is a sanity test that it is always legitimate to do. It will remain in the code even after we identify and fix the real problem. Reviewed-by: Chris Packham <[email protected]> Reviewed-by: John Thompson <[email protected]> Signed-off-by: Hamish Martin <[email protected]> Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-01tipc: set 'active' state correctly for first established linkJon Paul Maloy1-0/+1
When we are displaying statistics for the first link established between two peers, it will always be presented as STANDBY although it in reality is ACTIVE. This happens because we forget to set the 'active' flag in the link instance at the moment it is established. Although this is a bug, it only has impact on the presentation view of the link, not on its actual functionality. Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-01soreuseport: Fix TCP listener hash collisionCraig Gallek1-0/+2
I forgot to include a check for listener port equality when deciding if two sockets should belong to the same reuseport group. This was not caught previously because it's only necessary when two listening sockets for the same user happen to hash to the same listener bucket. The same error does not exist in the UDP path. Fixes: c125e80b8868("soreuseport: fast reuseport TCP socket selection") Signed-off-by: Craig Gallek <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-01net: l2tp: fix reversed udp6 checksum flagsWang Shanker1-2/+2
This patch fixes a bug which causes the behavior of whether to ignore udp6 checksum of udp6 encapsulated l2tp tunnel contrary to what userspace program requests. When the flag `L2TP_ATTR_UDP_ZERO_CSUM6_RX` is set by userspace, it is expected that udp6 checksums of received packets of the l2tp tunnel to create should be ignored. In `l2tp_netlink.c`: `l2tp_nl_cmd_tunnel_create()`, `cfg.udp6_zero_rx_checksums` is set according to the flag, and then passed to `l2tp_core.c`: `l2tp_tunnel_create()` and then `l2tp_tunnel_sock_create()`. In `l2tp_tunnel_sock_create()`, `udp_conf.use_udp6_rx_checksums` is set the same to `cfg.udp6_zero_rx_checksums`. However, if we want the checksum to be ignored, `udp_conf.use_udp6_rx_checksums` should be set to `false`, i.e. be set to the contrary. Similarly, the same should be done to `udp_conf.use_udp6_tx_checksums`. Signed-off-by: Miao Wang <[email protected]> Acked-by: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-30tty: Replace ASYNC_INITIALIZED bit and update atomicallyPeter Hurley2-8/+9
Replace ASYNC_INITIALIZED bit in the tty_port::flags field with TTY_PORT_INITIALIZED bit in the tty_port::iflags field. Introduce helpers tty_port_set_initialized() and tty_port_initialized() to abstract atomic bit ops. Note: the transforms for test_and_set_bit() and test_and_clear_bit() are unnecessary as the state transitions are already mutually exclusive; the tty lock prevents concurrent open/close/hangup. Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-30tty: Replace ASYNC_CHECK_CD and update atomicallyPeter Hurley3-7/+4
Replace ASYNC_CHECK_CD bit in the tty_port::flags field with TTY_PORT_CHECK_CD bit in the tty_port::iflags field. Introduce helpers tty_port_set_check_carrier() and tty_port_check_carrier() to abstract the atomic bit ops. Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-30tty: Replace ASYNC_NORMAL_ACTIVE bit and update atomicallyPeter Hurley1-5/+5
Replace ASYNC_NORMAL_ACTIVE bit in the tty_port::flags field with TTY_PORT_ACTIVE bit in the tty_port::iflags field. Introduce helpers tty_port_set_active() and tty_port_active() to abstract atomic bit ops. Extract state changes from port lock sections, as this usage is broken and confused; the state transitions are protected by the tty lock (which mutually excludes parallel open/close/hangup), and no user tests the active state while holding the port lock. Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-30tty: Replace ASYNC_CTS_FLOW bit and update atomicallyPeter Hurley1-2/+1
Replace ASYNC_CTS_FLOW bit in the tty_port::flags field with TTY_PORT_CTS_FLOW bit in the tty_port::iflags field. Add tty_port_set_cts_flow() helper to abstract the atomic bit ops. Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-30tty: Replace TTY_THROTTLED bit tests with tty_throttled()Peter Hurley1-1/+1
Abstract TTY_THROTTLED bit tests with tty_throttled(). Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-30tty: Replace TTY_IO_ERROR bit tests with tty_io_error()Peter Hurley2-4/+4
Abstract TTY_IO_ERROR status test treewide with tty_io_error(). NB: tty->flags uses atomic bit ops; replace non-atomic bit test with test_bit(). Signed-off-by: Peter Hurley <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2016-04-29net: constify is_skb_forwardable's argumentsNikolay Aleksandrov1-1/+1
is_skb_forwardable is not supposed to change anything so constify its arguments Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-29ip_tunnel: fix preempt warning in ip tunnel creation/updatingPaolo Abeni1-2/+2
After the commit e09acddf873b ("ip_tunnel: replace dst_cache with generic implementation"), a preemption debug warning is triggered on ip4 tunnels updating; the dst cache helper needs to be invoked in unpreemptible context. We don't need to load the cache on tunnel update, so this commit fixes the warning replacing the load with a dst cache reset, which is preempt safe. Fixes: e09acddf873b ("ip_tunnel: replace dst_cache with generic implementation") Reported-by: Eric Dumazet <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-29netfilter: IDLETIMER: fix race condition when destroy the targetLiping Zhang1-0/+1
Workqueue maybe still in running while we destroy the IDLETIMER target, thus cause a use after free error, add cancel_work_sync() to avoid such situation. Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-04-29batman-adv: Fix reference counting of hardif_neigh_node object for neigh_nodeSven Eckelmann2-11/+7
The batadv_neigh_node was specific to a batadv_hardif_neigh_node and held an implicit reference to it. But this reference was never stored in form of a pointer in the batadv_neigh_node itself. Instead batadv_neigh_node_release depends on a consistent state of hard_iface->neigh_list and that batadv_hardif_neigh_get always returns the batadv_hardif_neigh_node object which it has a reference for. But batadv_hardif_neigh_get cannot guarantee that because it is working only with rcu_read_lock on this list. It can therefore happen that a neigh_addr is in this list twice or that batadv_hardif_neigh_get cannot find the batadv_hardif_neigh_node for an neigh_addr due to some other list operations taking place at the same time. Instead add a batadv_hardif_neigh_node pointer directly in batadv_neigh_node which will be used for the reference counter decremented on release of batadv_neigh_node. Fixes: cef63419f7db ("batman-adv: add list of unique single hop neighbors per hard-interface") Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2016-04-29batman-adv: Fix reference counting of vlan object for tt_local_entrySven Eckelmann2-38/+6
The batadv_tt_local_entry was specific to a batadv_softif_vlan and held an implicit reference to it. But this reference was never stored in form of a pointer in the tt_local_entry itself. Instead batadv_tt_local_remove, batadv_tt_local_table_free and batadv_tt_local_purge_pending_clients depend on a consistent state of bat_priv->softif_vlan_list and that batadv_softif_vlan_get always returns the batadv_softif_vlan object which it has a reference for. But batadv_softif_vlan_get cannot guarantee that because it is working only with rcu_read_lock on this list. It can therefore happen that an vid is in this list twice or that batadv_softif_vlan_get cannot find the batadv_softif_vlan for an vid due to some other list operations taking place at the same time. Instead add a batadv_softif_vlan pointer directly in batadv_tt_local_entry which will be used for the reference counter decremented on release of batadv_tt_local_entry. Fixes: 35df3b298fc8 ("batman-adv: fix TT VLAN inconsistency on VLAN re-add") Signed-off-by: Sven Eckelmann <[email protected]> Acked-by: Antonio Quartulli <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2016-04-29batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP eventAntonio Quartulli3-0/+18
At the moment there is no explicit reactivation of an hard-interface upon NETDEV_UP event. In case of B.A.T.M.A.N. IV the interface is reactivated as soon as the next OGM is scheduled for sending, but this mechanism does not work with B.A.T.M.A.N. V. The latter does not rely on the same scheduling mechanism as its predecessor and for this reason the hard-interface remains deactivated forever after being brought down once. This patch fixes the reactivation mechanism by adding a new routing API which explicitly allows each algorithm to perform any needed operation upon interface re-activation. Such API is optional and is implemented by B.A.T.M.A.N. V only and it just takes care of setting the iface status to ACTIVE Signed-off-by: Antonio Quartulli <[email protected]> Signed-off-by: Marek Lindner <[email protected]>
2016-04-29batman-adv: fix DAT candidate selection (must use vid)Antonio Quartulli1-7/+10
Now that DAT is VLAN aware, it must use the VID when computing the DHT address of the candidate nodes where an entry is going to be stored/retrieved. Fixes: be1db4f6615b ("batman-adv: make the Distributed ARP Table vlan aware") Signed-off-by: Antonio Quartulli <[email protected]> [[email protected]: fix conflicts with current version] Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Marek Lindner <[email protected]>
2016-04-29netfilter: conntrack: init all_locks to avoid debug warningFlorian Westphal1-1/+1
Else we get 'BUG: spinlock bad magic on CPU#' on resize when spin lock debugging is enabled. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-04-29netfilter: fix IS_ERR_VALUE usagePablo Neira Ayuso3-6/+12
This is a forward-port of the original patch from Andrzej Hajda, he said: "IS_ERR_VALUE should be used only with unsigned long type. Otherwise it can work incorrectly. To achieve this function xt_percpu_counter_alloc is modified to return unsigned long, and its result is assigned to temporary variable to perform error checking, before assigning to .pcnt field. The patch follows conclusion from discussion on LKML [1][2]. [1]: http://permalink.gmane.org/gmane.linux.kernel/2120927 [2]: http://permalink.gmane.org/gmane.linux.kernel/2150581" Original patch from Andrzej is here: http://patchwork.ozlabs.org/patch/582970/ This patch has clashed with input validation fixes for x_tables. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-04-28Merge branch 'for-linus' of ↵Linus Torvalds6-55/+55
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "There is a lifecycle fix in the auth code, a fix for a narrow race condition on map, and a helpful message in the log when there is a feature mismatch (which happens frequently now that the default server-side options have changed)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: report unsupported features to syslog rbd: fix rbd map vs notify races libceph: make authorizer destruction independent of ceph_auth_client
2016-04-28net: dsa: Provide CPU port statistics to master netdevFlorian Fainelli1-0/+88
This patch overloads the DSA master netdev, aka CPU Ethernet MAC to also include switch-side statistics, which is useful for debugging purposes, when the switch is not properly connected to the Ethernet MAC (duplex mismatch, (RG)MII electrical issues etc.). We accomplish this by retaining the original copy of the master netdev's ethtool_ops, and just overload the 3 operations we care about: get_sset_count, get_strings and get_ethtool_stats so as to intercept these calls and call into the original master_netdev ethtool_ops, plus our own. We take this approach as opposed to providing a set of DSA helper functions that would retrive the CPU port's statistics, because the entire purpose of DSA is to allow unmodified Ethernet MAC drivers to be used as CPU conduit interfaces, therefore, statistics overlay in such drivers would simply not scale. The new ethtool -S <iface> output would therefore look like this now: <iface> statistics p<2 digits cpu port number>_<switch MIB counter names> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28tcp: give prequeue mode some careEric Dumazet1-5/+5
TCP prequeue goal is to defer processing of incoming packets to user space thread currently blocked in a recvmsg() system call. Intent is to spend less time processing these packets on behalf of softirq handler, as softirq handler is unfair to normal process scheduler decisions, as it might interrupt threads that do not even use networking. Current prequeue implementation has following issues : 1) It only checks size of the prequeue against sk_rcvbuf It was fine 15 years ago when sk_rcvbuf was in the 64KB vicinity. But we now have ~8MB values to cope with modern networking needs. We have to add sk_rmem_alloc in the equation, since out of order packets can definitely use up to sk_rcvbuf memory themselves. 2) Even with a fixed memory truesize check, prequeue can be filled by thousands of packets. When prequeue needs to be flushed, either from sofirq context (in tcp_prequeue() or timer code), or process context (in tcp_prequeue_process()), this adds a latency spike which is often not desirable. I added a fixed limit of 32 packets, as this translated to a max flush time of 60 us on my test hosts. Also note that all packets in prequeue are not accounted for tcp_mem, since they are not charged against sk_forward_alloc at this point. This is probably not a big deal. Note that this might increase LINUX_MIB_TCPPREQUEUEDROPPED counts, which is misnamed, as packets are not dropped at all, but rather pushed to the stack (where they can be either consumed or dropped) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28gre: reject GUE and FOU in collect metadata modeJiri Benc1-0/+5
The collect metadata mode does not support GUE nor FOU. This might be implemented later; until then, we should reject such config. I think this is okay to be changed. It's unlikely anyone has such configuration (as it doesn't work anyway) and we may need a way to distinguish whether it's supported or not by the kernel later. For backwards compatibility with iproute2, it's not possible to just check the attribute presence (iproute2 always includes the attribute), the actual value has to be checked, too. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28gre: build header correctly for collect metadata tunnelsJiri Benc1-4/+5
In ipgre (i.e. not gretap) + collect metadata mode, the skb was assumed to contain Ethernet header and was encapsulated as ETH_P_TEB. This is not the case, the interface is ARPHRD_IPGRE and the protocol to be used for encapsulation is skb->protocol. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28gre: do not assign header_ops in collect metadata modeJiri Benc1-2/+3
In ipgre mode (i.e. not gretap) with collect metadata flag set, the tunnel is incorrectly assumed to be mGRE in NBMA mode (see commit 6a5f44d7a048c). This is not the case, we're controlling the encapsulation addresses by lwtunnel metadata. And anyway, assigning dev->header_ops in collect metadata mode does not make sense. Although it would be more user firendly to reject requests that specify both the collect metadata flag and a remote/local IP address, this would break current users of gretap or introduce ugly code and differences in handling ipgre and gretap configuration. Keep the current behavior of remote/local IP address being ignored in such case. v3: Back to v1, added explanation paragraph. v2: Reject configuration specifying both remote/local address and collect metadata flag. Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.") Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28Merge tag 'mac80211-for-davem-2016-04-27' of ↵David S. Miller1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a single fix, for a per-CPU memory leak in a (root user triggerable) error case. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-04-28tipc: remove an unnecessary NULL checkDan Carpenter1-2/+1
This is never called with a NULL "buf" and anyway, we dereference 's' on the lines before so it would Oops before we reach the check. Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller5-4/+23
Antonio Quartulli says: ==================== In this patchset you can find the following fixes: 1) check skb size to avoid reading beyond its border when delivering payloads, by Sven Eckelmann 2) initialize last_seen time in neigh_node object to prevent cleanup routine from accidentally purge it, by Marek Lindner 3) release "recently added" slave interfaces upon virtual/batman interface shutdown, by Sven Eckelmann 4) properly decrease router object reference counter upon routing table update, by Sven Eckelmann 5) release queue slots when purging OGM packets of deactivating slave interface, by Linus Lüssing Patch 2 and 3 have no "Fixes:" tag because the offending commits date back to when batman-adv was not yet officially in the net tree. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-04-28tuntap: calculate rps hash only when neededJason Wang1-0/+1
There's no need to calculate rps hash if it was not enabled. So this patch export rps_needed and check it before trying to get rps hash. Tests (using pktgen to inject packets to guest) shows this can improve pps about 13% (when rps is disabled). Before: ~1150000 pps After: ~1300000 pps Cc: Michael S. Tsirkin <[email protected]> Signed-off-by: Jason Wang <[email protected]> ---- Changes from V1: - Fix build when CONFIG_RPS is not set Signed-off-by: David S. Miller <[email protected]>