aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-10-22tun: avoid extra timer schedule in tun_flow_cleanup()Eric Dumazet1-3/+6
If tun_flow_cleanup() deleted all flows, no need to arm the timer again. It will be armed next time tun_flow_update() is called. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22tun: do not block BH again in tun_flow_cleanup()Eric Dumazet1-2/+2
tun_flow_cleanup() being a timer callback, it is already running in BH context. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'bpf-BASE_RTT'David S. Miller7-7/+213
Lawrence Brakmo says: ==================== bpf: add support for BASE_RTT This patch set adds the following functionality to socket_ops BPF programs. 1) Add bpf helper function bpf_getsocketops. Currently only supports TCP_CONGESTION 2) Add BPF_SOCKET_OPS_BASE_RTT op to get the base RTT of the connection. In general, the base RTT indicates the threshold such that RTTs above it indicate congestion. More details in the relevant patches. Consists of the following patches: [PATCH net-next 1/5] bpf: add support for BPF_SOCK_OPS_BASE_RTT [PATCH net-next 2/5] bpf: Adding helper function bpf_getsockops [PATCH net-next 3/5] bpf: Add BPF_SOCKET_OPS_BASE_RTT support to [PATCH net-next 4/5] bpf: sample BPF_SOCKET_OPS_BASE_RTT program [PATCH net-next 5/5] bpf: create samples/bpf/tcp_bpf.readme ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22bpf: create samples/bpf/tcp_bpf.readmeLawrence Brakmo1-0/+26
Readme file explaining how to create a cgroupv2 and attach one of the tcp_*_kern.o socket_ops BPF program. Signed-off-by: Lawrence Brakmo <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked_by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22bpf: sample BPF_SOCKET_OPS_BASE_RTT programLawrence Brakmo2-0/+79
Sample socket_ops BPF program to test the BPF helper function bpf_getsocketops and the new socket_ops op BPF_SOCKET_OPS_BASE_RTT. The program provides a base RTT of 80us when the calling flow is within a DC (as determined by the IPV6 prefix) and the congestion algorithm is "nv". Signed-off-by: Lawrence Brakmo <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked_by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22bpf: Add BPF_SOCKET_OPS_BASE_RTT support to tcp_nvLawrence Brakmo1-2/+38
TCP_NV will try to get the base RTT from a socket_ops BPF program if one is loaded. NV will then use the base RTT to bound its min RTT (its notion of the base RTT). It uses the base RTT as an upper bound and 80% of the base RTT as its lower bound. In other words, NV will consider filtered RTTs larger than base RTT as a sign of congestion. As a result, there is no minRTT inflation when there is a lot of congestion. For example, in a DC where the RTTs are less than 40us when there is no congestion, a base RTT value of 80us improves the performance of NV. The difference between the uncongested RTT and the base RTT provided represents how much queueing we are willing to have (in practice it can be higher). NV has been tunned to reduce congestion when there are many flows at the cost of one flow not achieving full bandwith utilization. When a reasonable base RTT is provided, one NV flow can now fully utilize the full bandwidth. In addition, the performance is also improved when there are many flows. In the following examples the NV results are using a kernel with this patch set (i.e. both NV results are using the new nv_loss_dec_factor). With one host sending to another host and only one flow the goodputs are: Cubic: 9.3 Gbps, NV: 5.5 Gbps, NV (baseRTT=80us): 9.2 Gbps With 2 hosts sending to one host (1 flow per host, the goodput per flow is: Cubic: 4.6 Gbps, NV: 4.5 Gbps, NV (baseRTT=80us)L 4.6 Gbps But the RTTs seen by a ping process in the sender is: Cubic: 3.3ms NV: 97us, NV (baseRTT=80us): 146us With a lot of flows things look even better for NV with baseRTT. Here we have 3 hosts sending to one host. Each sending host has 6 flows: 1 stream, 4x1MB RPC, 1x10KB RPC. Cubic, NV and NV with baseRTT all fully utilize the full available bandwidth. However, the distribution of bandwidth among the flows is very different. For the 10KB RPC flow: Cubic: 27Mbps, NV: 111Mbps, NV (baseRTT=80us): 222Mbps The 99% latencies for the 10KB flows are: Cubic: 26ms, NV: 1ms, NV (baseRTT=80us): 500us The RTT seen by a ping process at the senders: Cubic: 3.2ms NV: 720us, NV (baseRTT=80us): 330us Signed-off-by: Lawrence Brakmo <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22bpf: Adding helper function bpf_getsockopsLawrence Brakmo3-5/+63
Adding support for helper function bpf_getsockops to socket_ops BPF programs. This patch only supports TCP_CONGESTION. Signed-off-by: Vlad Vysotsky <[email protected]> Acked-by: Lawrence Brakmo <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22bpf: add support for BPF_SOCK_OPS_BASE_RTTLawrence Brakmo1-0/+7
A congestion control algorithm can make a call to the BPF socket_ops program to request the base RTT. The base RTT can be congestion control dependent and is meant to represent a congestion threshold such that RTTs above it indicate congestion. This is especially useful for flows within a DC where the base RTT is easy to obtain. Being provided a base RTT solves a basic problem in RTT based congestion avoidance algorithms (such as Vegas, NV and BBR). Although it is easy to get the base RTT when the network is not congested, it is very diffcult to do when it is very congested. Newer connections get an inflated value of the base RTT leading to unfariness (newer flows with a larger base RTT get more bandwidth). As a result, RTT based congestion avoidance algorithms tend to update their base RTTs to improve fairness. In very congested networks this can lead to base RTT inflation, reducing the ability of these RTT based congestion control algorithms to prevent congestion. Note that in my experiments with TCP-NV, the base RTT provided can be much larger than the actual hardware RTT. For example, experimenting with hosts within a rack where the hardware RTT is 16-20us, I've used base RTTs up to 150us. The effect of using a larger base RTT is that the congestion avoidance algorithm will allow more queueing. When there are only a few flows the main effect is larger measured RTTs and RPC latencies due to the increased queueing. When there are a lot of flows, a larger base RTT can lead to more congestion and more packet drops. For this case, where the hardware RTT is 20us, a base RTT of 80us produces good results. This patch only introduces BPF_SOCK_OPS_BASE_RTT, a later patch in this set adds support for using it in TCP-NV. Further study and testing is needed before support can be added to other delay based congestion avoidance algorithms. Signed-off-by: Lawrence Brakmo <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22nfp: use struct fields for 8 bit-wide accessPieter Jansen van Vuuren2-74/+39
Use direct access struct fields rather than PREP_FIELD() macros to manipulate the jump ID and length, both of which are exactly 8-bits wide. This simplifies the code somewhat. Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: x25: mark expected switch fall-throughsGustavo A. R. Silva2-1/+2
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: af_unix: mark expected switch fall-throughGustavo A. R. Silva1-0/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22rxrpc: Don't release call mutex on error pointerDavid Howells1-2/+3
Don't release call mutex at the end of rxrpc_kernel_begin_call() if the call pointer actually holds an error value. Fixes: 540b1c48c37a ("rxrpc: Fix deadlock between call creation and sendmsg/recvmsg") Reported-by: Marc Dionne <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'stmmac-hw-tstamp-fixes'David S. Miller2-8/+8
Jose Abreu says: ==================== net: stmmac: Fix HW timestamping Three fixes for HW timestamping feature, all of them for RX side. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: stmmac: Prevent infinite loop in get_rx_timestamp_status()Jose Abreu1-1/+1
Prevent infinite loop by correctly setting the loop condition to break when i == 10. Signed-off-by: Jose Abreu <[email protected]> Cc: David S. Miller <[email protected]> Cc: Joao Pinto <[email protected]> Cc: Giuseppe Cavallaro <[email protected]> Cc: Alexandre Torgue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: stmmac: Fix stmmac_get_rx_hwtstamp()Jose Abreu1-7/+6
When using GMAC4 the valid timestamp is from CTX next desc but we are passing the previous desc to get_rx_timestamp_status() callback. Fix this and while at it rework a little bit the function logic. Signed-off-by: Jose Abreu <[email protected]> Cc: David S. Miller <[email protected]> Cc: Joao Pinto <[email protected]> Cc: Giuseppe Cavallaro <[email protected]> Cc: Alexandre Torgue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: stmmac: Add missing call to dev_kfree_skb()Jose Abreu1-0/+1
When RX HW timestamp is enabled and a frame is discarded we are not freeing the skb but instead only setting to NULL the entry. Add a call to dev_kfree_skb_any() so that skb entry is correctly freed. Signed-off-by: Jose Abreu <[email protected]> Cc: David S. Miller <[email protected]> Cc: Joao Pinto <[email protected]> Cc: Giuseppe Cavallaro <[email protected]> Cc: Alexandre Torgue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-21Merge branch 'for-linus' of ↵Linus Torvalds13-97/+199
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - joydev now implements a blacklist to avoid creating joystick nodes for accelerometers found in composite devices such as PlaStation controllers - assorted driver fixes * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ims-psu - check if CDC union descriptor is sane Input: joydev - blacklist ds3/ds4/udraw motion sensors Input: allow matching device IDs on property bits Input: factor out and export input_device_id matching code Input: goodix - poll the 'buffer status' bit before reading data Input: axp20x-pek - fix module not auto-loading for axp221 pek Input: tca8418 - enable interrupt after it has been requested Input: stmfts - fix setting ABS_MT_POSITION_* maximum size Input: ti_am335x_tsc - fix incorrect step config for 5 wire touchscreen Input: synaptics - disable kernel tracking on SMBus devices
2017-10-22geneve: Get rid of is_all_zero(), streamline is_tnl_info_zero()Stefano Brivio1-16/+3
No need to re-invent memchr_inv() with !is_all_zero(). While at it, replace conditional and return clauses with a single return clause in is_tnl_info_zero(). Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'dsa-lan9303-Add-fdb-mdb-methods'David S. Miller2-0/+344
Egil Hjelmeland says: ==================== net: dsa: lan9303: Add fdb/mdb methods This series add support for accessing and managing the lan9303 ALR (Address Logic Resolution). The first patch add low level functions for accessing the ALR, along with port_fast_age and port_fdb_dump methods. The second patch add functions for managing ALR entires, along with remaining fdb/mdb methods. Note that to complete STP support, a special ALR entry with the STP eth address must be added too. This must be addressed later. Comments welcome! Changes v2 -> v3: - Whitespace polishing. Removed some "section" comments. - Prefixed ALR constants with LAN9303_ for consistency. - Patch 2: lan9303_port_fast_age() wrap the "port" into a struct for passing as context to alr_loop_cb_del_port_learned. Safer in event of type change. - Patch 2: Reviewed-by: Vivien Didelot <[email protected]> Changes v1 -> v2: - Patch 2: Removed question comment ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: dsa: lan9303: Add fdb/mdb manipulationEgil Hjelmeland2-0/+182
Add functions for managing the lan9303 ALR (Address Logic Resolution). Implement DSA methods: port_fdb_add, port_fdb_del, port_mdb_prepare, port_mdb_add and port_mdb_del. Since the lan9303 do not offer reading specific ALR entry, the driver caches all static entries - in a flat table. Signed-off-by: Egil Hjelmeland <[email protected]> Reviewed-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: dsa: lan9303: Add port_fast_age and port_fdb_dump methodsEgil Hjelmeland2-0/+162
Add DSA method port_fast_age as a step to STP support. Add low level functions for accessing the lan9303 ALR (Address Logic Resolution). Added DSA method port_fdb_dump Signed-off-by: Egil Hjelmeland <[email protected]> Reviewed-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-21Merge branch 'for-linus' of ↵Linus Torvalds5-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "MS_I_VERSION fixes - Mimi's fix + missing bits picked from Matthew (his patch contained a duplicate of the fs/namespace.c fix as well, but by that point the original fix had already been applied)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: Convert fs/*/* to SB_I_VERSION vfs: fix mounting a filesystem with i_version
2017-10-22tipc: refactor tipc_sk_timeout() functionJon Maloy1-26/+23
The function tipc_sk_timeout() is more complex than necessary, and even seems to contain an undetected bug. At one of the occurences where we renew the timer we just order it with (HZ / 20), instead of (jiffies + HZ / 20); In this commit we clean up the function. Acked-by: Ying Xue <[email protected]> Signed-off-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'net-driver-refcont_t'David S. Miller20-96/+105
Elena Reshetova says: ==================== networking drivers refcount_t conversions Note: these are the last patches related to networking that perform conversion of refcounters from atomic_t to refcount_t. In contrast to the core network refcounter conversions that were merged earlier, these are much more straightforward ones. This series, for various networking drivers, replaces atomic_t reference counters with the new refcount_t type and API (see include/linux/refcount.h). By doing this we prevent intentional or accidental underflows or overflows that can led to use-after-free vulnerabilities. The patches are fully independent and can be cherry-picked separately. Patches are based on top of net-next. If there are no objections to the patches, please merge them via respective trees ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, connector: convert cn_callback_entry.refcnt from atomic_t to refcount_tElena Reshetova3-5/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable cn_callback_entry.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, ppp: convert syncppp.refcnt from atomic_t to refcount_tElena Reshetova1-5/+6
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable syncppp.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, ppp: convert ppp_file.refcnt from atomic_t to refcount_tElena Reshetova1-10/+11
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable ppp_file.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, ppp: convert asyncppp.refcnt from atomic_t to refcount_tElena Reshetova1-5/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable asyncppp.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net: convert masces_tx_sa.refcnt from atomic_t to refcount_tElena Reshetova1-4/+4
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable masces_tx_sa.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net: convert masces_rx_sc.refcnt from atomic_t to refcount_tElena Reshetova1-4/+4
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable masces_rx_sc.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net: convert masces_rx_sa.refcnt from atomic_t to refcount_tElena Reshetova1-4/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable masces_rx_sa.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, hamradio: convert sixpack.refcnt from atomic_t to refcount_tElena Reshetova1-6/+6
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable sixpack.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, mlx5: convert fs_node.refcount from atomic_t to refcount_tElena Reshetova2-15/+16
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable fs_node.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, mlx5: convert mlx5_cq.refcount from atomic_t to refcount_tElena Reshetova2-10/+10
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx5_cq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, mlx4: convert mlx4_srq.refcount from atomic_t to refcount_tElena Reshetova2-5/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_srq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, mlx4: convert mlx4_qp.refcount from atomic_t to refcount_tElena Reshetova2-5/+5
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_qp.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, mlx4: convert mlx4_cq.refcount from atomic_t to refcount_tElena Reshetova2-6/+6
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mlx4_cq.refcount is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, ethernet: convert mtk_eth.dma_refcnt from atomic_t to refcount_tElena Reshetova2-4/+8
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable mtk_eth.dma_refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22drivers, net, ethernet: convert clip_entry.refcnt from atomic_t to refcount_tElena Reshetova2-8/+9
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable clip_entry.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'mlxsw-fixes'David S. Miller2-1/+41
Jiri Pirko says: ==================== mlxsw: spectrum: Configure TTL of "inherit" for offloaded tunnels Petr says: Currently mlxsw only offloads tunnels that are configured with TTL of "inherit" (which is the default). However, Spectrum defaults to 255 and the driver neglects to change the configuration. Thus the tunnel packets from offloaded tunnels always have TTL of 255, even though tunnels with explicit TTL of 255 are never actually offloaded. To fix this, introduce support for TIGCR, the register that keeps the related bits of global tunnel configuration, and use it on first offload to properly configure inheritance of TTL of tunnel packets from overlay packets. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22mlxsw: spectrum_router: Configure TIGCR on initPetr Machata1-1/+10
Spectrum tunnels do not default to ttl of "inherit" like the Linux ones do. Configure TIGCR on router init so that the TTL of tunnel packets is copied from the overlay packets. Fixes: ee954d1a91b2 ("mlxsw: spectrum_router: Support GRE tunnels") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22mlxsw: reg: Add Tunneling IPinIP General Configuration RegisterPetr Machata1-0/+31
The TIGCR register is used for setting up the IPinIP Tunnel configuration. Fixes: ee954d1a91b2 ("mlxsw: spectrum_router: Support GRE tunnels") Signed-off-by: Petr Machata <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'hns3-loopback-selftest'David S. Miller4-5/+343
Yunsheng Lin says: ==================== Add mac loopback selftest support in hns3 driver This patchset refactors the skb receiving and transmitting function before adding mac loopback selftest support in hns3 driver. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: hns3: Add mac loopback selftest support in hns3 driverYunsheng Lin2-0/+327
This patch adds mac loopback selftest support for ethtool cmd by checking if a transmitted packet can be received correctly when mac loopback is enabled. Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: hns3: Refactor the skb receiving and transmitting functionYunsheng Lin2-5/+16
This patch refactors the skb receiving and transmitting functions and export them in order to support the ethtool's mac loopback selftest. Signed-off-by: Yunsheng Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22net: ethtool: remove error check for legacy setting transceiver typeNiklas Söderlund1-3/+2
Commit 9cab88726929605 ("net: ethtool: Add back transceiver type") restores the transceiver type to struct ethtool_link_settings and convert_link_ksettings_to_legacy_settings() but forgets to remove the error check for the same in convert_legacy_settings_to_link_ksettings(). This prevents older versions of ethtool to change link settings. # ethtool --version ethtool version 3.16 # ethtool -s eth0 autoneg on speed 100 duplex full Cannot set new settings: Invalid argument not setting speed not setting duplex not setting autoneg While newer versions of ethtool works. # ethtool --version ethtool version 4.10 # ethtool -s eth0 autoneg on speed 100 duplex full [ 57.703268] sh-eth ee700000.ethernet eth0: Link is Down [ 59.618227] sh-eth ee700000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx Fixes: 19cab88726929605 ("net: ethtool: Add back transceiver type") Signed-off-by: Niklas Söderlund <[email protected]> Reported-by: Renjith R V <[email protected]> Tested-by: Geert Uytterhoeven <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22Merge branch 'bpftool-add-a-version-command-and-fix-several-items'David S. Miller6-30/+56
Jakub Kicinski says: ==================== tools: bpftool: add a "version" command, and fix several items Quentin says: The first seven patches of this series bring several minor fixes to bpftool. Please see individual commit logs for details. Last patch adds a "version" commands to bpftool, which is in fact the version of the kernel from which it was compiled. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-10-22tools: bpftool: add a command to display bpftool versionQuentin Monnet2-1/+15
This command can be used to print the version of the tool, which is in fact the version from Linux taken from usr/include/linux/version.h. Example usage: $ bpftool version bpftool v4.14.0 Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22tools: bpftool: show that `opcodes` or `file FILE` should be exclusiveQuentin Monnet2-6/+6
For the `bpftool prog dump { jited | xlated } ...` command, adding `opcodes` keyword (to request opcodes to be printed) will have no effect if `file FILE` (to write binary output to FILE) is provided. The manual page and the help message to be displayed in the terminal should reflect that, and indicate that these options should be mutually exclusive. Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-10-22tools: bpftool: print all relevant byte opcodes for "load double word"Quentin Monnet1-3/+12
The eBPF instruction permitting to load double words (8 bytes) into a register need 8-byte long "immediate" field, and thus occupy twice the space of other instructions. bpftool was aware of this and would increment the instruction counter only once on meeting such instruction, but it would only print the first four bytes of the immediate value to load. Make it able to dump the whole 16 byte-long double instruction instead (as would `llvm-objdump -d <program>`). Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>