aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2014-01-16tipc: standardize sendmsg routine of connected socketYing Xue1-19/+41
Standardize the behaviour of waiting for events in TIPC send_packet() so that all variables of socket or port structures are protected within socket lock, allowing the process of calling sendmsg() to be woken up at appropriate time. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16tipc: standardize sendmsg routine of connectionless socketYing Xue1-10/+29
Comparing the behaviour of how to wait for events in TIPC sendmsg() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, sk_sleep() and tport->congested variables associated with socket are exposed without socket lock protection while wait_event_interruptible_timeout() accesses them. So standardizing it with similar implementation in other stacks can help us correct these errors which the process of calling sendmsg() cannot be woken up event if an expected event arrive at socket or improperly woken up although the wake condition doesn't match. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16tipc: standardize accept routineYing Xue1-13/+41
Comparing the behaviour of how to wait for events in TIPC accept() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. As sk_sleep() and sk->sk_receive_queue variables associated with socket are not protected by socket lock, the process of calling accept() may be woken up improperly or sometimes cannot be woken up at all. After standardizing it with inet_csk_wait_for_connect routine, we can get benefits including: avoiding 'thundering herd' phenomenon, adding a timeout mechanism for accept(), coping with a pending signal, and having sk_sleep() and sk->sk_receive_queue being always protected within socket lock scope and so on. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16tipc: standardize connect routineYing Xue1-30/+33
Comparing the behaviour of how to wait for events in TIPC connect() with other stacks, the TIPC implementation might be perceived as different, and sometimes even incorrect. For instance, as both sock->state and sk_sleep() are directly fed to wait_event_interruptible_timeout() as its arguments, and socket lock has to be released before we call wait_event_interruptible_timeout(), the two variables associated with socket are exposed out of socket lock protection, thereby probably getting stale values so that the process of calling connect() cannot be woken up exactly even if correct event arrives or it is woken up improperly even if the wake condition is not satisfied in practice. Therefore, standardizing its behaviour with sk_stream_wait_connect routine can avoid these risks. Additionally the implementation of connect routine is simplified as a whole, allowing it to return correct values in all different cases. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16sctp: remove the unnecessary assignmentwangweidong1-1/+0
When go the right path, the status is 0, no need to assign it again. So just remove the assignment. Signed-off-by: Wang Weidong <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16net_sched: act: pick a different type for act_xtWANG Cong1-1/+1
In tcf_register_action() we check either ->type or ->kind to see if there is an existing action registered, but ipt action registers two actions with same type but different kinds. They should have different types too. Cc: Jamal Hadi Salim <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller4-13/+31
Included change: - properly format already existing kerneldoc Signed-off-by: David S. Miller <[email protected]>
2014-01-16net_sched: act: use tcf_hash_release() in net/sched/act_police.cWANG Cong1-27/+3
Cc: Jamal Hadi Salim <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller1-1/+1
Included change: - properly compute the batman-adv header overhead. Such result is later used to initialize the hard_header_len member of the soft-interface netdev object Signed-off-by: David S. Miller <[email protected]>
2014-01-16net: add NETDEV_PRECHANGEMTU to notify before mtu change happensVeaceslav Falico1-0/+5
Currently, if a device changes its mtu, first the change happens (invloving all the side effects), and after that the NETDEV_CHANGEMTU is sent so that other devices can catch up with the new mtu. However, if they return NOTIFY_BAD, then the change is reverted and error returned. This is a really long and costy operation (sometimes). To fix this, add NETDEV_PRECHANGEMTU notification which is called prior to any change actually happening, and if any callee returns NOTIFY_BAD - the change is aborted. This way we're skipping all the playing with apply/revert the mtu. CC: "David S. Miller" <[email protected]> CC: Jiri Pirko <[email protected]> CC: Eric Dumazet <[email protected]> CC: Nicolas Dichtel <[email protected]> CC: Cong Wang <[email protected]> Signed-off-by: Veaceslav Falico <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16net: Check skb->rxhash in gro_receiveTom Herbert1-1/+8
When initializing a gro_list for a packet, first check the rxhash of the incoming skb against that of the skb's in the list. This should be a very strong inidicator of whether the flow is going to be matched, and potentially allows a lot of other checks to be short circuited. Use skb_hash_raw so that we don't force the hash to be calculated. Tested by running netperf 200 TCP_STREAMs between two machines with GRO, HW rxhash, and 1G. Saw no performance degration, slight reduction of time in dev_gro_receive. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16packet: use percpu mmap tx frame pending refcountDaniel Borkmann3-7/+62
In PF_PACKET's packet mmap(), we can avoid using one atomic_inc() and one atomic_dec() call in skb destructor and use a percpu reference count instead in order to determine if packets are still pending to be sent out. Micro-benchmark with [1] that has been slightly modified (that is, protcol = 0 in socket(2) and bind(2)), example on a rather crappy testing machine; I expect it to scale and have even better results on bigger machines: ./packet_mm_tx -s7000 -m7200 -z700000 em1, avg over 2500 runs: With patch: 4,022,015 cyc Without patch: 4,812,994 cyc time ./packet_mm_tx -s64 -c10000000 em1 > /dev/null, stable: With patch: real 1m32.241s user 0m0.287s sys 1m29.316s Without patch: real 1m38.386s user 0m0.265s sys 1m35.572s In function tpacket_snd(), it is okay to use packet_read_pending() since in fast-path we short-circuit the condition already with ph != NULL, since we have next frames to process. In case we have MSG_DONTWAIT, we also do not execute this path as need_wait is false here anyway, and in case of _no_ MSG_DONTWAIT flag, it is okay to call a packet_read_pending(), because when we ever reach that path, we're done processing outgoing frames anyway and only look if there are skbs still outstanding to be orphaned. We can stay lockless in this percpu counter since it's acceptable when we reach this path for the sum to be imprecise first, but we'll level out at 0 after all pending frames have reached the skb destructor eventually through tx reclaim. When people pin a tx process to particular CPUs, we expect overflows to happen in the reference counter as on one CPU we expect heavy increase; and distributed through ksoftirqd on all CPUs a decrease, for example. As David Laight points out, since the C language doesn't define the result of signed int overflow (i.e. rather than wrap, it is allowed to saturate as a possible outcome), we have to use unsigned int as reference count. The sum over all CPUs when tx is complete will result in 0 again. The BUG_ON() in tpacket_destruct_skb() we can remove as well. It can _only_ be set from inside tpacket_snd() path and we made sure to increase tx_ring.pending in any case before we called po->xmit(skb). So testing for tx_ring.pending == 0 is not too useful. Instead, it would rather have been useful to test if lower layers didn't orphan the skb so that we're missing ring slots being put back to TP_STATUS_AVAILABLE. But such a bug will be caught in user space already as we end up realizing that we do not have any TP_STATUS_AVAILABLE slots left anymore. Therefore, we're all set. Btw, in case of RX_RING path, we do not make use of the pending member, therefore we also don't need to use up any percpu memory here. Also note that __alloc_percpu() already returns a zero-filled percpu area, so initialization is done already. [1] http://wiki.ipxwarzone.com/index.php5?title=Linux_packet_mmap Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16packet: don't unconditionally schedule() in case of MSG_DONTWAITDaniel Borkmann1-7/+6
In tpacket_snd(), when we've discovered a first frame that is not in status TP_STATUS_SEND_REQUEST, and return a NULL buffer, we exit the send routine in case of MSG_DONTWAIT, since we've finished traversing the mmaped send ring buffer and don't care about pending frames. While doing so, we still unconditionally call an expensive schedule() in the packet_current_frame() "error" path, which is unnecessary in this case since it's enough to just quit the function. Also, in case MSG_DONTWAIT is not set, we should rather test for need_resched() first and do schedule() only if necessary since meanwhile pending frames could already have finished processing and called skb destructor. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16packet: improve socket create/bind latency in some casesDaniel Borkmann1-11/+22
Most people acquire PF_PACKET sockets with a protocol argument in the socket call, e.g. libpcap does so with htons(ETH_P_ALL) for all its sockets. Most likely, at some point in time a subsequent bind() call will follow, e.g. in libpcap with ... memset(&sll, 0, sizeof(sll)); sll.sll_family = AF_PACKET; sll.sll_ifindex = ifindex; sll.sll_protocol = htons(ETH_P_ALL); ... as arguments. What happens in the kernel is that already in socket() syscall, we install a proto hook via register_prot_hook() if our protocol argument is != 0. Yet, in bind() we're almost doing the same work by doing a unregister_prot_hook() with an expensive synchronize_net() call in case during socket() the proto was != 0, plus follow-up register_prot_hook() with a bound device to it this time, in order to limit traffic we get. In the case when the protocol and user supplied device index (== 0) does not change from socket() to bind(), we can spare us doing the same work twice. Similarly for re-binding to the same device and protocol. For these scenarios, we can decrease create/bind latency from ~7447us (sock-bind-2 case) to ~89us (sock-bind-1 case) with this patch. Alternatively, for the first case, if people care, they should simply create their sockets with proto == 0 argument and define the protocol during bind() as this saves a call to synchronize_net() as well (sock-bind-3 case). In all other cases, we're tied to user space behaviour we must not change, also since a bind() is not strictly required. Thus, we need the synchronize_net() to make sure no asynchronous packet processing paths still refer to the previous elements of po->prot_hook. In case of mmap()ed sockets, the workflow that includes bind() is socket() -> setsockopt(<ring>) -> bind(). In that case, a pair of {__unregister, register}_prot_hook is being called from setsockopt() in order to install the new protocol receive handler. Thus, when we call bind and can skip a re-hook, we have already previously installed the new handler. For fanout, this is handled different entirely, so we should be good. Timings on an i7-3520M machine: * sock-bind-1: 89 us * sock-bind-2: 7447 us * sock-bind-3: 75 us sock-bind-1: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=all(0), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-2: socket(PF_PACKET, SOCK_RAW, htons(ETH_P_IP)) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 sock-bind-3: socket(PF_PACKET, SOCK_RAW, 0) = 3 bind(3, {sa_family=AF_PACKET, proto=htons(ETH_P_IP), if=lo(1), pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0 Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16net/ipv4: don't use module_init in non-modular gre_offloadPaul Gortmaker1-8/+2
Recent commit 438e38fadca2f6e57eeecc08326c8a95758594d4 ("gre_offload: statically build GRE offloading support") added new module_init/module_exit calls to the gre_offload.c file. The file is obj-y and can't be anything other than built-in. Currently it can never be built modular, so using module_init as an alias for __initcall can be somewhat misleading. Fix this up now, so that we can relocate module_init from init.h into module.h in the future. If we don't do this, we'd have to add module.h to obviously non-modular code, and that would be a worse thing. We also make the inclusion explicit. Note that direct use of __initcall is discouraged, vs. one of the priority categorized subgroups. As __initcall gets mapped onto device_initcall, our use of device_initcall directly in this change means that the runtime impact is zero -- it will remain at level 6 in initcall ordering. As for the module_exit, rather than replace it with __exitcall, we simply remove it, since it appears only UML does anything with those, and even for UML, there is no relevant cleanup to be done here. Cc: Eric Dumazet <[email protected]> Signed-off-by: Paul Gortmaker <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16net: eth_type_trans() should use skb_header_pointer()Eric Dumazet1-2/+5
eth_type_trans() can read uninitialized memory as drivers do not necessarily pull more than 14 bytes in skb->head before calling it. As David suggested, we can use skb_header_pointer() to fix this without breaking some drivers that might not expect eth_type_trans() pulling 2 additional bytes. Signed-off-by: Eric Dumazet <[email protected]> Cc: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16Merge branch 'master' of ↵David S. Miller3-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nftables Pablo Neira Ayuso says: ==================== This small batch contains several Netfilter fixes for your net-next tree, more specifically: * Fix compilation warning in nft_ct in NF_CONNTRACK_MARK is not set, from Kristian Evensen. * Add dependency to IPV6 for NF_TABLES_INET. This one has been reported by the several robots that are testing .config combinations, from Paul Gortmaker. * Fix default base chain policy setting in nf_tables, from myself. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-01-16neigh: use NEIGH_VAR_INIT in ndo_neigh_setup functions.Jiri Pirko1-2/+2
When ndo_neigh_setup is called, the bitfield used by NEIGH_VAR_SET is not initialized yet. This might cause confusion for the people who use NEIGH_VAR_SET in ndo_neigh_setup. So rather introduce NEIGH_VAR_INIT for usage in ndo_neigh_setup. Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15bpf: do not use reciprocal divideEric Dumazet1-28/+2
At first Jakub Zawadzki noticed that some divisions by reciprocal_divide were not correct. (off by one in some cases) http://www.wireshark.org/~darkjames/reciprocal-buggy.c He could also show this with BPF: http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c The reciprocal divide in linux kernel is not generic enough, lets remove its use in BPF, as it is not worth the pain with current cpus. Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Jakub Zawadzki <[email protected]> Cc: Mircea Gherzan <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Hannes Frederic Sowa <[email protected]> Cc: Matt Evans <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: David S. Miller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15ipv6 addrconf: don't cleanup prefix route for IFA_F_NOPREFIXROUTEThomas Haller1-75/+109
Refactor the deletion/update of prefix routes when removing an address. Now also consider IFA_F_NOPREFIXROUTE and if there is an address present with this flag, to not cleanup the route. Instead, assume that userspace is taking care of this route. Also perform the same cleanup, when userspace changes an existing address to add NOPREFIXROUTE (to an address that didn't have this flag). This is done because when the address was added, a prefix route was created for it. Since the user now wants to handle this route by himself, we cleanup this route. This cleanup of the route is not totally robust. There is no guarantee, that the route we are about to delete was really the one added by the kernel. This behavior does not change by the patch, and in practice it should work just fine. Signed-off-by: Thomas Haller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15ipv6 addrconf: add IFA_F_NOPREFIXROUTE flag to suppress creation of IP6 routesThomas Haller1-6/+13
When adding/modifying an IPv6 address, the userspace application needs a way to suppress adding a prefix route. This is for example relevant together with IFA_F_MANAGERTEMPADDR, where userspace creates autoconf generated addresses, but depending on on-link, no route for the prefix should be added. Signed-off-by: Thomas Haller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15Revert "batman-adv: drop dependency against CRC16"David S. Miller1-0/+1
This reverts commit 12afc36e38b3b6a0ec9bda71632c2285e7fdbab2. The dependency is actually still necessary. Signed-off-by: David S. Miller <[email protected]>
2014-01-15sctp: create helper function to enable|disable sackdelaywangweidong1-18/+19
add sctp_spp_sackdelay_{enable|disable} helper function for avoiding code duplication. Signed-off-by: Wang Weidong <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15ipv6: move IPV6_TCLASS_SHIFT into ipv6.h and define a helperLi RongQing4-6/+3
Two places defined IPV6_TCLASS_SHIFT, so we should move it into ipv6.h, and use this macro as possible. And define ip6_tclass helper to return tclass Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15net: move 6lowpan compression code to separate moduleDmitry Eremin-Solenikov4-2/+11
IEEE 802.15.4 and Bluetooth networking stacks share 6lowpan compression code. Instead of introducing Makefile/Kconfig hacks, build this code as a separate module referenced from both ieee802154 and bluetooth modules. This fixes the following build error observed in some kernel configurations: net/built-in.o: In function `header_create': 6lowpan.c:(.text+0x166149): undefined reference to `lowpan_header_compress' net/built-in.o: In function `bt_6lowpan_recv': (.text+0x166b3c): undefined reference to `lowpan_process_data' Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Dmitry Eremin-Solenikov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15net: rename sysfs symlinks on device name changeVeaceslav Falico1-0/+22
Currently, we don't rename the upper/lower_ifc symlinks in /sys/class/net/*/ , which might result stale/duplicate links/names. Fix this by adding netdev_adjacent_rename_links(dev, oldname) which renames all the upper/lower interface's links to dev from the upper/lower_oldname to the new name. We don't need a rollback because only we control these symlinks and if we fail to rename them - sysfs will anyway complain. Reported-by: Ding Tianhong <[email protected]> CC: Ding Tianhong <[email protected]> CC: "David S. Miller" <[email protected]> CC: Eric Dumazet <[email protected]> CC: Nicolas Dichtel <[email protected]> CC: Cong Wang <[email protected]> Signed-off-by: Veaceslav Falico <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15net: add sysfs helpers for netdev_adjacent logicVeaceslav Falico1-27/+30
They clean up the code a bit and can be used further. CC: Ding Tianhong <[email protected]> CC: "David S. Miller" <[email protected]> CC: Eric Dumazet <[email protected]> CC: Nicolas Dichtel <[email protected]> CC: Cong Wang <[email protected]> Signed-off-by: Veaceslav Falico <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-16batman-adv: use consistent kerneldoc styleSimon Wunderlich4-13/+31
Reported-by: Antonio Quartulli <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2014-01-15batman-adv: fix batman-adv header overhead calculationMarek Lindner1-1/+1
Batman-adv prepends a full ethernet header in addition to its own header. This has to be reflected in the MTU calculation, especially since the value is used to set dev->hard_header_len. Introduced by 411d6ed93a5d0601980d3e5ce75de07c98e3a7de ("batman-adv: consider network coding overhead when calculating required mtu") Reported-by: cmsv <[email protected]> Reported-by: Martin Hundebøll <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2014-01-15neigh: split lines for NEIGH_VAR_SET so they are not too longJiri Pirko1-3/+6
introduced by: commit 1f9248e5606afc6485255e38ad57bdac08fa7711 "neigh: convert parms to an array" Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-15netfilter: nft_ct: fix compilation warning if NF_CONNTRACK_MARK is not setKristian Evensen1-0/+2
net/netfilter/nft_ct.c: In function 'nft_ct_set_eval': net/netfilter/nft_ct.c:136:6: warning: unused variable 'value' [-Wunused-variable] Reported-by: kbuild test robot <[email protected]> Signed-off-by: Kristian Evensen <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-01-14tcp: do not export tcp_gso_segment() and tcp_gro_receive()Eric Dumazet1-2/+0
tcp_gso_segment() and tcp_gro_receive() no longer need to be exported. IPv4 and IPv6 offloads are statically linked. Note that tcp_gro_complete() is still used by bnx2x, unfortunately. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: nl80211: __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue1-65/+37
As __cfg80211_rdev_from_attrs(), nl80211_dump_wiphy_parse() and nl80211_set_wiphy() are all under rtnl_lock protection, __dev_get_by_index() instead of dev_get_by_index() should be used to find interface handler in them allowing us to avoid to change interface reference counter. Cc: Johannes Berg <[email protected]> Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14can: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue1-10/+5
As cgw_create_job() is always under rtnl_lock protection, __dev_get_by_index() instead of dev_get_by_index() should be used to find interface handler in it having us avoid to change interface reference counter. Cc: Oliver Hartkopp <[email protected]> Signed-off-by: Ying Xue <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14caif: __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue1-2/+1
The following call chains indicate that chnl_net_open() is under rtnl_lock protection as __dev_open() is protected by rtnl_lock. So if __dev_get_by_index() instead of dev_get_by_index() is used to find interface handler in it, this would help us avoid to change interface reference counter. __dev_open() chnl_net_open() Cc: Dmitry Tarnyagin <[email protected]> Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14batman-adv: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue1-3/+1
The following call chains indicate that batadv_is_on_batman_iface() is always under rtnl_lock protection as call_netdevice_notifier() is protected by rtnl_lock. So if __dev_get_by_index() rather than dev_get_by_index() is used to find interface handler in it, this would help us avoid to change interface reference counter. call_netdevice_notifier() batadv_hard_if_event() batadv_hardif_add_interface() batadv_is_valid_iface() batadv_is_on_batman_iface() Cc: Antonio Quartulli <[email protected]> Signed-off-by: Ying Xue <[email protected]> Acked-by: Antonio Quartulli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14decnet: use __dev_get_by_index instead of dev_get_by_index to find interfaceYing Xue1-8/+2
The following call chain we can identify that dn_cache_getroute() is protected under rtnl_lock. So if we use __dev_get_by_index() instead of dev_get_by_index() to find interface handlers in it, this would help us avoid to change interface reference counter. rtnetlink_rcv() rtnl_lock() netlink_rcv_skb() dn_cache_getroute() rtnl_unlock() Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14dcb: use __dev_get_by_name instead of dev_get_by_name to find interfaceYing Xue1-10/+5
The following call chain indicates that dcb_doit() is protected under rtnl_lock. So if we use __dev_get_by_name() instead of dev_get_by_name() to find interface handlers in it, this would help us avoid to change interface reference counter. rtnetlink_rcv() rtnl_lock() netlink_rcv_skb() dcb_doit() rtnl_unlock() Cc: John Fastabend <[email protected]> Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14IPv6: move the anycast_src_echo_reply sysctl to netns_sysctl_ipv6FX Le Bail2-3/+3
This change move anycast_src_echo_reply sysctl with other ipv6 sysctls. Suggested-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Francois-Xavier Le Bail <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14sctp: remove a redundant NULL checkDan Carpenter1-1/+1
It confuses Smatch when we check "sinit" for NULL and then non-NULL and that causes a false positive warning later. Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14tipc: spelling fixesstephen hemminger3-3/+3
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14ipv6: addrconf spelling fixesstephen hemminger1-5/+5
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: avoid reference counter overflows on fib_rules in multicast forwardingHannes Frederic Sowa2-4/+10
Bob Falken reported that after 4G packets, multicast forwarding stopped working. This was because of a rule reference counter overflow which freed the rule as soon as the overflow happend. This patch solves this by adding the FIB_LOOKUP_NOREF flag to fib_rules_lookup calls. This is safe even from non-rcu locked sections as in this case the flag only implies not taking a reference to the rule, which we don't need at all. Rules only hold references to the namespace, which are guaranteed to be available during the call of the non-rcu protected function reg_vif_xmit because of the interface reference which itself holds a reference to the net namespace. Fixes: f0ad0860d01e47 ("ipv4: ipmr: support multiple tables") Fixes: d1db275dd3f6e4 ("ipv6: ip6mr: support multiple tables") Reported-by: Bob Falken <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Thomas Graf <[email protected]> Cc: Julian Anastasov <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: Spelling s/transmition/transmission/Geert Uytterhoeven1-1/+1
Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14ieee802154: Fix memory leak in ieee802154_add_iface()Christian Engelmayer1-2/+4
Fix a memory leak in the ieee802154_add_iface() error handling path. Detected by Coverity: CID 710490. Signed-off-by: Christian Engelmayer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: replace macros net_random and net_srandom with direct calls to prandomAruna-Hewapathirane27-46/+48
This patch removes the net_random and net_srandom macros and replaces them with direct calls to the prandom ones. As new commits only seem to use prandom_u32 there is no use to keep them around. This change makes it easier to grep for users of prandom_u32. Signed-off-by: Aruna-Hewapathirane <[email protected]> Suggested-by: Hannes Frederic Sowa <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14ipv6: copy traffic class from ping request to replyHannes Frederic Sowa1-1/+4
Suggested-by: Simon Schneider <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14ipv4: register igmp_notifier even when !CONFIG_PROC_FSWANG Cong2-4/+8
We still need this notifier even when we don't config PROC_FS. It should be rare to have a kernel without PROC_FS, so just for completeness. Cc: Stephen Hemminger <[email protected]> Cc: David S. Miller <[email protected]> Cc: Patrick McHardy <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: Add trace events for all receive entry points, exposing more skb fieldsBen Hutchings1-39/+61
The existing net/netif_rx and net/netif_receive_skb trace events provide little information about the skb, nor do they indicate how it entered the stack. Add trace events at entry of each of the exported functions, including most fields that are likely to be interesting for debugging driver datapath behaviour. Split netif_rx() and netif_receive_skb() so that internal calls are not traced. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-01-14net: Add net_dev_start_xmit trace event, exposing more skb fieldsBen Hutchings1-0/+2
The existing net/net_dev_xmit trace event provides little information about the skb that has been passed to the driver, and it is not simple to add more since the skb may already have been freed at the point the event is emitted. Add a separate trace event before the skb is passed to the driver, including most fields that are likely to be interesting for debugging driver datapath behaviour. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>