aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2010-07-02Merge branch 'master' of ↵David S. Miller1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
2010-07-02net: decreasing real_num_tx_queues needs to flush qdiscJohn Fastabend1-0/+18
Reducing real_num_queues needs to flush the qdisc otherwise skbs with queue_mappings greater then real_num_tx_queues can be sent to the underlying driver. The flow for this is, dev_queue_xmit() dev_pick_tx() skb_tx_hash() => hash using real_num_tx_queues skb_set_queue_mapping() ... qdisc_enqueue_root() => enqueue skb on txq from hash ... dev->real_num_tx_queues -= n ... sch_direct_xmit() dev_hard_start_xmit() ndo_start_xmit(skb,dev) => skb queue set with old hash skbs are enqueued on the qdisc with skb->queue_mapping set 0 < queue_mappings < real_num_tx_queues. When the driver decreases real_num_tx_queues skb's may be dequeued from the qdisc with a queue_mapping greater then real_num_tx_queues. This fixes a case in ixgbe where this was occurring with DCB and FCoE. Because the driver is using queue_mapping to map skbs to tx descriptor rings we can potentially map skbs to rings that no longer exist. Signed-off-by: John Fastabend <[email protected]> Tested-by: Ross Brattain <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-07-02minstrel_ht: fix check for downgrading of top2 rateMing Lei1-2/+2
The check should be against current top2 rate, instead of current top rate. Signed-off-by: Ming Lei <[email protected]> Acked-by: Felix Fietkau <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-07-02minstrel_ht: fix updating rate with best probabilityMing Lei1-0/+2
The throughput should be considered when updating rate with best probability. Signed-off-by: Ming Lei <[email protected]> Acked-by: Felix Fietkau <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-07-02netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECTEric Dumazet1-2/+4
We should release dst if dst->error is set. Bug introduced in 2.6.14 by commit e104411b82f5c ([XFRM]: Always release dst_entry on error in xfrm_lookup) Signed-off-by: Eric Dumazet <[email protected]> Cc: [email protected] Signed-off-by: Patrick McHardy <[email protected]>
2010-07-02bridge: add per bridge device controls for invoking iptablesPatrick McHardy3-9/+97
Support more fine grained control of bridge netfilter iptables invocation by adding seperate brnf_call_*tables parameters for each device using the sysfs interface. Packets are passed to layer 3 netfilter when either the global parameter or the per bridge parameter is enabled. Acked-by: Stephen Hemminger <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2010-07-01Merge branch 'master' of ↵David S. Miller19-62/+178
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 Conflicts: drivers/net/wireless/libertas/host.h
2010-06-30ethtool: Add support for control of RX flow hash indirectionBen Hutchings1-0/+80
Many NICs use an indirection table to map an RX flow hash value to one of an arbitrary number of queues (not necessarily a power of 2). It can be useful to remove some queues from this indirection table so that they are only used for flows that are specifically filtered there. It may also be useful to weight the mapping to account for user processes with the same CPU-affinity as the RX interrupts. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-30ethtool: Change ethtool_op_set_flags to validate flagsBen Hutchings1-23/+5
ethtool_op_set_flags() does not check for unsupported flags, and has no way of doing so. This means it is not suitable for use as a default implementation of ethtool_ops::set_flags. Add a 'supported' parameter specifying the flags that the driver and hardware support, validate the requested flags against this, and change all current callers to pass this parameter. Change some other trivial implementations of ethtool_ops::set_flags to call ethtool_op_set_flags(). Signed-off-by: Ben Hutchings <[email protected]> Reviewed-by: Stanislaw Gruszka <[email protected]> Acked-by: Jeff Garzik <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-30fragment: add fast path for in-order fragmentsChangli Gao2-0/+23
add fast path for in-order fragments As the fragments are sent in order in most of OSes, such as Windows, Darwin and FreeBSD, it is likely the new fragments are at the end of the inet_frag_queue. In the fast path, we check if the skb at the end of the inet_frag_queue is the prev we expect. Signed-off-by: Changli Gao <[email protected]> ---- include/net/inet_frag.h | 1 + net/ipv4/ip_fragment.c | 12 ++++++++++++ net/ipv6/reassembly.c | 11 +++++++++++ 3 files changed, 24 insertions(+) Signed-off-by: David S. Miller <[email protected]>
2010-06-30snmp: 64bit ipstats_mib for all archesEric Dumazet4-10/+76
/proc/net/snmp and /proc/net/netstat expose SNMP counters. Width of these counters is either 32 or 64 bits, depending on the size of "unsigned long" in kernel. This means user program parsing these files must already be prepared to deal with 64bit values, regardless of user program being 32 or 64 bit. This patch introduces 64bit snmp values for IPSTAT mib, where some counters can wrap pretty fast if they are 32bit wide. # netstat -s|egrep "InOctets|OutOctets" InOctets: 244068329096 OutOctets: 244069348848 Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-30act_nat: use stack variableChangli Gao1-21/+10
act_nat: use stack variable structure tc_nat isn't too big for stack, so we can put it in stack. Signed-off-by: Changli Gao <[email protected]> ---- net/sched/act_nat.c | 31 ++++++++++--------------------- 1 file changed, 10 insertions(+), 21 deletions(-) Signed-off-by: David S. Miller <[email protected]>
2010-06-30act_mirred: combine duplicate codeChangli Gao1-4/+2
act_mirred: combine duplicate code tcf_bstats is updated in any way, so we can do it earlier to reduce the size of the code. Signed-off-by: Changli Gao <[email protected]> Signed-off-by: Jamal Hadi Salim <[email protected]> ---- net/sched/act_mirred.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) Signed-off-by: David S. Miller <[email protected]>
2010-06-30mac82011: Allow selection of minstrel_ht as default rc algorithmHelmut Schaa1-0/+1
Allow selection of minstrel_ht as default rate control algorithm. At the moment minstrel_ht can only be requested by the driver code but not selected as default in make menuconfig. Fix this by using minstrel_ht when minstrel was selected as default and minstrel_ht is available. This change won't affect legacy devices as minstrel_ht falls back to minstrel in that case. Signed-off-by: Helmut Schaa <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-06-30net/core: use ntohs for skb->protocolSebastian Andrzej Siewior1-1/+2
This is only noticed by people that are not doing everything correct in the first place. Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-30ipv6: Use interface max_desync_factor instead of static defaultBen Hutchings1-4/+4
max_desync_factor can be configured per-interface, but nothing is using the value. Reported-by: Piotr Lewandowski <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-30ipv6: Clamp reported valid_lft to a minimum of 0Ben Hutchings1-2/+6
Since addresses are only revalidated every 2 minutes, the reported valid_lft can underflow shortly before the address is deleted. Clamp it to a minimum of 0, as for prefered_lft. Reported-by: Piotr Lewandowski <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-29net/Makefile: conditionally descend to wireless and ieee802154Nicolas Kaiser1-2/+2
Don't descend to wireless and ieee802154 unless they are actually used. Signed-off-by: Nicolas Kaiser <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-29mac80211: add basic tracing to drv_get_surveyJohn W. Linville2-1/+28
Reported-by: Johannes Berg <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-06-29mac80211: remove unnecessary check in ieee80211_dump_surveyJohn W. Linville1-3/+0
This check is duplicated in drv_get_survey. Reported-by: Johannes Berg <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-06-29ethtool: Fix potential user buffer overflow for ETHTOOL_{G, S}RXFHBen Hutchings1-9/+27
struct ethtool_rxnfc was originally defined in 2.6.27 for the ETHTOOL_{G,S}RXFH command with only the cmd, flow_type and data fields. It was then extended in 2.6.30 to support various additional commands. These commands should have been defined to use a new structure, but it is too late to change that now. Since user-space may still be using the old structure definition for the ETHTOOL_{G,S}RXFH commands, and since they do not need the additional fields, only copy the originally defined fields to and from user-space. Signed-off-by: Ben Hutchings <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2010-06-29ethtool: Fix potential kernel buffer overflow in ETHTOOL_GRXCLSRLALLBen Hutchings1-2/+3
On a 32-bit machine, info.rule_cnt >= 0x40000000 leads to integer overflow and the buffer may be smaller than needed. Since ETHTOOL_GRXCLSRLALL is unprivileged, this can presumably be used for at least denial of service. Signed-off-by: Ben Hutchings <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2010-06-29caif: Kconfig and Makefile fixesSjur Braendeland2-17/+4
Use "depends on" instead of "if" in Kconfig files. Fixed CAIF debug flag, and removed unnecessary clean-* options. Signed-off-by: Sjur Braendeland <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-28act_mirred: don't clone skb when skb isn't sharedChangli Gao1-3/+3
don't clone skb when skb isn't shared When the tcf_action is TC_ACT_STOLEN, and the skb isn't shared, we don't need to clone a new skb. As the skb will be freed after this function returns, we can use it freely once we get a reference to it. Signed-off-by: Changli Gao <[email protected]> ---- include/net/sch_generic.h | 11 +++++++++-- net/sched/act_mirred.c | 6 +++--- 2 files changed, 12 insertions(+), 5 deletions(-) Signed-off-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-28tcp: tso_fragment() might avoid GFP_ATOMICEric Dumazet1-3/+3
We can pass a gfp argument to tso_fragment() and avoid GFP_ATOMIC allocations sometimes. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-28vlan: 64 bit rx countersEric Dumazet3-25/+41
Use u64_stats_sync infrastructure to implement 64bit rx stats. (tx stats are addressed later) Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-28net: use this_cpu_ptr()Eric Dumazet3-4/+4
use this_cpu_ptr(p) instead of per_cpu_ptr(p, smp_processor_id()) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-28mac80211: fix the for_each_sta_info macroFelix Fietkau1-8/+8
Because of an ambiguity in the for_each_sta_info macro, it can currently only be used if the third parameter is set to 'sta'. Fix this by renaming the parameter to '_sta'. Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2010-06-28mac80211: use netif_receive_skb in ieee80211_tx_status callpathJohn W. Linville1-2/+2
This avoids the extra queueing from calling netif_rx. Signed-off-by: John W. Linville <[email protected]>
2010-06-28mac80211: use netif_receive_skb in ieee80211_rx callpathJohn W. Linville1-5/+5
This avoids the extra queueing from calling netif_rx. Signed-off-by: John W. Linville <[email protected]>
2010-06-28netfilter: ipt_LOG/ip6t_LOG: add option to print decoded MAC headerPatrick McHardy2-42/+93
The LOG targets print the entire MAC header as one long string, which is not readable very well: IN=eth0 OUT= MAC=00:15:f2:24:91:f8:00:1b:24:dc:61:e6:08:00 ... Add an option to decode known header formats (currently just ARPHRD_ETHER devices) in their individual fields: IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=0800 ... IN=eth0 OUT= MACSRC=00:1b:24:dc:61:e6 MACDST=00:15:f2:24:91:f8 MACPROTO=86dd ... The option needs to be explicitly enabled by userspace to avoid breaking existing parsers. Signed-off-by: Patrick McHardy <[email protected]>
2010-06-28netfilter: ipt_LOG/ip6t_LOG: remove comparison within loopPatrick McHardy2-9/+9
Remove the comparison within the loop to print the macheader by prepending the colon to all but the first printk. Based on suggestion by Jan Engelhardt <[email protected]>. Signed-off-by: Patrick McHardy <[email protected]>
2010-06-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds10-13/+22
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (52 commits) phylib: Add autoload support for the LXT973 phy. ISDN: hysdn, fix potential NULL dereference vxge: fix memory leak in vxge_alloc_msix() error path isdn/gigaset: correct CAPI connection state storage isdn/gigaset: encode HLC and BC together isdn/gigaset: correct CAPI DATA_B3 Delivery Confirmation isdn/gigaset: correct CAPI voice connection encoding isdn/gigaset: honor CAPI application's buffer size request cpmac: do not leak struct net_device on phy_connect errors smc91c92_cs: fix the problem that lan & modem does not work simultaneously ipv6: fix NULL reference in proxy neighbor discovery Bluetooth: Bring back var 'i' increment xfrm: check bundle policy existance before dereferencing it sky2: enable rx/tx in sky2_phy_reinit() cnic: Disable statistics initialization for eth clients that do not support statistics net: add dependency on fw class module to qlcnic and netxen_nic snmp: fix SNMP_ADD_STATS() hso: remove setting of low_latency flag udp: Fix bogus UFO packet generation lasi82596: fix netdev_mc_count conversion ...
2010-06-26syncookies: add support for ECNFlorian Westphal4-12/+16
Allows use of ECN when syncookies are in effect by encoding ecn_ok into the syn-ack tcp timestamp. While at it, remove a uneeded #ifdef CONFIG_SYN_COOKIES. With CONFIG_SYN_COOKIES=nm want_cookie is ifdef'd to 0 and gcc removes the "if (0)". Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-26syncookies: do not store rcv_wscale in tcp timestampFlorian Westphal2-22/+14
As pointed out by Fernando Gont there is no need to encode rcv_wscale into the cookie. We did not use the restored rcv_wscale anyway; it is recomputed via tcp_select_initial_window(). Thus we can save 4 bits in the ts option space by removing rcv_wscale. In case window scaling was not supported, we set the (invalid) wscale value 0xf. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25ipv6: remove ipv6_statisticsEric Dumazet1-2/+0
commit 9261e5370112 (ipv6: making ip and icmp statistics per/namespace) forgot to remove ipv6_statistics variable. commit bc417d99bf27 (ipv6: remove stale MIB definitions) took care of icmpv6_statistics & icmpv6msg_statistics Signed-off-by: Eric Dumazet <[email protected]> CC: Denis V. Lunev <[email protected]> CC: Alexey Dobriyan <[email protected]> CC: Hideaki YOSHIFUJI <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25snmp: add align parameter to snmp_mib_init()Eric Dumazet6-21/+39
In preparation for 64bit snmp counters for some mibs, add an 'align' parameter to snmp_mib_init(), instead of assuming mibs only contain 'unsigned long' fields. Callers can use __alignof__(type) to provide correct alignment. Signed-off-by: Eric Dumazet <[email protected]> CC: Herbert Xu <[email protected]> CC: Arnaldo Carvalho de Melo <[email protected]> CC: Hideaki YOSHIFUJI <[email protected]> CC: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25arp: RCU change in arp_solicit()Eric Dumazet1-5/+7
Avoid two atomic ops in arp_solicit() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25dccp: make implementation of Syn-RTT symmetricGerrit Renker2-3/+16
This patch is thanks to Andre Noll who reported the issue and helped testing. The Syn-RTT sampled during the initial handshake currently only works for the client sending the DCCP-Request. TFRC penalizes the absence of an RTT sample with a very slow initial speed (1 packet per second), which delays slow-start significantly, resulting in sluggish performance. This patch mirrors the "Syn RTT" principle by adding a timestamp also onto the DCCP-Response, producing an RTT sample when the (Data)Ack completing the handshake arrives. Also changed the documentation to 'TFRC' since Syn RTTs are also used by CCID-4. Signed-off-by: Gerrit Renker <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25dccp: remove unused function argumentGerrit Renker4-19/+13
This removes an unused 'sk' argument from several option-inserting functions. Signed-off-by: Gerrit Renker <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25net/core/pktgen.c: Use pr_<level>Joe Perches1-78/+64
Add pr_fmt(fmt) KBUILD_MODNAME ": " fmt Remove "pktgen: " from formats Convert printks to pr_<level> Added func_enter() for debugging Moved version to end of string at module_init Coalesced long formats Signed-off-by: Joe Perches <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25net: optimize Berkeley Packet Filter (BPF) processingHagen Paul Pfeifer1-51/+161
Gcc is currenlty not in the ability to optimize the switch statement in sk_run_filter() because of dense case labels. This patch replace the OR'd labels with ordered sequenced case labels. The sk_chk_filter() function is modified to patch/replace the original OPCODES in a ordered but equivalent form. gcc is now in the ability to transform the switch statement in sk_run_filter into a jump table of complexity O(1). Until this patch gcc generates a sequence of conditional branches (O(n) of 567 byte .text segment size (arch x86_64): 7ff: 8b 06 mov (%rsi),%eax 801: 66 83 f8 35 cmp $0x35,%ax 805: 0f 84 d0 02 00 00 je adb <sk_run_filter+0x31d> 80b: 0f 87 07 01 00 00 ja 918 <sk_run_filter+0x15a> 811: 66 83 f8 15 cmp $0x15,%ax 815: 0f 84 c5 02 00 00 je ae0 <sk_run_filter+0x322> 81b: 77 73 ja 890 <sk_run_filter+0xd2> 81d: 66 83 f8 04 cmp $0x4,%ax 821: 0f 84 17 02 00 00 je a3e <sk_run_filter+0x280> 827: 77 29 ja 852 <sk_run_filter+0x94> 829: 66 83 f8 01 cmp $0x1,%ax [...] With the modification the compiler translate the switch statement into the following jump table fragment: 7ff: 66 83 3e 2c cmpw $0x2c,(%rsi) 803: 0f 87 1f 02 00 00 ja a28 <sk_run_filter+0x26a> 809: 0f b7 06 movzwl (%rsi),%eax 80c: ff 24 c5 00 00 00 00 jmpq *0x0(,%rax,8) 813: 44 89 e3 mov %r12d,%ebx 816: e9 43 03 00 00 jmpq b5e <sk_run_filter+0x3a0> 81b: 41 89 dc mov %ebx,%r12d 81e: e9 3b 03 00 00 jmpq b5e <sk_run_filter+0x3a0> Furthermore, I reordered the instructions to reduce cache line misses by order the most common instruction to the start. Signed-off-by: Hagen Paul Pfeifer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25ipv6: fix NULL reference in proxy neighbor discoverystephen hemminger1-1/+1
The addition of TLLAO option created a kernel OOPS regression for the case where neighbor advertisement is being sent via proxy path. When using proxy, ipv6_get_ifaddr() returns NULL causing the NULL dereference. Change causing the bug was: commit f7734fdf61ec6bb848e0bafc1fb8bad2c124bb50 Author: Octavian Purdila <[email protected]> Date: Fri Oct 2 11:39:15 2009 +0000 make TLLAO option for NA packets configurable Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-25netfilter: complete the deprecation of CONFIG_NF_CT_ACCTTim Gardner2-35/+1
CONFIG_NF_CT_ACCT has been deprecated for awhile and was originally scheduled for removal by 2.6.29. Removing support for this config option also stops this deprecation warning message in the kernel log. [ 61.669627] nf_conntrack version 0.5.0 (16384 buckets, 65536 max) [ 61.669850] CONFIG_NF_CT_ACCT is deprecated and will be removed soon. Please use [ 61.669852] nf_conntrack.acct=1 kernel parameter, acct=1 nf_conntrack module option or [ 61.669853] sysctl net.netfilter.nf_conntrack_acct=1 to enable it. Signed-off-by: Tim Gardner <[email protected]> [Patrick: changed default value to 0] Signed-off-by: Patrick McHardy <[email protected]>
2010-06-25netfilter: xt_connbytes: Force CT accounting to be enabledTim Gardner1-0/+10
Check at rule install time that CT accounting is enabled. Force it to be enabled if not while also emitting a warning since this is not the default state. This is in preparation for deprecating CONFIG_NF_CT_ACCT upon which CONFIG_NETFILTER_XT_MATCH_CONNBYTES depended being set. Added 2 CT accounting support functions: nf_ct_acct_enabled() - Get CT accounting state. nf_ct_set_acct() - Enable/disable CT accountuing. Signed-off-by: Tim Gardner <[email protected]> Acked-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2010-06-24Bluetooth: Bring back var 'i' incrementGustavo F. Padovan1-0/+2
commit ff6e2163f28a1094fb5ca5950fe2b43c3cf6bc7a accidentally added a regression on the bnep code. Fixing it. Signed-off-by: Gustavo F. Padovan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-24tcp: do not send reset to already closed socketsKonstantin Khorenko1-0/+4
i've found that tcp_close() can be called for an already closed socket, but still sends reset in this case (tcp_send_active_reset()) which seems to be incorrect. Moreover, a packet with reset is sent with different source port as original port number has been already cleared on socket. Besides that incrementing stat counter for LINUX_MIB_TCPABORTONCLOSE also does not look correct in this case. Initially this issue was found on 2.6.18-x RHEL5 kernel, but the same seems to be true for the current mainstream kernel (checked on 2.6.35-rc3). Please, correct me if i missed something. How that happens: 1) the server receives a packet for socket in TCP_CLOSE_WAIT state that triggers a tcp_reset(): Call Trace: <IRQ> [<ffffffff8025b9b9>] tcp_reset+0x12f/0x1e8 [<ffffffff80046125>] tcp_rcv_state_process+0x1c0/0xa08 [<ffffffff8003eb22>] tcp_v4_do_rcv+0x310/0x37a [<ffffffff80028bea>] tcp_v4_rcv+0x74d/0xb43 [<ffffffff8024ef4c>] ip_local_deliver_finish+0x0/0x259 [<ffffffff80037131>] ip_local_deliver+0x200/0x2f4 [<ffffffff8003843c>] ip_rcv+0x64c/0x69f [<ffffffff80021d89>] netif_receive_skb+0x4c4/0x4fa [<ffffffff80032eca>] process_backlog+0x90/0xec [<ffffffff8000cc50>] net_rx_action+0xbb/0x1f1 [<ffffffff80012d3a>] __do_softirq+0xf5/0x1ce [<ffffffff8001147a>] handle_IRQ_event+0x56/0xb0 [<ffffffff8006334c>] call_softirq+0x1c/0x28 [<ffffffff80070476>] do_softirq+0x2c/0x85 [<ffffffff80070441>] do_IRQ+0x149/0x152 [<ffffffff80062665>] ret_from_intr+0x0/0xa <EOI> [<ffffffff80008a2e>] __handle_mm_fault+0x6cd/0x1303 [<ffffffff80008903>] __handle_mm_fault+0x5a2/0x1303 [<ffffffff80033a9d>] cache_free_debugcheck+0x21f/0x22e [<ffffffff8006a263>] do_page_fault+0x49a/0x7dc [<ffffffff80066487>] thread_return+0x89/0x174 [<ffffffff800c5aee>] audit_syscall_exit+0x341/0x35c [<ffffffff80062e39>] error_exit+0x0/0x84 tcp_rcv_state_process() ... // (sk_state == TCP_CLOSE_WAIT here) ... /* step 2: check RST bit */ if(th->rst) { tcp_reset(sk); goto discard; } ... --------------------------------- tcp_rcv_state_process tcp_reset tcp_done tcp_set_state(sk, TCP_CLOSE); inet_put_port __inet_put_port inet_sk(sk)->num = 0; sk->sk_shutdown = SHUTDOWN_MASK; 2) After that the process (socket owner) tries to write something to that socket and "inet_autobind" sets a _new_ (which differs from the original!) port number for the socket: Call Trace: [<ffffffff80255a12>] inet_bind_hash+0x33/0x5f [<ffffffff80257180>] inet_csk_get_port+0x216/0x268 [<ffffffff8026bcc9>] inet_autobind+0x22/0x8f [<ffffffff80049140>] inet_sendmsg+0x27/0x57 [<ffffffff8003a9d9>] do_sock_write+0xae/0xea [<ffffffff80226ac7>] sock_writev+0xdc/0xf6 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe [<ffffffff8001fb49>] __pollwait+0x0/0xdd [<ffffffff8008d533>] default_wake_function+0x0/0xe [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e [<ffffffff800f0b49>] do_readv_writev+0x163/0x274 [<ffffffff80066538>] thread_return+0x13a/0x174 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4 [<ffffffff800622dd>] tracesys+0xd5/0xe0 3) sendmsg fails at last with -EPIPE (=> 'write' returns -EPIPE in userspace): F: tcp_sendmsg1 -EPIPE: sk=ffff81000bda00d0, sport=49847, old_state=7, new_state=7, sk_err=0, sk_shutdown=3 Call Trace: [<ffffffff80027557>] tcp_sendmsg+0xcb/0xe87 [<ffffffff80033300>] release_sock+0x10/0xae [<ffffffff8016f20f>] vgacon_cursor+0x0/0x1a7 [<ffffffff8026bd32>] inet_autobind+0x8b/0x8f [<ffffffff8003a9d9>] do_sock_write+0xae/0xea [<ffffffff80226ac7>] sock_writev+0xdc/0xf6 [<ffffffff800680c7>] _spin_lock_irqsave+0x9/0xe [<ffffffff8001fb49>] __pollwait+0x0/0xdd [<ffffffff8008d533>] default_wake_function+0x0/0xe [<ffffffff800a4f10>] autoremove_wake_function+0x0/0x2e [<ffffffff800f0b49>] do_readv_writev+0x163/0x274 [<ffffffff80066538>] thread_return+0x13a/0x174 [<ffffffff800145d8>] tcp_poll+0x0/0x1c9 [<ffffffff800c56d3>] audit_syscall_entry+0x180/0x1b3 [<ffffffff800f0dd0>] sys_writev+0x49/0xe4 [<ffffffff800622dd>] tracesys+0xd5/0xe0 tcp_sendmsg() ... /* Wait for a connection to finish. */ if ((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) { int old_state = sk->sk_state; if ((err = sk_stream_wait_connect(sk, &timeo)) != 0) { if (f_d && (err == -EPIPE)) { printk("F: tcp_sendmsg1 -EPIPE: sk=%p, sport=%u, old_state=%d, new_state=%d, " "sk_err=%d, sk_shutdown=%d\n", sk, ntohs(inet_sk(sk)->sport), old_state, sk->sk_state, sk->sk_err, sk->sk_shutdown); dump_stack(); } goto out_err; } } ... 4) Then the process (socket owner) understands that it's time to close that socket and does that (and thus triggers sending reset packet): Call Trace: ... [<ffffffff80032077>] dev_queue_xmit+0x343/0x3d6 [<ffffffff80034698>] ip_output+0x351/0x384 [<ffffffff80251ae9>] dst_output+0x0/0xe [<ffffffff80036ec6>] ip_queue_xmit+0x567/0x5d2 [<ffffffff80095700>] vprintk+0x21/0x33 [<ffffffff800070f0>] check_poison_obj+0x2e/0x206 [<ffffffff80013587>] poison_obj+0x36/0x45 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d [<ffffffff80023481>] dbg_redzone1+0x1c/0x25 [<ffffffff8025dea6>] tcp_send_active_reset+0x15/0x14d [<ffffffff8000ca94>] cache_alloc_debugcheck_after+0x189/0x1c8 [<ffffffff80023405>] tcp_transmit_skb+0x764/0x786 [<ffffffff8025df8a>] tcp_send_active_reset+0xf9/0x14d [<ffffffff80258ff1>] tcp_close+0x39a/0x960 [<ffffffff8026be12>] inet_release+0x69/0x80 [<ffffffff80059b31>] sock_release+0x4f/0xcf [<ffffffff80059d4c>] sock_close+0x2c/0x30 [<ffffffff800133c9>] __fput+0xac/0x197 [<ffffffff800252bc>] filp_close+0x59/0x61 [<ffffffff8001eff6>] sys_close+0x85/0xc7 [<ffffffff800622dd>] tracesys+0xd5/0xe0 So, in brief: * a received packet for socket in TCP_CLOSE_WAIT state triggers tcp_reset() which clears inet_sk(sk)->num and put socket into TCP_CLOSE state * an attempt to write to that socket forces inet_autobind() to get a new port (but the write itself fails with -EPIPE) * tcp_close() called for socket in TCP_CLOSE state sends an active reset via socket with newly allocated port This adds an additional check in tcp_close() for already closed sockets. We do not want to send anything to closed sockets. Signed-off-by: Konstantin Khorenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-24net: fix "netpoll: Allow netpoll_setup/cleanup recursion"Andrew Morton1-1/+0
Remove rtnl_unlock() which had no corresponding rtnl_lock(). Signed-off-by: Andrew Morton <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-24xfrm: check bundle policy existance before dereferencing itTimo Teräs1-1/+2
Fix the bundle validation code to not assume having a valid policy. When we have multiple transformations for a xfrm policy, the bundle instance will be a chain of bundles with only the first one having the policy reference. When policy_genid is bumped it will expire the first bundle in the chain which is equivalent of expiring the whole chain. Reported-bisected-and-tested-by: Justin P. Mattock <[email protected]> Signed-off-by: Timo Teräs <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-06-24nl80211: Add option to adjust transmit powerJuuso Oikarinen1-0/+31
This patch adds transmit power setting type and transmit power level attributes to NL80211_CMD_SET_WIPHY in order to facilitate adjusting of the transmit power level of the device. The added attributes allow selection of automatic, limited or fixed transmit power level, with the level definable in signed mBm format. Signed-off-by: Juuso Oikarinen <[email protected]> Signed-off-by: John W. Linville <[email protected]>