aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2009-09-02net: drop_monitor: make last_rx timestamp privateNeil Horman1-3/+12
It was recently pointed out to me that the last_rx field of the net_device structure wasn't updated regularly. In fact only the bonding driver really uses it currently. Since the drop_monitor code relies on the last_rx field to detect drops on recevie in hardware, We need to find a more reliable way to rate limit our drop checks (so that we don't check for drops on every frame recevied, which would be inefficient. This patch makes a last_rx timestamp that is private to the drop monitor code and is updated for every device that we track. Signed-off-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-02Merge branch 'master' of ↵David S. Miller3-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2009-09-02cfg80211: fix looping soft lockup in find_ie()Bob Copeland1-1/+1
The find_ie() function uses a size_t for the len parameter, and directly uses len as a loop variable. If any received packets are malformed, it is possible for the decrease of len to overflow, and since the result is unsigned, the loop will not terminate. Change it to a signed int so the loop conditional works for negative values. This fixes the following soft lockup: [38573.102007] BUG: soft lockup - CPU#0 stuck for 61s! [phy0:2230] [38573.102007] Modules linked in: aes_i586 aes_generic fuse af_packet ipt_REJECT xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state iptable_filter ip_tables x_tables acpi_cpufreq binfmt_misc dm_mirror dm_region_hash dm_log dm_multipath dm_mod kvm_intel kvm uinput i915 arc4 ecb drm snd_hda_codec_idt ath5k snd_hda_intel hid_apple mac80211 usbhid appletouch snd_hda_codec snd_pcm ath cfg80211 snd_timer i2c_algo_bit ohci1394 video snd processor ieee1394 rfkill ehci_hcd sg sky2 backlight snd_page_alloc uhci_hcd joydev output ac thermal button battery sr_mod applesmc cdrom input_polldev evdev unix [last unloaded: scsi_wait_scan] [38573.102007] irq event stamp: 2547724535 [38573.102007] hardirqs last enabled at (2547724534): [<c1002ffc>] restore_all_notrace+0x0/0x18 [38573.102007] hardirqs last disabled at (2547724535): [<c10038f4>] apic_timer_interrupt+0x28/0x34 [38573.102007] softirqs last enabled at (92950144): [<c103ab48>] __do_softirq+0x108/0x210 [38573.102007] softirqs last disabled at (92950274): [<c1348e74>] _spin_lock_bh+0x14/0x80 [38573.102007] [38573.102007] Pid: 2230, comm: phy0 Tainted: G W (2.6.31-rc7-wl #8) MacBook1,1 [38573.102007] EIP: 0060:[<f8ea2d50>] EFLAGS: 00010292 CPU: 0 [38573.102007] EIP is at cmp_ies+0x30/0x180 [cfg80211] [38573.102007] EAX: 00000082 EBX: 00000000 ECX: ffffffc1 EDX: d8efd014 [38573.102007] ESI: ffffff7c EDI: 0000004d EBP: eee2dc50 ESP: eee2dc3c [38573.102007] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [38573.102007] CR0: 8005003b CR2: d8efd014 CR3: 01694000 CR4: 000026d0 [38573.102007] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [38573.102007] DR6: ffff0ff0 DR7: 00000400 [38573.102007] Call Trace: [38573.102007] [<f8ea2f8d>] cmp_bss+0xed/0x100 [cfg80211] [38573.102007] [<f8ea33e4>] cfg80211_bss_update+0x84/0x410 [cfg80211] [38573.102007] [<f8ea3884>] cfg80211_inform_bss_frame+0x114/0x180 [cfg80211] [38573.102007] [<f97255ff>] ieee80211_bss_info_update+0x4f/0x180 [mac80211] [38573.102007] [<f972b118>] ieee80211_rx_bss_info+0x88/0xf0 [mac80211] [38573.102007] [<f9739297>] ? ieee802_11_parse_elems+0x27/0x30 [mac80211] [38573.102007] [<f972b224>] ieee80211_rx_mgmt_probe_resp+0xa4/0x1c0 [mac80211] [38573.102007] [<f972bc59>] ieee80211_sta_rx_queued_mgmt+0x919/0xc50 [mac80211] [38573.102007] [<c1009707>] ? sched_clock+0x27/0xa0 [38573.102007] [<c1009707>] ? sched_clock+0x27/0xa0 [38573.102007] [<c105ffd0>] ? mark_held_locks+0x60/0x80 [38573.102007] [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70 [38573.102007] [<c134baa5>] ? sub_preempt_count+0x85/0xc0 [38573.102007] [<c1348bce>] ? _spin_unlock_irqrestore+0x3e/0x70 [38573.102007] [<c12c1c0f>] ? skb_dequeue+0x4f/0x70 [38573.102007] [<f972c021>] ieee80211_sta_work+0x91/0xb80 [mac80211] [38573.102007] [<c1009707>] ? sched_clock+0x27/0xa0 [38573.102007] [<c134baa5>] ? sub_preempt_count+0x85/0xc0 [38573.102007] [<c10479af>] worker_thread+0x18f/0x320 [38573.102007] [<c104794e>] ? worker_thread+0x12e/0x320 [38573.102007] [<c1348be5>] ? _spin_unlock_irqrestore+0x55/0x70 [38573.102007] [<f972bf90>] ? ieee80211_sta_work+0x0/0xb80 [mac80211] [38573.102007] [<c104cbb0>] ? autoremove_wake_function+0x0/0x50 [38573.102007] [<c1047820>] ? worker_thread+0x0/0x320 [38573.102007] [<c104c854>] kthread+0x84/0x90 [38573.102007] [<c104c7d0>] ? kthread+0x0/0x90 [38573.102007] [<c1003ab7>] kernel_thread_helper+0x7/0x10 Cc: [email protected] Signed-off-by: Bob Copeland <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2009-09-02wireless: remove mac80211 rate selection extra menuLuis R. Rodriguez1-3/+2
We can just display this upon enabling mac80211 with an 'if MAC80211 != n' check. Cc: Johannes Berg <[email protected]> Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2009-09-02wireless: update reg debug kconfig entryLuis R. Rodriguez1-0/+4
Refer to the wireless wiki for more information. Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2009-09-02net: file_operations should be constStephen Hemminger7-14/+14
All instances of file_operations should be const. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-02inet: inet_connection_sock_af_ops constStephen Hemminger4-10/+10
The function block inet_connect_sock_af_ops contains no data make it constant. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-02tcp: MD5 operations should be constStephen Hemminger2-7/+7
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-02net: seq_operations should be constStephen Hemminger2-2/+2
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-02Merge branch 'master' of ↵David S. Miller15-28/+44
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/yellowfin.c
2009-09-01ipv6: ip6_push_pending_frames() should increment IPSTATS_MIB_OUTDISCARDSEric Dumazet1-0/+1
qdisc drops should be notified to IP_RECVERR enabled sockets, as done in IPV4. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01drop_monitor: fix trace_napi_poll_hit()Xiao Guangrong1-1/+2
The net_dev of backlog napi is NULL, like below: __get_cpu_var(softnet_data).backlog.dev == NULL So, we should check it in napi tracepoint's probe function Acked-by: Neil Horman <[email protected]> Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01pkt_sched: Revert tasklet_hrtimer changes.David S. Miller2-19/+16
These are full of unresolved problems, mainly that conversions don't work 1-1 from hrtimers to tasklet_hrtimers because unlike hrtimers tasklets can't be killed from softirq context. And when a qdisc gets reset, that's exactly what we need to do here. We'll work this out in the net-next-2.6 tree and if warranted we'll backport that work to -stable. This reverts the following 3 changesets: a2cb6a4dd470d7a64255a10b843b0d188416b78f ("pkt_sched: Fix bogon in tasklet_hrtimer changes.") 38acce2d7983632100a9ff3fd20295f6e34074a8 ("pkt_sched: Convert CBQ to tasklet_hrtimer.") ee5f9757ea17759e1ce5503bdae2b07e48e32af9 ("pkt_sched: Convert qdisc_watchdog to tasklet_hrtimer") Signed-off-by: David S. Miller <[email protected]>
2009-09-01net: sk_free() should be allowed right after sk_alloc()Jarek Poplawski1-1/+1
After commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) sk_free() frees socks conditionally and depends on sk_wmem_alloc being set e.g. in sock_init_data(). But in some cases sk_free() is called earlier, usually after other alloc errors. Fix is to move sk_wmem_alloc initialization from sock_init_data() to sk_alloc() itself. Signed-off-by: Jarek Poplawski <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01net: make neigh_ops constantStephen Hemminger4-11/+11
These tables are never modified at runtime. Move to read-only section. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01netns: embed ip6_dst_ops directlyAlexey Dobriyan1-21/+13
struct net::ipv6.ip6_dst_ops is separatedly dynamically allocated, but there is no fundamental reason for it. Embed it directly into struct netns_ipv6. For that: * move struct dst_ops into separate header to fix circular dependencies I honestly tried not to, it's pretty impossible to do other way * drop dynamical allocation, allocate together with netns For a change, remove struct dst_ops::dst_net, it's deducible by using container_of() given dst_ops pointer. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01netfilter: ebt_ulog: fix checkentry return valuePatrick McHardy1-1/+1
Commit 19eda87 (netfilter: change return types of check functions for Ebtables extensions) broke the ebtables ulog module by missing a return value conversion. Signed-off-by: Patrick McHardy <[email protected]>
2009-09-01Revert Backoff [v3]: Calculate TCP's connection close threshold as a time value.Damian Lukowski1-4/+7
RFC 1122 specifies two threshold values R1 and R2 for connection timeouts, which may represent a number of allowed retransmissions or a timeout value. Currently linux uses sysctl_tcp_retries{1,2} to specify the thresholds in number of allowed retransmissions. For any desired threshold R2 (by means of time) one can specify tcp_retries2 (by means of number of retransmissions) such that TCP will not time out earlier than R2. This is the case, because the RTO schedule follows a fixed pattern, namely exponential backoff. However, the RTO behaviour is not predictable any more if RTO backoffs can be reverted, as it is the case in the draft "Make TCP more Robust to Long Connectivity Disruptions" (http://tools.ietf.org/html/draft-zimmermann-tcp-lcd). In the worst case TCP would time out a connection after 3.2 seconds, if the initial RTO equaled MIN_RTO and each backoff has been reverted. This patch introduces a function retransmits_timed_out(N), which calculates the timeout of a TCP connection, assuming an initial RTO of MIN_RTO and N unsuccessful, exponentially backed-off retransmissions. Whenever timeout decisions are made by comparing the retransmission counter to some value N, this function can be used, instead. The meaning of tcp_retries2 will be changed, as many more RTO retransmissions can occur than the value indicates. However, it yields a timeout which is similar to the one of an unpatched, exponentially backing off TCP in the same scenario. As no application could rely on an RTO greater than MIN_RTO, there should be no risk of a regression. Signed-off-by: Damian Lukowski <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01Revert Backoff [v3]: Revert RTO on ICMP destination unreachableDamian Lukowski3-4/+40
Here, an ICMP host/network unreachable message, whose payload fits to TCP's SND.UNA, is taken as an indication that the RTO retransmission has not been lost due to congestion, but because of a route failure somewhere along the path. With true congestion, a router won't trigger such a message and the patched TCP will operate as standard TCP. This patch reverts one RTO backoff, if an ICMP host/network unreachable message, whose payload fits to TCP's SND.UNA, arrives. Based on the new RTO, the retransmission timer is reset to reflect the remaining time, or - if the revert clocked out the timer - a retransmission is sent out immediately. Backoffs are only reverted, if TCP is in RTO loss recovery, i.e. if there have been retransmissions and reversible backoffs, already. Changes from v2: 1) Renaming of skb in tcp_v4_err() moved to another patch. 2) Reintroduced tcp_bound_rto() and __tcp_set_rto(). 3) Fixed code comments. Signed-off-by: Damian Lukowski <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01Revert Backoff [v3]: Rename skb to icmp_skb in tcp_v4_err()Damian Lukowski1-8/+8
This supplementary patch renames skb to icmp_skb in tcp_v4_err() in order to disambiguate from another sk_buff variable, which will be introduced in a separate patch. Signed-off-by: Damian Lukowski <[email protected]> Acked-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01dcbnl: Add implementations of dcbnl setapp/getapp commandsYi Zou1-0/+122
Implements the dcbnl netlink setapp/getapp pair. When a setapp/getapp is received, dcbnl would just pass on to dcbnl_rtnl_op.setapp/getapp that are supposed to be implemented by the low level drivers. Signed-off-by: Yi Zou <[email protected]> Acked-by: Peter P Waskiewicz Jr <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01dcbnl: Add netlink attributes for setapp/getapp to dcbnlYi Zou1-0/+8
Add defines for dcbnl netlink attributes to support netlink message passing of setapp/getapp in dcbnl. Signed-off-by: Yi Zou <[email protected]> Acked-by: Peter P Waskiewicz Jr <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01vlan: Add support for net_devices_ops.ndo_fcoe_enable/_disable to VLANYi Zou1-0/+26
This adds implementation of the net_devices_ops.ndo_fcoe_enable/_disable to the VLAN driver. It checks if the real_dev has support for ndo_fcoe_enable/ ndo_fcoe_disable and if so, passes on to call the associated real_dev. Signed-off-by: Yi Zou <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01wireless: convert drivers to netdev_tx_tStephen Hemminger2-6/+8
Mostly just simple conversions: * ray_cs had bogus return of NET_TX_LOCKED but driver was not using NETIF_F_LLTX * hostap and ipw2x00 had some code that returned value from a called function that also had to change to return netdev_tx_t Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01netdev: convert pseudo drivers to netdev_tx_tStephen Hemminger1-1/+1
These are all drivers that don't touch real hardware. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01irda: convert to netdev_tx_tStephen Hemminger1-2/+4
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01convert hamradio drivers to netdev_txreturnt_tStephen Hemminger2-2/+2
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01convert ATM drivers to netdev_tx_tStephen Hemminger4-6/+12
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-09-01netdev: convert pseudo-devices to netdev_tx_tStephen Hemminger16-21/+26
Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-31IPVS: Add handling of incoming ICMPV6 messagesJulius Volz1-6/+17
Add handling of incoming ICMPv6 messages. This follows the handling of IPv4 ICMP messages. Amongst ther things this problem allows IPVS to behave sensibly when an ICMPV6_PKT_TOOBIG message is received: This message is received when a realserver sends a packet >PMTU to the client. The hop on this path with insufficient MTU will generate an ICMPv6 Packet Too Big message back to the VIP. The LVS server receives this message, but the call to the function handling this has been missing. Thus, IPVS fails to forward the message to the real server, which then does not adjust the path MTU. This patch adds the missing call to ip_vs_in_icmp_v6() in ip_vs_in() to handle this situation. Thanks to Rob Gallagher from HEAnet for reporting this issue and for testing this patch in production (with direct routing mode). [[email protected]: tweaked changelog] Signed-off-by: Julius Volz <[email protected]> Tested-by: Rob Gallagher <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2009-08-31netfilter: ip6t_eui: fix read outside array boundsPatrick McHardy1-7/+2
Use memcmp() instead of open coded comparison that reads one byte past the intended end. Based on patch from Roel Kluin <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2009-08-31netfilter: nf_conntrack: netns fix re reliable conntrack event deliveryAlexey Dobriyan1-3/+3
Conntracks in netns other than init_net dying list were never killed. Signed-off-by: Alexey Dobriyan <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2009-08-31ipvs: Use atomic operations atomiclySimon Horman2-6/+7
A pointed out by Shin Hong, IPVS doesn't always use atomic operations in an atomic manner. While this seems unlikely to be manifest in strange behaviour, it seems appropriate to clean this up. Cc: shin hong <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2009-08-30pkt_sched: Fix resource limiting in pfifo_fastKrishna Kumar1-4/+4
pfifo_fast_enqueue has this check: if (skb_queue_len(list) < qdisc_dev(qdisc)->tx_queue_len) { which allows each band to enqueue upto tx_queue_len skbs for a total of 3*tx_queue_len skbs. I am not sure if this was the intention of limiting in qdisc. Patch compiled and 32 simultaneous netperf testing ran fine. Also: # tc -s qdisc show dev eth2 qdisc pfifo_fast 0: root bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Sent 16835026752 bytes 373116 pkt (dropped 0, overlimits 0 requeues 25) rate 0bit 0pps backlog 0b 0p requeues 25 Signed-off-by: Krishna Kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-30net: convert remaining non-symbolic return values in dev_queue_xmitKrishna Kumar1-1/+1
Patch compiled and 32 simultaneous netperf testing ran fine. Signed-off-by: Krishna Kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-30can: use correct NET_RX_ return valuesOliver Hartkopp1-2/+2
Dropped skb's should be documented by an appropriate return value. Use the correct NET_RX_DROP and NET_RX_SUCCESS values for that reason. Signed-off-by: Oliver Hartkopp <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-30Merge branch 'master' of ↵David S. Miller9-146/+1419
git://git.kernel.org/pub/scm/linux/kernel/git/holtmann/bluetooth-next-2.6
2009-08-29tipc: fix test of bearer_priority range in tipc_register_media()roel kluin1-1/+1
For the bearer_priority to be less than TIPC_MIN_LINK_PRI and greater than TIPC_MAX_LINK_PRI is logically impossible. Signed-off-by: Roel Kluin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-29can: switch to seq_fileAlexey Dobriyan2-199/+167
create_proc_read_entry() is going to be removed soon. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-29tcp: Remove redundant copy of MD5 authentication keyJohn Dykstra1-23/+0
Remove the copy of the MD5 authentication key from tcp_check_req(). This key has already been copied by tcp_v4_syn_recv_sock() or tcp_v6_syn_recv_sock(). Signed-off-by: John Dykstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-29Speed-up pfifo_fast lookup using a private bitmapKrishna Kumar1-22/+48
Maintain a per-qdisc bitmap for pfifo_fast giving availability of skbs for each band. This allows faster lookup for a skb when there are no high priority skbs. Also, it helps in (rare) cases when there are no skbs on the list, where an immediate lookup is faster than iterating through the three bands. Signed-off-by: Krishna Kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-29ipv6: Update Neighbor Cache when IPv6 RA is received on a routerDavid Ward1-6/+8
When processing a received IPv6 Router Advertisement, the kernel creates or updates an IPv6 Neighbor Cache entry for the sender -- but presently this does not occur if IPv6 forwarding is enabled (net.ipv6.conf.*.forwarding = 1), or if IPv6 Router Advertisements are not accepted (net.ipv6.conf.*.accept_ra = 0), because in these cases processing of the Router Advertisement has already halted. This patch allows the Neighbor Cache to be updated in these cases, while still avoiding any modification to routes or link parameters. This continues to satisfy RFC 4861, since any entry created in the Neighbor Cache as the result of a received Router Advertisement is still placed in the STALE state. Signed-off-by: David Ward <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-29tcp: fix premature termination of FIN_WAIT2 time-wait socketsOctavian Purdila1-1/+1
There is a race condition in the time-wait sockets code that can lead to premature termination of FIN_WAIT2 and, subsequently, to RST generation when the FIN,ACK from the peer finally arrives: Time TCP header 0.000000 30755 > http [SYN] Seq=0 Win=2920 Len=0 MSS=1460 TSV=282912 TSER=0 0.000008 http > 30755 aSYN, ACK] Seq=0 Ack=1 Win=2896 Len=0 MSS=1460 TSV=... 0.136899 HEAD /1b.html?n1Lg=v1 HTTP/1.0 [Packet size limited during capture] 0.136934 HTTP/1.0 200 OK [Packet size limited during capture] 0.136945 http > 30755 [FIN, ACK] Seq=187 Ack=207 Win=2690 Len=0 TSV=270521... 0.136974 30755 > http [ACK] Seq=207 Ack=187 Win=2734 Len=0 TSV=283049 TSER=... 0.177983 30755 > http [ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283089 TSER=... 0.238618 30755 > http [FIN, ACK] Seq=207 Ack=188 Win=2733 Len=0 TSV=283151... 0.238625 http > 30755 [RST] Seq=188 Win=0 Len=0 Say twdr->slot = 1 and we are running inet_twdr_hangman and in this instance inet_twdr_do_twkill_work returns 1. At that point we will mark slot 1 and schedule inet_twdr_twkill_work. We will also make twdr->slot = 2. Next, a connection is closed and tcp_time_wait(TCP_FIN_WAIT2, timeo) is called which will create a new FIN_WAIT2 time-wait socket and will place it in the last to be reached slot, i.e. twdr->slot = 1. At this point say inet_twdr_twkill_work will run which will start destroying the time-wait sockets in slot 1, including the just added TCP_FIN_WAIT2 one. To avoid this issue we increment the slot only if all entries in the slot have been purged. This change may delay the slots cleanup by a time-wait death row period but only if the worker thread didn't had the time to run/purge the current slot in the next period (6 seconds with default sysctl settings). However, on such a busy system even without this change we would probably see delays... Signed-off-by: Octavian Purdila <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28fib_trie: resize reworkJens Låås1-72/+23
Here is rework and cleanup of the resize function. Some bugs we had. We were using ->parent when we should use node_parent(). Also we used ->parent which is not assigned by inflate in inflate loop. Also a fix to set thresholds to power 2 to fit halve and double strategy. max_resize is renamed to max_work which better indicates it's function. Reaching max_work is not an error, so warning is removed. max_work only limits amount of work done per resize. (limits CPU-usage, outstanding memory etc). The clean-up makes it relatively easy to add fixed sized root-nodes if we would like to decrease the memory pressure on routers with large routing tables and dynamic routing. If we'll need that... Its been tested with 280k routes. Work done together with Robert Olsson. Signed-off-by: Jens Låås <[email protected]> Signed-off-by: Robert Olsson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28sit: allow ip fragmentation when using nopmtudisc to fix package lossSascha Hlusiak1-1/+1
if tunnel parameters have frag_off set to IP_DF, pmtudisc on the ipv4 link will be performed by deriving the mtu from the ipv4 link and setting the DF-Flag of the encapsulating IPv4 Header. If fragmentation is needed on the way, the IPv4 pmtu gets adjusted, the ipv6 package will be resent eventually, using the new and lower mtu and everyone is happy. If the frag_off parameter is unset, the mtu for the tunnel will be derived from the tunnel device or the ipv6 pmtu, which might be higher than the ipv4 pmtu. In that case we must allow the fragmentation of the IPv4 packet because the IPv6 mtu wouldn't 'learn' from the adjusted IPv4 pmtu, resulting in frequent icmp_frag_needed and package loss on the IPv6 layer. This patch allows fragmentation when tunnel was created with parameter nopmtudisc, like in ipip/gre tunnels. Signed-off-by: Sascha Hlusiak <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28net: ip_rt_send_redirect() optimizationEric Dumazet1-9/+11
While doing some forwarding benchmarks, I noticed ip_rt_send_redirect() is rather expensive, even if send_redirects is false for the device. Fix is to avoid two atomic ops, we dont really need to take a reference on in_dev Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28tcp: keepalive cleanupsEric Dumazet2-5/+4
Introduce keepalive_probes(tp) helper, and use it, like keepalive_time_when(tp) and keepalive_intvl_when(tp) Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28ipv4: af_inet.c cleanupsEric Dumazet1-57/+55
Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28pktgen: use proc_create_data()Alexey Dobriyan1-2/+4
It looks like after rename device proc entry is unusable, because of no ->read_proc or ->proc_fops. And create_proc_entry() is deprecated. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2009-08-28pktgen: increase versionStephen Hemminger1-6/+10
Increase module version, and cleanup module info. Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>