aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2008-10-08netfilter: netns nf_conntrack: cleanup after L3 and L4 proto unregister in ↵Alexey Dobriyan1-2/+8
every netns Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: unregister helper in every netnsAlexey Dobriyan1-16/+24
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netns: export netns listAlexey Dobriyan1-0/+1
Conntrack code will use it for a) removing expectations and helpers when corresponding module is removed, and b) removing conntracks when L3 protocol conntrack module is removed. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns /proc/net/ip_conntrack, ↵Alexey Dobriyan1-19/+38
/proc/net/stat/ip_conntrack, /proc/net/ip_conntrack_expect Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack_expectAlexey Dobriyan1-10/+11
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns /proc/net/nf_conntrack, ↵Alexey Dobriyan1-20/+31
/proc/net/stat/nf_conntrack Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: pass netns pointer to L4 protocol's ->error hookAlexey Dobriyan7-19/+25
Again, it's deducible from skb, but we're going to use it for nf_conntrack_checksum and statistics, so just pass it from upper layer. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: pass netns pointer to nf_conntrack_in()Alexey Dobriyan3-17/+26
It's deducible from skb->dev or skb->dst->dev, but we know netns at the moment of call, so pass it down and use for finding and creating conntracks. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns unconfirmed listAlexey Dobriyan2-4/+5
What is confirmed connection in one netns can very well be unconfirmed in another one. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns expectationsAlexey Dobriyan9-45/+49
Make per-netns a) expectation hash and b) expectations count. Expectations always belongs to netns to which it's master conntrack belong. This is natural and doesn't bloat expectation. Proc files and leaf users are stubbed to init_net, this is temporary. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns: fix {ip,6}_route_me_harder() in netnsAlexey Dobriyan2-4/+5
Take netns from skb->dst->dev. It should be safe because, they are called from LOCAL_OUT hook where dst is valid (though, I'm not exactly sure about IPVS and queueing packets to userspace). [Patrick: its safe everywhere since they already expect skb->dst to be set] Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns conntrack hashAlexey Dobriyan13-58/+61
* make per-netns conntrack hash Other solution is to add ->ct_net pointer to tuplehashes and still has one hash, I tried that it's ugly and requires more code deep down in protocol modules et al. * propagate netns pointer to where needed, e. g. to conntrack iterators. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: per-netns conntrack countAlexey Dobriyan4-14/+12
Sysctls and proc files are stubbed to init_net's one. This is temporary. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: add ->ct_net -- pointer from conntrack to netnsAlexey Dobriyan2-5/+14
Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which it was created. It comes from netdevice. ->ct_net is write-once field. Every conntrack in system has ->ct_net initialized, no exceptions. ->ct_net doesn't pin netns: conntracks are recycled after timeouts and pinning background traffic will prevent netns from even starting shutdown sequence. Right now every conntrack is created in init_net. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns nf_conntrack: add netns boilerplateAlexey Dobriyan3-7/+22
One comment: #ifdefs around #include is necessary to overcome amazing compile breakages in NOTRACK-in-netns patch (see below). Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns: ip6t_REJECT in netns for realAlexey Dobriyan1-10/+12
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns: ip6table_mangle in netns for realAlexey Dobriyan1-9/+22
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns: ip6table_raw in netns for realAlexey Dobriyan1-4/+16
Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: netns: remove nf_*_net() wrappersAlexey Dobriyan6-19/+19
Now that dev_net() exists, the usefullness of them is even less. Also they're a big problem in resolving circular header dependencies necessary for NOTRACK-in-netns patch. See below. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: implement NFPROTO_UNSPEC as a wildcard for extensionsJan Engelhardt14-269/+124
When a match or target is looked up using xt_find_{match,target}, Xtables will also search the NFPROTO_UNSPEC module list. This allows for protocol-independent extensions (like xt_time) to be reused from other components (e.g. arptables, ebtables). Extensions that take different codepaths depending on match->family or target->family of course cannot use NFPROTO_UNSPEC within the registration structure (e.g. xt_pkttype). Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: x_tables: use NFPROTO_* in extensionsJan Engelhardt74-223/+225
Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: Introduce NFPROTO_* constantsJan Engelhardt4-23/+25
The netfilter subsystem only supports a handful of protocols (much less than PF_*) and even non-PF protocols like ARP and pseudo-protocols like PF_BRIDGE. By creating NFPROTO_*, we can earn a few memory savings on arrays that previously were always PF_MAX-sized and keep the pseudo-protocols to ourselves. Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: xt_recent: IPv6 supportJan Engelhardt2-54/+253
This updates xt_recent to support the IPv6 address family. The new /proc/net/xt_recent directory must be used for this. The old proc interface can also be configured out. Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: rename ipt_recent to xt_recentJan Engelhardt5-32/+31
Like with other modules (such as ipt_state), ipt_recent.h is changed to forward definitions to (IOW include) xt_recent.h, and xt_recent.c is changed to use the new constant names. Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-08netfilter: Use unsigned types for hooknum and pf varsJan Engelhardt28-82/+86
and (try to) consistently use u_int8_t for the L3 family. Signed-off-by: Jan Engelhardt <[email protected]> Signed-off-by: Patrick McHardy <[email protected]>
2008-10-07Merge branch 'master' of ↵David S. Miller14-40/+941
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6
2008-10-07tcp: Fix tcp_hybla zero congestion window growth with small rho and large cwnd.Daniele Lacamera1-1/+5
Because of rounding, in certain conditions, i.e. when in congestion avoidance state rho is smaller than 1/128 of the current cwnd, TCP Hybla congestion control starves and the cwnd is kept constant forever. This patch forces an increment by one segment after #send_cwnd calls without increments(newreno behavior). Signed-off-by: Daniele Lacamera <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07net: Fix netdev_run_todo dead-lockHerbert Xu2-22/+7
Benjamin Thery tracked down a bug that explains many instances of the error unregister_netdevice: waiting for %s to become free. Usage count = %d It turns out that netdev_run_todo can dead-lock with itself if a second instance of it is run in a thread that will then free a reference to the device waited on by the first instance. The problem is really quite silly. We were trying to create parallelism where none was required. As netdev_run_todo always follows a RTNL section, and that todo tasks can only be added with the RTNL held, by definition you should only need to wait for the very ones that you've added and be done with it. There is no need for a second mutex or spinlock. This is exactly what the following patch does. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07ipv4: add mc_count to in_device.Rami Rosen1-2/+5
This patch add mc_count to struct in_device and updates increment/decrement/initilaize of this field in IPv4 and in IPv6. - Also printing the vfs /proc entry (/proc/net/igmp) is adjusted to use the new mc_count. Signed-off-by: Rami Rosen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07tcp: Fix possible double-ack w/ user dmaAli Saidi1-1/+2
From: Ali Saidi <[email protected]> When TCP receive copy offload is enabled it's possible that tcp_rcv_established() will cause two acks to be sent for a single packet. In the case that a tcp_dma_early_copy() is successful, copied_early is set to true which causes tcp_cleanup_rbuf() to be called early which can send an ack. Further along in tcp_rcv_established(), __tcp_ack_snd_check() is called and will schedule a delayed ACK. If no packets are processed before the delayed ack timer expires the packet will be acked twice. Signed-off-by: David S. Miller <[email protected]>
2008-10-07net: only invoke dev->change_rx_flags when device is UPPatrick McHardy1-6/+10
Jesper Dangaard Brouer <[email protected]> reported a bug when setting a VLAN device down that is in promiscous mode: When the VLAN device is set down, the promiscous count on the real device is decremented by one by vlan_dev_stop(). When removing the promiscous flag from the VLAN device afterwards, the promiscous count on the real device is decremented a second time by the vlan_change_rx_flags() callback. The root cause for this is that the ->change_rx_flags() callback is invoked while the device is down. The synchronization is meant to mirror the behaviour of the ->set_rx_mode callbacks, meaning the ->open function is responsible for doing a full sync on open, the ->close() function is responsible for doing full cleanup on ->stop() and ->change_rx_flags() is meant to do incremental changes while the device is UP. Only invoke ->change_rx_flags() while the device is UP to provide the intended behaviour. Tested-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: make uplitev6 mib per/namespaceDenis V. Lunev3-9/+10
Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: make udpv6 mib per/namespaceDenis V. Lunev3-9/+6
Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: add stub functions for per/namespace mibs allocationDenis V. Lunev1-2/+16
The content of init_ipv6_mibs/cleanup_ipv6_mibs will be moved to new calls one by one next. Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: allow per device ipv6 snmp statistics in non-initial namespaceDenis V. Lunev1-3/+0
Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: register global ipv6 mibs statistics in each namespaceDenis V. Lunev1-2/+4
Unused net variable will become used very soon. Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07ipv6: separate seq_ops for global & per/device ipv6 statisticsDenis V. Lunev1-16/+32
idev has been stored on seq->private. NULL has been stored for global statistics. The situation is changed with net namespace. We need to store pointer to struct net and the only place is seq->private. So, we'll have for /proc/net/dev_snmp6/* and for /proc/net/snmp6 pointers of two different types stored in the same field. This effectively requires to separate seq_ops of these files. Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07ipv6: consolidate ipv6 sock_stat code at the beginning of net/ipv6/proc.cDenis V. Lunev1-13/+13
Simple, comsolidate sockstat6 staff in one place, at the beginning of the file. Right now sockstat6_seq_open/sockstat6_seq_fops looks like an intrusion in the middle of snmp6 code. Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: register /proc/net/dev_snmp6/* in each nsDenis V. Lunev1-24/+16
Do the same for /proc/net/snmp6. Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07netns: move /proc/net/dev_snmp6 to struct netDenis V. Lunev1-9/+11
Signed-off-by: Denis V. Lunev <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07tcp: cleanup messy initializerIlpo Järvinen1-2/+2
I'm quite sure that if I give this function in its old format for you to inspect, you start to wonder what is the type of demanded or if it's a global variable. Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07tcp: kill pointless urg_modeIlpo Järvinen4-14/+20
It all started from me noticing that this urgent check in tcp_clean_rtx_queue is unnecessarily inside the loop. Then I took a longer look to it and found out that the users of urg_mode can trivially do without, well almost, there was one gotcha. Bonus: those funny people who use urg with >= 2^31 write_seq - snd_una could now rejoice too (that's the only purpose for the between being there, otherwise a simple compare would have done the thing). Not that I assume that the rest of the tcp code happily lives with such mind-boggling numbers :-). Alas, it turned out to be impossible to set wmem to such numbers anyway, yes I really tried a big sendfile after setting some wmem but nothing happened :-). ...Tcp_wmem is int and so is sk_sndbuf... So I hacked a bit variable to long and found out that it seems to work... :-) Signed-off-by: Ilpo Järvinen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07net: packet split receive apiPeter Zijlstra1-0/+20
Add some packet-split receive hooks. For one this allows to do NUMA node affine page allocs. Later on these hooks will be extended to do emergency reserve allocations for fragments. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07net: wrap sk->sk_backlog_rcv()Peter Zijlstra3-4/+4
Wrap calling sk->sk_backlog_rcv() in a function. This will allow extending the generic sk_backlog_rcv behaviour. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07ipv6: initialize ip6_route sysctl vars in ip6_route_net_init()Peter Zijlstra2-8/+9
This makes that ip6_route_net_init() does all of the route init code. There used to be a race between ip6_route_net_init() and ip6_net_init() and someone relying on the combined result was left out cold. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07ipv6: clean up ip6_route_net_init() error handlingPeter Zijlstra1-9/+10
ip6_route_net_init() error handling looked less than solid, fix 'er up. Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07inet: Don't lookup the socket if there's a socket attached to the skbKOVACS Krisztian2-6/+14
Use the socket cached in the skb if it's present. Signed-off-by: KOVACS Krisztian <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07inet: Add udplib_lookup_skb() helpersKOVACS Krisztian2-4/+24
To be able to use the cached socket reference in the skb during input processing we add a new set of lookup functions that receive the skb on their argument list. Signed-off-by: KOVACS Krisztian <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-07inet_hashtables: Add inet_lookup_skb helpersArnaldo Carvalho de Melo4-14/+6
To be able to use the cached socket reference in the skb during input processing we add a new set of lookup functions that receive the skb on their argument list. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: KOVACS Krisztian <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2008-10-06mac80211: avoid "Wireless Event too big" message for assoc responseJohn W. Linville1-3/+5
The association response IEs are sent to userland with an IWEVCUSTOM event, which unfortunately is limited to a little more than 100 bytes of IE information with the encoding used. Many APs send so much IE information that this message overflows. When the IWEVCUSTOM event is too large, the kernel doesn't send it to userland anyway -- better just not to send it. An attempt was made by Jouni Malinen to correct this issue by converting to use IWEVASSOCREQIE and IWEVASSOCRESPIE messages instead ("mac80211: Use IWEVASSOCREQIE instead of IWEVCUSTOM"). Unfortunately, that caused a problem due to 32-/64-bit interactions on some systems and was reverted after the 'userland ABI' rule was invoked. That leaves us with this option instead of a proper fix, at least until we move to a cfg80211-based solution. Signed-off-by: John W. Linville <[email protected]>