aboutsummaryrefslogtreecommitdiff
path: root/net/ipv6/ip6_output.c
AgeCommit message (Collapse)AuthorFilesLines
2011-12-22net: introduce DST_NOPEER dst flagEric Dumazet1-1/+1
Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <[email protected]> Tested-by: Chris Boot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-12-05net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.David Miller1-3/+3
To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: David S. Miller <[email protected]> Acked-by: Roland Dreier <[email protected]>
2011-12-03ipv6: Add fragment reporting to ipv6_skip_exthdr().Jesse Gross1-1/+2
While parsing through IPv6 extension headers, fragment headers are skipped making them invisible to the caller. This reports the fragment offset of the last header in order to make it possible to determine whether the packet is fragmented and, if so whether it is a first or last fragment. Signed-off-by: Jesse Gross <[email protected]>
2011-11-22net: remove ipv6_addr_copy()Alexey Dobriyan1-9/+9
C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-11-18ipv6: Remove all uses of LL_ALLOCATED_SPACEHerbert Xu1-2/+6
ipv6: Remove all uses of LL_ALLOCATED_SPACE The macro LL_ALLOCATED_SPACE was ill-conceived. It applies the alignment to the sum of needed_headroom and needed_tailroom. As the amount that is then reserved for head room is needed_headroom with alignment, this means that the tail room left may be too small. This patch replaces all uses of LL_ALLOCATED_SPACE in net/ipv6 with the macro LL_RESERVED_SPACE and direct reference to needed_tailroom. This also fixes the problem with needed_headroom changing between allocating the skb and reserving the head room. Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-10-28ipv6: fix error propagation in ip6_ufo_append_data()Zheng Yan1-1/+1
We should return errcode from sock_alloc_send_skb() Signed-off-by: Zheng Yan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-10-27ipv6: tcp: fix TCLASS value in ACK messages sent from TIME_WAITEric Dumazet1-5/+2
commit 66b13d99d96a (ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT) fixed IPv4 only. This part is for the IPv6 side, adding a tclass param to ip6_xmit() We alias tw_tclass and tw_tos, if socket family is INET6. [ if sockets is ipv4-mapped, only IP_TOS socket option is used to fill TOS field, TCLASS is not taken into account ] Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-10-19net: add skb frag size accessorsEric Dumazet1-2/+3
To ease skb->truesize sanitization, its better to be able to localize all references to skb frags size. Define accessors : skb_frag_size() to fetch frag size, and skb_frag_size_{set|add|sub}() to manipulate it. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-10-18ipv6: Fix IPsec slowpath fragmentation problemSteffen Klassert1-6/+12
ip6_append_data() builds packets based on the mtu from dst_mtu(rt->dst.path). On IPsec the effective mtu is lower because we need to add the protocol headers and trailers later when we do the IPsec transformations. So after the IPsec transformations the packet might be too big, which leads to a slowpath fragmentation then. This patch fixes this by building the packets based on the lower IPsec mtu from dst_mtu(&rt->dst) and adapts the exthdr handling to this. Signed-off-by: Steffen Klassert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-08-24net: ipv6: convert to SKB frag APIsIan Campbell1-3/+4
Signed-off-by: Ian Campbell <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: "Pekka Savola (ipv6)" <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2011-08-03net: fix NULL dereferences in check_peer_redir()Eric Dumazet1-2/+11
Gergely Kalman reported crashes in check_peer_redir(). It appears commit f39925dbde778 (ipv4: Cache learned redirect information in inetpeer.) added a race, leading to possible NULL ptr dereference. Since we can now change dst neighbour, we should make sure a reader can safely use a neighbour. Add RCU protection to dst neighbour, and make sure check_peer_redir() can be called safely by different cpus in parallel. As neighbours are already freed after one RCU grace period, this patch should not add typical RCU penalty (cache cold effects) Many thanks to Gergely for providing a pretty report pointing to the bug. Reported-by: Gergely Kalman <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-07-21ipv6: make fragment identifications less predictableEric Dumazet1-5/+31
IPv6 fragment identification generation is way beyond what we use for IPv4 : It uses a single generator. Its not scalable and allows DOS attacks. Now inetpeer is IPv6 aware, we can use it to provide a more secure and scalable frag ident generator (per destination, instead of system wide) This patch : 1) defines a new secure_ipv6_id() helper 2) extends inet_getid() to provide 32bit results 3) extends ipv6_select_ident() with a new dest parameter Reported-by: Fernando Gont <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-07-17net: Abstract dst->neighbour accesses behind helpers.David S. Miller1-6/+10
dst_{get,set}_neighbour() Signed-off-by: David S. Miller <[email protected]>
2011-07-16net: Create and use new helper, neigh_output().David S. Miller1-7/+3
Signed-off-by: David S. Miller <[email protected]>
2011-07-16ipv6: Use calculated 'neigh' instead of re-evaluating dst->neighbourDavid S. Miller1-1/+1
Signed-off-by: David S. Miller <[email protected]>
2011-07-14net: Embed hh_cache inside of struct neighbour.David S. Miller1-5/+9
Now that there is a one-to-one correspondance between neighbour and hh_cache entries, we no longer need: 1) dynamic allocation 2) attachment to dst->hh 3) refcounting Initialization of the hh_cache entry is indicated by hh_len being non-zero, and such initialization is always done with the neighbour's lock held as a writer. Signed-off-by: David S. Miller <[email protected]>
2011-05-06inet: Decrease overhead of on-stack inet_cork.David S. Miller1-16/+18
When we fast path datagram sends to avoid locking by putting the inet_cork on the stack we use up lots of space that isn't necessary. This is because inet_cork contains a "struct flowi" which isn't used in these code paths. Split inet_cork to two parts, "inet_cork" and "inet_cork_full". Only the latter of which has the "struct flowi" and is what is stored in inet_sock. Signed-off-by: David S. Miller <[email protected]> Acked-by: Eric Dumazet <[email protected]>
2011-04-22inet: constify ip headers and in6_addrEric Dumazet1-4/+4
Add const qualifiers to structs iphdr, ipv6hdr and in6_addr pointers where possible, to make code intention more obvious. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-04-15ipv6: RTA_PREFSRC support for ipv6 route source address selectionDaniel Walter1-4/+4
[ipv6] Add support for RTA_PREFSRC This patch allows a user to select the preferred source address for a specific IPv6-Route. It can be set via a netlink message setting RTA_PREFSRC to a valid IPv6 address which must be up on the device the route will be bound to. Signed-off-by: Daniel Walter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-03-31Fix common misspellingsLucas De Marchi1-1/+1
Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <[email protected]>
2011-03-12ipv6: Convert to use flowi6 where applicable.David S. Miller1-45/+45
Signed-off-by: David S. Miller <[email protected]>
2011-03-12net: Put flowi_* prefix on AF independent members of struct flowiDavid S. Miller1-5/+5
I intend to turn struct flowi into a union of AF specific flowi structs. There will be a common structure that each variant includes first, much like struct sock_common. This is the first step to move in that direction. Signed-off-by: David S. Miller <[email protected]>
2011-03-02xfrm: Return dst directly from xfrm_lookup()David S. Miller1-8/+2
Instead of on the stack. Signed-off-by: David S. Miller <[email protected]>
2011-03-01xfrm: Handle blackhole route creation via afinfo.David S. Miller1-22/+10
That way we don't have to potentially do this in every xfrm_lookup() caller. Signed-off-by: David S. Miller <[email protected]>
2011-03-01ipv6: Normalize arguments to ip6_dst_blackhole().David S. Miller1-2/+2
Return a dst pointer which is potentitally error encoded. Don't pass original dst pointer by reference, pass a struct net instead of a socket, and elide the flow argument since it is unnecessary. Signed-off-by: David S. Miller <[email protected]>
2011-03-01xfrm: Kill XFRM_LOOKUP_WAIT flag.David S. Miller1-2/+2
This can be determined from the flow flags instead. Signed-off-by: David S. Miller <[email protected]>
2011-03-01ipv6: Change final dst lookup arg name to "can_sleep"David S. Miller1-6/+6
Since it indicates whether we are invoked from a sleepable context or not. Signed-off-by: David S. Miller <[email protected]>
2011-03-01net: Add FLOWI_FLAG_CAN_SLEEP.David S. Miller1-0/+2
And set is in contexts where the route resolution can sleep. Signed-off-by: David S. Miller <[email protected]>
2011-03-01ipv6: Consolidate route lookup sequences.David S. Miller1-11/+69
Route lookups follow a general pattern in the ipv6 code wherein we first find the non-IPSEC route, potentially override the flow destination address due to ipv6 options settings, and then finally make an IPSEC search using either xfrm_lookup() or __xfrm_lookup(). __xfrm_lookup() is used when we want to generate a blackhole route if the key manager needs to resolve the IPSEC rules (in this case -EREMOTE is returned and the original 'dst' is left unchanged). Otherwise plain xfrm_lookup() is used and when asynchronous IPSEC resolution is necessary, we simply fail the lookup completely. All of these cases are encapsulated into two routines, ip6_dst_lookup_flow and ip6_sk_dst_lookup_flow. The latter of which handles unconnected UDP datagram sockets. Signed-off-by: David S. Miller <[email protected]>
2011-03-01inet: Remove unused sk_sndmsg_* from UFOHerbert Xu1-1/+0
UFO doesn't really use the sk_sndmsg_* parameters so touching them is pointless. It can't use them anyway since the whole point of UFO is to use the original pages without copying. Signed-off-by: Herbert Xu <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-02-28net: TX timestamps for IPv6 UDP packetsAnders Berggren1-0/+17
Enabling TX timestamps (SO_TIMESTAMPING) for IPv6 UDP packets, in the same fashion as for IPv4. Necessary in order for NICs such as Intel 82580 to timestamp IPv6 packets. Signed-off-by: Anders Berggren <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-02-25ipv6: totlen is declared and assigned but not usedHagen Paul Pfeifer1-3/+0
Signed-off-by: Hagen Paul Pfeifer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-02-04inetpeer: Move ICMP rate limiting state into inet_peer entries.David S. Miller1-1/+4
Like metrics, the ICMP rate limiting bits are cached state about a destination. So move it into the inet_peer entries. If an inet_peer cannot be bound (the reason is memory allocation failure or similar), the policy is to allow. Signed-off-by: David S. Miller <[email protected]>
2011-01-12inet6: prevent network storms caused by linux IPv6 routersAlexey Kuznetsov1-0/+3
Linux IPv6 forwards unicast packets, which are link layer multicasts... The hole was present since day one. I was 100% this check is there, but it is not. The problem shows itself, f.e. when Microsoft Network Load Balancer runs on a network. This software resolves IPv6 unicast addresses to multicast MAC addresses. Signed-off-by: Alexey Kuznetsov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-19ipv6: Fragment locally generated tunnel-mode IPSec6 packets as needed.David Stevens1-10/+2
This patch modifies IPsec6 to fragment IPv6 packets that are locally generated as needed. This version of the patch only fragments in tunnel mode, so that fragment headers will not be obscured by ESP in transport mode. Signed-off-by: David L Stevens <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-09-27Merge branch 'master' of ↵David S. Miller1-5/+13
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/qlcnic/qlcnic_init.c net/ipv4/ip_output.c
2010-09-23net: return operator cleanupEric Dumazet1-2/+2
Change "return (EXPR);" to "return EXPR;" return is not a function, parentheses are not required. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-09-21ip: fix truesize mismatch in ip fragmentationEric Dumazet1-5/+13
Special care should be taken when slow path is hit in ip_fragment() : When walking through frags, we transfert truesize ownership from skb to frags. Then if we hit a slow_path condition, we must undo this or risk uncharging frags->truesize twice, and in the end, having negative socket sk_wmem_alloc counter, or even freeing socket sooner than expected. Many thanks to Nick Bowler, who provided a very clean bug report and test program. Thanks to Jarek for reviewing my first patch and providing a V2 While Nick bisection pointed to commit 2b85a34e911 (net: No more expensive sock_hold()/sock_put() on each tx), underlying bug is older (2.6.12-rc5) A side effect is to extend work done in commit b2722b1c3a893e (ip_fragment: also adjust skb->truesize for packets not owned by a socket) to ipv6 as well. Reported-and-bisected-by: Nick Bowler <[email protected]> Tested-by: Nick Bowler <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> CC: Jarek Poplawski <[email protected]> CC: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-08-23net: Rename skb_has_frags to skb_has_frag_listDavid S. Miller1-1/+1
SKBs can be "fragmented" in two ways, via a page array (called skb_shinfo(skb)->frags[]) and via a list of SKBs (called skb_shinfo(skb)->frag_list). Since skb_has_frags() tests the latter, it's name is confusing since it sounds more like it's testing the former. Signed-off-by: David S. Miller <[email protected]>
2010-06-10net-next: remove useless union keywordChangli Gao1-19/+19
remove useless union keyword in rtable, rt6_info and dn_route. Since there is only one member in a union, the union keyword isn't useful. Signed-off-by: Changli Gao <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-05-28ipv6: Add GSO support on forwarding pathHerbert Xu1-1/+1
Currently we disallow GSO packets on the IPv6 forward path. This patch fixes this. Note that I discovered that our existing GSO MTU checks (e.g., IPv4 forwarding) are buggy in that they skip the check altogether, when they really should be checking gso_size + header instead. I have also been lazy here in that I haven't bothered to segment the GSO packet by hand before generating an ICMP message. Someone should add that to be 100% correct. Reported-by: Ralf Baechle <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-05-11ipv6: ip6mr: support multiple tablesPatrick McHardy1-1/+1
This patch adds support for multiple independant multicast routing instances, named "tables". Userspace multicast routing daemons can bind to a specific table instance by issuing a setsockopt call using a new option MRT6_TABLE. The table number is stored in the raw socket data and affects all following ip6mr setsockopt(), getsockopt() and ioctl() calls. By default, a single table (RT6_TABLE_DFLT) is created with a default routing rule pointing to it. Newly created pim6reg devices have the table number appended ("pim6regX"), with the exception of devices created in the default table, which are named just "pim6reg" for compatibility reasons. Packets are directed to a specific table instance using routing rules, similar to how regular routing rules work. Currently iif, oif and mark are supported as keys, source and destination addresses could be supported additionally. Example usage: - bind pimd/xorp/... to a specific table: uint32_t table = 123; setsockopt(fd, SOL_IPV6, MRT6_TABLE, &table, sizeof(table)); - create routing rules directing packets to the new table: # ip -6 mrule add iif eth0 lookup 123 # ip -6 mrule add oif eth0 lookup 123 Signed-off-by: Patrick McHardy <[email protected]>
2010-05-10Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy1-13/+20
Conflicts: net/bridge/br_device.c net/bridge/br_forward.c Signed-off-by: Patrick McHardy <[email protected]>
2010-04-30ipv6: cleanup: remove unneeded null checkDan Carpenter1-2/+1
We dereference "sk" unconditionally elsewhere in the function. This was left over from: b30bd282 "ip6_xmit: remove unnecessary NULL ptr check". According to that commit message, "the sk argument to ip6_xmit is never NULL nowadays since the skb->priority assigment expects a valid socket." Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-27Merge branch 'master' of ↵David S. Miller1-1/+1
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e100.c drivers/net/e1000e/netdev.c
2010-04-23IPv6: Complete IPV6_DONTFRAG supportBrian Haley1-8/+16
Finally add support to detect a local IPV6_DONTFRAG event and return the relevant data to the user if they've enabled IPV6_RECVPATHMTU on the socket. The next recvmsg() will return no data, but have an IPV6_PATHMTU as ancillary data. Signed-off-by: Brian Haley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-23IPv6: Add dontfrag argument to relevant functionsBrian Haley1-1/+1
Add dontfrag argument to relevant functions for IPV6_DONTFRAG support, as well as allowing the value to be passed-in via ancillary cmsg data. Signed-off-by: Brian Haley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-21ipv6: allow to send packet after receiving ICMPv6 Too Big message with MTU ↵Shan Wei1-1/+1
field less than IPV6_MIN_MTU According to RFC2460, PMTU is set to the IPv6 Minimum Link MTU (1280) and a fragment header should always be included after a node receiving Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU. After receiving a ICMPv6 Too Big message reporting PMTU is less than the IPv6 Minimum Link MTU, sctp *can't* send any data/control chunk that total length including IPv6 head and IPv6 extend head is less than IPV6_MIN_MTU(1280 bytes). The failure occured in p6_fragment(), about reason see following(take SHUTDOWN chunk for example): sctp_packet_transmit (SHUTDOWN chunk, len=16 byte) |------sctp_v6_xmit (local_df=0) |------ip6_xmit |------ip6_output (dst_allfrag is ture) |------ip6_fragment In ip6_fragment(), for local_df=0, drops the the packet and returns EMSGSIZE. The patch fixes it with adding check length of skb->len. In this case, Ipv6 not to fragment upper protocol data, just only add a fragment header before it. Signed-off-by: Shan Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-21Merge branch 'master' of ↵David S. Miller1-1/+1
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-6000.c net/core/dev.c
2010-04-20Merge branch 'master' of /repos/git/net-next-2.6Patrick McHardy1-6/+3
Conflicts: Documentation/feature-removal-schedule.txt net/ipv6/netfilter/ip6t_REJECT.c net/netfilter/xt_limit.c Signed-off-by: Patrick McHardy <[email protected]>