aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-01-19net: ipv4: handle DSA enabled master network devicesFlorian Fainelli1-3/+3
The logic to configure a network interface for kernel IP auto-configuration is very simplistic, and does not handle the case where a device is stacked onto another such as with DSA. This causes the kernel not to open and configure the master network device in a DSA switch tree, and therefore slave network devices using this master network devices as conduit device cannot be open. This restriction comes from a check in net/dsa/slave.c, which is basically checking the master netdev flags for IFF_UP and returns -ENETDOWN if it is not the case. Automatically bringing-up DSA master network devices allows DSA slave network devices to be used as valid interfaces for e.g: NFS root booting by allowing kernel IP autoconfiguration to succeed on these interfaces. On the reverse path, make sure we do not attempt to close a DSA-enabled device as this would implicitely prevent the slave DSA network device from operating. Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19ipv6: stop sending PTB packets for MTU < 1280Hagen Paul Pfeifer1-5/+2
Reduce the attack vector and stop generating IPv6 Fragment Header for paths with an MTU smaller than the minimum required IPv6 MTU size (1280 byte) - called atomic fragments. See IETF I-D "Deprecating the Generation of IPv6 Atomic Fragments" [1] for more information and how this "feature" can be misused. [1] https://tools.ietf.org/html/draft-ietf-6man-deprecate-atomfrag-generation-00 Signed-off-by: Fernando Gont <[email protected]> Signed-off-by: Hagen Paul Pfeifer <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19rtnl: allow to create device with IFLA_LINK_NETNSID setNicolas Dichtel1-3/+22
This patch adds the ability to create a netdevice in a specified netns and then move it into the final netns. In fact, it allows to have a symetry between get and set rtnl messages. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19tunnels: advertise link netns via netlinkNicolas Dichtel8-0/+24
Implement rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is added to rtnetlink messages. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19rtnl: add link netns id to interface messagesNicolas Dichtel1-0/+13
This patch adds a new attribute (IFLA_LINK_NETNSID) which contains the 'link' netns id when this netns is different from the netns where the interface stands (for example for x-net interfaces like ip tunnels). With this attribute, it's possible to interpret correctly all advertised information (like IFLA_LINK, etc.). Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19netns: add rtnl cmd to add and get peer netns idsNicolas Dichtel1-0/+211
With this patch, a user can define an id for a peer netns by providing a FD or a PID. These ids are local to the netns where it is added (ie valid only into this netns). The main function (ie the one exported to other module), peernet2id(), allows to get the id of a peer netns. If no id has been assigned by the user, this function allocates one. These ids will be used in netlink messages to point to a peer netns, for example in case of a x-netns interface. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-19mac80211: delete the assoc/auth timer upon suspendEmmanuel Grumbach1-0/+12
While suspending, we destroy the authentication / association that might be taking place. While doing so, we forgot to delete the timer which can be firing after local->suspended is already set, producing the warning below. Fix that by deleting the timer. [66722.825487] WARNING: CPU: 2 PID: 5612 at net/mac80211/util.c:755 ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211]() [66722.825487] queueing ieee80211 work while going to suspend [66722.825529] CPU: 2 PID: 5612 Comm: kworker/u16:69 Tainted: G W O 3.16.1+ #24 [66722.825537] Workqueue: events_unbound async_run_entry_fn [66722.825545] Call Trace: [66722.825552] <IRQ> [<ffffffff817edbb2>] dump_stack+0x4d/0x66 [66722.825556] [<ffffffff81075cad>] warn_slowpath_common+0x7d/0xa0 [66722.825572] [<ffffffffa06b5b90>] ? ieee80211_sta_bcn_mon_timer+0x50/0x50 [mac80211] [66722.825573] [<ffffffff81075d1c>] warn_slowpath_fmt+0x4c/0x50 [66722.825586] [<ffffffffa06977a2>] ieee80211_can_queue_work.isra.18+0x32/0x40 [mac80211] [66722.825598] [<ffffffffa06977d5>] ieee80211_queue_work+0x25/0x50 [mac80211] [66722.825611] [<ffffffffa06b5bac>] ieee80211_sta_timer+0x1c/0x20 [mac80211] [66722.825614] [<ffffffff8108655a>] call_timer_fn+0x8a/0x300 Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-19Revert "wireless: Support of IFLA_INFO_KIND rtnl attribute"Johannes Berg1-6/+0
This reverts commit ba1debdfed974f25aa598c283567878657b292ee. Oliver reported that it breaks network-manager, for some reason with this patch NM decides that the device isn't wireless but "generic" (ethernet), sees no carrier (as expected with wifi) and fails to do anything else with it. Revert this to unbreak userspace. Reported-by: Oliver Hartkopp <[email protected]> Tested-by: Oliver Hartkopp <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-19netfilter: nf_tables: validate hooks in NAT expressionsPablo Neira Ayuso5-50/+88
The user can crash the kernel if it uses any of the existing NAT expressions from the wrong hook, so add some code to validate this when loading the rule. This patch introduces nft_chain_validate_hooks() which is based on an existing function in the bridge version of the reject expression. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-01-19bridge: remove oflags from setlink/dellink.Rosen, Rami1-6/+2
Commit 02dba4388d16 ("bridge: fix setlink/dellink notifications") removed usage of oflags in both rtnl_bridge_setlink() and rtnl_bridge_dellink() methods. This patch removes this variable as it is no longer needed. Signed-off-by: Rami Rosen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-18netlink: Fix bugs in nlmsg_end() conversions.David S. Miller6-14/+11
Commit 053c095a82cf ("netlink: make nlmsg_end() and genlmsg_end() void") didn't catch all of the cases where callers were breaking out on the return value being equal to zero, which they no longer should when zero means success. Fix all such cases. Reported-by: Marcel Holtmann <[email protected]> Reported-by: Scott Feldman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-18netlink: make nlmsg_end() and genlmsg_end() voidJohannes Berg40-110/+187
Contrary to common expectations for an "int" return, these functions return only a positive value -- if used correctly they cannot even return 0 because the message header will necessarily be in the skb. This makes the very common pattern of if (genlmsg_end(...) < 0) { ... } be a whole bunch of dead code. Many places also simply do return nlmsg_end(...); and the caller is expected to deal with it. This also commonly (at least for me) causes errors, because it is very common to write if (my_function(...)) /* error condition */ and if my_function() does "return nlmsg_end()" this is of course wrong. Additionally, there's not a single place in the kernel that actually needs the message length returned, and if anyone needs it later then it'll be very easy to just use skb->len there. Remove this, and make the functions void. This removes a bunch of dead code as described above. The patch adds lines because I did - return nlmsg_end(...); + nlmsg_end(...); + return 0; I could have preserved all the function's return values by returning skb->len, but instead I've audited all the places calling the affected functions and found that none cared. A few places actually compared the return value with <= 0 in dump functionality, but that could just be changed to < 0 with no change in behaviour, so I opted for the more efficient version. One instance of the error I've made numerous times now is also present in net/phonet/pn_netlink.c in the route_dumpit() function - it didn't check for <0 or <=0 and thus broke out of the loop every single time. I've preserved this since it will (I think) have caused the messages to userspace to be formatted differently with just a single message for every SKB returned to userspace. It's possible that this isn't needed for the tools that actually use this, but I don't even know what they are so couldn't test that changing this behaviour would be acceptable. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-18tipc: fix socket list regression in new nl apiRichard Alpe1-12/+18
Commit 07f6c4bc (tipc: convert tipc reference table to use generic rhashtable) introduced a problem with port listing in the new netlink API. It broke the resume functionality resulting in a never ending loop. This was caused by starting with the first hash table every time subsequently never returning an empty skb (terminating). This patch fixes the resume mechanism by keeping a logical reference to the last hash table along with a logical reference to the socket (port) that didn't fit in the previous message. Signed-off-by: Richard Alpe <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-18Merge branch 'for-upstream' of ↵David S. Miller32-2377/+2385
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-01-16 Here are some more bluetooth & ieee802154 patches intended for 3.20: - Refactoring & cleanups of ieee802154 & 6lowpan code - Various fixes to the btmrvl driver - Fixes for Bluetooth Low Energy Privacy feature handling - Added build-time sanity checks for sockaddr sizes - Fixes for Security Manager registration on LE-only controllers - Refactoring of broken inquiry mode handling to a generic quirk Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-01-18net: replace br_fdb_external_learn_* calls with switchdev notifier eventsJiri Pirko3-35/+59
This patch benefits from newly introduced switchdev notifier and uses it to propagate fdb learn events from rocker driver to bridge. That avoids direct function calls and possible use by other listeners (ovs). Suggested-by: Thomas Graf <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: Scott Feldman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-18switchdev: introduce switchdev notifierJiri Pirko1-0/+65
This patch introduces new notifier for purposes of exposing events which happen on switch driver side. The consumers of the event messages are mainly involved masters, namely bridge and ovs. Suggested-by: Thomas Graf <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: Scott Feldman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-17socket: use ki_nbytes instead of iov_length()Nicolas Dichtel1-6/+4
This field already contains the length of the iovec, no need to calculate it again. Suggested-by: Al Viro <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-17net: sctp: fix race for one-to-many sockets in sendmsg's auto associateDaniel Borkmann1-1/+7
I.e. one-to-many sockets in SCTP are not required to explicitly call into connect(2) or sctp_connectx(2) prior to data exchange. Instead, they can directly invoke sendmsg(2) and the SCTP stack will automatically trigger connection establishment through 4WHS via sctp_primitive_ASSOCIATE(). However, this in its current implementation is racy: INIT is being sent out immediately (as it cannot be bundled anyway) and the rest of the DATA chunks are queued up for later xmit when connection is established, meaning sendmsg(2) will return successfully. This behaviour can result in an undesired side-effect that the kernel made the application think the data has already been transmitted, although none of it has actually left the machine, worst case even after close(2)'ing the socket. Instead, when the association from client side has been shut down e.g. first gracefully through SCTP_EOF and then close(2), the client could afterwards still receive the server's INIT_ACK due to a connection with higher latency. This INIT_ACK is then considered out of the blue and hence responded with ABORT as there was no alive assoc found anymore. This can be easily reproduced f.e. with sctp_test application from lksctp. One way to fix this race is to wait for the handshake to actually complete. The fix defers waiting after sctp_primitive_ASSOCIATE() and sctp_primitive_SEND() succeeded, so that DATA chunks cooked up from sctp_sendmsg() have already been placed into the output queue through the side-effect interpreter, and therefore can then be bundeled together with COOKIE_ECHO control chunks. strace from example application (shortened): socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5 sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")}, msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...}, msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF close(3) = 0 tcpdump before patch (fooling the application): 22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684] 22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591] 22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT] tcpdump after patch: 14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729] 14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492] 14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...] 14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0] 14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...] 14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0] 14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...] 14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0] 14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN] 14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK] 14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE] Looks like this bug is from the pre-git history museum. ;) Fixes: 08707d5482df ("lksctp-2_5_31-0_5_1.patch") Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-17tc: cls_bpf: rename bpf_len to bpf_num_opsJiri Pirko1-9/+9
It was suggested by DaveM to change the name as "len" might indicate unit bytes. Suggested-by: David Miller <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-17tc: add BPF based actionJiri Pirko3-0/+218
This action provides a possibility to exec custom BPF code. Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-17bridge: fix setlink/dellink notificationsRoopa Prabhu2-24/+26
problems with bridge getlink/setlink notifications today: - bridge setlink generates two notifications to userspace - one from the bridge driver - one from rtnetlink.c (rtnl_bridge_notify) - dellink generates one notification from rtnetlink.c. Which means bridge setlink and dellink notifications are not consistent - Looking at the code it appears, If both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF were set, the size calculation in rtnl_bridge_notify can be wrong. Example: if you set both BRIDGE_FLAGS_MASTER and BRIDGE_FLAGS_SELF in a setlink request to rocker dev, rtnl_bridge_notify will allocate skb for one set of bridge attributes, but, both the bridge driver and rocker dev will try to add attributes resulting in twice the number of attributes being added to the skb. (rocker dev calls ndo_dflt_bridge_getlink) There are multiple options: 1) Generate one notification including all attributes from master and self: But, I don't think it will work, because both master and self may use the same attributes/policy. Cannot pack the same set of attributes in a single notification from both master and slave (duplicate attributes). 2) Generate one notification from master and the other notification from self (This seems to be ideal): For master: the master driver will send notification (bridge in this example) For self: the self driver will send notification (rocker in the above example. It can use helpers from rtnetlink.c to do so. Like the ndo_dflt_bridge_getlink api). This patch implements 2) (leaving the 'rtnl_bridge_notify' around to be used with 'self'). v1->v2 : - rtnl_bridge_notify is now called only for self, so, remove 'BRIDGE_FLAGS_SELF' check and cleanup a few things - rtnl_bridge_dellink used to always send a RTM_NEWLINK msg earlier. So, I have changed the notification from br_dellink to go as RTM_NEWLINK Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-16genetlink: synchronize socket closing and family removalJohannes Berg3-14/+27
In addition to the problem Jeff Layton reported, I looked at the code and reproduced the same warning by subscribing and removing the genl family with a socket still open. This is a fairly tricky race which originates in the fact that generic netlink allows the family to go away while sockets are still open - unlike regular netlink which has a module refcount for every open socket so in general this cannot be triggered. Trying to resolve this issue by the obvious locking isn't possible as it will result in deadlocks between unregistration and group unbind notification (which incidentally lockdep doesn't find due to the home grown locking in the netlink table.) To really resolve this, introduce a "closing socket" reference counter (for generic netlink only, as it's the only affected family) in the core netlink code and use that in generic netlink to wait for all the sockets that are being closed at the same time as a generic netlink family is removed. This fixes the race that when a socket is closed, it will should call the unbind, but if the family is removed at the same time the unbind will not find it, leading to the warning. The real problem though is that in this case the unbind could actually find a new family that is registered to have a multicast group with the same ID, and call its mcast_unbind() leading to confusing. Also remove the warning since it would still trigger, but is now no longer a problem. This also moves the code in af_netlink.c to before unreferencing the module to avoid having the same problem in the normal non-genl case. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-16genetlink: disallow subscribing to unknown mcast groupsJohannes Berg1-1/+1
Jeff Layton reported that he could trigger the multicast unbind warning in generic netlink using trinity. I originally thought it was a race condition between unregistering the generic netlink family and closing the socket, but there's a far simpler explanation: genetlink currently allows subscribing to groups that don't (yet) exist, and the warning is triggered when unsubscribing again while the group still doesn't exist. Originally, I had a warning in the subscribe case and accepted it out of userspace API concerns, but the warning was of course wrong and removed later. However, I now think that allowing userspace to subscribe to groups that don't exist is wrong and could possibly become a security problem: Consider a (new) genetlink family implementing a permission check in the mcast_bind() function similar to the like the audit code does today; it would be possible to bypass the permission check by guessing the ID and subscribing to the group it exists. This is only possible in case a family like that would be dynamically loaded, but it doesn't seem like a huge stretch, for example wireless may be loaded when you plug in a USB device. To avoid this reject such subscription attempts. If this ends up causing userspace issues we may need to add a workaround in af_netlink to deny such requests but not return an error. Reported-by: Jeff Layton <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-16cfg80211: fix checking nl80211_send_station() return valueJohannes Berg1-1/+1
The return value from nl80211_send_station() is the length of the skb, or a negative error, so abort sending the message only when the return value was negative. This fixes the ibss_rsn wpa_supplicant test case. Reported-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-16mac80211: remove doubled semicolonJohannes Berg1-1/+1
Signed-off-by: Johannes Berg <[email protected]>
2015-01-16Bluetooth: Remove unused functionRickard Strandqvist1-15/+0
Remove the function hci_conn_change_link_key() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-16netlink: Fix netlink_insert EADDRINUSE errorHerbert Xu1-4/+7
The patch c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate nl_sk_hash_lock") introduced a bug where the EADDRINUSE error has been replaced by ENOMEM. This patch rectifies that problem. Signed-off-by: Herbert Xu <[email protected]> Acked-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-16net: rps: fix cpu unplugEric Dumazet1-5/+15
softnet_data.input_pkt_queue is protected by a spinlock that we must hold when transferring packets from victim queue to an active one. This is because other cpus could still be trying to enqueue packets into victim queue. A second problem is that when we transfert the NAPI poll_list from victim to current cpu, we absolutely need to special case the percpu backlog, because we do not want to add complex locking to protect process_queue : Only owner cpu is allowed to manipulate it, unless cpu is offline. Based on initial patch from Prasad Sodagudi & Subash Abhinov Kasiviswanathan. This version is better because we do not slow down packet processing, only make migration safer. Reported-by: Prasad Sodagudi <[email protected]> Reported-by: Subash Abhinov Kasiviswanathan <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15ip: zero sockaddr returned on error queueWillem de Bruijn2-13/+5
The sockaddr is returned in IP(V6)_RECVERR as part of errhdr. That structure is defined and allocated on the stack as struct { struct sock_extended_err ee; struct sockaddr_in(6) offender; } errhdr; The second part is only initialized for certain SO_EE_ORIGIN values. Always initialize it completely. An MTU exceeded error on a SOCK_RAW/IPPROTO_RAW is one example that would return uninitialized bytes. Signed-off-by: Willem de Bruijn <[email protected]> ---- Also verified that there is no padding between errhdr.ee and errhdr.offender that could leak additional kernel data. Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15bridge: use MDBA_SET_ENTRY_MAX for maxtype in nlmsg_parse()Nicolas Dichtel1-1/+1
This is just a cleanup, because in the current code MDBA_SET_ENTRY_MAX == MDBA_SET_ENTRY. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15Merge tag 'mac80211-for-davem-2015-01-15' of ↵David S. Miller2-23/+35
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Just two fixes - one for an uninialized variable and one for a deadlock in regulatory processing. Signed-off-by: David S. Miller <[email protected]>
2015-01-15Merge tag 'mac80211-next-for-davem-2015-01-15' of ↵David S. Miller36-476/+1101
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Here's a big pile of changes for this round. We have * a lot of regulatory code changes to deal with the way newer Intel devices handle this * a change to drop packets while disconnecting from an AP instead of trying to wait for them * a new attempt at improving the tailroom accounting to not kick in too much for performance reasons * improvements in wireless link statistics * many other small improvements and small fixes that didn't seem necessary for 3.19 (e.g. in hwsim which is testing only code) Conflicts: drivers/staging/rtl8723au/os_dep/ioctl_cfg80211.c Minor overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2015-01-15ipv4: per cpu uncached listEric Dumazet1-13/+33
RAW sockets with hdrinc suffer from contention on rt_uncached_lock spinlock. One solution is to use percpu lists, since most routes are destroyed by the cpu that created them. It is unclear why we even have to put these routes in uncached_list, as all outgoing packets should be freed when a device is dismantled. Signed-off-by: Eric Dumazet <[email protected]> Fixes: caacf05e5ad1 ("ipv4: Properly purge netdev references on uncached routes.") Signed-off-by: David S. Miller <[email protected]>
2015-01-15cfg80211: change bandwidth reporting to explicit fieldJohannes Berg4-36/+84
For some reason, we made the bandwidth separate flags, which is rather confusing - a single rate cannot have different bandwidths at the same time. Change this to no longer be flags but use a separate field for the bandwidth ('bw') instead. While at it, add support for 5 and 10 MHz rates - these are reported as regular legacy rates with their real bitrate, but tagged as 5/10 now to make it easier to distinguish them. In the nl80211 API, the flags are preserved, but the code now can also clearly only set a single one of the flags. Signed-off-by: Johannes Berg <[email protected]>
2015-01-15svcrdma: Handle additional inline contentChuck Lever1-0/+55
Most NFS RPCs place their large payload argument at the end of the RPC header (eg, NFSv3 WRITE). For NFSv3 WRITE and SYMLINK, RPC/RDMA sends the complete RPC header inline, and the payload argument in the read list. Data in the read list is the last part of the XDR stream. One important case is not like this, however. NFSv4 COMPOUND is a counted array of operations. A WRITE operation, with its large data payload, can appear in the middle of the compound's operations array. Thus NFSv4 WRITE compounds can have header content after the WRITE payload. The Linux client, for example, performs an NFSv4 WRITE like this: { PUTFH, WRITE, GETATTR } Though RFC 5667 is not precise about this, the proper way to convey this compound is to place the GETATTR inline, _after_ the front of the RPC header. The receiver inserts the read list payload into the XDR stream after the initial WRITE arguments, and before the GETATTR operation, thanks to the value of the read list "position" field. The Linux client currently sends the GETATTR at the end of the RPC/RDMA read list, which is incorrect. It will be corrected in the future. The Linux server currently rejects NFSv4 compounds with inline content after the read list. For the above NFSv4 WRITE compound, the NFS compound header indicates there are three operations, but the server finds nonsense when it looks in the XDR stream for the third operation, and the compound fails with OP_ILLEGAL. Move trailing inline content to the end of the XDR buffer's page list. This presents incoming NFSv4 WRITE compounds to NFSD in the same way the socket transport does. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Move read list XDR round-up logicChuck Lever1-28/+9
This is a pre-requisite for a subsequent patch. Read list XDR round-up needs to be done _before_ additional inline content is copied to the end of the XDR buffer's page list. Move the logic added by commit e560e3b510d2 ("svcrdma: Add zero padding if the client doesn't send it"). Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Support RDMA_NOMSG requestsChuck Lever1-3/+36
Currently the Linux server can not decode RDMA_NOMSG type requests. Operations whose length exceeds the fixed size of RDMA SEND buffers, like large NFSv4 CREATE(NF4LNK) operations, must be conveyed via RDMA_NOMSG. For an RDMA_MSG type request, the client sends the RPC/RDMA, RPC headers, and some or all of the NFS arguments via RDMA SEND. For an RDMA_NOMSG type request, the client sends just the RPC/RDMA header via RDMA SEND. The request's read list contains elements for the entire RPC message, including the RPC header. NFSD expects the RPC/RMDA header and RPC header to be contiguous in page zero of the XDR buffer. Add logic in the RDMA READ path to make the read list contents land where the server prefers, when the incoming message is a type RDMA_NOMSG message. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: rc_position sanity checkingChuck Lever1-4/+12
An RPC/RDMA client may send large RPC arguments via a read list. This is a list of scatter/gather elements which convey RPC call arguments too large to fit in a small RDMA SEND. Each entry in the read list has a "position" field, whose value is the byte offset in the XDR stream where the data in that entry is to be inserted. Entries which share the same "position" value make up the same RPC argument. The receiver inserts entries with the same position field value in list order into the XDR stream. Currently the Linux NFS/RDMA server cannot handle receiving read chunks in more than one position, mostly because no current client sends read lists with elements in more than one position. As a sanity check, ensure that all received chunks have the same "rc_position." Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Plant reader function in struct svcxprt_rdmaChuck Lever2-44/+29
The RDMA reader function doesn't change once an svcxprt_rdma is instantiated. Instead of checking sc_devcap during every incoming RPC, set the reader function once when the connection is accepted. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Find rmsgp more reliablyChuck Lever1-14/+4
xdr_start() can return the wrong rmsgp address if an assumption about how the xdr_buf was constructed changes. When it gets it wrong, the client receives a reply that has gibberish in the RPC/RDMA header, preventing it from matching a waiting RPC request. Instead, make (and document) just one assumption: that the RDMA header for the client's RPC call is at the start of the first page in rq_pages. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Scrub BUG_ON() and WARN_ON() call sitesChuck Lever3-33/+49
Current convention is to avoid using BUG_ON() in places where an oops could cause complete system failure. Replace BUG_ON() call sites in svcrdma with an assertion error message and allow execution to continue safely. Some BUG_ON() calls are removed because they have never fired in production (that we are aware of). Some WARN_ON() calls are also replaced where a back trace is not helpful; e.g., in a workqueue task. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Clean up read chunk countingChuck Lever2-19/+12
The byte_count argument is not used, and the function is called only from one place. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Remove unused variableChuck Lever1-2/+0
Nit: remove an unused variable to squelch a compiler warning. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15svcrdma: Clean up dprintkChuck Lever1-4/+4
Nit: Fix inconsistent white space in dprintk messages. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-01-15Bluetooth: Add paranoid check for existing LE and BR/EDR SMP channelsMarcel Holtmann1-0/+12
When the SMP channels have been already registered, then print out a clear WARN_ON message that something went wrong. Also unregister the existing channels in this case before trying to register new ones. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15socket: use iov_length()Nicolas Dichtel1-10/+2
Better to use available helpers. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15Bluetooth: Fix lookup of fixed channels by local bdaddrJohan Hedberg1-8/+7
The comparing of chan->src should always be done against the local identity address, represented by hcon->src and hcon->src_type. This patch modifies l2cap_global_fixed_chan() to take the full hci_conn so that we can easily compare against hcon->src and hcon->src_type. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-01-15Bluetooth: Add helpers for src/dst bdaddr type conversionJohan Hedberg1-12/+22
The current bdaddr_type() usage in l2cap_core.c is a bit funny in that it's always passed a hci_conn + a hci_conn member. Because of this only the hci_conn is really needed. Since the second parameter is always either hcon->src_type or hcon->dst type this patch adds two helper functions for each purpose: bdaddr_src_type() and bdaddr_dst_type(). Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-01-15cfg80211: remove 80+80 MHz rate reportingJohannes Berg2-5/+1
These rates are treated the same as 160 MHz in the spec, so it makes no sense to distinguish them. As no driver uses them yet, this is also not a problem, just remove them. In the userspace API the field remains reserved to preserve API and ABI. Signed-off-by: Johannes Berg <[email protected]>
2015-01-15mac80211: remove 80+80 MHz rate reportingJohannes Berg3-9/+0
These rates are treated the same as 160 MHz in the spec, so it makes no sense to distinguish them. As no driver uses them yet, this is also not a problem, just remove them. Signed-off-by: Johannes Berg <[email protected]>