aboutsummaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)AuthorFilesLines
2012-07-18skbuff: Use correct allocation in skb_copy_ubufsKrishna Kumar1-5/+5
Use correct allocation flags during copy of user space fragments to the kernel. Also "improve" couple of for loops. Signed-off-by: Krishna Kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-17netpoll: move np->dev and np->dev_name init into __netpoll_setup()Jiri Pirko1-5/+5
Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-16net: cgroup: fix access the unallocated memory in netprio cgroupGao feng1-17/+54
there are some out of bound accesses in netprio cgroup. now before accessing the dev->priomap.priomap array,we only check if the dev->priomap exist.and because we don't want to see additional bound checkings in fast path, so we should make sure that dev->priomap is null or array size of dev->priomap.priomap is equal to max_prioidx + 1; so in write_priomap logic,we should call extend_netdev_table when dev->priomap is null and dev->priomap.priomap_len < max_len. and in cgrp_create->update_netdev_tables logic,we should call extend_netdev_table only when dev->priomap exist and dev->priomap.priomap_len < max_len. and it's not needed to call update_netdev_tables in write_priomap, we can only allocate the net device's priomap which we change through net_prio.ifpriomap. this patch also add a return value for update_netdev_tables & extend_netdev_table, so when new_priomap is allocated failed, write_priomap will stop to access the priomap,and return -ENOMEM back to the userspace to tell the user what happend. Change From v3: 1. add rtnl protect when reading max_prioidx in write_priomap. 2. only call extend_netdev_table when map->priomap_len < max_len, this will make sure array size of dev->map->priomap always bigger than any prioidx. 3. add a function write_update_netdev_table to make codes clear. Change From v2: 1. protect extend_netdev_table by RTNL. 2. when extend_netdev_table failed,call dev_put to reduce device's refcount. Signed-off-by: Gao feng <[email protected]> Cc: Neil Horman <[email protected]> Cc: Eric Dumazet <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-16net: make sock diag per-namespaceAndrey Vagin1-7/+20
Before this patch sock_diag works for init_net only and dumps information about sockets from all namespaces. This patch expands sock_diag for all name-spaces. It creates a netlink kernel socket for each netns and filters data during dumping. v2: filter accoding with netns in all places remove an unused variable. Cc: "David S. Miller" <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: James Morris <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Pavel Emelyanov <[email protected]> CC: Eric Dumazet <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Andrew Vagin <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-16net: respect GFP_DMA in __netdev_alloc_skb()Eric Dumazet1-1/+1
Few drivers use GFP_DMA allocations, and netdev_alloc_frag() doesn't allocate pages in DMA zone. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-14net: feed /dev/random with the MAC address when registering a deviceTheodore Ts'o2-0/+4
Cc: David Miller <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: "Theodore Ts'o" <[email protected]> Cc: [email protected]
2012-07-13net: Update alloc frag to reduce get/put page usage and recycle pagesAlexander Duyck1-8/+20
This patch is meant to help improve performance by reducing the number of locked operations required to allocate a frag on x86 and other platforms. This is accomplished by using atomic_set operations on the page count instead of calling get_page and put_page. It is based on work originally provided by Eric Dumazet. In addition it also helps to reduce memory overhead when using TCP. This is done by recycling the page if the only holder of the frame is the netdev_alloc_frag call itself. This can occur when skb heads are stolen by either GRO or TCP and the driver providing the packets is using paged frags to store all of the data for the packets. Cc: Eric Dumazet <[email protected]> Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-11tcp: TCP Small QueuesEric Dumazet1-0/+4
This introduce TSQ (TCP Small Queues) TSQ goal is to reduce number of TCP packets in xmit queues (qdisc & device queues), to reduce RTT and cwnd bias, part of the bufferbloat problem. sk->sk_wmem_alloc not allowed to grow above a given limit, allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a given time. TSO packets are sized/capped to half the limit, so that we have two TSO packets in flight, allowing better bandwidth use. As a side effect, setting the limit to 40000 automatically reduces the standard gso max limit (65536) to 40000/2 : It can help to reduce latencies of high prio packets, having smaller TSO packets. This means we divert sock_wfree() to a tcp_wfree() handler, to queue/send following frames when skb_orphan() [2] is called for the already queued skbs. Results on my dev machines (tg3/ixgbe nics) are really impressive, using standard pfifo_fast, and with or without TSO/GSO. Without reduction of nominal bandwidth, we have reduction of buffering per bulk sender : < 1ms on Gbit (instead of 50ms with TSO) < 8ms on 100Mbit (instead of 132 ms) I no longer have 4 MBytes backlogged in qdisc by a single netperf session, and both side socket autotuning no longer use 4 Mbytes. As skb destructor cannot restart xmit itself ( as qdisc lock might be taken at this point ), we delegate the work to a tasklet. We use one tasklest per cpu for performance reasons. If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag. This flag is tested in a new protocol method called from release_sock(), to eventually send new segments. [1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable [2] skb_orphan() is usually called at TX completion time, but some drivers call it in their start_xmit() handler. These drivers should at least use BQL, or else a single TCP session can still fill the whole NIC TX ring, since TSQ will have no effect. Signed-off-by: Eric Dumazet <[email protected]> Cc: Dave Taht <[email protected]> Cc: Tom Herbert <[email protected]> Cc: Matt Mathis <[email protected]> Cc: Yuchung Cheng <[email protected]> Cc: Nandita Dukkipati <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-5/+10
Conflicts: net/batman-adv/bridge_loop_avoidance.c net/batman-adv/bridge_loop_avoidance.h net/batman-adv/soft-interface.c net/mac80211/mlme.c With merge help from Antonio Quartulli (batman-adv) and Stephen Rothwell (drivers/net/usb/qmi_wwan.c). The net/mac80211/mlme.c conflict seemed easy enough, accounting for a conversion to some new tracing macros. Signed-off-by: David S. Miller <[email protected]>
2012-07-10net: Fix (nearly-)kernel-doc comments for various functionsBen Hutchings3-6/+9
Fix incorrect start markers, wrapped summary lines, missing section breaks, incorrect separators, and some name mismatches. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-10net: Properly define functions with no parametersBen Hutchings1-1/+1
Defining a function with no parameters as 'T foo()' is the deprecated K&R style, and is not strictly equivalent to defining it as 'T foo(void)'. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-10rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().David S. Miller1-3/+1
Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <[email protected]>
2012-07-10netprio_cgroup.c: fix comment typoLiu Bo1-1/+1
poitner -> pointer. Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2012-07-09net: cgroup: fix out of bounds accessesEric Dumazet2-4/+8
dev->priomap is allocated by extend_netdev_table() called from update_netdev_tables(). And this is only called if write_priomap() is called. But if write_priomap() is not called, it seems we can have out of bounds accesses in cgrp_destroy(), read_priomap() & skb_update_prio() With help from Gao Feng Signed-off-by: Eric Dumazet <[email protected]> Cc: Neil Horman <[email protected]> Cc: Gao feng <[email protected]> Acked-by: Gao feng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-09cgroup: fix panic in netprio_cgroupGao feng1-1/+2
we set max_prioidx to the first zero bit index of prioidx_map in function get_prioidx. So when we delete the low index netprio cgroup and adding a new netprio cgroup again,the max_prioidx will be set to the low index. when we set the high index cgroup's net_prio.ifpriomap,the function write_priomap will call update_netdev_tables to alloc memory which size is sizeof(struct netprio_map) + sizeof(u32) * (max_prioidx + 1), so the size of array that map->priomap point to is max_prioidx +1, which is low than what we actually need. fix this by adding check in get_prioidx,only set max_prioidx when max_prioidx low than the new prioidx. Signed-off-by: Gao feng <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+1
2012-07-05net-next: Add netif_get_num_default_rss_queuesYuval Mintz1-0/+11
Most multi-queue networking driver consider the number of online cpus when configuring RSS queues. This patch adds a wrapper to the number of cpus, setting an upper limit on the number of cpus a driver should consider (by default) when allocating resources for his queues. Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Eilon Greenstein <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-07-05net: Kill dst->_neighbour, accessors, and final uses.David S. Miller1-18/+0
No longer used. Signed-off-by: David S. Miller <[email protected]>
2012-07-05neigh: Convert over to dst_neigh_lookup_skb().David S. Miller1-3/+16
Signed-off-by: David S. Miller <[email protected]>
2012-07-05net: Do delayed neigh confirmation.David S. Miller1-1/+2
When a dst_confirm() happens, mark the confirmation as pending in the dst. Then on the next packet out, when we have the neigh in-hand, do the update. This removes the dependency in dst_confirm() of dst's having an attached neigh. While we're here, remove the explicit 'dst' NULL check, all except 2 or 3 call sites ensure it's not NULL. So just fix those cases up. Signed-off-by: David S. Miller <[email protected]>
2012-07-05ipv4: Make neigh lookups directly in output packet path.David S. Miller1-5/+7
Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by: David S. Miller <[email protected]>
2012-07-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds1-2/+2
Pull networking update from David Miller: 1) Fix RX sequence number handling in mwifiex, from Stone Piao. 2) Netfilter ipset mis-compares device names, fix from Florian Westphal. 3) Fix route leak in ipv6 IPVS, from Eric Dumazet. 4) NFS fixes. Several buffer overflows in NCI layer from Dan Rosenberg, and release sock OOPS'er fix from Eric Dumazet. 5) Fix WEP handling ath9k, we started using a bit the chip provides to indicate undecrypted packets but that bit turns out to be unreliable in certain configurations. Fix from Felix Fietkau. 6) Fix Kconfig dependency bug in wlcore, from Randy Dunlap. 7) New USB IDs for rtlwifi driver from Larry Finger. 8) Fix crashes in qmi_wwan usbnet driver when disconnecting, from Bjørn Mork. 9) Gianfar driver programs coalescing settings properly in single queue mode, but does not do so in multi-queue mode. Fix from Claudiu Manoil. 10) Missing module.h include in davinci_cpdma.c, from Daniel Mack. 11) Need dummy handler for IPSET_CMD_NONE otherwise we crash in ipset if we get this via nfnetlink, fix from Tomasz Bursztyka. 12) Missing RCU unlock in nfnetlink error path, also from Tomasz. 13) Fix divide by zero in igbvf when the user tries to set an RX coalescing value of 0 usecs, from Mitch A Williams. 14) We can process SCTP sacks for the wrong transport, oops. Fix from Neil Horman. 15) Remove hw IP payload checksumming from e1000e driver. This has zery value in our stack, and turning it on creates a very unintuitive restriction for users when using jumbo MTUs. Specifically, when IP payload checksums are on you cannot use both receive hashing offload and jumbo MTU. Fix from Bruce Allan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits) e1000e: remove use of IP payload checksum sctp: be more restrictive in transport selection on bundled sacks igbvf: fix divide by zero netfilter: nfnetlink: fix missing rcu_read_unlock in nfnetlink_rcv_msg netfilter: ipset: fix crash if IPSET_CMD_NONE command is sent davinci_cpdma: include linux/module.h gianfar: Fix RXICr/TXICr programming for multi-queue mode net: Downgrade CAP_SYS_MODULE deprecated message from error to warning. net: qmi_wwan: fix Oops while disconnecting mwifiex: fix memory leak associated with IE manamgement ath9k: fix panic caused by returning a descriptor we have queued for reuse mac80211: correct behaviour on unrecognised action frames ath9k: enable serialize_regmode for non-PCIE AR9287 rtlwifi: rtl8192cu: New USB IDs NFC: Return from rawsock_release when sk is NULL iwlwifi: fix activating inactive stations wlcore: drop INET dependency ath9k: fix dynamic WEP related regression NFC: Prevent multiple buffer overflows in NCI netfilter: update location of my trees ...
2012-07-03Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds1-0/+1
Pull block bits from Jens Axboe: "As vacation is coming up, thought I'd better get rid of my pending changes in my for-linus branch for this iteration. It contains: - Two patches for mtip32xx. Killing a non-compliant sysfs interface and moving it to debugfs, where it belongs. - A few patches from Asias. Two legit bug fixes, and one killing an interface that is no longer in use. - A patch from Jan, making the annoying partition ioctl warning a bit less annoying, by restricting it to !CAP_SYS_RAWIO only. - Three bug fixes for drbd from Lars Ellenberg. - A fix for an old regression for umem, it hasn't really worked since the plugging scheme was changed in 3.0. - A few fixes from Tejun. - A splice fix from Eric Dumazet, fixing an issue with pipe resizing." * 'for-linus' of git://git.kernel.dk/linux-block: scsi: Silence unnecessary warnings about ioctl to partition block: Drop dead function blk_abort_queue() block: Mitigate lock unbalance caused by lock switching block: Avoid missed wakeup in request waitqueue umem: fix up unplugging splice: fix racy pipe->buffers uses drbd: fix null pointer dereference with on-congestion policy when diskless drbd: fix list corruption by failing but already aborted reads drbd: fix access of unallocated pages and kernel panic xen/blkfront: Add WARN to deal with misbehaving backends. blkcg: drop local variable @q from blkg_destroy() mtip32xx: Create debugfs entries for troubleshooting mtip32xx: Remove 'registers' and 'flags' from sysfs blkcg: fix blkg_alloc() failure path block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED block: fix return value on cfq_init() failure mtip32xx: Remove version.h header file inclusion xen/blkback: Copy id field when doing BLKIF_DISCARD.
2012-06-29netlink: add netlink_kernel_cfg parameter to netlink_kernel_createPablo Neira Ayuso2-4/+13
This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-29ipv4: Elide fib_validate_source() completely when possible.David S. Miller1-0/+4
If rpfilter is off (or the SKB has an IPSEC path) and there are not tclassid users, we don't have to do anything at all when fib_validate_source() is invoked besides setting the itag to zero. We monitor tclassid uses with a counter (modified only under RTNL and marked __read_mostly) and we protect the fib_validate_source() real work with a test against this counter and whether rpfilter is to be done. Having a way to know whether we need no tclassid processing or not also opens the door for future optimized rpfilter algorithms that do not perform full FIB lookups. Signed-off-by: David S. Miller <[email protected]>
2012-06-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+2
Conflicts: drivers/net/caif/caif_hsi.c drivers/net/usb/qmi_wwan.c The qmi_wwan merge was trivial. The caif_hsi.c, on the other hand, was not. It's a conflict between 1c385f1fdf6f9c66d982802cd74349c040980b50 ("caif-hsi: Replace platform device with ops structure.") in the net-next tree and commit 39abbaef19cd0a30be93794aa4773c779c3eb1f3 ("caif-hsi: Postpone init of HIS until open()") in the net tree. I did my best with that one and will ask Sjur to check it out. Signed-off-by: David S. Miller <[email protected]>
2012-06-28net: Downgrade CAP_SYS_MODULE deprecated message from error to warning.Vinson Lee1-2/+2
Make logging level consistent with other deprecation messages in net subsystem. Signed-off-by: Vinson Lee <[email protected]> Cc: David Mackey <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-27net: skb_free_datagram_locked() doesnt drop all packetsEric Dumazet1-1/+0
dropwatch wrongly diagnose all received UDP packets as drops. This patch removes trace_kfree_skb() done in skb_free_datagram_locked(). Locations calling skb_free_datagram_locked() should do it on their own. As a result, drops are accounted on the right function. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-27netlink: Get rid of obsolete rtnetlink macrosThomas Graf1-13/+0
Removes all RTA_GET*() and RTA_PUT*() variations, as well as the the unused rtattr_strcmp(). Get rid of rtm_get_table() by moving it to its only user decnet. Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-27sock_diag: Do not use RTA_PUT() macrosThomas Graf1-9/+3
Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-19ipv4: Early TCP socket demux.David S. Miller1-0/+5
Input packet processing for local sockets involves two major demuxes. One for the route and one for the socket. But we can optimize this down to one demux for certain kinds of local sockets. Currently we only do this for established TCP sockets, but it could at least in theory be expanded to other kinds of connections. If a TCP socket is established then it's identity is fully specified. This means that whatever input route was used during the three-way handshake must work equally well for the rest of the connection since the keys will not change. Once we move to established state, we cache the receive packet's input route to use later. Like the existing cached route in sk->sk_dst_cache used for output packets, we have to check for route invalidations using dst->obsolete and dst->ops->check(). Early demux occurs outside of a socket locked section, so when a route invalidation occurs we defer the fixup of sk->sk_rx_dst until we are actually inside of established state packet processing and thus have the socket locked. Signed-off-by: David S. Miller <[email protected]>
2012-06-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-27/+7
Conflicts: net/ipv6/route.c This deals with a merge conflict between the net-next addition of the inetpeer network namespace ops, and Thomas Graf's bug fix in 2a0c451ade8e1783c5d453948289e4a978d417c9 which makes sure we don't register /proc/net/ipv6_route before it is actually safe to do so. Signed-off-by: David S. Miller <[email protected]>
2012-06-15net: remove skb_orphan_try()Eric Dumazet1-22/+1
Orphaning skb in dev_hard_start_xmit() makes bonding behavior unfriendly for applications sending big UDP bursts : Once packets pass the bonding device and come to real device, they might hit a full qdisc and be dropped. Without orphaning, the sender is automatically throttled because sk->sk_wmemalloc reaches sk->sk_sndbuf (assuming sk_sndbuf is not too big) We could try to defer the orphaning adding another test in dev_hard_start_xmit(), but all this seems of little gain, now that BQL tends to make packets more likely to be parked in Qdisc queues instead of NIC TX ring, in cases where performance matters. Reverts commits : fc6055a5ba31 net: Introduce skb_orphan_try() 87fd308cfc6b net: skb_tx_hash() fix relative to skb_orphan_try() and removes SKBTX_DRV_NEEDS_SK_REF flag Reported-and-bisected-by: Jean-Michel Hautbois <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Tested-by: Oliver Hartkopp <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-13netpoll: fix netpoll_send_udp() bugsEric Dumazet1-5/+6
Bogdan Hamciuc diagnosed and fixed following bug in netpoll_send_udp() : "skb->len += len;" instead of "skb_put(skb, len);" Meaning that _if_ a network driver needs to call skb_realloc_headroom(), only packet headers would be copied, leaving garbage in the payload. However the skb_realloc_headroom() must be avoided as much as possible since it requires memory and netpoll tries hard to work even if memory is exhausted (using a pool of preallocated skbs) It appears netpoll_send_udp() reserved 16 bytes for the ethernet header, which happens to work for typicall drivers but not all. Right thing is to use LL_RESERVED_SPACE(dev) (And also add dev->needed_tailroom of tailroom) This patch combines both fixes. Many thanks to Bogdan for raising this issue. Reported-by: Bogdan Hamciuc <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Tested-by: Bogdan Hamciuc <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Neil Horman <[email protected]> Reviewed-by: Neil Horman <[email protected]> Reviewed-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-13splice: fix racy pipe->buffers usesEric Dumazet1-0/+1
Dave Jones reported a kernel BUG at mm/slub.c:3474! triggered by splice_shrink_spd() called from vmsplice_to_pipe() commit 35f3d14dbbc5 (pipe: add support for shrinking and growing pipes) added capability to adjust pipe->buffers. Problem is some paths don't hold pipe mutex and assume pipe->buffers doesn't change for their duration. Fix this by adding nr_pages_max field in struct splice_pipe_desc, and use it in place of pipe->buffers where appropriate. splice_shrink_spd() loses its struct pipe_inode_info argument. Reported-by: Dave Jones <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Tom Herbert <[email protected]> Cc: stable <[email protected]> # 2.6.35 Tested-by: Dave Jones <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2012-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-11/+9
Conflicts: MAINTAINERS drivers/net/wireless/iwlwifi/pcie/trans.c The iwlwifi conflict was resolved by keeping the code added in 'net' that turns off the buggy chip feature. The MAINTAINERS conflict was merely overlapping changes, one change updated all the wireless web site URLs and the other changed some GIT trees to be Johannes's instead of John's. Signed-off-by: David S. Miller <[email protected]>
2012-06-12ethtool: Make more commands available to unprivileged processesBen Hutchings1-0/+5
'Get' commands should generally not require CAP_NET_ADMIN, with the exception of those that expose internal state. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-12net-next: add dev_loopback_xmit() to avoid duplicate codeMichel Machado1-0/+17
Add dev_loopback_xmit() in order to deduplicate functions ip_dev_loopback_xmit() (in net/ipv4/ip_output.c) and ip6_dev_loopback_xmit() (in net/ipv6/ip6_output.c). I was about to reinvent the wheel when I noticed that ip_dev_loopback_xmit() and ip6_dev_loopback_xmit() do exactly what I need and are not IP-only functions, but they were not available to reuse elsewhere. ip6_dev_loopback_xmit() does not have line "skb_dst_force(skb);", but I understand that this is harmless, and should be in dev_loopback_xmit(). Signed-off-by: Michel Machado <[email protected]> CC: "David S. Miller" <[email protected]> CC: Alexey Kuznetsov <[email protected]> CC: James Morris <[email protected]> CC: Hideaki YOSHIFUJI <[email protected]> CC: Patrick McHardy <[email protected]> CC: Eric Dumazet <[email protected]> CC: Jiri Pirko <[email protected]> CC: "Michał Mirosław" <[email protected]> CC: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-12Merge branch 'master' of ↵John W. Linville1-74/+0
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
2012-06-08net/core: fix kernel-doc warningsRandy Dunlap2-3/+3
Fix kernel-doc warnings in net/core: Warning(net/core/skbuff.c:3368): No description found for parameter 'delta_truesize' Warning(net/core/filter.c:628): No description found for parameter 'pfp' Warning(net/core/filter.c:628): Excess function parameter 'sk' description in 'sk_unattached_filter_create' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-07Added kernel support in EEE Ethtool commandsYuval Mintz1-0/+40
This patch extends the kernel's ethtool interface by adding support for 2 new EEE commands - get_eee and set_eee. Thanks goes to Giuseppe Cavallaro for his original patch adding this support. Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Eilon Greenstein <[email protected]> Reviewed-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-07net: Update kernel-doc for __alloc_skb()Ben Hutchings1-2/+2
__alloc_skb() now extends tailroom to allow the use of padding added by the heap allocator. Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-07net: neighbour: fix neigh_dump_info()Eric Dumazet1-8/+6
Denys found out "ip neigh" output was truncated to about 54 neighbours. Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Denys Fedoryshchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-69/+33
2012-06-05wireless: remove wext sysfsJohannes Berg1-74/+0
The only user of this was hal prior to its 0.5.12 release which happened over two years ago, so I'm sure this can be removed without issues. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2012-06-04drop_monitor: dont sleep in atomic contextEric Dumazet1-69/+33
drop_monitor calls several sleeping functions while in atomic context. BUG: sleeping function called from invalid context at mm/slub.c:943 in_atomic(): 1, irqs_disabled(): 0, pid: 2103, name: kworker/0:2 Pid: 2103, comm: kworker/0:2 Not tainted 3.5.0-rc1+ #55 Call Trace: [<ffffffff810697ca>] __might_sleep+0xca/0xf0 [<ffffffff811345a3>] kmem_cache_alloc_node+0x1b3/0x1c0 [<ffffffff8105578c>] ? queue_delayed_work_on+0x11c/0x130 [<ffffffff815343fb>] __alloc_skb+0x4b/0x230 [<ffffffffa00b0360>] ? reset_per_cpu_data+0x160/0x160 [drop_monitor] [<ffffffffa00b022f>] reset_per_cpu_data+0x2f/0x160 [drop_monitor] [<ffffffffa00b03ab>] send_dm_alert+0x4b/0xb0 [drop_monitor] [<ffffffff810568e0>] process_one_work+0x130/0x4c0 [<ffffffff81058249>] worker_thread+0x159/0x360 [<ffffffff810580f0>] ? manage_workers.isra.27+0x240/0x240 [<ffffffff8105d403>] kthread+0x93/0xa0 [<ffffffff816be6d4>] kernel_thread_helper+0x4/0x10 [<ffffffff8105d370>] ? kthread_freezable_should_stop+0x80/0x80 [<ffffffff816be6d0>] ? gs_change+0xb/0xb Rework the logic to call the sleeping functions in right context. Use standard timer/workqueue api to let system chose any cpu to perform the allocation and netlink send. Also avoid a loop if reset_per_cpu_data() cannot allocate memory : use mod_timer() to wait 1/10 second before next try. Signed-off-by: Eric Dumazet <[email protected]> Cc: Neil Horman <[email protected]> Reviewed-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-06-04sock_diag: add SK_MEMINFO_BACKLOGEric Dumazet1-0/+1
Adding socket backlog len in INET_DIAG_SKMEMINFO is really useful to diagnose various TCP problems. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-05-31net: sock: validate data_len before allocating skb in sock_alloc_send_pskb()Jason Wang1-2/+5
We need to validate the number of pages consumed by data_len, otherwise frags array could be overflowed by userspace. So this patch validate data_len and return -EMSGSIZE when data_len may occupies more frags than MAX_SKB_FRAGS. Signed-off-by: Jason Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-05-29drop_monitor: Add module alias to enable automatic module loadingNeil Horman1-0/+1
Now that we have module alias macros for generic netlink families, lets use those to mark modules with the appropriate family names for loading Signed-off-by: Neil Horman <[email protected]> CC: Eric Dumazet <[email protected]> CC: David Miller <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2012-05-23Merge branch 'for-linus' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...