aboutsummaryrefslogtreecommitdiff
path: root/net/core
AgeCommit message (Collapse)AuthorFilesLines
2011-01-24Merge branch 'master' of ↵David S. Miller2-4/+4
master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6
2011-01-24net: arp_ioctl() must hold RTNLEric Dumazet1-1/+2
Commit 941666c2e3e0 "net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()" introduced a regression, reported by Jamie Heilman. "arp -Ds 192.168.2.41 eth0 pub" triggered the ASSERT_RTNL() assert in pneigh_lookup() Removing RTNL requirement from arp_ioctl() was a mistake, just revert that part. Reported-by: Jamie Heilman <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-21net: netif_setup_tc() is staticEric Dumazet1-1/+1
Signed-off-by: Eric Dumazet <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-20rtnetlink: fix link attribute validation with IFLA_GROUPPatrick McHardy1-5/+8
rtnl_group_changelink() is invoked by rtnl_newlink() before the link attributes have been validated. Additionally the group changes are performed even if NLM_F_CREATE is specified and a new link is created, while more reasonable semantics would be to set the group value on the newly created link. Fix both problems by moving the rtnl_group_changelink() invocation down to the handling of non-existant links without NLM_F_CREATE() and add a dev_set_group() call to rtnl_create_link(). Signed-off-by: Patrick McHardy <[email protected]> Acked-by: Vlad Dogaru <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-20neigh: __rcu annotationsEric Dumazet1-6/+7
fix some minor issues and sparse (__rcu) warnings Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-20net_sched: RCU conversion of stabEric Dumazet1-3/+5
This patch converts stab qdisc management to RCU, so that we can perform the qdisc_calculate_pkt_len() call before getting qdisc lock. This shortens the lock's held time in __dev_xmit_skb(). This permits more qdiscs to get TCQ_F_CAN_BYPASS status, avoiding lot of cache misses and so reducing latencies. Signed-off-by: Eric Dumazet <[email protected]> CC: Patrick McHardy <[email protected]> CC: Jesper Dangaard Brouer <[email protected]> CC: Jarek Poplawski <[email protected]> CC: Jamal Hadi Salim <[email protected]> CC: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-20net: dev_close_many() is staticEric Dumazet1-1/+1
Signed-off-by: Eric Dumazet <[email protected]> CC: Octavian Purdila <[email protected]> Reviewed-by: Octavian Purdila <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-19net: implement mechanism for HW based QOSJohn Fastabend1-1/+54
This patch provides a mechanism for lower layer devices to steer traffic using skb->priority to tx queues. This allows for hardware based QOS schemes to use the default qdisc without incurring the penalties related to global state and the qdisc lock. While reliably receiving skbs on the correct tx ring to avoid head of line blocking resulting from shuffling in the LLD. Finally, all the goodness from txq caching and xps/rps can still be leveraged. Many drivers and hardware exist with the ability to implement QOS schemes in the hardware but currently these drivers tend to rely on firmware to reroute specific traffic, a driver specific select_queue or the queue_mapping action in the qdisc. By using select_queue for this drivers need to be updated for each and every traffic type and we lose the goodness of much of the upstream work. Firmware solutions are inherently inflexible. And finally if admins are expected to build a qdisc and filter rules to steer traffic this requires knowledge of how the hardware is currently configured. The number of tx queues and the queue offsets may change depending on resources. Also this approach incurs all the overhead of a qdisc with filters. With the mechanism in this patch users can set skb priority using expected methods ie setsockopt() or the stack can set the priority directly. Then the skb will be steered to the correct tx queues aligned with hardware QOS traffic classes. In the normal case with single traffic class and all queues in this class everything works as is until the LLD enables multiple tcs. To steer the skb we mask out the lower 4 bits of the priority and allow the hardware to configure upto 15 distinct classes of traffic. This is expected to be sufficient for most applications at any rate it is more then the 8021Q spec designates and is equal to the number of prio bands currently implemented in the default qdisc. This in conjunction with a userspace application such as lldpad can be used to implement 8021Q transmission selection algorithms one of these algorithms being the extended transmission selection algorithm currently being used for DCB. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-19netlink: support setting devgroup parametersVlad Dogaru1-4/+28
If a rtnetlink request specifies a negative or zero ifindex and has no interface name attribute, but has a group attribute, then the chenges are made to all the interfaces belonging to the specified group. Signed-off-by: Vlad Dogaru <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-19net_device: add support for network device groupsVlad Dogaru2-0/+18
Net devices can now be grouped, enabling simpler manipulation from userspace. This patch adds a group field to the net_device structure, as well as rtnetlink support to query and modify it. Signed-off-by: Vlad Dogaru <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-4/+4
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (41 commits) sctp: user perfect name for Delayed SACK Timer option net: fix can_checksum_protocol() arguments swap Revert "netlink: test for all flags of the NLM_F_DUMP composite" gianfar: Fix misleading indentation in startup_gfar() net/irda/sh_irda: return to RX mode when TX error net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan. USB CDC NCM: tx_fixup() race condition fix ns83820: Avoid bad pointer deref in ns83820_init_one(). ipv6: Silence privacy extensions initialization bnx2x: Update bnx2x version to 1.62.00-4 bnx2x: Fix AER setting for BCM57712 bnx2x: Fix BCM84823 LED behavior bnx2x: Mark full duplex on some external PHYs bnx2x: Fix BCM8073/BCM8727 microcode loading bnx2x: LED fix for BCM8727 over BCM57712 bnx2x: Common init will be executed only once after POR bnx2x: Swap BCM8073 PHY polarity if required iwlwifi: fix valid chain reading from EEPROM ath5k: fix locking in tx_complete_poll_work ath9k_hw: do PA offset calibration only on longcal interval ...
2011-01-19net: fix can_checksum_protocol() arguments swapEric Dumazet1-1/+1
commit 0363466866d901fbc (net offloading: Convert checksums to use centrally computed features.) mistakenly swapped can_checksum_protocol() arguments. This broke IPv6 on bnx2 for instance, on NIC without TCPv6 checksum offloads. Reported-by: Hans de Bruin <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-19Revert "netlink: test for all flags of the NLM_F_DUMP composite"David S. Miller1-1/+1
This reverts commit 0ab03c2b1478f2438d2c80204f7fef65b1bca9cf. It breaks several things including the avahi daemon. Signed-off-by: David S. Miller <[email protected]>
2011-01-18net: filter: dont block softirqs in sk_run_filter()Eric Dumazet1-3/+3
Packet filter (BPF) doesnt need to disable softirqs, being fully re-entrant and lock-less. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-18Merge branch 'master' of ↵David S. Miller1-2/+2
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
2011-01-18net offloading: Do not mask out NETIF_F_HW_VLAN_TX for vlan.Jesse Gross1-2/+2
In netif_skb_features() we return only the features that are valid for vlans if we have a vlan packet. However, we should not mask out NETIF_F_HW_VLAN_TX since it enables transmission of vlan tags and is obviously valid. Reported-by: Eric Dumazet <[email protected]> Signed-off-by: Jesse Gross <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2-29/+2
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (47 commits) GRETH: resolve SMP issues and other problems GRETH: handle frame error interrupts GRETH: avoid writing bad speed/duplex when setting transfer mode GRETH: fixed skb buffer memory leak on frame errors GRETH: GBit transmit descriptor handling optimization GRETH: fix opening/closing GRETH: added raw AMBA vendor/device number to match against. cassini: Fix build bustage on x86. e1000e: consistent use of Rx/Tx vs. RX/TX/rx/tx in comments/logs e1000e: update Copyright for 2011 e1000: Avoid unhandled IRQ r8169: keep firmware in memory. netdev: tilepro: Use is_unicast_ether_addr helper etherdevice.h: Add is_unicast_ether_addr function ks8695net: Use default implementation of ethtool_ops::get_link ks8695net: Disable non-working ethtool operations USB CDC NCM: Don't deref NULL in cdc_ncm_rx_fixup() and don't use uninitialized variable. vxge: Remember to release firmware after upgrading firmware netdev: bfin_mac: Remove is_multicast_ether_addr use in netdev_for_each_mc_addr ipsec: update MAX_AH_AUTH_LEN to support sha512 ...
2011-01-13net: remove dev_txq_stats_fold()Eric Dumazet1-29/+0
After recent changes, (percpu stats on vlan/tunnels...), we dont need anymore per struct netdev_queue tx_bytes/tx_packets/tx_dropped counters. Only remaining users are ixgbe, sch_teql, gianfar & macvlan : 1) ixgbe can be converted to use existing tx_ring counters. 2) macvlan incremented txq->tx_dropped, it can use the dev->stats.tx_dropped counter. 3) sch_teql : almost revert ab35cd4b8f42 (Use net_device internal stats) Now we have ndo_get_stats64(), use it, even for "unsigned long" fields (No need to bring back a struct net_device_stats) 4) gianfar adds a stats structure per tx queue to hold tx_bytes/tx_packets This removes a lockdep warning (and possible lockup) in rndis gadget, calling dev_get_stats() from hard IRQ context. Ref: http://www.spinics.net/lists/netdev/msg149202.html Reported-by: Neil Jones <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> CC: Jarek Poplawski <[email protected]> CC: Alexander Duyck <[email protected]> CC: Jeff Kirsher <[email protected]> CC: Sandeep Gopalpet <[email protected]> CC: Michal Nazarewicz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-3/+3
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (46 commits) hwrng: via_rng - Fix memory scribbling on some CPUs crypto: padlock - Move padlock.h into include/crypto hwrng: via_rng - Fix asm constraints crypto: n2 - use __devexit not __exit in n2_unregister_algs crypto: mark crypto workqueues CPU_INTENSIVE crypto: mv_cesa - dont return PTR_ERR() of wrong pointer crypto: ripemd - Set module author and update email address crypto: omap-sham - backlog handling fix crypto: gf128mul - Remove experimental tag crypto: af_alg - fix af_alg memory_allocated data type crypto: aesni-intel - Fixed build with binutils 2.16 crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets net: Add missing lockdep class names for af_alg include: Install linux/if_alg.h for user-space crypto API crypto: omap-aes - checkpatch --file warning fixes crypto: omap-aes - initialize aes module once per request crypto: omap-aes - unnecessary code removed crypto: omap-aes - error handling implementation improved crypto: omap-aes - redundant locking is removed crypto: omap-aes - DMA initialization fixes for OMAP off mode ...
2011-01-13Merge branch 'for-next' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.
2011-01-12Merge branch 'master' of git://1984.lsi.us.es/net-2.6David S. Miller1-0/+2
2011-01-12netfilter: fix compilation when conntrack is disabled but tproxy is enabledKOVACS Krisztian1-0/+2
The IPv6 tproxy patches split IPv6 defragmentation off of conntrack, but failed to update the #ifdef stanzas guarding the defragmentation related fields and code in skbuff and conntrack related code in nf_defrag_ipv6.c. This patch adds the required #ifdefs so that IPv6 tproxy can truly be used without connection tracking. Original report: http://marc.info/?l=linux-netdev&m=129010118516341&w=2 Reported-by: Randy Dunlap <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: KOVACS Krisztian <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2011-01-10net_sched: factorize qdisc stats handlingEric Dumazet1-1/+4
HTB takes into account skb is segmented in stats updates. Generalize this to all schedulers. They should use qdisc_bstats_update() helper instead of manipulating bstats.bytes and bstats.packets Add bstats_update() helper too for classes that use gnet_stats_basic_packed fields. Note : Right now, TCQ_F_CAN_BYPASS shortcurt can be taken only if no stab is setup on qdisc. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-10net: Add alloc_netdev_mqs functionTom Herbert1-11/+21
Added alloc_netdev_mqs function which allows the number of transmit and receive queues to be specified independenty. alloc_netdev_mq was changed to a macro to call the new function. Also added alloc_etherdev_mqs with same purpose. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Convert checksums to use centrally computed features.Jesse Gross1-28/+12
In order to compute the features for other offloads (primarily scatter/gather), we need to first check the ability of the NIC to offload the checksum for the packet. Since we have already computed this, we can directly use the result instead of figuring it out again. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Convert skb_need_linearize() to use precomputed features.Jesse Gross1-15/+6
This switches skb_need_linearize() to use the features that have been centrally computed. In doing so, this fixes a problem where scatter/gather should not be used because the card does not support checksum offloading on that type of packet. On device registration we only check that some form of checksum offloading is available if scatter/gatther is enabled but we must also check at transmission time. Examples of this include IPv6 or vlan packets on a NIC that only supports IPv4 offloading. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Convert dev_gso_segment() to use precomputed features.Jesse Gross1-5/+3
This switches dev_gso_segment() to use the device features computed by the centralized routine. In doing so, it fixes a problem where it would always use dev->features, instead of those appropriate to the number of vlan tags if any are present. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Pass features into netif_needs_gso().Jesse Gross1-2/+6
Now that there is a single function that can compute the device features relevant to a packet, we don't want to run it for each offload. This converts netif_needs_gso() to take the features of the device, rather than computing them itself. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Generalize netif_get_vlan_features().Jesse Gross1-8/+27
netif_get_vlan_features() is currently only used by netif_needs_gso(), so it only concerns itself with GSO features. However, several other places also should take into account the contents of the packet when deciding whether to offload to hardware. This generalizes the function to return features about all of the various forms of offloading. Since offloads tend to be linked together, this avoids duplicating the logic in each location (i.e. the scatter/gather code also needs the checksum logic). Suggested-by: Michał Mirosław <[email protected]> Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net offloading: Accept NETIF_F_HW_CSUM for all protocols.Jesse Gross1-1/+1
We currently only have software fallback for one type of checksum: the TCP/UDP one's complement. This means that a protocol that uses hardware offloading for a different type of checksum (FCoE, SCTP) must directly check the device's features and do the right thing ahead of time. By the time we get to dev_can_checksum(), we're only deciding whether to apply the one algorithm in software or hardware. NETIF_F_HW_CSUM has the same capabilities as the software version, so we should always use it if present. The primary advantage of this is multiply tagged vlans can use hardware checksumming. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09net: fix kernel-doc warning in core/filter.cRandy Dunlap1-1/+1
Fix new kernel-doc notation warning in net/core/filter.c: Warning(net/core/filter.c:172): No description found for parameter 'fentry' Warning(net/core/filter.c:172): Excess function parameter 'filter' description in 'sk_run_filter' Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-09netlink: test for all flags of the NLM_F_DUMP compositeJan Engelhardt1-1/+1
Due to NLM_F_DUMP is composed of two bits, NLM_F_ROOT | NLM_F_MATCH, when doing "if (x & NLM_F_DUMP)", it tests for _either_ of the bits being set. Because NLM_F_MATCH's value overlaps with NLM_F_EXCL, non-dump requests with NLM_F_EXCL set are mistaken as dump requests. Substitute the condition to test for _all_ bits being set. Signed-off-by: Jan Engelhardt <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-01-07Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-1/+1
* 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.
2011-01-06net: add POLLPRI to sock_def_readable()Eric Dumazet1-1/+1
Leonardo Chiquitto found poll() could block forever on tcp sockets and Urgent data was received, if the event flag only contains POLLPRI. He did a bisection and found commit 4938d7e0233 (poll: avoid extra wakeups in select/poll) was the source of the problem. Problem is TCP sockets use standard sock_def_readable() function for their sk_data_ready() handler, and sock_def_readable() doesnt signal POLLPRI. Only TCP is affected by the problem. Adding POLLPRI to the list of flags might trigger unnecessary schedules, but URGENT handling is such a seldom used feature this seems a good compromise. Thanks a lot to Leonardo for providing the bisection result and a test program as well. Reference : http://www.spinics.net/lists/netdev/msg151793.html Reported-and-bisected-by: Leonardo Chiquitto <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Tested-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-26Merge branch 'master' of ↵David S. Miller1-2/+1
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/ipv4/fib_frontend.c
2010-12-23Revert "ipv4: Allow configuring subnets as local addresses"David S. Miller1-2/+1
This reverts commit 4465b469008bc03b98a1b8df4e9ae501b6c69d4b. Conflicts: net/ipv4/fib_frontend.c As reported by Ben Greear, this causes regressions: > Change 4465b469008bc03b98a1b8df4e9ae501b6c69d4b caused rules > to stop matching the input device properly because the > FLOWI_FLAG_MATCH_ANY_IIF is always defined in ip_dev_find(). > > This breaks rules such as: > > ip rule add pref 512 lookup local > ip rule del pref 0 lookup local > ip link set eth2 up > ip -4 addr add 172.16.0.102/24 broadcast 172.16.0.255 dev eth2 > ip rule add to 172.16.0.102 iif eth2 lookup local pref 10 > ip rule add iif eth2 lookup 10001 pref 20 > ip route add 172.16.0.0/24 dev eth2 table 10001 > ip route add unreachable 0/0 table 10001 > > If you had a second interface 'eth0' that was on a different > subnet, pinging a system on that interface would fail: > > [root@ct503-60 ~]# ping 192.168.100.1 > connect: Invalid argument Reported-by: Ben Greear <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-22Merge branch 'master' into for-nextJiri Kosina9-64/+71
Conflicts: MAINTAINERS arch/arm/mach-omap2/pm24xx.c drivers/scsi/bfa/bfa_fcpim.c Needed to update to apply fixes for which the old branch was too outdated.
2010-12-21filter: optimize accesses to ancillary dataEric Dumazet1-28/+44
We can translate pseudo load instructions at filter check time to dedicated instructions to speed up filtering and avoid one switch(). libpcap currently uses SKF_AD_PROTOCOL, but custom filters probably use other ancillary accesses. Note : I made the assertion that ancillary data was always accessed with BPF_LD|BPF_?|BPF_ABS instructions, not with BPF_LD|BPF_?|BPF_IND ones (offset given by K constant, not by K + X register) On x86_64, this saves a few bytes of text : # size net/core/filter.o.* text data bss dec hex filename 4864 0 0 4864 1300 net/core/filter.o.new 4944 0 0 4944 1350 net/core/filter.o.old Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-21net: timestamp cloned packet in dev_queue_xmit_nitEric Dumazet1-7/+2
Le vendredi 17 décembre 2010 à 10:26 +0100, Eric Dumazet a écrit : > > I think we can add this after latest Changli patch : > > He does one skb_clone() before calling the sniffers. > We could set timestamp on this clone, instead of original skb. > > Problem solved. > [PATCH net-next-2.6] net: timestamp cloned packet in dev_queue_xmit_nit Now we do one clone of skb if at least one sniffer might take packet, we also can do the skb timestamping on the clone and let original packet unchanged. This is a generalization of commit 8caf153974f2 (net: sch_netem: Fix an inconsistency in ingress netem timestamps.) This way, we can have a good idea when packets are delivered to our stack (tcpdump -i ifb0), while a tcpdump on original device gives timestamps right before ingressing. This also speedup our stack, avoiding taking timestamps if not needed. Signed-off-by: Eric Dumazet <[email protected]> Cc: Changli Gao <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Jarek Poplawski <[email protected]> Acked-by: Changli Gao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-20pktgen: Remove unnecessary prefix from pr_<level>Joe Perches1-2/+1
Remove "pktgen: " prefix string from one pr_info. pr_fmt adds it, so this is a duplicate. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-19net: kill unused macrosShan Wei2-2/+0
These macros never be used, so remove them. Signed-off-by: Shan Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-19net: increase skb->users instead of skb_clone()Changli Gao1-10/+20
In dev_queue_xmit_nit(), we have to clone skbs as we need to mangle skbs, however, we don't need to clone skbs for all the packet_types. Except for the first packet_type, we increase skb->users instead of skb_clone(). Signed-off-by: Changli Gao <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-17Merge branch 'master' of ↵David S. Miller2-14/+39
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/bnx2x/bnx2x.h drivers/net/wireless/iwlwifi/iwl-1000.c drivers/net/wireless/iwlwifi/iwl-6000.c drivers/net/wireless/iwlwifi/iwl-core.h drivers/vhost/vhost.c
2010-12-16net: Use skb_checksum_start_offset()Michał Mirosław2-4/+4
Replace skb->csum_start - skb_headroom(skb) with skb_checksum_start_offset(). Note for usb/smsc95xx: skb->data - skb->head == skb_headroom(skb). Signed-off-by: Michał Mirosław <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-16net: fix nulls list corruptions in sk_prot_allocOctavian Purdila1-12/+35
Special care is taken inside sk_port_alloc to avoid overwriting skc_node/skc_nulls_node. We should also avoid overwriting skc_bind_node/skc_portaddr_node. The patch fixes the following crash: BUG: unable to handle kernel paging request at fffffffffffffff0 IP: [<ffffffff812ec6dd>] udp4_lib_lookup2+0xad/0x370 [<ffffffff812ecc22>] __udp4_lib_lookup+0x282/0x360 [<ffffffff812ed63e>] __udp4_lib_rcv+0x31e/0x700 [<ffffffff812bba45>] ? ip_local_deliver_finish+0x65/0x190 [<ffffffff812bbbf8>] ? ip_local_deliver+0x88/0xa0 [<ffffffff812eda35>] udp_rcv+0x15/0x20 [<ffffffff812bba45>] ip_local_deliver_finish+0x65/0x190 [<ffffffff812bbbf8>] ip_local_deliver+0x88/0xa0 [<ffffffff812bb2cd>] ip_rcv_finish+0x32d/0x6f0 [<ffffffff8128c14c>] ? netif_receive_skb+0x99c/0x11c0 [<ffffffff812bb94b>] ip_rcv+0x2bb/0x350 [<ffffffff8128c14c>] netif_receive_skb+0x99c/0x11c0 Signed-off-by: Leonard Crestez <[email protected]> Signed-off-by: Octavian Purdila <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-16net: factorize sync-rcu call in unregister_netdevice_manyOctavian Purdila1-42/+76
Add dev_close_many and dev_deactivate_many to factorize another sync-rcu operation on the netdevice unregister path. $ modprobe dummy numdummies=10000 $ ip link set dev dummy* up $ time rmmod dummy Without the patch With the patch real 0m 24.63s real 0m 5.15s user 0m 0.00s user 0m 0.00s sys 0m 6.05s sys 0m 5.14s Signed-off-by: Octavian Purdila <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-16net: use NUMA_NO_NODE instead of the magic number -1Changli Gao2-2/+3
Signed-off-by: Changli Gao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-16bnx2x: Take the distribution range definition out of skb_tx_hash()Vladislav Zolotarov1-5/+10
Move the calcualation of the Tx hash for a given hash range into a separate function and define the skb_tx_hash(), which calculates a Tx hash for a [0; dev->real_num_tx_queues - 1] hash values range, using this function (__skb_tx_hash()). Signed-off-by: Vladislav Zolotarov <[email protected]> Signed-off-by: Eilon Greenstein <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-12-15workqueue: convert cancel_rearming_delayed_work[queue]() users to ↵Tejun Heo1-1/+1
cancel_delayed_work_sync() cancel_rearming_delayed_work[queue]() has been superceded by cancel_delayed_work_sync() quite some time ago. Convert all the in-kernel users. The conversions are completely equivalent and trivial. Signed-off-by: Tejun Heo <[email protected]> Acked-by: "David S. Miller" <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Evgeniy Polyakov <[email protected]> Cc: Jeff Garzik <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: [email protected] Cc: Anton Vorontsov <[email protected]> Cc: David Woodhouse <[email protected]> Cc: "J. Bruce Fields" <[email protected]> Cc: Neil Brown <[email protected]> Cc: Alex Elder <[email protected]> Cc: [email protected] Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: Trond Myklebust <[email protected]> Cc: [email protected]
2010-12-10net: fix skb_defer_rx_timestamp()Eric Dumazet1-2/+4
After commit c1f19b51d1d8 (net: support time stamping in phy devices.), kernel might crash if CONFIG_NETWORK_PHY_TIMESTAMPING=y and skb_defer_rx_timestamp() handles a packet without an ethernet header. Fixes kernel bugzilla #24102 Reference: https://bugzilla.kernel.org/show_bug.cgi?id=24102 Reported-and-tested-by: Andrew Watts <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>