aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-07-08selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnelStephen Suryaputra4-0/+1220
Add selftest scripts for multipath hashing on inner IP pkts when there is a single GRE tunnel but there are multiple underlay routes to reach the other end of the tunnel. Four cases are covered in these scripts: - IPv4 inner, IPv4 outer - IPv6 inner, IPv4 outer - IPv4 inner, IPv6 outer - IPv6 inner, IPv6 outer Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Stephen Suryaputra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08ipv6: Support multipath hashing on inner IP pktsStephen Suryaputra2-0/+37
Make the same support as commit 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") for outer IPv6. The hashing considers both IPv4 and IPv6 pkts when they are tunneled by IPv6 GRE. Signed-off-by: Stephen Suryaputra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08ipv4: Multipath hashing on inner L3 needs to consider inner IPv6 pktsStephen Suryaputra1-4/+17
Commit 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") supports multipath policy value of 2, Layer 3 or inner Layer 3 if present, but it only considers inner IPv4. There is a use case of IPv6 is tunneled by IPv4 GRE, thus add the ability to hash on inner IPv6 addresses. Fixes: 363887a2cdfe ("ipv4: Support multipath hashing on inner IP pkts for GRE tunnel") Signed-off-by: Stephen Suryaputra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: pasemi: fix an use-after-free in pasemi_mac_phy_init()Wen Yang1-1/+1
The phy_dn variable is still being used in of_phy_connect() after the of_node_put() call, which may result in use-after-free. Fixes: 1dd2d06c0459 ("net: Rework pasemi_mac driver to use of_mdio infrastructure") Signed-off-by: Wen Yang <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'ras-core-for-linus' of ↵Linus Torvalds7-205/+269
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS updates from Ingo Molnar: "Boris is on vacation so I'm sending the RAS bits this time. The main changes were: - Various RAS/CEC improvements and fixes by Borislav Petkov: - error insertion fixes - offlining latency fix - memory leak fix - additional sanity checks - cleanups - debug output improvements - More SMCA enhancements by Yazen Ghannam: - make banks truly per-CPU which they are in the hardware - don't over-cache certain registers - make the number of MCA banks per-CPU variable The long term goal with these changes is to support future heterogenous SMCA extensions. - Misc fixes and improvements" * 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Do not check return value of debugfs_create functions x86/MCE: Determine MCA banks' init state properly x86/MCE: Make the number of MCA banks a per-CPU variable x86/MCE/AMD: Don't cache block addresses on SMCA systems x86/MCE: Make mce_banks a per-CPU array x86/MCE: Make struct mce_banks[] static RAS/CEC: Add copyright RAS/CEC: Add CONFIG_RAS_CEC_DEBUG and move CEC debug features there RAS/CEC: Dump the different array element sections RAS/CEC: Rename count_threshold to action_threshold RAS/CEC: Sanity-check array on every insertion RAS/CEC: Fix potential memory leak RAS/CEC: Do not set decay value on error RAS/CEC: Check count_threshold unconditionally RAS/CEC: Fix pfn insertion
2019-07-08net: axienet: fix a potential double free in axienet_probe()Wen Yang1-1/+0
There is a possible use-after-free issue in the axienet_probe(): 1701: np = of_parse_phandle(pdev->dev.of_node, "axistream-connected", 0); 1702: if (np) { ... 1787: of_node_put(np); ---> released here 1788: lp->eth_irq = platform_get_irq(pdev, 0); 1789: } else { ... 1801: } 1802: if (IS_ERR(lp->dma_regs)) { ... 1805: of_node_put(np); ---> double released here 1806: goto free_netdev; 1807: } We solve this problem by removing the unnecessary of_node_put(). Fixes: 28ef9ebdb64c ("net: axienet: make use of axistream-connected attribute optional") Signed-off-by: Wen Yang <[email protected]> Cc: Anirudha Sarangi <[email protected]> Cc: John Linn <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Michal Simek <[email protected]> Cc: Robert Hancock <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Reviewed-by: Robert Hancock <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08sunrpc/cache: remove the exporting of cache_seq_nextDenis Efremov1-1/+0
The function cache_seq_next is declared static and marked EXPORT_SYMBOL_GPL, which is at best an odd combination. Because the function is not used outside of the net/sunrpc/cache.c file it is defined in, this commit removes the EXPORT_SYMBOL_GPL() marking. Fixes: d48cf356a130 ("SUNRPC: Remove non-RCU protected lookup") Signed-off-by: Denis Efremov <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2019-07-08Merge branch 'locking-core-for-linus' of ↵Linus Torvalds55-2020/+2788
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in this cycle are: - rwsem scalability improvements, phase #2, by Waiman Long, which are rather impressive: "On a 2-socket 40-core 80-thread Skylake system with 40 reader and writer locking threads, the min/mean/max locking operations done in a 5-second testing window before the patchset were: 40 readers, Iterations Min/Mean/Max = 1,807/1,808/1,810 40 writers, Iterations Min/Mean/Max = 1,807/50,344/151,255 After the patchset, they became: 40 readers, Iterations Min/Mean/Max = 30,057/31,359/32,741 40 writers, Iterations Min/Mean/Max = 94,466/95,845/97,098" There's a lot of changes to the locking implementation that makes it similar to qrwlock, including owner handoff for more fair locking. Another microbenchmark shows how across the spectrum the improvements are: "With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) on a 2-socket Skylake system with equal numbers of readers and writers (mixed) before and after this patchset were: # of Threads Before Patch After Patch ------------ ------------ ----------- 2 2,618 4,193 4 1,202 3,726 8 802 3,622 16 729 3,359 32 319 2,826 64 102 2,744" The changes are extensive and the patch-set has been through several iterations addressing various locking workloads. There might be more regressions, but unless they are pathological I believe we want to use this new implementation as the baseline going forward. - jump-label optimizations by Daniel Bristot de Oliveira: the primary motivation was to remove IPI disturbance of isolated RT-workload CPUs, which resulted in the implementation of batched jump-label updates. Beyond the improvement of the real-time characteristics kernel, in one test this patchset improved static key update overhead from 57 msecs to just 1.4 msecs - which is a nice speedup as well. - atomic64_t cross-arch type cleanups by Mark Rutland: over the last ~10 years of atomic64_t existence the various types used by the APIs only had to be self-consistent within each architecture - which means they became wildly inconsistent across architectures. Mark puts and end to this by reworking all the atomic64 implementations to use 's64' as the base type for atomic64_t, and to ensure that this type is consistently used for parameters and return values in the API, avoiding further problems in this area. - A large set of small improvements to lockdep by Yuyang Du: type cleanups, output cleanups, function return type and othr cleanups all around the place. - A set of percpu ops cleanups and fixes by Peter Zijlstra. - Misc other changes - please see the Git log for more details" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (82 commits) locking/lockdep: increase size of counters for lockdep statistics locking/atomics: Use sed(1) instead of non-standard head(1) option locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING x86/jump_label: Make tp_vec_nr static x86/percpu: Optimize raw_cpu_xchg() x86/percpu, sched/fair: Avoid local_clock() x86/percpu, x86/irq: Relax {set,get}_irq_regs() x86/percpu: Relax smp_processor_id() x86/percpu: Differentiate this_cpu_{}() and __this_cpu_{}() locking/rwsem: Guard against making count negative locking/rwsem: Adaptive disabling of reader optimistic spinning locking/rwsem: Enable time-based spinning on reader-owned rwsem locking/rwsem: Make rwsem->owner an atomic_long_t locking/rwsem: Enable readers spinning on writer locking/rwsem: Clarify usage of owner's nonspinaable bit locking/rwsem: Wake up almost all readers in wait queue locking/rwsem: More optimal RT task handling of null owner locking/rwsem: Always release wait_lock before waking up tasks locking/rwsem: Implement lock handoff to prevent lock starvation locking/rwsem: Make rwsem_spin_on_owner() return owner state ...
2019-07-09selftests/bpf: fix test_reuseport_array on s390Ilya Leoshkevich1-6/+15
Fix endianness issue: passing a pointer to 64-bit fd as a 32-bit key does not work on big-endian architectures. So cast fd to 32-bits when necessary. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-07-08net: stmmac: enable clause 45 mdio supportKweh Hock Leong2-8/+37
DWMAC4 is capable to support clause 45 mdio communication. This patch enable the feature on stmmac_mdio_write() and stmmac_mdio_read() by following phy_write_mmd() and phy_read_mmd() mdiobus read write implementation format. Reviewed-by: Li, Yifan <[email protected]> Signed-off-by: Kweh Hock Leong <[email protected]> Signed-off-by: Ong Boon Leong <[email protected]> Signed-off-by: Voon Weifeng <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: openvswitch: use netif_ovs_is_port() instead of opencodeTaehee Yoo2-4/+4
Use netif_ovs_is_port() function instead of open code. This patch doesn't change logic. Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08MAINTAINERS: Add page_pool maintainer entryJesper Dangaard Brouer1-0/+8
In this release cycle the number of NIC drivers using page_pool will likely reach 4 drivers. It is about time to add a maintainer entry. Add myself and Ilias. Signed-off-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Ilias Apalodimas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'mvpp2-cls-ether'David S. Miller1-0/+6
Maxime Chevallier says: ==================== net: mvpp2: Add classification based on the ETHER flow This series adds support for classification of the ETHER flow in the mvpp2 driver. The first patch allows detecting when a user specifies a flow_type that isn't supported by the driver, while the second adds support for this flow_type by adding the mapping between the ETHER_FLOW enum value and the relevant classifier flow entries. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: mvpp2: cls: Add support for ETHER_FLOWMaxime Chevallier1-0/+2
Users can specify classification actions based on the 'ether' flow type. In that case, this will apply to all ethernet traffic, superseeding flows such as 'udp4' or 'tcp6'. Add support for this flow type in the PPv2 classifier, by mapping the ETHER_FLOW value to the corresponding entries in the classifier. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: mvpp2: cls: Report an error for unsupported flow typesMaxime Chevallier1-0/+4
Add a missing check to detect flow types that we don't support, so that user can be informed of this. Signed-off-by: Maxime Chevallier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds61-466/+845
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The changes in this cycle are: - RCU flavor consolidation cleanups and optmizations - Documentation updates - Miscellaneous fixes - SRCU updates - RCU-sync flavor consolidation - Torture-test updates - Linux-kernel memory-consistency-model updates, most notably the addition of plain C-language accesses" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (61 commits) tools/memory-model: Improve data-race detection tools/memory-model: Change definition of rcu-fence tools/memory-model: Expand definition of barrier tools/memory-model: Do not use "herd" to refer to "herd7" tools/memory-model: Fix comment in MP+poonceonces.litmus Documentation: atomic_t.txt: Explain ordering provided by smp_mb__{before,after}_atomic() rcu: Don't return a value from rcu_assign_pointer() rcu: Force inlining of rcu_read_lock() rcu: Fix irritating whitespace error in rcu_assign_pointer() rcu: Upgrade sync_exp_work_done() to smp_mb() rcutorture: Upper case solves the case of the vanishing NULL pointer torture: Suppress propagating trace_printk() warning rcutorture: Dump trace buffer for callback pipe drain failures torture: Add --trust-make to suppress "make clean" torture: Make --cpus override idleness calculations torture: Run kernel build in source directory torture: Add function graph-tracing cheat sheet torture: Capture qemu output rcutorture: Tweak kvm options rcutorture: Add trivial RCU implementation ...
2019-07-08selftests: txring_overwrite: fix incorrect test of mmap() return valueFrank de Brabander1-1/+1
If mmap() fails it returns MAP_FAILED, which is defined as ((void *) -1). The current if-statement incorrectly tests if *ring is NULL. Fixes: 358be656406d ("selftests/net: add txring_overwrite") Signed-off-by: Frank de Brabander <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'vsock-virtio-fixes'David S. Miller1-30/+104
Stefano Garzarella says: ==================== vsock/virtio: several fixes in the .probe() and .remove() During the review of "[PATCH] vsock/virtio: Initialize core virtio vsock before registering the driver", Stefan pointed out some possible issues in the .probe() and .remove() callbacks of the virtio-vsock driver. This series tries to solve these issues: - Patch 1 adds RCU critical sections to avoid use-after-free of 'the_virtio_vsock' pointer. - Patch 2 stops workers before to call vdev->config->reset(vdev) to be sure that no one is accessing the device. - Patch 3 moves the works flush at the end of the .remove() to avoid use-after-free of 'vsock' object. v3: - Patch 1: use rcu_dereference_protected() to get the_virtio_vosck value in the virtio_vsock_probe() [Jason] v2: https://patchwork.kernel.org/cover/11022343/ v1: https://patchwork.kernel.org/cover/10964733/ Before this series the guest crashes in a few second. After this series the test runs (~12h) without issues. Tested on an SMP guest (-smp 4 -monitor tcp:127.0.0.1:1234,server,nowait) with these scripts to stress the .probe()/.remove() path: - guest while true; do cat /dev/urandom | nc-vsock -l 4321 > /dev/null & cat /dev/urandom | nc-vsock -l 5321 > /dev/null & cat /dev/urandom | nc-vsock -l 6321 > /dev/null & cat /dev/urandom | nc-vsock -l 7321 > /dev/null & wait done - host while true; do cat /dev/urandom | nc-vsock 3 4321 > /dev/null & cat /dev/urandom | nc-vsock 3 5321 > /dev/null & cat /dev/urandom | nc-vsock 3 6321 > /dev/null & cat /dev/urandom | nc-vsock 3 7321 > /dev/null & sleep 2 echo "device_del v1" | nc 127.0.0.1 1234 sleep 1 echo "device_add vhost-vsock-pci,id=v1,guest-cid=3" | nc 127.0.0.1 1234 sleep 1 done ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08vsock/virtio: fix flush of works during the .remove()Stefano Garzarella1-6/+9
This patch moves the flush of works after vdev->config->del_vqs(vdev), because we need to be sure that no workers run before to free the 'vsock' object. Since we stopped the workers using the [tx|rx|event]_run flags, we are sure no one is accessing the device while we are calling vdev->config->reset(vdev), so we can safely move the workers' flush. Before the vdev->config->del_vqs(vdev), workers can be scheduled by VQ callbacks, so we must flush them after del_vqs(), to avoid use-after-free of 'vsock' object. Suggested-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08vsock/virtio: stop workers during the .remove()Stefano Garzarella1-1/+50
Before to call vdev->config->reset(vdev) we need to be sure that no one is accessing the device, for this reason, we add new variables in the struct virtio_vsock to stop the workers during the .remove(). This patch also add few comments before vdev->config->reset(vdev) and vdev->config->del_vqs(vdev). Suggested-by: Stefan Hajnoczi <[email protected]> Suggested-by: Michael S. Tsirkin <[email protected]> Signed-off-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08vsock/virtio: use RCU to avoid use-after-free on the_virtio_vsockStefano Garzarella1-24/+46
Some callbacks used by the upper layers can run while we are in the .remove(). A potential use-after-free can happen, because we free the_virtio_vsock without knowing if the callbacks are over or not. To solve this issue we move the assignment of the_virtio_vsock at the end of .probe(), when we finished all the initialization, and at the beginning of .remove(), before to release resources. For the same reason, we do the same also for the vdev->priv. We use RCU to be sure that all callbacks that use the_virtio_vsock ended before freeing it. This is not required for callbacks that use vdev->priv, because after the vdev->config->del_vqs() we are sure that they are ended and will no longer be invoked. We also take the mutex during the .remove() to avoid that .probe() can run while we are resetting the device. Signed-off-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'b53-docs'David S. Miller3-0/+477
Benedikt Spranger says: ==================== Document the configuration of b53 this is the third round to document the configuration of a b53 supported switch. v3..v2: - fix a typo - improve b53 configuration in DSA_TAG_PROTO_NONE showcase. - grade up from RFC to patch for mainline inclusion. v1..v2: - split out generic parts of the configuration. - target comments by Andrew Lunn and Florian Fainelli. - make changes visible to build system ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08Documentation: net: dsa: b53: Describe b53 configurationBenedikt Spranger2-0/+184
Document the different needs of documentation for the b53 driver. Signed-off-by: Benedikt Spranger <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Documentation: net: dsa: Describe DSA switch configurationBenedikt Spranger2-0/+293
Document DSA tagged and VLAN based switch configuration by showcases. Signed-off-by: Benedikt Spranger <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08nfp: tls: fix error return code in nfp_net_tls_add()Wei Yongjun1-0/+1
Fix to return negative error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 1f35a56cf586 ("nfp: tls: add/delete TLS TX connections") Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'bnxt_en-XDP_REDIRECT'David S. Miller6-29/+214
Michael Chan says: ==================== bnxt_en: Add XDP_REDIRECT support. This patch series adds XDP_REDIRECT support by Andy Gospodarek. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08bnxt_en: add page_pool supportAndy Gospodarek4-7/+47
This removes contention over page allocation for XDP_REDIRECT actions by adding page_pool support per queue for the driver. The performance for XDP_REDIRECT actions scales linearly with the number of cores performing redirect actions when using the page pools instead of the standard page allocator. v2: Fix up the error path from XDP registration, noted by Ilias Apalodimas. Signed-off-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08bnxt_en: optimized XDP_REDIRECT supportAndy Gospodarek4-10/+140
This adds basic support for XDP_REDIRECT in the bnxt_en driver. Next patch adds the more optimized page pool support. Signed-off-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08bnxt_en: Refactor __bnxt_xmit_xdp().Michael Chan4-13/+28
__bnxt_xmit_xdp() is used by XDP_TX and ethtool loopback packet transmit. Refactor it so that it can be re-used by the XDP_REDIRECT logic. Restructure the TX interrupt handler logic to cleanly separate XDP_TX logic in preparation for XDP_REDIRECT. Acked-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08bnxt_en: rename some xdp functionsAndy Gospodarek3-7/+7
Renaming bnxt_xmit_xdp to __bnxt_xmit_xdp to get ready for XDP_REDIRECT support and reduce confusion/namespace collision. Signed-off-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'cpsw-Add-XDP-support'David S. Miller11-88/+640
Ivan Khoronzhuk says: ==================== net: ethernet: ti: cpsw: Add XDP support This patchset adds XDP support for TI cpsw driver and base it on page_pool allocator. It was verified on af_xdp socket drop, af_xdp l2f, ebpf XDP_DROP, XDP_REDIRECT, XDP_PASS, XDP_TX. It was verified with following configs enabled: CONFIG_JIT=y CONFIG_BPFILTER=y CONFIG_BPF_SYSCALL=y CONFIG_XDP_SOCKETS=y CONFIG_BPF_EVENTS=y CONFIG_HAVE_EBPF_JIT=y CONFIG_BPF_JIT=y CONFIG_CGROUP_BPF=y Link on previous v7: https://lkml.org/lkml/2019/7/4/715 Also regular tests with iperf2 were done in order to verify impact on regular netstack performance, compared with base commit: https://pastebin.com/JSMT0iZ4 v8..v9: - fix warnings on arm64 caused by typos in type casting v7..v8: - corrected dma calculation based on headroom instead of hard start - minor comment changes v6..v7: - rolled back to v4 solution but with small modification - picked up patch: https://www.spinics.net/lists/netdev/msg583145.html - added changes related to netsec fix and cpsw v5..v6: - do changes that is rx_dev while redirect/flush cycle is kept the same - dropped net: ethernet: ti: davinci_cpdma: return handler status - other changes desc in patches v4..v5: - added two plreliminary patches: net: ethernet: ti: davinci_cpdma: allow desc split while down net: ethernet: ti: cpsw_ethtool: allow res split while down - added xdp alocator refcnt on xdp level, avoiding page pool refcnt - moved flush status as separate argument for cpdma_chan_process - reworked cpsw code according to last changes to allocator - added missed statistic counter v3..v4: - added page pool user counter - use same pool for ndevs in dual mac - restructured page pool create/destroy according to the last changes in API v2..v3: - each rxq and ndev has its own page pool v1..v2: - combined xdp_xmit functions - used page allocation w/o refcnt juggle - unmapped page for skb netstack - moved rxq/page pool allocation to open/close pair - added several preliminary patches: net: page_pool: add helper function to retrieve dma addresses net: page_pool: add helper function to unmap dma addresses net: ethernet: ti: cpsw: use cpsw as drv data net: ethernet: ti: cpsw_ethtool: simplify slave loops ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: ethernet: ti: cpsw: add XDP supportIvan Khoronzhuk4-59/+488
Add XDP support based on rx page_pool allocator, one frame per page. Page pool allocator is used with assumption that only one rx_handler is running simultaneously. DMA map/unmap is reused from page pool despite there is no need to map whole page. Due to specific of cpsw, the same TX/RX handler can be used by 2 network devices, so special fields in buffer are added to identify an interface the frame is destined to. Thus XDP works for both interfaces, that allows to test xdp redirect between two interfaces easily. Also, each rx queue have own page pools, but common for both netdevs. XDP prog is common for all channels till appropriate changes are added in XDP infrastructure. Also, once page_pool recycling becomes part of skb netstack some simplifications can be added, like removing page_pool_release_page() before skb receive. In order to keep rx_dev while redirect, that can be somehow used in future, do flush in rx_handler, that allows to keep rx dev the same while redirect. It allows to conform with tracing rx_dev pointed by Jesper. Also, there is probability, that XDP generic code can be extended to support multi ndev drivers like this one, using same rx queue for several ndevs, based on switchdev for instance or else. In this case, driver can be modified like exposed here: https://lkml.org/lkml/2019/7/3/243 Acked-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: ethernet: ti: cpsw_ethtool: allow res split while downIvan Khoronzhuk1-2/+1
That's possible to set channel num while interfaces are down. When interface gets up it should resplit budget. This resplit can happen after phy is up but only if speed is changed, so should be set before this, for this allow it to happen while changing number of channels, when interfaces are down. Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: ethernet: ti: davinci_cpdma: allow desc split while downIvan Khoronzhuk3-9/+28
That's possible to set ring params while interfaces are down. When interface gets up it uses number of descs to fill rx queue and on later on changes to create rx pools. Usually, this resplit can happen after phy is up, but it can be needed before this, so allow it to happen while setting number of rx descs, when interfaces are down. Also, if no dependency on intf state, move it to cpdma layer, where it should be. Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: ethernet: ti: davinci_cpdma: add dma mapped submitIvan Khoronzhuk2-10/+83
In case if dma mapped packet needs to be sent, like with XDP page pool, the "mapped" submit can be used. This patch adds dma mapped submit based on regular one. Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: core: page_pool: add user refcnt and reintroduce page_pool_destroyIvan Khoronzhuk5-8/+40
Jesper recently removed page_pool_destroy() (from driver invocation) and moved shutdown and free of page_pool into xdp_rxq_info_unreg(), in-order to handle in-flight packets/pages. This created an asymmetry in drivers create/destroy pairs. This patch reintroduce page_pool_destroy and add page_pool user refcnt. This serves the purpose to simplify drivers error handling as driver now drivers always calls page_pool_destroy() and don't need to track if xdp_rxq_info_reg_mem_model() was unsuccessful. This could be used for a special cases where a single RX-queue (with a single page_pool) provides packets for two net_device'es, and thus needs to register the same page_pool twice with two xdp_rxq_info structures. This patch is primarily to ease API usage for drivers. The recently merged netsec driver, actually have a bug in this area, which is solved by this API change. This patch is a modified version of Ivan Khoronzhuk's original patch. Link: https://lore.kernel.org/netdev/[email protected]/ Fixes: 5c67bf0ec4d0 ("net: netsec: Use page_pool API") Signed-off-by: Jesper Dangaard Brouer <[email protected]> Reviewed-by: Ilias Apalodimas <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Reviewed-by: Saeed Mahameed <[email protected]> Signed-off-by: Ivan Khoronzhuk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08docs: automarkup.py: ignore exceptions when seeking for xrefsMauro Carvalho Chehab1-2/+10
When using the automarkup extension with: make pdfdocs without passing an specific book, the code will raise an exception: File "/devel/v4l/docs/Documentation/sphinx/automarkup.py", line 86, in auto_markup node.parent.replace(node, markup_funcs(name, app, node)) File "/devel/v4l/docs/Documentation/sphinx/automarkup.py", line 59, in markup_funcs 'function', target, pxref, lit_text) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/domains/c.py", line 308, in resolve_xref contnode, target) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/util/nodes.py", line 450, in make_refnode '#' + targetid) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/builders/latex/__init__.py", line 159, in get_relative_uri return self.get_target_uri(to, typ) File "/devel/v4l/docs/sphinx_2.0/lib/python3.7/site-packages/sphinx/builders/latex/__init__.py", line 152, in get_target_uri raise NoUri sphinx.environment.NoUri This happens because not all references will belong to a single PDF/LaTeX document. Better to just ignore those than breaking Sphinx build. Fixes: d74b0d31ddde ("Docs: An initial automarkup extension for sphinx") Signed-off-by: Mauro Carvalho Chehab <[email protected]> [jc: Narrowed the "except" and tweaked the comment] Signed-off-by: Jonathan Corbet <[email protected]>
2019-07-08docs: Move binderfs to admin-guideMatthew Wilcox (Oracle)3-10/+1
The documentation is more appropriate for the administrator than for the internal kernel API section it is currently in. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Acked-by: Christian Brauner <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2019-07-08nfc: fix potential illegal memory accessYang Wei1-1/+1
The frags_q is not properly initialized, it may result in illegal memory access when conn_info is NULL. The "goto free_exit" should be replaced by "goto exit". Signed-off-by: Yang Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08macb: fix build warning for !CONFIG_OFArnd Bergmann1-2/+2
When CONFIG_OF is disabled, we get a harmless warning about the newly added variable: drivers/net/ethernet/cadence/macb_main.c:48:39: error: 'mgmt' defined but not used [-Werror=unused-variable] static struct sifive_fu540_macb_mgmt *mgmt; Move the variable closer to its use inside of the #ifdef. Fixes: c218ad559020 ("macb: Add support for SiFive FU540-C000") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08gve: fix unused variable/label warningsArnd Bergmann1-33/+33
On unusual page sizes, we get harmless warnings: drivers/net/ethernet/google/gve/gve_rx.c:283:6: error: unused variable 'pagecount' [-Werror,-Wunused-variable] drivers/net/ethernet/google/gve/gve_rx.c:336:1: error: unused label 'have_skb' [-Werror,-Wunused-label] Change the preprocessor #if to regular if() to avoid this. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08net: stmmac: Re-work the queue selection for TSO packetsJose Abreu1-10/+18
Ben Hutchings says: "This is the wrong place to change the queue mapping. stmmac_xmit() is called with a specific TX queue locked, and accessing a different TX queue results in a data race for all of that queue's state. I think this commit should be reverted upstream and in all stable branches. Instead, the driver should implement the ndo_select_queue operation and override the queue mapping there." Fixes: c5acdbee22a1 ("net: stmmac: Send TSO packets always from Queue 0") Suggested-by: Ben Hutchings <[email protected]> Signed-off-by: Jose Abreu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds3-5/+12
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "The speculative paranoia departement delivers a few more plugs for possible (probably theoretical) spectre/mds leaks" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tls: Fix possible spectre-v1 in do_get_thread_area() x86/ptrace: Fix possible spectre-v1 in ptrace_get_debugreg() x86/speculation/mds: Eliminate leaks by trace_hardirqs_on()
2019-07-08sfc: Remove 'PCIE error reporting unavailable'Martin Habets1-5/+1
This is only at notice level but it was pointed out that no other driver does this. Also there is no action the user can take as it is really a property of the server. Signed-off-by: Martin Habets <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'x86-timers-for-linus' of ↵Linus Torvalds3-519/+427
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "A rather large series consolidating the HPET code, which was triggered by the attempt to bolt HPET NMI watchdog support on to the existing maze with the usual duct tape and super glue approach. This mainly removes two separate partially redundant storage layers and consolidates them into a single one which provides a consistent view of the different HPET channels and their usage and allows to integrate HPET NMI watchdog support (if it turns out to be feasible) in a non intrusive way" * 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits) x86/hpet: Use channel for legacy clockevent storage x86/hpet: Use common init for legacy clockevent x86/hpet: Carve out shareable parts of init_one_hpet_msi_clockevent() x86/hpet: Consolidate clockevent functions x86/hpet: Wrap legacy clockevent in hpet_channel x86/hpet: Use cached info instead of extra flags x86/hpet: Move clockevents into channels x86/hpet: Rename variables to prepare for switching to channels x86/hpet: Add function to select a /dev/hpet channel x86/hpet: Add mode information to struct hpet_channel x86/hpet: Use cached channel data x86/hpet: Introduce struct hpet_base and struct hpet_channel x86/hpet: Coding style cleanup x86/hpet: Clean up comments x86/hpet: Make naming consistent x86/hpet: Remove not required includes x86/hpet: Decapitalize and rename EVT_TO_HPET_DEV x86/hpet: Simplify counter validation x86/hpet: Separate counter check out of clocksource register code x86/hpet: Shuffle code around for readability sake ...
2019-07-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller27-84/+757
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Move bridge keys in nft_meta to nft_meta_bridge, from wenxu. 2) Support for bridge pvid matching, from wenxu. 3) Support for bridge vlan protocol matching, also from wenxu. 4) Add br_vlan_get_pvid_rcu(), to fetch the bridge port pvid from packet path. 5) Prefer specific family extension in nf_tables. 6) Autoload specific family extension in case it is missing. 7) Add synproxy support to nf_tables, from Fernando Fernandez Mancera. 8) Support for GRE encapsulation in IPVS, from Vadim Fedorenko. 9) ICMP handling for GRE encapsulation, from Julian Anastasov. 10) Remove unused parameter in nf_queue, from Florian Westphal. 11) Replace seq_printf() by seq_puts() in nf_log, from Markus Elfring. 12) Rename nf_SYNPROXY.h => nf_synproxy.h before this header becomes public. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'x86-cpu-for-linus' of ↵Linus Torvalds29-88/+871
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Thomas Gleixner: "Updates for x86 CPU features: - Support for UMWAIT/UMONITOR, which allows to use MWAIT and MONITOR instructions in user space to save power e.g. in HPC workloads which spin wait on synchronization points. The maximum time a MWAIT can halt in userspace is controlled by the kernel and can be adjusted by the sysadmin. - Speed up the MTRR handling code on CPUs which support cache self-snooping correctly. On those CPUs the wbinvd() invocations can be omitted which speeds up the MTRR setup by a factor of 50. - Support for the new x86 vendor Zhaoxin who develops processors based on the VIA Centaur technology. - Prevent 'cat /proc/cpuinfo' from affecting isolated NOHZ_FULL CPUs by sending IPIs to retrieve the CPU frequency and use the cached values instead. - The addition and late revert of the FSGSBASE support. The revert was required as it turned out that the code still has hard to diagnose issues. Yet another engineering trainwreck... - Small fixes, cleanups, improvements and the usual new Intel CPU family/model addons" * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits) x86/fsgsbase: Revert FSGSBASE support selftests/x86/fsgsbase: Fix some test case bugs x86/entry/64: Fix and clean up paranoid_exit x86/entry/64: Don't compile ignore_sysret if 32-bit emulation is enabled selftests/x86: Test SYSCALL and SYSENTER manually with TF set x86/mtrr: Skip cache flushes on CPUs with cache self-snooping x86/cpu/intel: Clear cache self-snoop capability in CPUs with known errata Documentation/ABI: Document umwait control sysfs interfaces x86/umwait: Add sysfs interface to control umwait maximum time x86/umwait: Add sysfs interface to control umwait C0.2 state x86/umwait: Initialize umwait control values x86/cpufeatures: Enumerate user wait instructions x86/cpu: Disable frequency requests via aperfmperf IPI for nohz_full CPUs x86/acpi/cstate: Add Zhaoxin processors support for cache flush policy in C3 ACPI, x86: Add Zhaoxin processors support for NONSTOP TSC x86/cpu: Create Zhaoxin processors architecture support file x86/cpu: Split Tremont based Atoms from the rest Documentation/x86/64: Add documentation for GS/FS addressing mode x86/elf: Enumerate kernel FSGSBASE capability in AT_HWCAP2 x86/cpu: Enable FSGSBASE on 64bit by default and add a chicken bit ...
2019-07-08net: netsec: Sync dma for device on buffer allocationIlias Apalodimas1-3/+1
cd1973a9215a ("net: netsec: Sync dma for device on buffer allocation") was merged on it's v1 instead of the v3. Merge the proper patch version Signed-off-by: Ilias Apalodimas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-07-08Merge branch 'x86-fpu-for-linus' of ↵Linus Torvalds5-59/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 FPU updates from Thomas Gleixner: "A small set of updates for the FPU code: - Make the no387/nofxsr command line options useful by restricting them to 32bit and actually clearing all dependencies to prevent random crashes and malfunction. - Simplify and cleanup the kernel_fpu_*() helpers" * 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/fpu: Inline fpu__xstate_clear_all_cpu_caps() x86/fpu: Make 'no387' and 'nofxsr' command line options useful x86/fpu: Remove the fpu__save() export x86/fpu: Simplify kernel_fpu_begin() x86/fpu: Simplify kernel_fpu_end()
2019-07-08Merge branch 'x86-entry-for-linus' of ↵Linus Torvalds6-53/+174
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 vsyscall updates from Thomas Gleixner: "Further hardening of the legacy vsyscall by providing support for execute only mode and switching the default to it. This prevents a certain class of attacks which rely on the vsyscall page being accessible at a fixed address in the canonical kernel address space" * 'x86-entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftests/x86: Add a test for process_vm_readv() on the vsyscall page x86/vsyscall: Add __ro_after_init to global variables x86/vsyscall: Change the default vsyscall mode to xonly selftests/x86/vsyscall: Verify that vsyscall=none blocks execution x86/vsyscall: Document odd SIGSEGV error code for vsyscalls x86/vsyscall: Show something useful on a read fault x86/vsyscall: Add a new vsyscall=xonly mode Documentation/admin: Remove the vsyscall=native documentation