aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-07-24mailmap: add entry for Mike RapoportMike Rapoport1-0/+3
Add an entry to correct my email addresses. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24khugepaged: fix null-pointer dereference due to raceKirill A. Shutemov1-0/+3
khugepaged has to drop mmap lock several times while collapsing a page. The situation can change while the lock is dropped and we need to re-validate that the VMA is still in place and the PMD is still subject for collapse. But we miss one corner case: while collapsing an anonymous pages the VMA could be replaced with file VMA. If the file VMA doesn't have any private pages we get NULL pointer dereference: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] anon_vma_lock_write include/linux/rmap.h:120 [inline] collapse_huge_page mm/khugepaged.c:1110 [inline] khugepaged_scan_pmd mm/khugepaged.c:1349 [inline] khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline] khugepaged_do_scan mm/khugepaged.c:2193 [inline] khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238 The fix is to make sure that the VMA is anonymous in hugepage_vma_revalidate(). The helper is only used for collapsing anonymous pages. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Reported-by: [email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Yang Shi <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm/hugetlb: avoid hardcoding while checking if cma is enabledBarry Song1-5/+10
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not enabled. gigantic pages might have been reserved on other nodes. This patch fixes possible double reservation and CMA leak. [[email protected]: fix CONFIG_CMA=n warning] [[email protected]: better checks before using hugetlb_cma] Link: http://lkml.kernel.org/r/[email protected] Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Signed-off-by: Barry Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm: memcg/slab: fix memory leak at non-root kmem_cache destroyMuchun Song1-7/+28
If the kmem_cache refcount is greater than one, we should not mark the root kmem_cache as dying. If we mark the root kmem_cache dying incorrectly, the non-root kmem_cache can never be destroyed. It resulted in memory leak when memcg was destroyed. We can use the following steps to reproduce. 1) Use kmem_cache_create() to create a new kmem_cache named A. 2) Coincidentally, the kmem_cache A is an alias for kmem_cache B, so the refcount of B is just increased. 3) Use kmem_cache_destroy() to destroy the kmem_cache A, just decrease the B's refcount but mark the B as dying. 4) Create a new memory cgroup and alloc memory from the kmem_cache B. It leads to create a non-root kmem_cache for allocating memory. 5) When destroy the memory cgroup created in the step 4), the non-root kmem_cache can never be destroyed. If we repeat steps 4) and 5), this will cause a lot of memory leak. So only when refcount reach zero, we mark the root kmem_cache as dying. Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate") Signed-off-by: Muchun Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Roman Gushchin <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm/memcg: fix refcount error while moving and swappingHugh Dickins1-2/+2
It was hard to keep a test running, moving tasks between memcgs with move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s refcount is discovered to be 0 (supposedly impossible), so it is then forced to REFCOUNT_SATURATED, and after thousands of warnings in quick succession, the test is at last put out of misery by being OOM killed. This is because of the way moved_swap accounting was saved up until the task move gets completed in __mem_cgroup_clear_mc(), deferred from when mem_cgroup_move_swap_account() actually exchanged old and new ids. Concurrent activity can free up swap quicker than the task is scanned, bringing id refcount down 0 (which should only be possible when offlining). Just skip that optimization: do that part of the accounting immediately. Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move") Signed-off-by: Hugh Dickins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Alex Shi <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Alex Shi <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Michal Hocko <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()Bhupesh Sharma1-1/+8
Prabhakar reported an OOPS inside mem_cgroup_get_nr_swap_pages() function in a corner case seen on some arm64 boards when kdump kernel runs with "cgroup_disable=memory" passed to the kdump kernel via bootargs. The root-cause behind the same is that currently mem_cgroup_swap_init() function is implemented as a subsys_initcall() call instead of a core_initcall(), this means 'cgroup_memory_noswap' still remains set to the default value (false) even when memcg is disabled via "cgroup_disable=memory" boot parameter. This may result in premature OOPS inside mem_cgroup_get_nr_swap_pages() function in corner cases: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000188 Mem abort info: ESR = 0x96000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 [0000000000000188] user address but active_mm is swapper Internal error: Oops: 96000006 [#1] SMP Modules linked in: <..snip..> Call trace: mem_cgroup_get_nr_swap_pages+0x9c/0xf4 shrink_lruvec+0x404/0x4f8 shrink_node+0x1a8/0x688 do_try_to_free_pages+0xe8/0x448 try_to_free_pages+0x110/0x230 __alloc_pages_slowpath.constprop.106+0x2b8/0xb48 __alloc_pages_nodemask+0x2ac/0x2f8 alloc_page_interleave+0x20/0x90 alloc_pages_current+0xdc/0xf8 atomic_pool_expand+0x60/0x210 __dma_atomic_pool_init+0x50/0xa4 dma_atomic_pool_init+0xac/0x158 do_one_initcall+0x50/0x218 kernel_init_freeable+0x22c/0x2d0 kernel_init+0x18/0x110 ret_from_fork+0x10/0x18 Code: aa1403e3 91106000 97f82a27 14000011 (f940c663) ---[ end trace 9795948475817de4 ]--- Kernel panic - not syncing: Fatal exception Rebooting in 10 seconds.. Fixes: eccb52e78809 ("mm: memcontrol: prepare swap controller setup for integration") Reported-by: Prabhakar Kushwaha <[email protected]> Signed-off-by: Bhupesh Sharma <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: James Morse <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm: initialize return of vm_insert_pagesTom Rix1-1/+1
clang static analysis reports a garbage return In file included from mm/memory.c:84: mm/memory.c:1612:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return err; ^~~~~~~~~~ The setting of err depends on a loop executing. So initialize err. Signed-off-by: Tom Rix <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right wayChengguang Xu2-2/+3
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc"), simple xattr entry is allocated with kvmalloc() instead of kmalloc(), so we should release it with kvfree() instead of kfree(). Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc") Signed-off-by: Chengguang Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Hugh Dickins <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Daniel Xu <[email protected]> Cc: Chris Down <[email protected]> Cc: Andreas Dilger <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Al Viro <[email protected]> Cc: <[email protected]> [5.7] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24mm/mmap.c: close race between munmap() and expand_upwards()/downwards()Kirill A. Shutemov1-2/+14
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under mmap_read_lock(). It can lead to race with __do_munmap(): Thread A Thread B __do_munmap() detach_vmas_to_be_unmapped() mmap_write_downgrade() expand_downwards() vma->vm_start = address; // The VMA now overlaps with // VMAs detached by the Thread A // page fault populates expanded part // of the VMA unmap_region() // Zaps pagetables partly // populated by Thread B Similar race exists for expand_upwards(). The fix is to avoid downgrading mmap_lock in __do_munmap() if detached VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA. [[email protected]: s/mmap_sem/mmap_lock/ in comment] Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: Jann Horn <[email protected]> Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Yang Shi <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> [4.20+] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-24uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix ↵Oleg Nesterov1-1/+1
GDB regression If a tracee is uprobed and it hits int3 inserted by debugger, handle_swbp() does send_sig(SIGTRAP, current, 0) which means si_code == SI_USER. This used to work when this code was written, but then GDB started to validate si_code and now it simply can't use breakpoints if the tracee has an active uprobe: # cat test.c void unused_func(void) { } int main(void) { return 0; } # gcc -g test.c -o test # perf probe -x ./test -a unused_func # perf record -e probe_test:unused_func gdb ./test -ex run GNU gdb (GDB) 10.0.50.20200714-git ... Program received signal SIGTRAP, Trace/breakpoint trap. 0x00007ffff7ddf909 in dl_main () from /lib64/ld-linux-x86-64.so.2 (gdb) The tracee hits the internal breakpoint inserted by GDB to monitor shared library events but GDB misinterprets this SIGTRAP and reports a signal. Change handle_swbp() to use force_sig(SIGTRAP), this matches do_int3_user() and fixes the problem. This is the minimal fix for -stable, arch/x86/kernel/uprobes.c is equally wrong; it should use send_sigtrap(TRAP_TRACE) instead of send_sig(SIGTRAP), but this doesn't confuse GDB and needs another x86-specific patch. Reported-by: Aaron Merey <[email protected]> Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2020-07-24sched: Warn if garbage is passed to default_wake_function()Chris Wilson1-0/+1
Since the default_wake_function() passes its flags onto try_to_wake_up(), warn if those flags collide with internal values. Given that the supplied flags are garbage, no repair can be done but at least alert the user to the damage they are causing. In the belief that these errors should be picked up during testing, the warning is only compiled in under CONFIG_SCHED_DEBUG. Signed-off-by: Chris Wilson <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-07-23vrf: Handle CONFIG_SYSCTL not setDavid Ahern2-61/+83
Randy reported compile failure when CONFIG_SYSCTL is not set/enabled: ERROR: modpost: "sysctl_vals" [drivers/net/vrf.ko] undefined! Fix by splitting out the sysctl init and cleanup into helpers that can be set to do nothing when CONFIG_SYSCTL is disabled. In addition, move vrf_strict_mode and vrf_strict_mode_change to above vrf_shared_table_handler (code move only) and wrap all of it in the ifdef CONFIG_SYSCTL. Update the strict mode tests to check for the existence of the /proc/sys entry. Fixes: 33306f1aaf82 ("vrf: add sysctl parameter for strict mode") Cc: Andrea Mayer <[email protected]> Reported-by: Randy Dunlap <[email protected]> Signed-off-by: David Ahern <[email protected]> Acked-by: Randy Dunlap <[email protected]> # build-tested Signed-off-by: David S. Miller <[email protected]>
2020-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-31/+22
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Fix NAT hook deletion when table is dormant, from Florian Westphal. 2) Fix IPVS sync stalls, from guodeqing. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-07-23ice: add 1G SGMII PHY typePaul M Stillwell Jr2-3/+15
There isn't a case for 1G SGMII in ice_get_media_type() so add the handling for it. Also handle the special case where some direct attach cables may report that they support 1G SGMII, but that is erroneous since SGMII is supposed to be a backplane media type (between a MAC and a PHY). If the driver doesn't handle this special case then a user could see the 'Port' in ethtool change from 'Direct attach Copper' to 'Backplane' when they have forced the speed to 1G, but the cable hasn't changed. Lastly, change ice_aq_get_phy_caps() to save the module_type info if the function was called with ICE_AQC_REPORT_TOPO_CAP. This call uses the media information to populate the module_type. If no media is present then the values in module_type will be 0. Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: Report AOC PHY Types as FiberDoug Dziggel1-0/+11
Report AOC types as fiber instead of unknown. Signed-off-by: Doug Dziggel <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: add AQC get link topology handle supportPaul Greenwalt2-1/+118
Add AQC get link topology handle support. This is needed to determine Direct Attach (DA) or backplane media type for PHY types that support either. Get link topology handle cage node type request can be used to determine if a cage is present or not. If a cage is present for PHY types that supports both DA and backplane media type, then the media type is DA, else the media type is backplane. Signed-off-by: Paul Greenwalt <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: Rename low_power_ctrlLev Faerman2-11/+11
Rename the low_power_ctrl field to low_power_ctrl_an to be properly descriptive of it being an autoneg field. Signed-off-by: Lev Faerman <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: update reporting of autoneg capabilitiesPaul Greenwalt5-6/+29
Firmware now reports AN28, AN32, and AN73. Add a helper and check these new values and report PHY autoneg capability. Signed-off-by: Paul Greenwalt <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: add ice_aq_get_phy_caps() debug logsPaul Greenwalt1-18/+50
Add debug logs for ice_aq_get_phy_caps(), and format ice_aq_set_phy_cfg() and ice_aq_get_link_info() debug logs to make them more readable. Signed-off-by: Paul Greenwalt <[email protected]> Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: support Total Port Shutdown on devices that support itBruce Allan3-1/+26
When the Port Disable bit is set in the Link Default Override Mask TLV PFA module in the NVM, Total Port Shutdown mode is supported and enabled. In this mode, the driver should act as if the link-down-on-close ethtool private flag is always enabled and dis-allow any change to that flag. Signed-off-by: Bruce Allan <[email protected]> Signed-off-by: Paul Greenwalt <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: add link lenient and default override supportPaul Greenwalt9-216/+628
Adds functions to check for link override firmware support and get the override settings for a port. The previously supported/default link mode was strict mode. In strict mode link is configured based on get PHY capabilities PHY types with media. Lenient mode is now the default link mode. In lenient mode the link is configured based on get PHY capabilities PHY types without media. This allows the user to configure link that the media does not report. Limit the minimum supported link mode to 25G for devices that support 100G, and 1G for devices that support less than 100G. Default override is only supported in lenient mode. If default override is supported and enabled, then default override values are used for configuring speed and FEC. Default override provide persistent link settings in the NVM. Signed-off-by: Paul Greenwalt <[email protected]> Signed-off-by: Evan Swanson <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23geneve: fix an uninitialized value in geneve_changelink()Cong Wang1-1/+1
geneve_nl2info() sets 'df' conditionally, so we have to initialize it by copying the value from existing geneve device in geneve_changelink(). Fixes: 56c09de347e4 ("geneve: allow changing DF behavior after creation") Reported-by: [email protected] Cc: Sabrina Dubroca <[email protected]> Signed-off-by: Cong Wang <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23bonding: check return value of register_netdevice() in bond_newlink()Cong Wang1-2/+1
Very similar to commit 544f287b8495 ("bonding: check error value of register_netdevice() immediately"), we should immediately check the return value of register_netdevice() before doing anything else. Fixes: 005db31d5f5f ("bonding: set carrier off for devices created through netlink") Reported-and-tested-by: [email protected] Cc: Beniamino Galvani <[email protected]> Cc: Taehee Yoo <[email protected]> Cc: Jay Vosburgh <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23ice: restore PHY settings on media insertionPaul Greenwalt7-95/+518
After the transition from no media to media FW will clear the set-phy-cfg data set by the user. Save initial PHY settings and any settings later requested by the user and use that data to restore PHY settings on media insertion. Since PHY configuration is now being stored, replace calls that were calling FW to get the configuration with the saved copy. Signed-off-by: Paul Greenwalt <[email protected]> Signed-off-by: Chinh T Cao <[email protected]> Signed-off-by: Paul M Stillwell Jr <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23net: dsa: stop overriding master's ndo_get_phys_port_nameVladimir Oltean3-40/+0
The purpose of this override is to give the user an indication of what the number of the CPU port is (in DSA, the CPU port is a hardware implementation detail and not a network interface capable of traffic). However, it has always failed (by design) at providing this information to the user in a reliable fashion. Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers"), the behavior was to only override this callback if it was not provided by the DSA master. That was its first failure: if the DSA master itself was a DSA port or a switchdev, then the user would not see the number of the CPU port in /sys/class/net/eth0/phys_port_name, but the number of the DSA master port within its respective physical switch. But that was actually ok in a way. The commit mentioned above changed that behavior, and now overrides the master's ndo_get_phys_port_name unconditionally. That comes with problems of its own, which are worse in a way. The idea is that it's typical for switchdev users to have udev rules for consistent interface naming. These are based, among other things, on the phys_port_name attribute. If we let the DSA switch at the bottom to start randomly overriding ndo_get_phys_port_name with its own CPU port, we basically lose any predictability in interface naming, or even uniqueness, for that matter. So, there are reasons to let DSA override the master's callback (to provide a consistent interface, a number which has a clear meaning and must not be interpreted according to context), and there are reasons to not let DSA override it (it breaks udev matching for the DSA master). But, there is an alternative method for users to retrieve the number of the CPU port of each DSA switch in the system: $ devlink port pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0 pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2 pci/0000:00:00.5/4: type notset flavour cpu port 4 spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0 spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1 spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2 spi/spi2.0/4: type notset flavour cpu port 4 spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0 spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1 spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2 spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3 spi/spi2.1/4: type notset flavour cpu port 4 So remove this duplicated, unreliable and troublesome method. From this patch on, the phys_port_name attribute of the DSA master will only contain information about itself (if at all). If the users need reliable information about the CPU port they're probably using devlink anyway. Signed-off-by: Vladimir Oltean <[email protected]> Acked-by: florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23ice: move auto FEC checks into ice_cfg_phy_fec()Paul Greenwalt3-39/+40
The call to ice_cfg_phy_fec() requires the caller to perform certain actions before calling it. Instead of imposing these preconditions move the operations into the function and perform them ourselves. Also, fix some style issues in nearby touched code. Signed-off-by: Paul Greenwalt <[email protected]> Signed-off-by: Chinh T Cao <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: refactor FC functionsPaul Greenwalt1-29/+46
Create a helper function for configuring requested flow control so that it can be utilized by other functions looking to configure flow control settings. Utilize the existing helper ice_copy_phy_caps_to_cfg() to copy a PHY capability to configuration instead duplicating the code for it. Signed-off-by: Paul Greenwalt <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: Add advanced power mgmt for WoLAkeem G Abodunrin10-29/+458
Add callbacks needed to support advanced power management for Wake on LAN. Also make ice_pf_state_is_nominal function available for all configurations not just CONFIG_PCI_IOV. Signed-off-by: Akeem G Abodunrin <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: split ice_discover_caps into two functionsJacob Keller1-55/+39
Using the new ice_aq_list_caps and ice_parse_(dev|func)_caps functions, replace ice_discover_caps with two functions that each take a pointer to the dev_caps and func_caps structures respectively. This makes the side effect of updating the hw->dev_caps and hw->func_caps obvious from reading the implementation of the function. Additionally, it opens the way for enabling reading of device capabilities outside of the initialization flow. By passing in a pointer, another caller will be able to read the capabilities without modifying the HW capabilities structures. As there are no other callers, it is safe to now remove ice_aq_discover_caps and ice_parse_caps. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: split ice_parse_caps into separate functionsJacob Keller1-170/+378
The ice_parse_caps function is used to convert the capability block data coming from firmware into a structured format used by other parts of the code. The current implementation directly updates the hw->func_caps and hw->dev_caps structures. It is directly called from within ice_aq_discover_caps. This causes the discover_caps function to have the side effect of modifying the HW capability structures, which is not intuitive. Split this function into ice_parse_dev_caps and ice_parse_func_caps. These functions will take a pointer to the dev_caps and func_caps respectively. Also create an ice_parse_common_caps for sharing the capability logic that is common to device and function. Doing so enables a future refactor to allow reading and parsing capabilities into a local caps structure instead of modifying the members of the HW structure directly. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23ice: refactor ice_discover_caps to avoid need to retryJacob Keller1-31/+12
The ice_discover_caps function is used to read the device and function capabilities, updating the hardware capabilities structures with relevant data. The exact number of capabilities returned by the hardware is unknown ahead of time. The AdminQ command will report the total number of capabilities in the return buffer. The current implementation involves requesting capabilities once, reading this returned size, and then re-requested with that size. This isn't really necessary. The firmware interface has a maximum size of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities. Avoid the retry loop by simply allocating a buffer of size ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The extra allocation isn't a big deal, as it will be released after we finish parsing the capabilities. Signed-off-by: Jacob Keller <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2020-07-23Revert "cifs: Fix the target file was deleted when rename failed."Steve French1-8/+2
This reverts commit 9ffad9263b467efd8f8dc7ae1941a0a655a2bab2. Upon additional testing with older servers, it was found that the original commit introduced a regression when using the old SMB1 dialect and rsyncing over an existing file. The patch will need to be respun to address this, likely including a larger refactoring of the SMB1 and SMB3 rename code paths to make it less confusing and also to address some additional rename error cases that SMB3 may be able to workaround. Signed-off-by: Steve French <[email protected]> Reported-by: Patrick Fernie <[email protected]> CC: Stable <[email protected]> Acked-by: Ronnie Sahlberg <[email protected]> Acked-by: Pavel Shilovsky <[email protected]> Acked-by: Zhang Xiaoxu <[email protected]>
2020-07-23Merge tag 's390-5.8-6' of ↵Linus Torvalds3-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into master Pull s390 fixes from Heiko Carstens: - Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH to reflect reality and avoid further confusion. This is a user space visible change therefore the commit has also a stable tag for 5.7, where this counter was introduced. - Add Matthew Rosato as s390 IOMMU maintainer. * tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: MAINTAINERS: add Matthew for s390 IOMMU s390/cpum_cf,perf: change DFLT_CCERROR counter name
2020-07-23i2c: i2c-qcom-geni: Fix DMA transfer raceDouglas Anderson1-2/+4
When I have KASAN enabled on my kernel and I start stressing the touchscreen my system tends to hang. The touchscreen is one of the only things that does a lot of big i2c transfers and ends up hitting the DMA paths in the geni i2c driver. It appears that KASAN adds enough delay in my system to tickle a race condition in the DMA setup code. When the system hangs, I found that it was running the geni_i2c_irq() over and over again. It had these: m_stat = 0x04000080 rx_st = 0x30000011 dm_tx_st = 0x00000000 dm_rx_st = 0x00000000 dma = 0x00000001 Notably we're in DMA mode but are getting M_RX_IRQ_EN and M_RX_FIFO_WATERMARK_EN over and over again. Putting some traces in geni_i2c_rx_one_msg() showed that when we failed we were getting to the start of geni_i2c_rx_one_msg() but were never executing geni_se_rx_dma_prep(). I believe that the problem here is that we are starting the geni command before we run geni_se_rx_dma_prep(). If a transfer makes it far enough before we do that then we get into the state I have observed. Let's change the order, which seems to work fine. Although problems were seen on the RX path, code inspection suggests that the TX should be changed too. Change it as well. Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Douglas Anderson <[email protected]> Tested-by: Sai Prakash Ranjan <[email protected]> Reviewed-by: Akash Asthana <[email protected]> Reviewed-by: Stephen Boyd <[email protected]> Reviewed-by: Mukesh Kumar Savaliya <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2020-07-23i2c: rcar: always clear ICSAR to avoid side effectsWolfram Sang1-0/+3
On R-Car Gen2, we get a timeout when reading from the address set in ICSAR, even though the slave interface is disabled. Clearing it fixes this situation. Note that Gen3 is not affected. To reproduce: bind and undbind an I2C slave on some bus, run 'i2cdetect' on that bus. Fixes: de20d1857dd6 ("i2c: rcar: add slave support") Signed-off-by: Wolfram Sang <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2020-07-23tcp: allow at most one TLP probe per flightYuchung Cheng3-12/+18
Previously TLP may send multiple probes of new data in one flight. This happens when the sender is cwnd limited. After the initial TLP containing new data is sent, the sender receives another ACK that acks partial inflight. It may re-arm another TLP timer to send more, if no further ACK returns before the next TLP timeout (PTO) expires. The sender may send in theory a large amount of TLP until send queue is depleted. This only happens if the sender sees such irregular uncommon ACK pattern. But it is generally undesirable behavior during congestion especially. The original TLP design restrict only one TLP probe per inflight as published in "Reducing Web Latency: the Virtue of Gentle Aggression", SIGCOMM 2013. This patch changes TLP to send at most one probe per inflight. Note that if the sender is app-limited, TLP retransmits old data and did not have this issue. Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23AX.25: Prevent integer overflows in connect and sendmsgDan Carpenter1-1/+4
We recently added some bounds checking in ax25_connect() and ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because they were no longer required. Unfortunately, I believe they are required to prevent integer overflows so I have added them back. Fixes: 8885bb0621f0 ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()") Fixes: 2f2a7ffad5c6 ("AX.25: Fix out-of-bounds read in ax25_connect()") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23cxgb4: add loopback ethtool self-testVishal Kulkarni3-1/+166
In this test, loopback pkt is created and sent on default queue. The packet goes until the Multi Port Switch (MPS) just before the MAC and based on the specified channel number, it either goes outside the wire on one of the physical ports or looped back to Rx path by MPS. In this case, we're specifying loopback channel, instead of physical ports, so the packet gets looped back to Rx path, instead of getting transmitted on the wire. v3: - Modify commit message to include test details. v2: - Add only loopback self-test. Signed-off-by: Vishal Kulkarni <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23Merge branch 'l2tp-further-checkpatch-pl-cleanups'David S. Miller6-150/+169
Tom Parkin says: ==================== l2tp: further checkpatch.pl cleanups l2tp hasn't been kept up to date with the static analysis checks offered by checkpatch.pl. This patchset builds on the series "l2tp: cleanup checkpatch.pl warnings". It includes small refactoring changes which improve code quality and resolve a subset of the checkpatch warnings for the l2tp codebase. ==================== Reviewed-by: James Chapman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: cleanup kzalloc callsTom Parkin1-2/+2
Passing "sizeof(struct blah)" in kzalloc calls is less readable, potentially prone to future bugs if the type of the pointer is changed, and triggers checkpatch warnings. Tweak the kzalloc calls in l2tp which use this form to avoid the warning. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: cleanup netlink tunnel create address handlingTom Parkin1-24/+33
When creating an L2TP tunnel using the netlink API, userspace must either pass a socket FD for the tunnel to use (for managed tunnels), or specify the tunnel source/destination address (for unmanaged tunnels). Since source/destination addresses may be AF_INET or AF_INET6, the l2tp netlink code has conditionally compiled blocks to support IPv6. Rather than embedding these directly into l2tp_nl_cmd_tunnel_create (where it makes the code difficult to read and confuses checkpatch to boot) split the handling of address-related attributes into a separate function. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: cleanup netlink send of tunnel address informationTom Parkin1-56/+70
l2tp_nl_tunnel_send has conditionally compiled code to support AF_INET6, which makes the code difficult to follow and triggers checkpatch warnings. Split the code out into functions to handle the AF_INET v.s. AF_INET6 cases, which both improves readability and resolves the checkpatch warnings. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: check socket address type in l2tp_dfs_seq_tunnel_showTom Parkin1-3/+5
checkpatch warns about indentation and brace balancing around the conditionally compiled code for AF_INET6 support in l2tp_dfs_seq_tunnel_show. By adding another check on the socket address type we can make the code more readable while removing the checkpatch warning. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: cleanup unnecessary braces in if statementsTom Parkin2-17/+12
These checks are all simple and don't benefit from extra braces to clarify intent. Remove them for easier-reading code. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23l2tp: cleanup comparisons to NULLTom Parkin6-48/+47
checkpatch warns about comparisons to NULL, e.g. CHECK: Comparison to NULL could be written "!rt" #474: FILE: net/l2tp/l2tp_ip.c:474: + if (rt == NULL) { These sort of comparisons are generally clearer and more readable the way checkpatch suggests, so update l2tp accordingly. Signed-off-by: Tom Parkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23net/ncsi: use eth_zero_addr() to clear mac addressMiaohe Lin1-1/+1
Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23cxgb4: use eth_zero_addr() to clear mac addressMiaohe Lin1-1/+1
Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23Merge branch 'mptcp-non-backup-subflows-pre-reqs'David S. Miller4-46/+81
Paolo Abeni says: ==================== mptcp: non backup subflows pre-reqs This series contains a bunch of MPTCP improvements loosely related to concurrent subflows xmit usage, currently under development. The first 3 patches are actually bugfixes for issues that will become apparent as soon as we will enable the above feature. The later patches improve the handling of incoming additional subflows, improving significantly the performances in stress tests based on a high new connection rate. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-07-23subflow: introduce and use mptcp_can_accept_new_subflow()Paolo Abeni1-0/+7
So that we can easily perform some basic PM-related adimission checks before creating the child socket. Reviewed-by: Mat Martineau <[email protected]> Tested-by: Christoph Paasch <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-23subflow: use rsk_ops->send_reset()Paolo Abeni1-1/+1
tcp_send_active_reset() is more prone to transient errors (memory allocation or xmit queue full): in stress conditions the kernel may drop the egress packet, and the client will be stuck. Reviewed-by: Mat Martineau <[email protected]> Tested-by: Christoph Paasch <[email protected]> Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>