aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-12-06mm/mmap.c: fix mmap return value when vma is merged after call_mmap()Liu Zixian1-14/+12
On success, mmap should return the begin address of newly mapped area, but patch "mm: mmap: merge vma after call_mmap() if possible" set vm_start of newly merged vma to return value addr. Users of mmap will get wrong address if vma is merged after call_mmap(). We fix this by moving the assignment to addr before merging vma. We have a driver which changes vm_flags, and this bug is found by our testcases. Fixes: d70cec898324 ("mm: mmap: merge vma after call_mmap() if possible") Signed-off-by: Liu Zixian <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Miaohe Lin <[email protected]> Cc: Hongxiang Lou <[email protected]> Cc: Hu Shiyuan <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06hugetlb_cgroup: fix offline of hugetlb cgroup with reservationsMike Kravetz1-5/+3
Adrian Moreno was ruuning a kubernetes 1.19 + containerd/docker workload using hugetlbfs. In this environment the issue is reproduced by: - Start a simple pod that uses the recently added HugePages medium feature (pod yaml attached) - Start a DPDK app. It doesn't need to run successfully (as in transfer packets) nor interact with real hardware. It seems just initializing the EAL layer (which handles hugepage reservation and locking) is enough to trigger the issue - Delete the Pod (or let it "Complete"). This would result in a kworker thread going into a tight loop (top output): 1425 root 20 0 0 0 0 R 99.7 0.0 5:22.45 kworker/28:7+cgroup_destroy 'perf top -g' reports: - 63.28% 0.01% [kernel] [k] worker_thread - 49.97% worker_thread - 52.64% process_one_work - 62.08% css_killed_work_fn - hugetlb_cgroup_css_offline 41.52% _raw_spin_lock - 2.82% _cond_resched rcu_all_qs 2.66% PageHuge - 0.57% schedule - 0.57% __schedule We are spinning in the do-while loop in hugetlb_cgroup_css_offline. Worse yet, we are holding the master cgroup lock (cgroup_mutex) while infinitely spinning. Little else can be done on the system as the cgroup_mutex can not be acquired. Do note that the issue can be reproduced by simply offlining a hugetlb cgroup containing pages with reservation counts. The loop in hugetlb_cgroup_css_offline is moving page counts from the cgroup being offlined to the parent cgroup. This is done for each hstate, and is repeated until hugetlb_cgroup_have_usage returns false. The routine moving counts (hugetlb_cgroup_move_parent) is only moving 'usage' counts. The routine hugetlb_cgroup_have_usage is checking for both 'usage' and 'reservation' counts. Discussion about what to do with reservation counts when reparenting was discussed here: https://lore.kernel.org/linux-kselftest/CAHS8izMFAYTgxym-Hzb_JmkTK1N_S9tGN71uS6MFV+R7swYu5A@mail.gmail.com/ The decision was made to leave a zombie cgroup for with reservation counts. Unfortunately, the code checking reservation counts was incorrectly added to hugetlb_cgroup_have_usage. To fix the issue, simply remove the check for reservation counts. While fixing this issue, a related bug in hugetlb_cgroup_css_offline was noticed. The hstate index is not reinitialized each time through the do-while loop. Fix this as well. Fixes: 1adc4d419aa2 ("hugetlb_cgroup: add interface for charge/uncharge hugetlb reservations") Reported-by: Adrian Moreno <[email protected]> Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Adrian Moreno <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Mina Almasry <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Shuah Khan <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mm/filemap: add static for function __add_to_page_cache_lockedAlex Shi1-1/+1
mm/filemap.c:830:14: warning: no previous prototype for `__add_to_page_cache_locked' [-Wmissing-prototypes] Signed-off-by: Alex Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Souptick Joarder <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06userfaultfd: selftests: fix SIGSEGV if huge mmap failsAxel Rasmussen1-9/+16
The error handling in hugetlb_allocate_area() was incorrect for the hugetlb_shared test case. Previously the behavior was: - mmap a hugetlb area - If this fails, set the pointer to NULL, and carry on - mmap an alias of the same hugetlb fd - If this fails, munmap the original area If the original mmap failed, it's likely the second one did too. If both failed, we'd blindly try to munmap a NULL pointer, causing a SIGSEGV. Instead, "goto fail" so we return before trying to mmap the alias. This issue can be hit "in real life" by forgetting to set /proc/sys/vm/nr_hugepages (leaving it at 0), and then trying to run the hugetlb_shared test. Another small improvement is, when the original mmap fails, don't just print "it failed": perror(), so we can see *why*. :) Signed-off-by: Axel Rasmussen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Peter Xu <[email protected]> Cc: Joe Perches <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: David Alan Gilbert <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06tools/testing/selftests/vm: fix build errorXingxing Su1-0/+4
Only x86 and PowerPC implement the pkey-xxx.h, and an error was reported when compiling protection_keys.c. Add a Arch judgment to compile "protection_keys" in the Makefile. If other arch implement this, add the arch name to the Makefile. eg: ifneq (,$(findstring $(ARCH),powerpc mips ... )) Following build errors: pkey-helpers.h:93:2: error: #error Architecture not supported #error Architecture not supported pkey-helpers.h:96:20: error: `PKEY_DISABLE_ACCESS' undeclared #define PKEY_MASK (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE) ^ protection_keys.c:218:45: error: `PKEY_DISABLE_WRITE' undeclared pkey_assert(flags & (PKEY_DISABLE_ACCESS | PKEY_DISABLE_WRITE)); ^ Signed-off-by: Xingxing Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Sandipan Das <[email protected]> Cc: John Hubbard <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Brian Geffon <[email protected]> Cc: Mina Almasry <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mailmap: add two more addresses of Uwe Kleine-KönigUwe Kleine-König1-0/+2
This fixes attribution for the commits (among others) - d4097456cd1d ("video/framebuffer: move the probe func into .devinit.text in Blackfin LCD driver") - 0312e024d6cd ("mfd: mc13xxx: Add support for mc34708") Signed-off-by: Uwe Kleine-König <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mm/swapfile: do not sleep with a spin lock heldQian Cai1-1/+3
We can't call kvfree() with a spin lock held, so defer it. Fixes a might_sleep() runtime warning. Fixes: 873d7bcfd066 ("mm/swapfile.c: use kvzalloc for swap_info_struct allocation") Signed-off-by: Qian Cai <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mm/zsmalloc.c: drop ZSMALLOC_PGTABLE_MAPPINGMinchan Kim4-69/+0
While I was doing zram testing, I found sometimes decompression failed since the compression buffer was corrupted. With investigation, I found below commit calls cond_resched unconditionally so it could make a problem in atomic context if the task is reschedule. BUG: sleeping function called from invalid context at mm/vmalloc.c:108 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 946, name: memhog 3 locks held by memhog/946: #0: ffff9d01d4b193e8 (&mm->mmap_lock#2){++++}-{4:4}, at: __mm_populate+0x103/0x160 #1: ffffffffa3d53de0 (fs_reclaim){+.+.}-{0:0}, at: __alloc_pages_slowpath.constprop.0+0xa98/0x1160 #2: ffff9d01d56b8110 (&zspage->lock){.+.+}-{3:3}, at: zs_map_object+0x8e/0x1f0 CPU: 0 PID: 946 Comm: memhog Not tainted 5.9.3-00011-gc5bfc0287345-dirty #316 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014 Call Trace: unmap_kernel_range_noflush+0x2eb/0x350 unmap_kernel_range+0x14/0x30 zs_unmap_object+0xd5/0xe0 zram_bvec_rw.isra.0+0x38c/0x8e0 zram_rw_page+0x90/0x101 bdev_write_page+0x92/0xe0 __swap_writepage+0x94/0x4a0 pageout+0xe3/0x3a0 shrink_page_list+0xb94/0xd60 shrink_inactive_list+0x158/0x460 We can fix this by removing the ZSMALLOC_PGTABLE_MAPPING feature (which contains the offending calling code) from zsmalloc. Even though this option showed some amount improvement(e.g., 30%) in some arm32 platforms, it has been headache to maintain since it have abused APIs[1](e.g., unmap_kernel_range in atomic context). Since we are approaching to deprecate 32bit machines and already made the config option available for only builtin build since v5.8, lastly it has been not default option in zsmalloc, it's time to drop the option for better maintenance. [1] http://lore.kernel.org/linux-mm/[email protected] Fixes: e47110e90584 ("mm/vunmap: add cond_resched() in vunmap_pmd_range") Signed-off-by: Minchan Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Cc: Tony Lindgren <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Harish Sriram <[email protected]> Cc: Uladzislau Rezki <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mm: list_lru: set shrinker map bit when child nr_items is not zeroYang Shi1-5/+5
When investigating a slab cache bloat problem, significant amount of negative dentry cache was seen, but confusingly they neither got shrunk by reclaimer (the host has very tight memory) nor be shrunk by dropping cache. The vmcore shows there are over 14M negative dentry objects on lru, but tracing result shows they were even not scanned at all. Further investigation shows the memcg's vfs shrinker_map bit is not set. So the reclaimer or dropping cache just skip calling vfs shrinker. So we have to reboot the hosts to get the memory back. I didn't manage to come up with a reproducer in test environment, and the problem can't be reproduced after rebooting. But it seems there is race between shrinker map bit clear and reparenting by code inspection. The hypothesis is elaborated as below. The memcg hierarchy on our production environment looks like: root / \ system user The main workloads are running under user slice's children, and it creates and removes memcg frequently. So reparenting happens very often under user slice, but no task is under user slice directly. So with the frequent reparenting and tight memory pressure, the below hypothetical race condition may happen: CPU A CPU B reparent dst->nr_items == 0 shrinker: total_objects == 0 add src->nr_items to dst set_bit return SHRINK_EMPTY clear_bit child memcg offline replace child's kmemcg_id with parent's (in memcg_offline_kmem()) list_lru_del() between shrinker runs see parent's kmemcg_id dec dst->nr_items reparent again dst->nr_items may go negative due to concurrent list_lru_del() The second run of shrinker: read nr_items without any synchronization, so it may see intermediate negative nr_items then total_objects may return 0 coincidently keep the bit cleared dst->nr_items != 0 skip set_bit add scr->nr_item to dst After this point dst->nr_item may never go zero, so reparenting will not set shrinker_map bit anymore. And since there is no task under user slice directly, so no new object will be added to its lru to set the shrinker map bit either. That bit is kept cleared forever. How does list_lru_del() race with reparenting? It is because reparenting replaces children's kmemcg_id to parent's without protecting from nlru->lock, so list_lru_del() may see parent's kmemcg_id but actually deleting items from child's lru, but dec'ing parent's nr_items, so the parent's nr_items may go negative as commit 2788cf0c401c ("memcg: reparent list_lrus and free kmemcg_id on css offline") says. Since it is impossible that dst->nr_items goes negative and src->nr_items goes zero at the same time, so it seems we could set the shrinker map bit iff src->nr_items != 0. We could synchronize list_lru_count_one() and reparenting with nlru->lock, but it seems checking src->nr_items in reparenting is the simplest and avoids lock contention. Fixes: fae91d6d8be5 ("mm/list_lru.c: set bit in memcg shrinker bitmap on first list_lru item appearance") Suggested-by: Roman Gushchin <[email protected]> Signed-off-by: Yang Shi <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Roman Gushchin <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Acked-by: Kirill Tkhai <[email protected]> Cc: Vladimir Davydov <[email protected]> Cc: <[email protected]> [4.19] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06mm: memcg/slab: fix obj_cgroup_charge() return value handlingRoman Gushchin1-16/+24
Commit 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations") introduced a regression into the handling of the obj_cgroup_charge() return value. If a non-zero value is returned (indicating of exceeding one of memory.max limits), the allocation should fail, instead of falling back to non-accounted mode. To make the code more readable, move memcg_slab_pre_alloc_hook() and memcg_slab_post_alloc_hook() calling conditions into bodies of these hooks. Fixes: 10befea91b61 ("mm: memcg/slab: use a single set of kmem_caches for all allocations") Signed-off-by: Roman Gushchin <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Shakeel Butt <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06coredump: fix core_pattern parse errorMenglong Dong1-1/+2
'format_corename()' will splite 'core_pattern' on spaces when it is in pipe mode, and take helper_argv[0] as the path to usermode executable. It works fine in most cases. However, if there is a space between '|' and '/file/path', such as '| /usr/lib/systemd/systemd-coredump %P %u %g', then helper_argv[0] will be parsed as '', and users will get a 'Core dump to | disabled'. It is not friendly to users, as the pattern above was valid previously. Fix this by ignoring the spaces between '|' and '/file/path'. Fixes: 315c69261dd3 ("coredump: split pipe command whitespace before expanding template") Signed-off-by: Menglong Dong <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Paul Wise <[email protected]> Cc: Jakub Wilk <[email protected]> [https://bugs.debian.org/924398] Cc: Neil Horman <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06zlib: export S390 symbols for zlib modulesRandy Dunlap1-0/+3
Fix build errors when ZLIB_INFLATE=m and ZLIB_DEFLATE=m and ZLIB_DFLTCC=y by exporting the 2 needed symbols in dfltcc_inflate.c. Fixes these build errors: ERROR: modpost: "dfltcc_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined! ERROR: modpost: "dfltcc_can_inflate" [lib/zlib_inflate/zlib_inflate.ko] undefined! Fixes: 126196100063 ("lib/zlib: add s390 hardware support for kernel zlib_inflate") Reported-by: kernel test robot <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Ilya Leoshkevich <[email protected]> Cc: Mikhail Zaslonko <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-12-06kbuild: avoid split lines in .mod filesMasahiro Yamada1-8/+4
"xargs echo" is not a safe way to remove line breaks because the input may exceed the command line limit and xargs may break it up into multiple invocations of echo. This should never happen because scripts/gen_autoksyms.sh expects all undefined symbols are placed in the second line of .mod files. One possible way is to replace "xargs echo" with "sed ':x;N;$!bx;s/\n/ /g'" or something, but I rewrote the code by using awk because it is more readable. This issue was reported by Sami Tolvanen; in his Clang LTO patch set, $(multi-used-m) is no longer an ELF object, but a thin archive that contains LLVM bitcode files. llvm-nm prints out symbols for each archive member separately, which results a lot of dupications, in some places, beyond the system-defined limit. This problem must be fixed irrespective of LTO, and we must ensure zero possibility of having this issue. Link: https://lkml.org/lkml/2020/12/1/1658 Reported-by: Sami Tolvanen <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Sami Tolvanen <[email protected]>
2020-12-06Revert "mei: virtio: virtualization frontend driver"Michael S. Tsirkin3-887/+0
This reverts commit d162219c655c8cf8003128a13840d6c1e183fb80. The device uses a VIRTIO device ID out of a not-for-production range. Releasing Linux using an ID out of this range will make it conflict with development setups. An official request to reserve an ID for an MEI device is yet to be submitted to the virtio TC, thus there's no chance it will be reserved and fixed in time before the next release. Once requested it usually takes 2-3 weeks to land in the spec, which means the device can be supported with the official ID in the next Linux version if contributors act quickly. Signed-off-by: Michael S. Tsirkin <[email protected]> Cc: Tomas Winkler <[email protected]> Cc: Alexander Usyskin <[email protected]> Cc: Wang Yu <[email protected]> Cc: Liu Shuo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-12-06x86/sev-es: Use new for_each_insn_prefix() macro to loop over prefixes bytesMasami Hiramatsu1-3/+2
Since insn.prefixes.nbytes can be bigger than the size of insn.prefixes.bytes[] when a prefix is repeated, the proper check must be: insn.prefixes.bytes[i] != 0 and i < 4 instead of using insn.prefixes.nbytes. Use the new for_each_insn_prefix() macro which does it correctly. Debugged by Kees Cook <[email protected]>. [ bp: Massage commit message. ] Fixes: 25189d08e516 ("x86/sev-es: Add support for handling IOIO exceptions") Reported-by: [email protected] Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/160697106089.3146288.2052422845039649176.stgit@devnote2
2020-12-06x86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytesMasami Hiramatsu1-5/+5
Since insn.prefixes.nbytes can be bigger than the size of insn.prefixes.bytes[] when a prefix is repeated, the proper check must be insn.prefixes.bytes[i] != 0 and i < 4 instead of using insn.prefixes.nbytes. Use the new for_each_insn_prefix() macro which does it correctly. Debugged by Kees Cook <[email protected]>. [ bp: Massage commit message. ] Fixes: 32d0b95300db ("x86/insn-eval: Add utility functions to get segment selector") Reported-by: [email protected] Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/160697104969.3146288.16329307586428270032.stgit@devnote2
2020-12-06x86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytesMasami Hiramatsu3-4/+36
Since insn.prefixes.nbytes can be bigger than the size of insn.prefixes.bytes[] when a prefix is repeated, the proper check must be insn.prefixes.bytes[i] != 0 and i < 4 instead of using insn.prefixes.nbytes. Introduce a for_each_insn_prefix() macro for this purpose. Debugged by Kees Cook <[email protected]>. [ bp: Massage commit message, sync with the respective header in tools/ and drop "we". ] Fixes: 2b1444983508 ("uprobes, mm, x86: Add the ability to install and remove uprobes breakpoints") Reported-by: [email protected] Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/160697103739.3146288.7437620795200799020.stgit@devnote2
2020-12-05Merge branch 'for-linus' of ↵Linus Torvalds5-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "A fix for 'RETRIGEN' handling in Atmel touch controllers that was causing lost interrupts on systems using edge-triggered interrupts, a quirk for i8042 driver, and a couple more fixes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: atmel_mxt_ts - fix lost interrupts Input: xpad - support Ardwiino Controllers Input: i8042 - add ByteSpeed touchpad to noloop table Input: i8042 - fix error return code in i8042_setup_aux() Input: soc_button_array - add missing include
2020-12-05net: mscc: ocelot: fix dropping of unknown IPv4 multicast on SevilleVladimir Oltean6-11/+11
The current assumption is that the felix DSA driver has flooding knobs per traffic class, while ocelot switchdev has a single flooding knob. This was correct for felix VSC9959 and ocelot VSC7514, but with the introduction of seville VSC9953, we see a switch driven by felix.c which has a single flooding knob. So it is clear that we must do what should have been done from the beginning, which is not to overwrite the configuration done by ocelot.c in felix, but instead to teach the common ocelot library about the differences in our switches, and set up the flooding PGIDs centrally. The effect that the bogus iteration through FELIX_NUM_TC has upon seville is quite dramatic. ANA_FLOODING is located at 0x00b548, and ANA_FLOODING_IPMC is located at 0x00b54c. So the bogus iteration will actually overwrite ANA_FLOODING_IPMC when attempting to write ANA_FLOODING[1]. There is no ANA_FLOODING[1] in sevile, just ANA_FLOODING. And when ANA_FLOODING_IPMC is overwritten with a bogus value, the effect is that ANA_FLOODING_IPMC gets the value of 0x0003CF7D: MC6_DATA = 61, MC6_CTRL = 61, MC4_DATA = 60, MC4_CTRL = 0. Because MC4_CTRL is zero, this means that IPv4 multicast control packets are not flooded, but dropped. An invalid configuration, and this is how the issue was actually spotted. Reported-by: Eldar Gasanov <[email protected]> Reported-by: Maxim Kochetkov <[email protected]> Tested-by: Eldar Gasanov <[email protected]> Fixes: 84705fc16552 ("net: dsa: felix: introduce support for Seville VSC9953 switch") Fixes: 3c7b51bd39b2 ("net: dsa: felix: allow flooding for all traffic classes") Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Alexandre Belloni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-05Merge branch 'i2c/for-current' of ↵Linus Torvalds5-17/+47
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "Some more I2C driver updates. IMX updates are a tad bigger, but not exceptionally big" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: mlxbf: Fix the return check of devm_ioremap and ioremap i2c: mlxbf: select CONFIG_I2C_SLAVE i2c: imx: Don't generate STOP condition if arbitration has been lost i2c: imx: Check for I2SR_IAL after every byte i2c: imx: Fix reset of I2SR_IAL flag i2c: qcom: Fix IRQ error misassignement i2c: qup: Fix error return code in qup_i2c_bam_schedule_desc()
2020-12-05Merge tag 'block-5.10-2020-12-05' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+4
Pull block fix from Jens Axboe: "Single fix for an issue with chunk_sectors and stacked devices" * tag 'block-5.10-2020-12-05' of git://git.kernel.dk/linux-block: block: use gcd() to fix chunk_sectors limit stacking
2020-12-05Merge tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-blockLinus Torvalds1-1/+2
Pull io_uring fix from Jens Axboe: "Just a small fix this time, for an issue with 32-bit compat apps and buffer selection with recvmsg" * tag 'io_uring-5.10-2020-12-05' of git://git.kernel.dk/linux-block: io_uring: fix recvmsg setup with compat buf-select
2020-12-05net: marvell: prestera: Fix error return code in prestera_port_create()Zhang Changzhong1-1/+3
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-05vrf: packets with lladdr src needs dst at input with orig_iif when needs strictStephen Suryaputra2-2/+103
Depending on the order of the routes to fe80::/64 are installed on the VRF table, the NS for the source link-local address of the originator might be sent to the wrong interface. This patch ensures that packets with link-local addr source is doing a lookup with the orig_iif when the destination addr indicates that it is strict. Add the reproducer as a use case in self test script fcnal-test.sh. Fixes: b4869aa2f881 ("net: vrf: ipv6 support for local traffic to local addresses") Signed-off-by: Stephen Suryaputra <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-05can: softing: softing_netdev_open(): fix error handlingZhang Qilong1-2/+7
If softing_netdev_open() fails, we should call close_candev() to avoid reference leak. Fixes: 03fd3cf5a179d ("can: add driver for Softing card") Signed-off-by: Zhang Qilong <[email protected]> Acked-by: Kurt Van Dijck <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Kleine-Budde <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-05ch_ktls: fix build warning for ipv4-only configArnd Bergmann1-5/+1
When CONFIG_IPV6 is disabled, clang complains that a variable is uninitialized for non-IPv4 data: drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1046:6: error: variable 'cntrl1' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] if (tx_info->ip_family == AF_INET) { ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/chelsio/inline_crypto/ch_ktls/chcr_ktls.c:1059:2: note: uninitialized use occurs here cntrl1 |= T6_TXPKT_ETHHDR_LEN_V(maclen - ETH_HLEN) | ^~~~~~ Replace the preprocessor conditional with the corresponding C version, and make the ipv4 case unconditional in this configuration to improve readability and avoid the warning. Fixes: 86716b51d14f ("ch_ktls: Update cheksum information") Signed-off-by: Arnd Bergmann <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-05Merge tag 'powerpc-5.10-5' of ↵Linus Torvalds10-18/+70
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 5.10: - Three commits fixing possible missed TLB invalidations for multi-threaded processes when CPUs are hotplugged in and out. - A fix for a host crash triggerable by host userspace (qemu) in KVM on Power9. - A fix for a host crash in machine check handling when running HPT guests on a HPT host. - One commit fixing potential missed TLB invalidations when using the hash MMU on Power9 or later. - A regression fix for machines with CPUs on node 0 but no memory. Thanks to Aneesh Kumar K.V, Cédric Le Goater, Greg Kurz, Milan Mohanty, Milton Miller, Nicholas Piggin, Paul Mackerras, and Srikar Dronamraju" * tag 'powerpc-5.10-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64s/powernv: Fix memory corruption when saving SLB entries on MCE KVM: PPC: Book3S HV: XIVE: Fix vCPU id sanity check powerpc/numa: Fix a regression on memoryless node 0 powerpc/64s: Trim offlined CPUs from mm_cpumasks kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling powerpc/64s/pseries: Fix hash tlbiel_all_isa300 for guest kernels powerpc/64s: Fix hash ISA v3.0 TLBIEL instruction generation
2020-12-05Merge tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds4-38/+42
Pull cifs fixes from Steve French: "Three smb3 fixes (two for stable) fixing - a null pointer issue in a DFS error path - a problem with excessive padding when mounted with "idsfromsid" causing owner fields to get corrupted - a more recent problem with compounded reparse point query found in testing to the Linux kernel server" * tag '5.10-rc6-smb3-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6: cifs: refactor create_sd_buf() and and avoid corrupting the buffer cifs: add NULL check for ses->tcon_ipc smb3: set COMPOUND_FID to FileID field of subsequent compound request
2020-12-05Merge tag 'scsi-fixes' of ↵Linus Torvalds3-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four small fixes in two drivers. The mpt3sas fixes are all problems with timeout under unusual conditions, and the storvsc is a missed incoming packet validation and a missed error return" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: mpt3sas: Increase IOCInit request timeout to 30s scsi: mpt3sas: Fix ioctl timeout scsi: storvsc: Validate length of incoming packet in storvsc_on_channel_callback() scsi: storvsc: Fix error return in storvsc_probe()
2020-12-05Merge tag 'for-5.10/dm-fixes-2' of ↵Linus Torvalds1-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull fix for device mapper fixes from Mike Snitzer: "Apologies for the glaring bug I introduced with my previous pull request! Fix incorrect branching at top of blk_max_size_offset()" * tag 'for-5.10/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: block: fix incorrect branching in blk_max_size_offset()
2020-12-05i2c: mlxbf: Fix the return check of devm_ioremap and ioremapWang Xiaojun1-6/+6
devm_ioremap and ioremap may return NULL which cannot be checked by IS_ERR. Signed-off-by: Wang Xiaojun <[email protected]> Reported-by: Hulk Robot <[email protected]> Acked-by: Khalil Blaiech <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2020-12-05i2c: mlxbf: select CONFIG_I2C_SLAVEArnd Bergmann1-0/+1
If this is not enabled, the interfaces used in this driver do not work: drivers/i2c/busses/i2c-mlxbf.c:1888:3: error: implicit declaration of function 'i2c_slave_event' [-Werror,-Wimplicit-function-declaration] i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value); ^ drivers/i2c/busses/i2c-mlxbf.c:1888:26: error: use of undeclared identifier 'I2C_SLAVE_WRITE_REQUESTED' i2c_slave_event(slave, I2C_SLAVE_WRITE_REQUESTED, &value); ^ drivers/i2c/busses/i2c-mlxbf.c:1890:32: error: use of undeclared identifier 'I2C_SLAVE_WRITE_RECEIVED' ret = i2c_slave_event(slave, I2C_SLAVE_WRITE_RECEIVED, ^ drivers/i2c/busses/i2c-mlxbf.c:1892:26: error: use of undeclared identifier 'I2C_SLAVE_STOP' i2c_slave_event(slave, I2C_SLAVE_STOP, &value); ^ Fixes: b5b5b32081cd ("i2c: mlxbf: I2C SMBus driver for Mellanox BlueField SoC") Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Khalil Blaiech <[email protected]> Signed-off-by: Wolfram Sang <[email protected]>
2020-12-04mac80211: mesh: fix mesh_pathtbl_init() error pathEric Dumazet1-3/+1
If tbl_mpp can not be allocated, we call mesh_table_free(tbl_path) while tbl_path rhashtable has not yet been initialized, which causes panics. Simply factorize the rhashtable_init() call into mesh_table_alloc() WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __flush_work kernel/workqueue.c:3040 [inline] WARNING: CPU: 1 PID: 8474 at kernel/workqueue.c:3040 __cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136 Modules linked in: CPU: 1 PID: 8474 Comm: syz-executor663 Not tainted 5.10.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__flush_work kernel/workqueue.c:3040 [inline] RIP: 0010:__cancel_work_timer+0x514/0x540 kernel/workqueue.c:3136 Code: 5d c3 e8 bf ae 29 00 0f 0b e9 f0 fd ff ff e8 b3 ae 29 00 0f 0b 43 80 3c 3e 00 0f 85 31 ff ff ff e9 34 ff ff ff e8 9c ae 29 00 <0f> 0b e9 dc fe ff ff 89 e9 80 e1 07 80 c1 03 38 c1 0f 8c 7d fd ff RSP: 0018:ffffc9000165f5a0 EFLAGS: 00010293 RAX: ffffffff814b7064 RBX: 0000000000000001 RCX: ffff888021c80000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff888024039ca0 R08: dffffc0000000000 R09: fffffbfff1dd3e64 R10: fffffbfff1dd3e64 R11: 0000000000000000 R12: 1ffff920002cbebd R13: ffff888024039c88 R14: 1ffff11004807391 R15: dffffc0000000000 FS: 0000000001347880(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000140 CR3: 000000002cc0a000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rhashtable_free_and_destroy+0x25/0x9c0 lib/rhashtable.c:1137 mesh_table_free net/mac80211/mesh_pathtbl.c:69 [inline] mesh_pathtbl_init+0x287/0x2e0 net/mac80211/mesh_pathtbl.c:785 ieee80211_mesh_init_sdata+0x2ee/0x530 net/mac80211/mesh.c:1591 ieee80211_setup_sdata+0x733/0xc40 net/mac80211/iface.c:1569 ieee80211_if_add+0xd5c/0x1cd0 net/mac80211/iface.c:1987 ieee80211_add_iface+0x59/0x130 net/mac80211/cfg.c:125 rdev_add_virtual_intf net/wireless/rdev-ops.h:45 [inline] nl80211_new_interface+0x563/0xb40 net/wireless/nl80211.c:3855 genl_family_rcv_msg_doit net/netlink/genetlink.c:739 [inline] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline] genl_rcv_msg+0xe4e/0x1280 net/netlink/genetlink.c:800 netlink_rcv_skb+0x190/0x3a0 net/netlink/af_netlink.c:2494 genl_rcv+0x24/0x40 net/netlink/genetlink.c:811 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x780/0x930 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x9a8/0xd40 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] ____sys_sendmsg+0x519/0x800 net/socket.c:2353 ___sys_sendmsg net/socket.c:2407 [inline] __sys_sendmsg+0x2b1/0x360 net/socket.c:2440 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 60854fd94573 ("mac80211: mesh: convert path table to rhashtable") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Reviewed-by: Johannes Berg <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04[SECURITY] fix namespaced fscaps when !CONFIG_SECURITYSerge Hallyn1-1/+1
Namespaced file capabilities were introduced in 8db6c34f1dbc . When userspace reads an xattr for a namespaced capability, a virtualized representation of it is returned if the caller is in a user namespace owned by the capability's owning rootid. The function which performs this virtualization was not hooked up if CONFIG_SECURITY=n. Therefore in that case the original xattr was shown instead of the virtualized one. To test this using libcap-bin (*1), $ v=$(mktemp) $ unshare -Ur setcap cap_sys_admin-eip $v $ unshare -Ur setcap -v cap_sys_admin-eip $v /tmp/tmp.lSiIFRvt8Y: OK "setcap -v" verifies the values instead of setting them, and will check whether the rootid value is set. Therefore, with this bug un-fixed, and with CONFIG_SECURITY=n, setcap -v will fail: $ v=$(mktemp) $ unshare -Ur setcap cap_sys_admin=eip $v $ unshare -Ur setcap -v cap_sys_admin=eip $v nsowner[got=1000, want=0],/tmp/tmp.HHDiOOl9fY differs in [] Fix this bug by calling cap_inode_getsecurity() in security_inode_getsecurity() instead of returning -EOPNOTSUPP, when CONFIG_SECURITY=n. *1 - note, if libcap is too old for getcap to have the '-n' option, then use verify-caps instead. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=209689 Cc: Hervé Guillemet <[email protected]> Acked-by: Casey Schaufler <[email protected]> Signed-off-by: Serge Hallyn <[email protected]> Signed-off-by: Andrew G. Morgan <[email protected]> Signed-off-by: James Morris <[email protected]>
2020-12-04openvswitch: fix error return code in validate_and_copy_dec_ttl()Wang Hai1-1/+1
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Changing 'return start' to 'return action_start' can fix this bug. Fixes: 69929d4c49e1 ("net: openvswitch: fix TTL decrement action netlink message format") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Reviewed-by: Eelco Chaudron <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04net: bridge: vlan: fix error return code in __vlan_add()Zhang Changzhong1-1/+3
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: f8ed289fab84 ("bridge: vlan: use br_vlan_(get|put)_master to deal with refcounts") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04ipv4: fix error return code in rtm_to_fib_config()Zhang Changzhong1-1/+1
Fix to return a negative error code from the error handling case instead of 0, as done elsewhere in this function. Fixes: d15662682db2 ("ipv4: Allow ipv6 gateway with ipv4 routes") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Zhang Changzhong <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04ethernet: select CONFIG_CRC32 as neededArnd Bergmann10-0/+10
A number of ethernet drivers require crc32 functionality to be avaialable in the kernel, causing a link error otherwise: arm-linux-gnueabi-ld: drivers/net/ethernet/agere/et131x.o: in function `et1310_setup_device_for_multicast': et131x.c:(.text+0x5918): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/cadence/macb_main.o: in function `macb_start_xmit': macb_main.c:(.text+0x4b88): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/faraday/ftgmac100.o: in function `ftgmac100_set_rx_mode': ftgmac100.c:(.text+0x2b38): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fec_main.o: in function `set_multicast_list': fec_main.c:(.text+0x6120): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o: in function `dtsec_add_hash_mac_address': fman_dtsec.c:(.text+0x830): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/freescale/fman/fman_dtsec.o:fman_dtsec.c:(.text+0xb68): more undefined references to `crc32_le' follow arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_hwinfo.o: in function `nfp_hwinfo_read': nfp_hwinfo.c:(.text+0x250): undefined reference to `crc32_be' arm-linux-gnueabi-ld: nfp_hwinfo.c:(.text+0x288): undefined reference to `crc32_be' arm-linux-gnueabi-ld: drivers/net/ethernet/netronome/nfp/nfpcore/nfp_resource.o: in function `nfp_resource_acquire': nfp_resource.c:(.text+0x144): undefined reference to `crc32_be' arm-linux-gnueabi-ld: nfp_resource.c:(.text+0x158): undefined reference to `crc32_be' arm-linux-gnueabi-ld: drivers/net/ethernet/nxp/lpc_eth.o: in function `lpc_eth_set_multicast_list': lpc_eth.c:(.text+0x1934): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_do': rocker_ofdpa.c:(.text+0x2e08): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_flow_tbl_del': rocker_ofdpa.c:(.text+0x3074): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/rocker/rocker_ofdpa.o: in function `ofdpa_port_fdb': arm-linux-gnueabi-ld: drivers/net/ethernet/mellanox/mlx5/core/steering/dr_ste.o: in function `mlx5dr_ste_calc_hash_index': dr_ste.c:(.text+0x354): undefined reference to `crc32_le' arm-linux-gnueabi-ld: drivers/net/ethernet/microchip/lan743x_main.o: in function `lan743x_netdev_set_multicast': lan743x_main.c:(.text+0x5dc4): undefined reference to `crc32_le' Add the missing 'select CRC32' entries in Kconfig for each of them. Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Nicolas Ferre <[email protected]> Acked-by: Madalin Bucur <[email protected]> Acked-by: Mark Einon <[email protected]> Acked-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04net: ipa: pass the correct size when freeing DMA memoryAlex Elder1-1/+6
When the coherent memory is freed in gsi_trans_pool_exit_dma(), we are mistakenly passing the size of a single element in the pool rather than the actual allocated size. Fix this bug. Fixes: 9dd441e4ed575 ("soc: qcom: ipa: GSI transactions") Reported-by: Stephen Boyd <[email protected]> Tested-by: Sujit Kautkar <[email protected]> Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04block: fix incorrect branching in blk_max_size_offset()Mike Snitzer1-4/+6
If non-zero 'chunk_sectors' is passed in to blk_max_size_offset() that override will be incorrectly ignored. Old blk_max_size_offset() branching, prior to commit 3ee16db390b4, must be used only if passed 'chunk_sectors' override is zero. Fixes: 3ee16db390b4 ("dm: fix IO splitting") Cc: [email protected] # 5.9 Reported-by: John Dorminy <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2020-12-04net/sched: fq_pie: initialize timer earlier in fq_pie_init()Davide Caratti1-1/+1
with the following tdc testcase: 83be: (qdisc, fq_pie) Create FQ-PIE with invalid number of flows as fq_pie_init() fails, fq_pie_destroy() is called to clean up. Since the timer is not yet initialized, it's possible to observe a splat like this: INFO: trying to register non-static key. the code is fine but needs lockdep annotation. turning off the locking correctness validator. CPU: 0 PID: 975 Comm: tc Not tainted 5.10.0-rc4+ #298 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 Call Trace: dump_stack+0x99/0xcb register_lock_class+0x12dd/0x1750 __lock_acquire+0xfe/0x3970 lock_acquire+0x1c8/0x7f0 del_timer_sync+0x49/0xd0 fq_pie_destroy+0x3f/0x80 [sch_fq_pie] qdisc_create+0x916/0x1160 tc_modify_qdisc+0x3c4/0x1630 rtnetlink_rcv_msg+0x346/0x8e0 netlink_unicast+0x439/0x630 netlink_sendmsg+0x719/0xbf0 sock_sendmsg+0xe2/0x110 ____sys_sendmsg+0x5ba/0x890 ___sys_sendmsg+0xe9/0x160 __sys_sendmsg+0xd3/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 [...] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 WARNING: CPU: 0 PID: 975 at lib/debugobjects.c:508 debug_print_object+0x162/0x210 [...] Call Trace: debug_object_assert_init+0x268/0x380 try_to_del_timer_sync+0x6a/0x100 del_timer_sync+0x9e/0xd0 fq_pie_destroy+0x3f/0x80 [sch_fq_pie] qdisc_create+0x916/0x1160 tc_modify_qdisc+0x3c4/0x1630 rtnetlink_rcv_msg+0x346/0x8e0 netlink_rcv_skb+0x120/0x380 netlink_unicast+0x439/0x630 netlink_sendmsg+0x719/0xbf0 sock_sendmsg+0xe2/0x110 ____sys_sendmsg+0x5ba/0x890 ___sys_sendmsg+0xe9/0x160 __sys_sendmsg+0xd3/0x170 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 fix it moving timer_setup() before any failure, like it was done on 'red' with former commit 608b4adab178 ("net_sched: initialize timer earlier in red_init()"). Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler") Signed-off-by: Davide Caratti <[email protected]> Reviewed-by: Cong Wang <[email protected]> Link: https://lore.kernel.org/r/2e78e01c504c633ebdff18d041833cf2e079a3a4.1607020450.git.dcaratti@redhat.com Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04tracing: Fix userstacktrace option for instancesSteven Rostedt (VMware)1-5/+8
When the instances were able to use their own options, the userstacktrace option was left hardcoded for the top level. This made the instance userstacktrace option bascially into a nop, and will confuse users that set it, but nothing happens (I was confused when it happened to me!) Cc: [email protected] Fixes: 16270145ce6b ("tracing: Add trace options for core options to instances") Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-12-04Merge tag 'for-5.10/dm-fixes' of ↵Linus Torvalds7-39/+28
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix DM's bio splitting changes that were made during v5.9. This restores splitting in terms of varied per-target ti->max_io_len rather than use block core's single stacked 'chunk_sectors' limit. - Like DM crypt, update DM integrity to not use crypto drivers that have CRYPTO_ALG_ALLOCATES_MEMORY set. - Fix DM writecache target's argument parsing and status display. - Remove needless BUG() from dm writecache's persistent_memory_claim() - Remove old gcc workaround in DM cache target's block_div() for ARM link errors now that gcc >= 4.9 is required. - Fix RCU locking in dm_blk_report_zones and dm_dax_zero_page_range. - Remove old, and now frowned upon, BUG_ON(in_interrupt()) in dm_table_event(). - Remove invalid sparse annotations from dm_prepare_ioctl() and dm_unprepare_ioctl(). * tag 'for-5.10/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: remove invalid sparse __acquires and __releases annotations dm: fix double RCU unlock in dm_dax_zero_page_range() error path dm: fix IO splitting dm writecache: remove BUG() and fail gracefully instead dm table: Remove BUG_ON(in_interrupt()) dm: fix bug with RCU locking in dm_blk_report_zones Revert "dm cache: fix arm link errors with inline" dm writecache: fix the maximum number of arguments dm writecache: advance the number of arguments when reporting max_age dm integrity: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY
2020-12-04dm: remove invalid sparse __acquires and __releases annotationsMike Snitzer1-2/+0
Fixes sparse warnings: drivers/md/dm.c:508:12: warning: context imbalance in 'dm_prepare_ioctl' - wrong count at exit drivers/md/dm.c:543:13: warning: context imbalance in 'dm_unprepare_ioctl' - wrong count at exit Fixes: 971888c46993f ("dm: hold DM table for duration of ioctl rather than use blkdev_get") Cc: [email protected] Signed-off-by: Mike Snitzer <[email protected]>
2020-12-04dm: fix double RCU unlock in dm_dax_zero_page_range() error pathMike Snitzer1-2/+0
Remove redundant dm_put_live_table() in dm_dax_zero_page_range() error path to fix sparse warning: drivers/md/dm.c:1208:9: warning: context imbalance in 'dm_dax_zero_page_range' - unexpected unlock Fixes: cdf6cdcd3b99a ("dm,dax: Add dax zero_page_range operation") Cc: [email protected] Signed-off-by: Mike Snitzer <[email protected]>
2020-12-04dm: fix IO splittingMike Snitzer4-19/+18
Commit 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") caused a couple regressions: 1) Using lcm_not_zero() when stacking chunk_sectors was a bug because chunk_sectors must reflect the most limited of all devices in the IO stack. 2) DM targets that set max_io_len but that do _not_ provide an .iterate_devices method no longer had there IO split properly. And commit 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") also caused a regression where DM no longer supported varied (per target) IO splitting. The implication being the potential for severely reduced performance for IO stacks that use a DM target like dm-cache to hide performance limitations of a slower device (e.g. one that requires 4K IO splitting). Coming full circle: Fix all these issues by discontinuing stacking chunk_sectors up using ti->max_io_len in dm_calculate_queue_limits(), add optional chunk_sectors override argument to blk_max_size_offset() and update DM's max_io_len() to pass ti->max_io_len to its blk_max_size_offset() call. Passing in an optional chunk_sectors override to blk_max_size_offset() allows for code reuse of block's centralized calculation for max IO size based on provided offset and split boundary. Fixes: 882ec4e609c1 ("dm table: stack 'chunk_sectors' limit to account for target-specific splitting") Fixes: 5091cdec56fa ("dm: change max_io_len() to use blk_max_size_offset()") Cc: [email protected] Reported-by: John Dorminy <[email protected]> Reported-by: Bruce Johnston <[email protected]> Reported-by: Kirill Tkhai <[email protected]> Reviewed-by: John Dorminy <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Reviewed-by: Jens Axboe <[email protected]>
2020-12-04Merge tag 'mac80211-for-net-2020-12-04' of ↵Jakub Kicinski3-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Three small fixes: * initialize some data to avoid using stack garbage * fix 6 GHz channel selection in mac80211 * correctly restart monitor mode interfaces in mac80211 * tag 'mac80211-for-net-2020-12-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211: mac80211: set SDATA_STATE_RUNNING for monitor interfaces cfg80211: initialize rekey_data mac80211: fix return value of ieee80211_chandef_he_6ghz_oper ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2020-12-04Merge tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drmLinus Torvalds20-185/+233
Pull drm fixes from Dave Airlie: "This week's regular fixes. i915 has fixes for a few races, use-after-free, and gpu hangs. Tegra just has some minor fixes that I didn't see much point in hanging on to. The nouveau fix is for all pre-nv50 cards and was reported a few times. Otherwise it's just some amdgpu, and a few misc fixes. Summary: amdgpu: - SMU11 manual fan fix - Renoir display clock fix - VCN3 dynamic powergating fix i915: - Program mocs:63 for cache eviction on gen9 (Chris) - Protect context lifetime with RCU (Chris) - Split the breadcrumb spinlock between global and contexts (Chris) - Retain default context state across shrinking (Venkata) - Limit frequency drop to RPe on parking (Chris) - Return earlier from intel_modeset_init() without display (Jani) - Defer initial modeset until after GGTT is initialized (Chris) nouveau: - pre-nv50 regression fix rockchip: - uninitialised LVDS property fix omap: - bridge fix panel: - race fix mxsfb: - fence sync fix - modifiers fix tegra: - idr init fix - sor fixes - output/of cleanup fix" * tag 'drm-fixes-2020-12-04' of git://anongit.freedesktop.org/drm/drm: (22 commits) drm/amdgpu/vcn3.0: remove old DPG workaround drm/amdgpu/vcn3.0: stall DPG when WPTR/RPTR reset drm/amd/display: Init clock value by current vbios CLKs drm/amdgpu/pm/smu11: Fix fan set speed bug drm/i915/display: Defer initial modeset until after GGTT is initialised drm/i915/display: return earlier from intel_modeset_init() without display drm/i915/gt: Limit frequency drop to RPe on parking drm/i915/gt: Retain default context state across shrinking drm/i915/gt: Split the breadcrumb spinlock between global and contexts drm/i915/gt: Protect context lifetime with RCU drm/i915/gt: Program mocs:63 for cache eviction on gen9 drm/omap: sdi: fix bridge enable/disable drm/panel: sony-acx565akm: Fix race condition in probe drm/rockchip: Avoid uninitialized use of endpoint id in LVDS drm/tegra: sor: Disable clocks on error in tegra_sor_init() drm/nouveau: make sure ret is initialized in nouveau_ttm_io_mem_reserve drm: mxsfb: Implement .format_mod_supported drm: mxsfb: fix fence synchronization drm/tegra: output: Do not put OF node twice drm/tegra: replace idr_init() by idr_init_base() ...
2020-12-04tty: Fix ->session lockingJann Horn3-14/+41
Currently, locking of ->session is very inconsistent; most places protect it using the legacy tty mutex, but disassociate_ctty(), __do_SAK(), tiocspgrp() and tiocgsid() don't. Two of the writers hold the ctrl_lock (because they already need it for ->pgrp), but __proc_set_tty() doesn't do that yet. On a PREEMPT=y system, an unprivileged user can theoretically abuse this broken locking to read 4 bytes of freed memory via TIOCGSID if tiocgsid() is preempted long enough at the right point. (Other things might also go wrong, especially if root-only ioctls are involved; I'm not sure about that.) Change the locking on ->session such that: - tty_lock() is held by all writers: By making disassociate_ctty() hold it. This should be fine because the same lock can already be taken through the call to tty_vhangup_session(). The tricky part is that we need to shorten the area covered by siglock to be able to take tty_lock() without ugly retry logic; as far as I can tell, this should be fine, since nothing in the signal_struct is touched in the `if (tty)` branch. - ctrl_lock is held by all writers: By changing __proc_set_tty() to hold the lock a little longer. - All readers that aren't holding tty_lock() hold ctrl_lock: By adding locking to tiocgsid() and __do_SAK(), and expanding the area covered by ctrl_lock in tiocspgrp(). Cc: [email protected] Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2020-12-04tty: Fix ->pgrp locking in tiocspgrp()Jann Horn1-2/+2
tiocspgrp() takes two tty_struct pointers: One to the tty that userspace passed to ioctl() (`tty`) and one to the TTY being changed (`real_tty`). These pointers are different when ioctl() is called with a master fd. To properly lock real_tty->pgrp, we must take real_tty->ctrl_lock. This bug makes it possible for racing ioctl(TIOCSPGRP, ...) calls on both sides of a PTY pair to corrupt the refcount of `struct pid`, leading to use-after-free errors. Fixes: 47f86834bbd4 ("redo locking of tty->pgrp") CC: [email protected] Signed-off-by: Jann Horn <[email protected]> Reviewed-by: Jiri Slaby <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>