Age | Commit message (Collapse) | Author | Files | Lines |
|
The paired pte_unmap() call is missing before the
dev_pagemap_mapping_shift() returns. So fix it.
David says:
"I guess this code never runs on 32bit / highmem, that's why we didn't
notice so far".
[[email protected]: cleanup]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Qi Zheng <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Naoya Horiguchi <[email protected]>
Cc: Muchun Song <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently, the asan-stack parameter is only passed along if
CFLAGS_KASAN_SHADOW is not empty, which requires KASAN_SHADOW_OFFSET to
be defined in Kconfig so that the value can be checked. In RISC-V's
case, KASAN_SHADOW_OFFSET is not defined in Kconfig, which means that
asan-stack does not get disabled with clang even when CONFIG_KASAN_STACK
is disabled, resulting in large stack warnings with allmodconfig:
drivers/video/fbdev/omap2/omapfb/displays/panel-lgphilips-lb035q02.c:117:12: error: stack frame size (14400) exceeds limit (2048) in function 'lb035q02_connect' [-Werror,-Wframe-larger-than]
static int lb035q02_connect(struct omap_dss_device *dssdev)
^
1 error generated.
Ensure that the value of CONFIG_KASAN_STACK is always passed along to
the compiler so that these warnings do not happen when
CONFIG_KASAN_STACK is disabled.
Link: https://github.com/ClangBuiltLinux/linux/issues/1453
References: 6baec880d7a5 ("kasan: turn off asan-stack for clang-8 and earlier")
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Marco Elver <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
If X2TLB=y (CPU_SHX2=y or CPU_SHX3=y, e.g. migor_defconfig), pgd_t.pgd
is "unsigned long long", causing:
In file included from arch/sh/include/asm/pgtable.h:13,
from include/linux/pgtable.h:6,
from include/linux/mm.h:33,
from arch/sh/kernel/asm-offsets.c:14:
arch/sh/include/asm/pgtable-3level.h: In function `pud_pgtable':
arch/sh/include/asm/pgtable-3level.h:37:9: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
37 | return (pmd_t *)pud_val(pud);
| ^
Fix this by adding an intermediate cast to "unsigned long", which is
basically what the old code did before.
Link: https://lkml.kernel.org/r/2c2eef3c9a2f57e5609100a4864715ccf253d30f.1631713483.git.geert+renesas@glider.be
Fixes: 9cf6fa2458443118 ("mm: rename pud_page_vaddr to pud_pgtable and make it return pmd_t *")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Tested-by: Daniel Palmer <[email protected]>
Acked-by: Rob Landley <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: "Aneesh Kumar K . V" <[email protected]>
Cc: Jacopo Mondi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Sync up MR_DEMOTION to migrate_reason_names and add a synch prompt.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim")
Signed-off-by: Weizhao Ouyang <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Mina Almasry <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Wei Xu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Sync up MR_CONTIG_RANGE and MR_LONGTERM_PIN to migrate_reason_names.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 310253514bbf ("mm/migrate: rename migration reason MR_CMA to MR_CONTIG_RANGE")
Fixes: d1e153fea2a8 ("mm/gup: migrate pinned pages out of movable zone")
Signed-off-by: Weizhao Ouyang <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Yang Shi <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Mina Almasry <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Wei Xu <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The kernel test robot reported the regression of fio.write_iops[1] with
commit 8cc621d2f45d ("mm: fs: invalidate BH LRU during page migration").
Since lru_add_drain is called frequently, invalidate bh_lrus there could
increase bh_lrus cache miss ratio, which needs more IO in the end.
This patch moves the bh_lrus invalidation from the hot path( e.g.,
zap_page_range, pagevec_release) to cold path(i.e., lru_add_drain_all,
lru_cache_disable).
Zhengjun Xing confirmed
"I test the patch, the regression reduced to -2.9%"
[1] https://lore.kernel.org/lkml/20210520083144.GD14190@xsang-OptiPlex-9020/
[2] 8cc621d2f45d, mm: fs: invalidate BH LRU during page migration
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Minchan Kim <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Chris Goldsworthy <[email protected]>
Tested-by: "Xing, Zhengjun" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Building Linux for ppc64le with Ubuntu clang version
12.0.0-3ubuntu1~21.04.1 shows the warning below.
arch/powerpc/boot/inffast.c:20:1: warning: unused function 'get_unaligned16' [-Wunused-function]
get_unaligned16(const unsigned short *p)
^
1 warning generated.
Fix it by moving the check from the preprocessor to C, so the compiler
sees the use.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Paul Menzel <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Tested-by: Nathan Chancellor <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Zhen Lei <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Idle page tracking can also be used for process address space, not only
file mappings.
Without this change, using with '-i' option for process address space
encounters below errors reported.
$ sudo ./page-types -p $(pidof bash) -i
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
mark page idle: Bad file descriptor
...
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Changbin Du <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix the following build failure reported in [1] by adding a conditional
definition of EM_RISCV in order to allow cross-compilation on machines
which do not have EM_RISCV definition in their host.
scripts/sorttable.c:352:7: error: use of undeclared identifier 'EM_RISCV'
EM_RISCV was added to <elf.h> in glibc 2.24 so builds on systems with
glibc headers < 2.24 should show this error.
[[email protected]: changelog addition]
Link: https://lore.kernel.org/lkml/[email protected]/ [1]
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 54fed35fd393 ("riscv: Enable BUILDTIME_TABLE_SORT")
Signed-off-by: Miles Chen <[email protected]>
Reported-by: Stefan Wahren <[email protected]>
Tested-by: Stefan Wahren <[email protected]>
Reviewed-by: Jisheng Zhang <[email protected]>
Cc: Michal Kubecek <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Markus Mayer <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
ocfs2_data_convert_worker() is currently dropping any cached acl info
for FILE before down-converting meta lock. It should also drop for
DIRECTORY. Otherwise the second acl lookup returns the cached one (from
VFS layer) which could be already stale.
The problem we are seeing is that the acl changes on one node doesn't
get refreshed on other nodes in the following case:
Node 1 Node 2
-------------- ----------------
getfacl dir1
getfacl dir1 <-- this is OK
setfacl -m u:user1:rwX dir1
getfacl dir1 <-- see the change for user1
getfacl dir1 <-- can't see change for user1
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Wengang Wang <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In the case of SHMEM_HUGE_WITHIN_SIZE, the page index is not rounded up
correctly. When the page index points to the first page in a huge page,
round_up() cannot bring it to the end of the huge page, but to the end
of the previous one.
An example:
HPAGE_PMD_NR on my machine is 512(2 MB huge page size). After
allcoating a 3000 KB buffer, I access it at location 2050 KB. In
shmem_is_huge(), the corresponding index happens to be 512. After
rounded up by HPAGE_PMD_NR, it will still be 512 which is smaller than
i_size, and shmem_is_huge() will return true. As a result, my buffer
takes an additional huge page, and that shouldn't happen when
shmem_enabled is set to within_size.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: f3f0e1d2150b2b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: Liu Yuntao <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Cc: wuxu.wu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
xtensa frame size is larger than the frame size for almost all other
architectures. This results in more than 50 "the frame size of <n> is
larger than 1024 bytes" errors when trying to build xtensa:allmodconfig.
Increase frame size for xtensa to 1536 bytes to avoid compile errors due
to frame size limits.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
Reviewed-by: Max Filippov <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: David Laight <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
gcc knows the true length too, and rightfully complains.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Adam Borowski <[email protected]>
Cc: SeongJae Park <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In the main KASAN config option CC_HAS_WORKING_NOSANITIZE_ADDRESS is
checked for instrumentation-based modes. However, if
HAVE_ARCH_KASAN_HW_TAGS is true all modes may still be selected.
To fix, also make the software modes depend on
CC_HAS_WORKING_NOSANITIZE_ADDRESS.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 6a63a63ff1ac ("kasan: introduce CONFIG_KASAN_HW_TAGS")
Signed-off-by: Marco Elver <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Aleksandr Nogikh <[email protected]>
Cc: Taras Madan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit fcc00621d88b ("mm/hwpoison: retry with shake_page() for
unhandlable pages") changed the return value of __get_hwpoison_page() to
retry for transiently unhandlable cases. However, __get_hwpoison_page()
currently fails to properly judge buddy pages as handlable, so hard/soft
offline for buddy pages always fail as "unhandlable page". This is
totally regrettable.
So let's add is_free_buddy_page() in HWPoisonHandlable(), so that
__get_hwpoison_page() returns different return values between buddy
pages and unhandlable pages as intended.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: fcc00621d88b ("mm/hwpoison: retry with shake_page() for unhandlable pages")
Signed-off-by: Naoya Horiguchi <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Reviewed-by: Yang Shi <[email protected]>
Cc: Tony Luck <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
From recently open/accept are now able to manipulate fixed file table,
but it's inconsistent that close can't. Close the gap, keep API same as
with open/accept, i.e. via sqe->file_slot.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
- fix a regression in GPIO ACPI on HP ElitePad 1000 G2 where the
gpio_set_debounce_timeout() now returns a fatal error if the specific
debounce period is not supported by the driver instead of just
emitting a warning
- fix return values of irq_mask/unmask() callbacks in gpio-uniphier
- fix hwirq calculation in gpio-aspeed-sgpio
- fix two issues in gpio-rockchip: only make the extended debounce
support available for v2 and remove a redundant BIT() usage
* tag 'gpio-fixes-for-v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
gpio/rockchip: fix get_direction value handling
gpio/rockchip: extended debounce support is only available on v2
gpio: gpio-aspeed-sgpio: Fix wrong hwirq in irq handler.
gpio: uniphier: Fix void functions to remove return value
gpiolib: acpi: Make set-debounce-timeout failures non fatal
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull device properties framework fix from Rafael Wysocki:
"Fix software node refcount imbalance on device removal (Laurentiu
Tudor)"
* tag 'devprop-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
software node: balance refcount for managed software nodes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a recent commit related to memory management that turned out to
be problematic (Jia He)"
* tag 'acpi-5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: Add memory semantics to acpi_os_map_memory()"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- It turns out that the optimised string routines merged in 5.14 are
not safe with in-kernel MTE (KASAN_HW_TAGS) because of reading beyond
the end of a string (strcmp, strncmp). Such reading may go across a
16 byte tag granule and cause a tag check fault. When KASAN_HW_TAGS
is enabled, use the generic strcmp/strncmp C implementation.
- An errata workaround for ThunderX relied on the CPU capabilities
being enabled in a specific order. This disappeared with the
automatic generation of the cpucaps.h file (sorted alphabetically).
Fix it by checking the current CPU only rather than the system-wide
capability.
- Add system_supports_mte() checks on the kernel entry/exit path and
thread switching to avoid unnecessary barriers and function calls on
systems where MTE is not supported.
- kselftests: skip arm64 tests if the required features are missing.
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: Restore forced disabling of KPTI on ThunderX
kselftest/arm64: signal: Skip tests if required features are missing
arm64: Mitigate MTE issues with str{n}cmp()
arm64: add MTE supported check to thread switching and syscall entry/exit
|
|
Pull ceph fix from Ilya Dryomov:
"A fix for a potential array out of bounds access from Dan"
* tag 'ceph-for-5.15-rc3' of git://github.com/ceph/ceph-client:
ceph: fix off by one bugs in unsafe_request_wait()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull misc filesystem fixes from Jan Kara:
"A for ext2 sleep in atomic context in case of some fs problems and a
cleanup of an invalidate_lock initialization"
* tag 'fixes_for_v5.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
ext2: fix sleeping in atomic bugs on error
mm: Fully initialize invalidate_lock, amend lock class later
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
"Followups to nodev root stuff from this merge window"
* 'work.init' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
init: don't panic if mount_nodev_root failed
init/do_mounts.c: Harden split_fs_names() against buffer overflow
|
|
Pull drm fixes from Dave Airlie:
"Quiet week this week, just some i915 and amd fixes, just getting ready
for my all nighter maintainer summit!
Summary:
i915:
- Fix ADL-P memory bandwidth parameters
- Fix memory corruption due to a double free
- Fix memory leak in DMC firmware handling
amdgpu:
- Update MAINTAINERS entry for powerplay
- Fix empty macros
- SI DPM fix
amdkfd:
- SVM fixes
- DMA mapping fix"
* tag 'drm-fixes-2021-09-24' of git://anongit.freedesktop.org/drm/drm:
drm/amdkfd: fix svm_migrate_fini warning
drm/amdkfd: handle svm migrate init error
drm/amd/pm: Update intermediate power state for SI
drm/amdkfd: fix dma mapping leaking warning
drm/amdkfd: SVM map to gpus check vma boundary
MAINTAINERS: fix up entry for AMD Powerplay
drm/amd/display: fix empty debug macros
drm/i915: Free all DMC payloads
drm/i915: Move __i915_gem_free_object to ttm_bo_destroy
drm/i915: Update memory bandwidth parameters
|
|
When running ->fallocate(), blkdev_fallocate() should hold
mapping->invalidate_lock to prevent page cache from being accessed,
otherwise stale data may be read in page cache.
Without this patch, blktests block/009 fails sometimes. With this patch,
block/009 can pass always.
Also as Jan pointed out, no pages can be created in the discarded area
while you are holding the invalidate_lock, so remove the 2nd
truncate_bdev_range().
Cc: Jan Kara <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
There is an use-after-free problem triggered by following process:
P1(sda) P2(sdb)
echo 0 > /sys/block/sdb/trace/enable
blk_trace_remove_queue
synchronize_rcu
blk_trace_free
relay_close
rcu_read_lock
__blk_add_trace
trace_note_tsk
(Iterate running_trace_list)
relay_close_buf
relay_destroy_buf
kfree(buf)
trace_note(sdb's bt)
relay_reserve
buf->offset <- nullptr deference (use-after-free) !!!
rcu_read_unlock
[ 502.714379] BUG: kernel NULL pointer dereference, address:
0000000000000010
[ 502.715260] #PF: supervisor read access in kernel mode
[ 502.715903] #PF: error_code(0x0000) - not-present page
[ 502.716546] PGD 103984067 P4D 103984067 PUD 17592b067 PMD 0
[ 502.717252] Oops: 0000 [#1] SMP
[ 502.720308] RIP: 0010:trace_note.isra.0+0x86/0x360
[ 502.732872] Call Trace:
[ 502.733193] __blk_add_trace.cold+0x137/0x1a3
[ 502.733734] blk_add_trace_rq+0x7b/0xd0
[ 502.734207] blk_add_trace_rq_issue+0x54/0xa0
[ 502.734755] blk_mq_start_request+0xde/0x1b0
[ 502.735287] scsi_queue_rq+0x528/0x1140
...
[ 502.742704] sg_new_write.isra.0+0x16e/0x3e0
[ 502.747501] sg_ioctl+0x466/0x1100
Reproduce method:
ioctl(/dev/sda, BLKTRACESETUP, blk_user_trace_setup[buf_size=127])
ioctl(/dev/sda, BLKTRACESTART)
ioctl(/dev/sdb, BLKTRACESETUP, blk_user_trace_setup[buf_size=127])
ioctl(/dev/sdb, BLKTRACESTART)
echo 0 > /sys/block/sdb/trace/enable &
// Add delay(mdelay/msleep) before kernel enters blk_trace_free()
ioctl$SG_IO(/dev/sda, SG_IO, ...)
// Enters trace_note_tsk() after blk_trace_free() returned
// Use mdelay in rcu region rather than msleep(which may schedule out)
Remove blk_trace from running_list before calling blk_trace_free() by
sysfs if blk_trace is at Blktrace_running state.
Fixes: c71a896154119f ("blktrace: add ftrace plugin")
Signed-off-by: Zhihao Cheng <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
rq_qos framework is only applied on request based driver, so:
1) rq_qos_done_bio() needn't to be called for bio based driver
2) rq_qos_done_bio() needn't to be called for bio which isn't tracked,
such as bios ended from error handling code.
Especially in bio_endio():
1) request queue is referred via bio->bi_bdev->bd_disk->queue, which
may be gone since request queue refcount may not be held in above two
cases
2) q->rq_qos may be freed in blk_cleanup_queue() when calling into
__rq_qos_done_bio()
Fix the potential kernel panic by not calling rq_qos_ops->done_bio if
the bio isn't tracked. This way is safe because both ioc_rqos_done_bio()
and blkcg_iolatency_done_bio() are nop if the bio isn't tracked.
Reported-by: Yu Kuai <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Acked-by: Tejun Heo <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
We don't retry short writes and so we would never get to async setup in
io_write() in that case. Thus ret2 > 0 is always false and
iov_iter_advance() is never used. Apparently, the same is found by
Coverity, which complains on the code.
Fixes: cd65869512ab ("io_uring: use iov_iter state save/restore helpers")
Reported-by: Dave Jones <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/5b33e61034748ef1022766efc0fb8854cfcf749c.1632500058.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
There's no reason to punt it unconditionally, we just need to ensure that
the submit lock grabbing is conditional.
Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update")
Signed-off-by: Jens Axboe <[email protected]>
|
|
For each provided buffer, we allocate a struct io_buffer to hold the
data associated with it. As a large number of buffers can be provided,
account that data with memcg.
Fixes: ddf0322db79c ("io_uring: add IORING_OP_PROVIDE_BUFFERS")
Signed-off-by: Jens Axboe <[email protected]>
|
|
If we have a lot of threads and rings, the tctx list can get quite big.
This is especially true if we keep creating new threads and rings.
Likewise for the provided buffers list. Be nice and insert a conditional
reschedule point while iterating the nodes for deletion.
Link: https://lore.kernel.org/io-uring/[email protected]/
Reported-by: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
For multishot mode, there may be cases like:
iowq original context
io_poll_add
_arm_poll()
mask = vfs_poll() is not 0
if mask
(2) io_poll_complete()
compl_unlock
(interruption happens
tw queued to original
context)
io_poll_task_func()
compl_lock
(3) done = io_poll_complete() is true
compl_unlock
put req ref
(1) if (poll->flags & EPOLLONESHOT)
put req ref
EPOLLONESHOT flag in (1) may be from (2) or (3), so there are multiple
combinations that can cause ref underfow.
Let's address it by:
- check the return value in (2) as done
- change (1) to if (done)
in this way, we only do ref put in (1) if 'oneshot flag' is from
(2)
- do poll.done check in io_poll_task_func(), so that we won't put ref
for the second time.
Signed-off-by: Hao Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
We should set EPOLLONESHOT if cqring_fill_event() returns false since
io_poll_add() decides to put req or not by it.
Fixes: 5082620fb2ca ("io_uring: terminate multishot poll for CQ ring overflow")
Signed-off-by: Hao Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
If poll arming and poll completion runs in parallel, there maybe races.
For instance, run io_poll_add in iowq and io_poll_task_func in original
context, then:
iowq original context
io_poll_add
vfs_poll
(interruption happens
tw queued to original
context) io_poll_task_func
generate cqe
del from cancel_hash[]
if !poll.done
insert to cancel_hash[]
The entry left in cancel_hash[], similar case for fast poll.
Fix it by set poll.done = true when del from cancel_hash[].
Fixes: 5082620fb2ca ("io_uring: terminate multishot poll for CQ ring overflow")
Signed-off-by: Hao Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Dave reports that a coredumping workload gets stuck in 5.15-rc2, and
identified the culprit in the Fixes line below. The problem is that
relying solely on fatal_signal_pending() to gate whether to exit or not
fails miserably if a process gets eg SIGILL sent. Don't exclusively
rely on fatal signals, also check if the thread group is exiting.
Fixes: 15e20db2e0ce ("io-wq: only exit on fatal signals")
Reported-by: Dave Chinner <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Without this call an xarray entry is leaked when the vfio_ap device is
unprobed. It was missed when the below patch was rebased across the
dev_set patch. Keep the remove function in the same order as the error
unwind in probe.
Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
Reviewed-by: Christoph Hellwig <[email protected]>
Tested-by: Tony Krowiak <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Tony Krowiak <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
Usnic VF doesn't need lock in atomic context to create QPs, so it is safe
to use mutex instead of spinlock. Such change fixes the following smatch
error.
Smatch static checker warning:
lib/kobject.c:289 kobject_set_name_vargs()
warn: sleeping in atomic context
Fixes: 514aee660df4 ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/2a0e295786c127e518ebee8bb7cafcb819a625f6.1631520231.git.leonro@nvidia.com
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Håkon Bugge <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
SERIAL_CORE_CONSOLE depends on TTY so EARLY_PRINTK should also
depend on TTY so that it does not select SERIAL_CORE_CONSOLE
inadvertently.
WARNING: unmet direct dependencies detected for SERIAL_CORE_CONSOLE
Depends on [n]: TTY [=n] && HAS_IOMEM [=y]
Selected by [y]:
- EARLY_PRINTK [=y]
Fixes: e8bf5bc776ed ("nios2: add early printk support")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Signed-off-by: Dinh Nguyen <[email protected]>
|
|
Fix double free_netdev when mhi_prepare_for_transfer fails.
Fixes: 3ffec6a14f24 ("net: Add mhi-net driver")
Signed-off-by: Daniele Palmas <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
Reviewed-by: Loic Poulain <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After commit 05b35e7eb9a1 ("smsc95xx: add phylib support"), link changes
are no longer propagated to usbnet. As a result, rx URB allocation won't
happen until there is a packet sent out first (this might never happen,
e.g. running just ssh server with a static IP). Fix by triggering usbnet
EVENT_LINK_CHANGE.
Fixes: 05b35e7eb9a1 ("smsc95xx: add phylib support")
Signed-off-by: Aaro Koskinen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.15:
- keep ctrl->namespaces ordered (me)
- fix incorrect h2cdata pdu offset accounting in nvme-tcp
(Sagi Grimberg)
- handled updated hw_queues in nvme-fc more carefully (Daniel Wagner,
James Smart)"
* tag 'nvme-5.15-2021-09-24' of git://git.infradead.org/nvme:
nvme: keep ctrl->namespaces ordered
nvme-tcp: fix incorrect h2cdata pdu offset accounting
nvme-fc: remove freeze/unfreeze around update_nr_hw_queues
nvme-fc: avoid race between time out and tear down
nvme-fc: update hardware queues before using them
|
|
Multipath RTA_FLOW is embedded in nexthop. Dump it in fib_add_nexthop()
to get the length of rtnexthop correct.
Fixes: b0f60193632e ("ipv4: Refactor nexthop attributes in fib_dump_info")
Signed-off-by: Xiao Liang <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The enetc phylink .mac_config handler intends to clear the IFMODE field
(bits 1:0) of the PM0_IF_MODE register, but incorrectly clears all the
other fields instead.
For normal operation, the bug was inconsequential, due to the fact that
we write the PM0_IF_MODE register in two stages, first in
phylink .mac_config (which incorrectly cleared out a bunch of stuff),
then we update the speed and duplex to the correct values in
phylink .mac_link_up.
Judging by the code (not tested), it looks like maybe loopback mode was
broken, since this is one of the settings in PM0_IF_MODE which is
incorrectly cleared.
Fixes: c76a97218dcb ("net: enetc: force the RGMII speed and duplex instead of operating in inband mode")
Reported-by: Pavel Machek (CIP) <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/urgent
Pull irqchip fixes from Marc Zyngier:
- Work around a bad GIC integration on a Renesas platform, where the
interconnect cannot deal with byte-sized MMIO accesses
- Cleanup another Renesas driver abusing the comma operator
- Fix a potential GICv4 memory leak on an error path
- Make the type of 'size' consistent with the rest of the code in
__irq_domain_add()
- Fix a regression in the Armada 370-XP IPI path
- Fix the build for the obviously unloved goldfish-pic
- Some documentation fixes
Link: https://lore.kernel.org/r/[email protected]
|
|
The return value of devm_clk_get should in general be propagated to
upper layer. In this case the clk is optional, use the appropriate
wrapper instead of interpreting all errors as "The optional clk is not
available".
Signed-off-by: Uwe Kleine-König <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Old code produces -24999 for 0b1110011100000000 input in standard format due to
always rounding up rather than "away from zero".
Use the common macro for division, unify and simplify the conversion code along
the way.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
For both local and remote sensors all the supported ICs can report an
"undervoltage lockout" condition which means the conversion wasn't
properly performed due to insufficient power supply voltage and so the
measurement results can't be trusted.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Signed-off-by: Paul Fertser <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
Function i2c_smbus_read_byte_data() can return a negative error number
instead of the data read if I2C transaction failed for whatever reason.
Lack of error checking can lead to serious issues on production
hardware, e.g. errors treated as temperatures produce spurious critical
temperature-crossed-threshold errors in BMC logs for OCP server
hardware. The patch was tested with Mellanox OCP Mezzanine card
emulating TMP421 protocol for temperature sensing which sometimes leads
to I2C protocol error during early boot up stage.
Fixes: 9410700b881f ("hwmon: Add driver for Texas Instruments TMP421/422/423 sensor chips")
Cc: [email protected]
Signed-off-by: Paul Fertser <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[groeck: dropped unnecessary line breaks]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
gcc 8.3 and 5.4 throw this:
In function 'modify_qp_init_to_rtr',
././include/linux/compiler_types.h:322:38: error: call to '__compiletime_assert_1859' declared with attribute error: FIELD_PREP: value too large for the field
_compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
[..]
drivers/infiniband/hw/hns/hns_roce_common.h:91:52: note: in expansion of macro 'FIELD_PREP'
*((__le32 *)ptr + (field_h) / 32) |= cpu_to_le32(FIELD_PREP( \
^~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write'
#define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val)
^~~~~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4412:2: note: in expansion of macro 'hr_reg_write'
hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini);
Because gcc has miscalculated the constantness of lp_pktn_ini:
mtu = ib_mtu_enum_to_int(ib_mtu);
if (WARN_ON(mtu < 0)) [..]
lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);
Since mtu is limited to {256,512,1024,2048,4096} lp_pktn_ini is between 4
and 8 which is compatible with the 4 bit field in the FIELD_PREP.
Work around this broken compiler by adding a 'can never be true'
constraint on lp_pktn_ini's value which clears out the problem.
Fixes: f0cb411aad23 ("RDMA/hns: Use new interface to modify QP context")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
Add a m68k-only set_fc helper to set the SFC and DFC registers for the
few places that need to override it for special MM operations, but
disconnect that from the deprecated kernel-wide set_fs() API.
Note that the SFC/DFC registers are context switched, so there is no need
to disable preemption.
Partially based on an earlier patch from
Linus Torvalds <[email protected]>.
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: Michael Schmitz <[email protected]>
Tested-by: Michael Schmitz <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Geert Uytterhoeven <[email protected]>
|