Age | Commit message (Collapse) | Author | Files | Lines |
|
ocfs2_dentry_attach_lock() can be executed in parallel threads against the
same dentry. Make that race safe. The race is like this:
thread A thread B
(A1) enter ocfs2_dentry_attach_lock,
seeing dentry->d_fsdata is NULL,
and no alias found by
ocfs2_find_local_alias, so kmalloc
a new ocfs2_dentry_lock structure
to local variable "dl", dl1
.....
(B1) enter ocfs2_dentry_attach_lock,
seeing dentry->d_fsdata is NULL,
and no alias found by
ocfs2_find_local_alias so kmalloc
a new ocfs2_dentry_lock structure
to local variable "dl", dl2.
......
(A2) set dentry->d_fsdata with dl1,
call ocfs2_dentry_lock() and increase
dl1->dl_lockres.l_ro_holders to 1 on
success.
......
(B2) set dentry->d_fsdata with dl2
call ocfs2_dentry_lock() and increase
dl2->dl_lockres.l_ro_holders to 1 on
success.
......
(A3) call ocfs2_dentry_unlock()
and decrease
dl2->dl_lockres.l_ro_holders to 0
on success.
....
(B3) call ocfs2_dentry_unlock(),
decreasing
dl2->dl_lockres.l_ro_holders, but
see it's zero now, panic
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Wengang Wang <[email protected]>
Reported-by: Daniel Sobe <[email protected]>
Tested-by: Daniel Sobe <[email protected]>
Reviewed-by: Changwei Ge <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Johannes pointed out that after commit 886cf1901db9 ("mm: move
recent_rotated pages calculation to shrink_inactive_list()") we lost all
zone_reclaim_stat::recent_rotated history.
This fixes it.
Link: http://lkml.kernel.org/r/155905972210.26456.11178359431724024112.stgit@localhost.localdomain
Fixes: 886cf1901db9 ("mm: move recent_rotated pages calculation to shrink_inactive_list()")
Signed-off-by: Kirill Tkhai <[email protected]>
Reported-by: Johannes Weiner <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
If mlockall() is called with only MCL_ONFAULT as flag, it removes any
previously applied lockings and does nothing else.
This behavior is counter-intuitive and doesn't match the Linux man page.
For mlockall():
EINVAL Unknown flags were specified or MCL_ONFAULT was specified
without either MCL_FUTURE or MCL_CURRENT.
Consequently, return the error EINVAL, if only MCL_ONFAULT is passed.
That way, applications will at least detect that they are calling
mlockall() incorrectly.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: b0f205c2a308 ("mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage")
Signed-off-by: Stefan Potyra <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
At least for ARM64 kernels compiled with the crosstoolchain from
Debian/stretch or with the toolchain from kernel.org the line number is
not decoded correctly by 'decode_stacktrace.sh':
$ echo "[ 136.513051] f1+0x0/0xc [kcrash]" | \
CROSS_COMPILE=/opt/gcc-8.1.0-nolibc/aarch64-linux/bin/aarch64-linux- \
./scripts/decode_stacktrace.sh /scratch/linux-arm64/vmlinux \
/scratch/linux-arm64 \
/nfs/debian/lib/modules/4.20.0-devel
[ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:68) kcrash
If addr2line from the toolchain is used the decoded line number is correct:
[ 136.513051] f1 (/linux/drivers/staging/kcrash/kcrash.c:57) kcrash
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Manuel Traut <[email protected]>
Acked-by: Konstantin Khlebnikov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Syzbot reported following memory leak:
ffffffffda RBX: 0000000000000003 RCX: 0000000000441f79
BUG: memory leak
unreferenced object 0xffff888114f26040 (size 32):
comm "syz-executor626", pid 7056, jiffies 4294948701 (age 39.410s)
hex dump (first 32 bytes):
40 60 f2 14 81 88 ff ff 40 60 f2 14 81 88 ff ff @`......@`......
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
slab_post_alloc_hook mm/slab.h:439 [inline]
slab_alloc mm/slab.c:3326 [inline]
kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
kmalloc include/linux/slab.h:547 [inline]
__memcg_init_list_lru_node+0x58/0xf0 mm/list_lru.c:352
memcg_init_list_lru_node mm/list_lru.c:375 [inline]
memcg_init_list_lru mm/list_lru.c:459 [inline]
__list_lru_init+0x193/0x2a0 mm/list_lru.c:626
alloc_super+0x2e0/0x310 fs/super.c:269
sget_userns+0x94/0x2a0 fs/super.c:609
sget+0x8d/0xb0 fs/super.c:660
mount_nodev+0x31/0xb0 fs/super.c:1387
fuse_mount+0x2d/0x40 fs/fuse/inode.c:1236
legacy_get_tree+0x27/0x80 fs/fs_context.c:661
vfs_get_tree+0x2e/0x120 fs/super.c:1476
do_new_mount fs/namespace.c:2790 [inline]
do_mount+0x932/0xc50 fs/namespace.c:3110
ksys_mount+0xab/0x120 fs/namespace.c:3319
__do_sys_mount fs/namespace.c:3333 [inline]
__se_sys_mount fs/namespace.c:3330 [inline]
__x64_sys_mount+0x26/0x30 fs/namespace.c:3330
do_syscall_64+0x76/0x1a0 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x44/0xa9
This is a simple off by one bug on the error path.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 60d3fd32a7a9 ("list_lru: introduce per-memcg lists")
Reported-by: [email protected]
Signed-off-by: Shakeel Butt <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Reviewed-by: Kirill Tkhai <[email protected]>
Cc: <[email protected]> [4.0+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The kernel test robot noticed a 26% will-it-scale pagefault regression
from commit 42a300353577 ("mm: memcontrol: fix recursive statistics
correctness & scalabilty"). This appears to be caused by bouncing the
additional cachelines from the new hierarchical statistics counters.
We can fix this by getting rid of the batched local counters instead.
Originally, there were *only* group-local counters, and they were fully
maintained per cpu. A reader of a stats file high up in the cgroup tree
would have to walk the entire subtree and collect each level's per-cpu
counters to get the recursive view. This was prohibitively expensive,
and so we switched to per-cpu batched updates of the local counters
during a983b5ebee57 ("mm: memcontrol: fix excessive complexity in
memory.stat reporting"), reducing the complexity from nr_subgroups *
nr_cpus to nr_subgroups.
With growing machines and cgroup trees, the tree walk itself became too
expensive for monitoring top-level groups, and this is when the culprit
patch added hierarchy counters on each cgroup level. When the per-cpu
batch size would be reached, both the local and the hierarchy counters
would get batch-updated from the per-cpu delta simultaneously.
This makes local and hierarchical counter reads blazingly fast, but it
unfortunately makes the write-side too cache line intense.
Since local counter reads were never a problem - we only centralized
them to accelerate the hierarchy walk - and use of the local counters
are becoming rarer due to replacement with hierarchical views (ongoing
rework in the page reclaim and workingset code), we can make those local
counters unbatched per-cpu counters again.
The scheme will then be as such:
when a memcg statistic changes, the writer will:
- update the local counter (per-cpu)
- update the batch counter (per-cpu). If the batch is full:
- spill the batch into the group's atomic_t
- spill the batch into all ancestors' atomic_ts
- empty out the batch counter (per-cpu)
when a local memcg counter is read, the reader will:
- collect the local counter from all cpus
when a hiearchy memcg counter is read, the reader will:
- read the atomic_t
We might be able to simplify this further and make the recursive
counters unbatched per-cpu counters as well (batch upward propagation,
but leave per-cpu collection to the readers), but that will require a
more in-depth analysis and testing of all the callsites. Deal with the
immediate regression for now.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 42a300353577 ("mm: memcontrol: fix recursive statistics correctness & scalabilty")
Signed-off-by: Johannes Weiner <[email protected]>
Reported-by: kernel test robot <[email protected]>
Tested-by: kernel test robot <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Roman Gushchin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Commit d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue")
attempted to avoid a problem with devices whose drivers want them to
stay in D0 over suspend-to-idle and resume, but it did not go as far
as it should with that.
Namely, first of all, the power state of a PCI bridge with a
downstream device in D0 must be D0 (based on the PCI PM spec r1.2,
sec 6, table 6-1, if the bridge is not in D0, there can be no PCI
transactions on its secondary bus), but that is not actively enforced
during system-wide PM transitions, so use the skip_bus_pm flag
introduced by commit d491f2b75237 for that.
Second, the configuration of devices left in D0 (whatever the reason)
during suspend-to-idle need not be changed and attempting to put them
into D0 again by force is pointless, so explicitly avoid doing that.
Fixes: d491f2b75237 ("PCI: PM: Avoid possible suspend-to-idle issue")
Reported-by: Kai-Heng Feng <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Reviewed-by: Mika Westerberg <[email protected]>
Tested-by: Kai-Heng Feng <[email protected]>
|
|
Naveen N. Rao says:
====================
The first patch updates DIV64 overflow tests to properly detect error
conditions. The second patch fixes powerpc64 JIT to generate the proper
unsigned division instruction for BPF_ALU64.
====================
Acked-by: Sandipan Das <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
BPF_ALU64 div/mod operations are currently using signed division, unlike
BPF_ALU32 operations. Fix the same. DIV64 and MOD64 overflow tests pass
with this fix.
Fixes: 156d0e290e969c ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF")
Cc: [email protected] # v4.8+
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
If the result of the division is LLONG_MIN, current tests do not detect
the error since the return value is truncated to a 32-bit value and ends
up being 0.
Signed-off-by: Naveen N. Rao <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Sync the changes to the flags made in "bpf: simplify definition of
BPF_FIB_LOOKUP related flags" with the BPF UAPI headers.
Doing in a separate commit to ease syncing of github/libbpf.
Signed-off-by: Martynas Pumputis <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Previously, the BPF_FIB_LOOKUP_{DIRECT,OUTPUT} flags in the BPF UAPI
were defined with the help of BIT macro. This had the following issues:
- In order to use any of the flags, a user was required to depend
on <linux/bits.h>.
- No other flag in bpf.h uses the macro, so it seems that an unwritten
convention is to use (1 << (nr)) to define BPF-related flags.
Fixes: 87f5fc7e48dd ("bpf: Provide helper to do forwarding lookups in kernel FIB table")
Signed-off-by: Martynas Pumputis <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.
The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.
To avoid this we need to have a spinlock attached to the inode and not the tcon.
RHBZ: 1580165
CC: Stable <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
RH Bugzilla: 1702264
We need to protect so that the call to smb2_reconnect() in
smb2_reconnect_server() does not end up freeing the session
because it can lead to a use after free and crash.
Reviewed-by: Aurelien Aptel <[email protected]>
Cc: <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
current->mm can be non-NULL if a kthread calls use_mm(). Check for
PF_KTHREAD instead to decide when to store user mode FP state.
Fixes: 2722146eb784 ("x86/fpu: Remove fpu->initialized")
Reported-by: Peter Zijlstra <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Cc: Aubrey Li <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Nicolai Stange <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: x86-ml <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent
Pull timer fixes from Daniel Lezcano:
- Fix missing notrace leading to deadlock on arch_arm_timer (Julien Thierry)
- Fix compilation warning on timer-ti-dm (Philippe Mazenauer)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- regression fixes (reverts) for module loading changes that turned out
to be incompatible with some userspace, from Benjamin Tissoires
- regression fix for special Logitech unifiying receiver 0xc52f, from
Hans de Goede
- a few device ID additions to logitech driver, from Hans de Goede
- fix for Bluetooth support on 2nd-gen Wacom Intuos Pro, from Jason
Gerecke
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: logitech-dj: Fix 064d:c52f receiver support
Revert "HID: core: Call request_module before doing device_add"
Revert "HID: core: Do not call request_module() in async context"
Revert "HID: Increase maximum report size allowed by hid_field_extract()"
HID: a4tech: fix horizontal scrolling
HID: hyperv: Add a module description line
HID: logitech-hidpp: Add support for the S510 remote control
HID: multitouch: handle faulty Elo touch device
HID: wacom: Sync INTUOSP2_BT touch state after each frame if necessary
HID: wacom: Correct button numbering 2nd-gen Intuos Pro over Bluetooth
HID: wacom: Send BTN_TOUCH in response to INTUOSP2_BT eraser contact
HID: wacom: Don't report anything prior to the tool entering range
HID: wacom: Don't set tool type until we're in range
HID: rmi: Use SET_REPORT request on control endpoint for Acer Switch 3 and 5
HID: logitech-hidpp: add support for the MX5500 keyboard
HID: logitech-dj: add support for the Logitech MX5500's Bluetooth Mini-Receiver
HID: i2c-hid: add iBall Aer3 to descriptor override
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus
ASoC: Fixes for v5.2
There's an awful lot of fixes here, almost all for the newly introduced
SoF DSP drivers (including a few things it turned up in shared code).
This is a large and complex piece of code so it's not surprising that
there have been quite a few issues here, fortunately things seem to have
mostly calmed down now. Otherwise there's just a smattering of small fixes.
|
|
Unfortunately, a couple of mistakes were made while implementing
Enlightened VMCS support, in particular, wrong clean fields were
used in copy_enlightened_to_vmcs12():
- exception_bitmap is covered by CONTROL_EXCPN;
- vm_exit_controls/pin_based_vm_exec_control/secondary_vm_exec_control
are covered by CONTROL_GRP1.
Fixes: 945679e301ea0 ("KVM: nVMX: add enlightened VMCS state")
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Free the vfio_ccw_cmd_region on module exit.
Fixes: d5afd5d135c8 ("vfio-ccw: add handling for async channel instructions")
Signed-off-by: Farhan Ali <[email protected]>
Message-Id: <c0f39039d28af39ea2939391bf005e3495d890fd.1559576250.git.alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <[email protected]>
Signed-off-by: Heiko Carstens <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.2-rc5:
- Fix DMC firmware input validation to avoid buffer overflow
- Fix perf register access whitelist for userspace
- Fix DSI panel on GPD MicroPC
- Fix per-pixel alpha with CCS
- Fix HDMI audio for SDVO
Signed-off-by: Daniel Vetter <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The removal of CONFIG_LBDAF changed the type of sector_t from "unsigned
long" to "u64" aka "unsigned long long" on 64-bit platforms, leading to
a compiler warning regression:
drivers/block/ps3vram.c: In function ‘ps3vram_probe’:
drivers/block/ps3vram.c:770:23: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘sector_t {aka long long unsigned int}’ [-Wformat=]
Fix this by using "%llu" instead.
Fixes: 72deb455b5ec619f ("block: remove CONFIG_LBDAF")
Signed-off-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
We've received a bugreport that using LPM with ST1000LM024 drives leads
to system lockups. So it seems that these models are buggy in more then
1 way. Add NOLPM quirk to the existing quirks entry for BROKEN_FPDMA_AA.
BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1571330
Cc: [email protected]
Reviewed-by: Martin K. Petersen <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
When people set a writeback percent via sysfs file,
/sys/block/bcache<N>/bcache/writeback_percent
current code directly sets BCACHE_DEV_WB_RUNNING to dc->disk.flags
and schedules kworker dc->writeback_rate_update.
If there is no cache set attached to, the writeback kernel thread is
not running indeed, running dc->writeback_rate_update does not make
sense and may cause NULL pointer deference when reference cache set
pointer inside update_writeback_rate().
This patch checks whether the cache set point (dc->disk.c) is NULL in
sysfs interface handler, and only set BCACHE_DEV_WB_RUNNING and
schedule dc->writeback_rate_update when dc->disk.c is not NULL (it
means the cache device is attached to a cache set).
This problem might be introduced from initial bcache commit, but
commit 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
changes part of the original code piece, so I add 'Fixes: 3fd47bfe55b0'
to indicate from which commit this patch can be applied.
Fixes: 3fd47bfe55b0 ("bcache: stop dc->writeback_rate_update properly")
Reported-by: Bjørn Forsman <[email protected]>
Signed-off-by: Coly Li <[email protected]>
Reviewed-by: Bjørn Forsman <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Recently people report bcache code compiled with gcc9 is broken, one of
the buggy behavior I observe is that two adjacent 4KB I/Os should merge
into one but they don't. Finally it turns out to be a stack corruption
caused by macro PRECEDING_KEY().
See how PRECEDING_KEY() is defined in bset.h,
437 #define PRECEDING_KEY(_k) \
438 ({ \
439 struct bkey *_ret = NULL; \
440 \
441 if (KEY_INODE(_k) || KEY_OFFSET(_k)) { \
442 _ret = &KEY(KEY_INODE(_k), KEY_OFFSET(_k), 0); \
443 \
444 if (!_ret->low) \
445 _ret->high--; \
446 _ret->low--; \
447 } \
448 \
449 _ret; \
450 })
At line 442, _ret points to address of a on-stack variable combined by
KEY(), the life range of this on-stack variable is in line 442-446,
once _ret is returned to bch_btree_insert_key(), the returned address
points to an invalid stack address and this address is overwritten in
the following called bch_btree_iter_init(). Then argument 'search' of
bch_btree_iter_init() points to some address inside stackframe of
bch_btree_iter_init(), exact address depends on how the compiler
allocates stack space. Now the stack is corrupted.
Fixes: 0eacac22034c ("bcache: PRECEDING_KEY()")
Signed-off-by: Coly Li <[email protected]>
Reviewed-by: Rolf Fokkens <[email protected]>
Reviewed-by: Pierre JUHEN <[email protected]>
Tested-by: Shenghui Wang <[email protected]>
Tested-by: Pierre JUHEN <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Nix <[email protected]>
Cc: [email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
The in-memory representation of SVE and FPSIMD registers is
different: the FPSIMD V-registers are stored as single 128-bit
host-endian values, whereas SVE registers are stored in an
endianness-invariant byte order.
This means that the two representations differ when running on a
big-endian host. But we blindly copy data from one representation
to another when converting between the two, resulting in the
register contents being unintentionally byteswapped in certain
situations. Currently this can be triggered by the first SVE
instruction after a syscall, for example (though the potential
trigger points may vary in future).
So, fix the conversion functions fpsimd_to_sve(), sve_to_fpsimd()
and sve_sync_from_fpsimd_zeropad() to swab where appropriate.
There is no common swahl128() or swab128() that we could use here.
Maybe it would be worth making this generic, but for now add a
simple local hack.
Since the byte order differences are exposed in ABI, also clarify
the documentation.
Cc: Alex Bennée <[email protected]>
Cc: Peter Maydell <[email protected]>
Cc: Alan Hayward <[email protected]>
Cc: Julien Grall <[email protected]>
Fixes: bc0ee4760364 ("arm64/sve: Core task context handling")
Fixes: 8cd969d28fd2 ("arm64/sve: Signal handling support")
Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support")
Signed-off-by: Dave Martin <[email protected]>
[will: Fix typos in comments and docs spotted by Julien]
Signed-off-by: Will Deacon <[email protected]>
|
|
blk_mq_sched_free_requests() may be called in failure path in which
q->elevator may not be setup yet, so remove WARN_ON(!q->elevator) from
blk_mq_sched_free_requests for avoiding the false positive.
This function is actually safe to call in case of !q->elevator because
hctx->sched_tags is checked.
Cc: Bart Van Assche <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Yi Zhang <[email protected]>
Fixes: c3e2219216c9 ("block: free sched's request pool in blk_cleanup_queue")
Reported-by: [email protected]
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
CFQ is gone. No need anymore to document its "proportional weight time
based division of disk policy".
Signed-off-by: Andreas Herrmann <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Remove references to CFQ and legacy block layer which are gone.
Update example with what's available under blk-mq.
Signed-off-by: Andreas Herrmann <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This patch removes the check in the null_blk_zoned for report zone
command, where it checks for the dev-,>zoned before executing the report
zone.
The null_zone_report() function is a block_device operation callback
which is initialized in the null_blk_main.c and gets called as a part
of blkdev for report zone IOCTL (BLKREPORTZONE).
blkdev_ioctl()
blkdev_report_zones_ioctl()
blkdev_report_zones()
blk_report_zones()
disk->fops->report_zones()
nullb_zone_report();
The null_zone_report() will never get executed on the non-zoned block
device, in the non zoned block device blk_queue_is_zoned() will always
be false which is first check the blkdev_report_zones_ioctl()
before actual low level driver report zone callback is executed.
Here is the detailed scenario:-
1. modprobe null_blk
null_init
null_alloc_dev
dev->zoned = 0
null_add_dev
dev->zoned == 0
so we don't set the q->limits.zoned = BLK_ZONED_HR
2. blkzone report /dev/nullb0
blkdev_ioctl()
blkdev_report_zones_ioctl()
blk_queue_is_zoned()
blk_queue_is_zoned
q->limits.zoned == 0
return false
if (!blk_queue_is_zoned(q)) <--- true
return -ENOTTY;
Reviewed-by: Damien Le Moal <[email protected]>
Reviewed-by: Bob Liu <[email protected]>
Signed-off-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
When all of these checks are cleaned up, lots of the functions used in
the blk-mq-debugfs code can now return void, as no need to check the
return value of them either.
Overall, this ends up cleaning up the code and making it smaller, always
a nice win.
Cc: Jens Axboe <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Opening and closing an io_uring instance leaks a UNIX domain socket
inode. This is because the ->file of the io_uring instance's internal
UNIX domain socket is set to point to the io_uring file, but then
sock_release() sees the non-NULL ->file and assumes the inode reference
is held by the file so doesn't call iput(). That's not the case here,
since the reference is still meant to be held by the socket; the actual
inode of the io_uring file is different.
Fix this leak by NULL-ing out ->file before releasing the socket.
Reported-by: [email protected]
Fixes: 2b188cc1bb85 ("Add io_uring IO interface")
Cc: <[email protected]> # v5.1+
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
In most use cases of zoned block devices (aka SMR disks), the
mq-deadline scheduler is mandatory as it implements sequential write
command processing guarantees with zone write locking. So make sure that
this scheduler is always enabled if CONFIG_BLK_DEV_ZONED is selected.
Tested-by: Chaitanya Kulkarni <[email protected]>
Reviewed-by: Chaitanya Kulkarni <[email protected]>
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fixes from Paul Moore:
"Three patches for v5.2.
One fixes a problem where we weren't correctly logging raw SELinux
labels, the other two fix problems where we weren't properly checking
calls to kmemdup()"
* tag 'selinux-pr-20190612' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: fix a missing-check bug in selinux_sb_eat_lsm_opts()
selinux: fix a missing-check bug in selinux_add_mnt_opt( )
selinux: log raw contexts as untrusted strings
|
|
Fixes SI cards running on amdgpu.
Fixes: 1929059893022 ("drm/amd/amdgpu: add RLC firmware to support raven1 refresh")
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=110883
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The "block" variable can be set by the user through debugfs, so it can
be quite large which leads to shift wrapping here. This means we report
a "block" as supported when it's not, and that leads to array overflows
later on.
This bug is not really a security issue in real life, because debugfs is
generally root only.
Fixes: 36ea1bd2d084 ("drm/amdgpu: add debugfs ctrl node")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
They are capable of using intertouch and it works well with
psmouse.synaptics_intertouch=1, so add them to the list.
Without it, scrolling and gestures are jumpy, three-finger pinch gesture
doesn't work and three- or four-finger swipes sometimes get stuck.
Signed-off-by: Alexander Mikhaylenko <[email protected]>
Reviewed-by: Benjamin Tissoires <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
|
|
Maxime Chevallier says:
====================
net: mvpp2: prs: Fixes for VID filtering
This series fixes some issues with VID filtering offload, mainly due to
the wrong ranges being used in the TCAM header parser.
The first patch fixes a bug where removing a VLAN from a port's
whitelist would also remove it from other port's, if they are on the
same PPv2 instance.
The second patch makes so that we don't invalidate the wrong TCAM
entries when clearing the whole whitelist.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When removing all VID filters, the mvpp2_prs_vid_entry_remove would be
called with the TCAM id incorrectly used as a VID, causing the wrong
TCAM entries to be invalidated.
Fix this by directly invalidating entries in the VID range.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
VID filtering is implemented in the Header Parser, with one range of 11
vids being assigned for each no-loopback port.
Make sure we use the per-port range when looking for existing entries in
the Parser.
Since we used a global range instead of a per-port one, this causes VIDs
to be removed from the whitelist from all ports of the same PPv2
instance.
Fixes: 56beda3db602 ("net: mvpp2: Add hardware offloading for VLAN filtering")
Suggested-by: Yuri Chipchev <[email protected]>
Signed-off-by: Maxime Chevallier <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Ido Schimmel says:
====================
mlxsw: Various fixes
This patchset contains various fixes for mlxsw.
Patch #1 fixes an hash polarization problem when a nexthop device is a
LAG device. This is caused by the fact that the same seed is used for
the LAG and ECMP hash functions.
Patch #2 fixes an issue in which the driver fails to refresh a nexthop
neighbour after it becomes dead. This prevents the nexthop from ever
being written to the adjacency table and used to forward traffic. Patch
Patch #4 fixes a wrong extraction of TOS value in flower offload code.
Patch #5 is a test case.
Patch #6 works around a buffer issue in Spectrum-2 by reducing the
default sizes of the shared buffer pools.
Patch #7 prevents prio-tagged packets from entering the switch when PVID
is removed from the bridge port.
Please consider patches #2, #4 and #6 for 5.1.y
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When PVID is removed from a bridge port, the Linux bridge drops both
untagged and prio-tagged packets. Align mlxsw with this behavior.
Fixes: 148f472da5db ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register")
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out
of 32 port buffers cannot be used. To work around this, the next FW release
will mark them as unused, and will report correspondingly lower total
shared buffer size. mlxsw will pick up the new value through a query to
cap_total_buffer_size resource. However the initial size for shared buffer
pool 0 is hard-coded and therefore needs to be updated.
Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the
total size of 42 MiB), and round down to the whole number of cells.
Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration")
Signed-off-by: Petr Machata <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The TOS value was not extracted correctly. Fix it.
Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos")
Reported-by: Alexander Petrovskiy <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Test that IPv4 and IPv6 nexthops are correctly marked with offload
indication in response to neighbour events.
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The driver tries to periodically refresh neighbours that are used to
reach nexthops. This is done by periodically calling neigh_event_send().
However, if the neighbour becomes dead, there is nothing we can do to
return it to a connected state and the above function call is basically
a NOP.
This results in the nexthop never being written to the device's
adjacency table and therefore never used to forward packets.
Fix this by dropping our reference from the dead neighbour and
associating the nexthop with a new neigbhour which we will try to
refresh.
Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Alex Veber <[email protected]>
Tested-by: Alex Veber <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The same hash function and seed are used for both ECMP and LAG hash.
Therefore, when a LAG device is used as a nexthop device as part of an
ECMP group, hash polarization can occur and all the traffic will be
hashed to a single LAG slave.
Fix this by using a different seed for the LAG hash.
Fixes: fa73989f2697 ("mlxsw: spectrum: Use a stable ECMP/LAG seed")
Signed-off-by: Ido Schimmel <[email protected]>
Reported-by: Alex Veber <[email protected]>
Tested-by: Alex Veber <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
tls_sw_do_sendpage needs to return the total number of bytes sent
regardless of how many sk_msgs are allocated. Unfortunately, copied
(the value we return up the stack) is zero'd before each new sk_msg
is allocated so we only return the copied size of the last sk_msg used.
The caller (splice, etc.) of sendpage will then believe only part
of its data was sent and send the missing chunks again. However,
because the data actually was sent the receiver will get multiple
copies of the same data.
To reproduce this do multiple sendfile calls with a length close to
the max record size. This will in turn call splice/sendpage, sendpage
may use multiple sk_msg in this case and then returns the incorrect
number of bytes. This will cause splice to resend creating duplicate
data on the receiver. Andre created a C program that can easily
generate this case so we will push a similar selftest for this to
bpf-next shortly.
The fix is to _not_ zero the copied field so that the total sent
bytes is returned.
Reported-by: Steinar H. Gunderson <[email protected]>
Reported-by: Andre Tomt <[email protected]>
Tested-by: Andre Tomt <[email protected]>
Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Get the ingress interface and increment ICMP counters based on that
instead of skb->dev when the the dev is a VRF device.
This is a follow up on the following message:
https://www.spinics.net/lists/netdev/msg560268.html
v2: Avoid changing skb->dev since it has unintended effect for local
delivery (David Ahern).
Signed-off-by: Stephen Suryaputra <[email protected]>
Reviewed-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|