Age | Commit message (Collapse) | Author | Files | Lines |
|
Mark noticed that syzkaller is able to reliably trigger the following warning:
dl_rq->running_bw > dl_rq->this_bw
WARNING: CPU: 1 PID: 153 at kernel/sched/deadline.c:124 switched_from_dl+0x454/0x608
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 153 Comm: syz-executor253 Not tainted 4.18.0-rc3+ #29
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x458
show_stack+0x20/0x30
dump_stack+0x180/0x250
panic+0x2dc/0x4ec
__warn_printk+0x0/0x150
report_bug+0x228/0x2d8
bug_handler+0xa0/0x1a0
brk_handler+0x2f0/0x568
do_debug_exception+0x1bc/0x5d0
el1_dbg+0x18/0x78
switched_from_dl+0x454/0x608
__sched_setscheduler+0x8cc/0x2018
sys_sched_setattr+0x340/0x758
el0_svc_naked+0x30/0x34
syzkaller reproducer runs a bunch of threads that constantly switch
between DEADLINE and NORMAL classes while interacting through futexes.
The splat above is caused by the fact that if a DEADLINE task is setattr
back to NORMAL while in non_contending state (blocked on a futex -
inactive timer armed), its contribution to running_bw is not removed
before sub_rq_bw() gets called (!task_on_rq_queued() branch) and the
latter sees running_bw > this_bw.
Fix it by removing a task contribution from running_bw if the task is
not queued and in non_contending state while switched to a different
class.
Reported-by: Mark Rutland <[email protected]>
Signed-off-by: Juri Lelli <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Daniel Bristot de Oliveira <[email protected]>
Reviewed-by: Luca Abeni <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
- A fix for OMAP5 and DRA7 to make the branch predictor hardening
settings take proper effect on secondary cores
- Disable USB OTG on am3517 since current driver isn't working
- Fix thermal sensor register settings on Armada 38x
- Fix suspend/resume IRQs on pxa3xx
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary cores
ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
ARM: dts: armada-38x: use the new thermal binding
|
|
pvti_cpu0_va is the address of shared kvmclock data structure.
pvti_cpu0_va is currently kept unset (1) on 32 bit systems, (2) when
kvmclock vsyscall is disabled, and (3) if kvmclock is not stable.
This poses a problem, because kvm_ptp needs pvti_cpu0_va, but (1) can
work on 32 bit, (2) has little relation to the vsyscall, and (3) does
not need stable kvmclock (although kvmclock won't be used for system
clock if it's not stable, so kvm_ptp is pointless in that case).
Expose pvti_cpu0_va whenever kvmclock is enabled to allow all users to
work with it.
This fixes a regression found on Gentoo: https://bugs.gentoo.org/658544.
Fixes: 9f08890ab906 ("x86/pvclock: add setter for pvclock_pvti_cpu0_va")
Cc: [email protected]
Reported-by: Andreas Steinmetz <[email protected]>
Signed-off-by: Radim Krčmář <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Prevent a config where KVM_AMD=y and CRYPTO_DEV_CCP_DD=m thereby ensuring
that AMD Secure Processor device driver will be built-in when KVM_AMD is
also built-in.
v1->v2:
* Removed usage of 'imply' Kconfig option.
* Change patch commit message.
Fixes: 505c9e94d832 ("KVM: x86: prefer "depends on" to "select" for SEV")
Cc: <[email protected]> # 4.16.x
Signed-off-by: Janakarajan Natarajan <[email protected]>
Reviewed-by: Brijesh Singh <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
This exit qualification was inadvertently dropped when the two
VM-entry failure blocks were coalesced.
Fixes: e79f245ddec1 ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
Signed-off-by: Jim Mattson <[email protected]>
Reviewed-by: Krish Sadhukhan <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
When we switched from doing rdmsr() to reading FS/GS base values from
current->thread we completely forgot about legacy 32-bit userspaces which
we still support in KVM (why?). task->thread.{fsbase,gsbase} are only
synced for 64-bit processes, calling save_fsgs_for_kvm() and using
its result from current is illegal for legacy processes.
There's no ARCH_SET_FS/GS prctls for legacy applications. Base MSRs are,
however, not always equal to zero. Intel's manual says (3.4.4 Segment
Loading Instructions in IA-32e Mode):
"In order to set up compatibility mode for an application, segment-load
instructions (MOV to Sreg, POP Sreg) work normally in 64-bit mode. An
entry is read from the system descriptor table (GDT or LDT) and is loaded
in the hidden portion of the segment register.
...
The hidden descriptor register fields for FS.base and GS.base are
physically mapped to MSRs in order to load all address bits supported by
a 64-bit implementation.
"
The issue was found by strace test suite where 32-bit ioctl_kvm_run test
started segfaulting.
Reported-by: Dmitry V. Levin <[email protected]>
Bisected-by: Masatake YAMATO <[email protected]>
Fixes: 42b933b59721 ("x86/kvm/vmx: read MSR_{FS,KERNEL_GS}_BASE from current->thread")
Cc: [email protected]
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
This lets userspace read the MSR_IA32_ARCH_CAPABILITIES and check that all
requested features are available on the host.
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
When cpu_stop_queue_two_works() begins to wake the stopper threads, it does
so without preemption disabled, which leads to the following race
condition:
The source CPU calls cpu_stop_queue_two_works(), with cpu1 as the source
CPU, and cpu2 as the destination CPU. When adding the stopper threads to
the wake queue used in this function, the source CPU stopper thread is
added first, and the destination CPU stopper thread is added last.
When wake_up_q() is invoked to wake the stopper threads, the threads are
woken up in the order that they are queued in, so the source CPU's stopper
thread is woken up first, and it preempts the thread running on the source
CPU.
The stopper thread will then execute on the source CPU, disable preemption,
and begin executing multi_cpu_stop(), and wait for an ack from the
destination CPU's stopper thread, with preemption still disabled. Since the
worker thread that woke up the stopper thread on the source CPU is affine
to the source CPU, and preemption is disabled on the source CPU, that
thread will never run to dequeue the destination CPU's stopper thread from
the wake queue, and thus, the destination CPU's stopper thread will never
run, causing the source CPU's stopper thread to wait forever, and stall.
Disable preemption when waking the stopper threads in
cpu_stop_queue_two_works().
Fixes: 0b26351b910f ("stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock")
Co-Developed-by: Prasad Sodagudi <[email protected]>
Signed-off-by: Prasad Sodagudi <[email protected]>
Co-Developed-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Isaac J. Manjarres <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
|
|
Markus reported that BTS is sporadically missing the tail of the trace
in the perf_event data buffer: [decode error (1): instruction overflow]
shown in GDB; and bisected it to the conversion of debug_store to PTI.
A little "optimization" crept into alloc_bts_buffer(), which mistakenly
placed bts_interrupt_threshold away from the 24-byte record boundary.
Intel SDM Vol 3B 17.4.9 says "This address must point to an offset from
the BTS buffer base that is a multiple of the BTS record size."
Revert "max" from a byte count to a record count, to calculate the
bts_interrupt_threshold correctly: which turns out to fix problem seen.
Fixes: c1961a4631da ("x86/events/intel/ds: Map debug buffers in cpu_entry_area")
Reported-and-tested-by: Markus T Metzger <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected] # v4.14+
Link: https://lkml.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux
Pull RTC fixes from Alexandre Belloni:
"Two fixes for 4.18:
- an important core fix for RTCs using the core offsetting only one
driver is affected
- a fix for the error path of mrst"
* tag 'rtc-4.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux:
rtc: fix alarm read and set offset
rtc: mrst: fix error code in probe()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes
Two omap fixes for v4.18-rc cycle
Turns out the recent patches for ARM branch predictor hardening are
not working on omap5 and dra7 as planned because the secondary CPU
is parked to the bootrom code. We can't configure it in the bootloader.
So we must enable invalidates of BTB for omap5 and dra7 secondary
core in the kernel.
And there's a fix for reserved register access for am3517. The
usb otg module on am3517 is not the same as for other omap3.
* tag 'omap-for-v4.18/fixes-rc4-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
ARM: dts: am3517.dtsi: Disable reference to OMAP3 OTG controller
ARM: DRA7/OMAP5: Enable ACTLR[0] (Enable invalidates of BTB) for secondary cores
Signed-off-by: Olof Johansson <[email protected]>
|
|
mvebu fixes for 4.18 (part 1)
Use the new thermal binding on Armada 38x allowing to use a driver fix
which is already part of the kernel.
* tag 'mvebu-fixes-4.18-1' of git://git.infradead.org/linux-mvebu:
ARM: dts: armada-38x: use the new thermal binding
Signed-off-by: Olof Johansson <[email protected]>
|
|
This is the fixes set for v4.18 cycle.
This is a fix for suspending all pxa3xx platforms, where high
number interrupts are not reenabled.
* tag 'pxa-fixes-4.18' of https://github.com/rjarzmik/linux:
ARM: pxa: irq: fix handling of ICMR registers in suspend/resume
Signed-off-by: Olof Johansson <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"Two related fixes for a boot failure of Xen PV guests"
* tag 'for-linus-4.18-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: setup pv irq ops vector earlier
xen: remove global bit from __default_kernel_pte_mask for pv guests
|
|
Pull block fix from Jens Axboe:
"Just a single regression fix (from 4.17) for bsg, fixing an EINVAL
return on non-data commands"
* tag 'for-linus-20180713' of git://git.kernel.dk/linux-block:
bsg: fix bogus EINVAL on non-data commands
|
|
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches form Andrew Morton <[email protected]>:
reiserfs: fix buffer overflow with long warning messages
checkpatch: fix duplicate invalid vsprintf pointer extension '%p<foo>' messages
mm: do not bug_on on incorrect length in __mm_populate()
mm/memblock.c: do not complain about top-down allocations for !MEMORY_HOTREMOVE
fs, elf: make sure to page align bss in load_elf_library
x86/purgatory: add missing FORCE to Makefile target
net/9p/client.c: put refcount of trans_mod in error case in parse_opts()
mm: allow arch to supply p??_free_tlb functions
autofs: fix slab out of bounds read in getname_kernel()
fs/proc/task_mmu.c: fix Locked field in /proc/pid/smaps*
mm: do not drop unused pages when userfaultd is running
|
|
ReiserFS prepares log messages into a 1024-byte buffer with no bounds
checks. Long messages, such as the "unknown mount option" warning when
userspace passes a crafted mount options string, overflow this buffer.
This causes KASAN to report a global-out-of-bounds write.
Fix it by truncating messages to the buffer size.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: [email protected]
Signed-off-by: Eric Biggers <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Multiline statements with invalid %p<foo> uses produce multiple
warnings. Fix that.
e.g.:
$ cat t_block.c
void foo(void)
{
MY_DEBUG(drv->foo,
"%pk",
foo->boo);
}
$ ./scripts/checkpatch.pl -f t_block.c
WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#1: FILE: t_block.c:1:
+void foo(void)
WARNING: Invalid vsprintf pointer extension '%pk'
#3: FILE: t_block.c:3:
+ MY_DEBUG(drv->foo,
+ "%pk",
+ foo->boo);
WARNING: Invalid vsprintf pointer extension '%pk'
#3: FILE: t_block.c:3:
+ MY_DEBUG(drv->foo,
+ "%pk",
+ foo->boo);
total: 0 errors, 3 warnings, 6 lines checked
NOTE: For some of the reported defects, checkpatch may be able to
mechanically convert to the typical style using --fix or --fix-inplace.
t_block.c has style problems, please review.
NOTE: If any of the errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Joe Perches <[email protected]>
Cc: "Tobin C. Harding" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate
kernel BUG at mm/gup.c:1242!
invalid opcode: 0000 [#1] SMP
CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
RIP: 0010:__mm_populate+0x1e2/0x1f0
Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
Call Trace:
vm_brk_flags+0xc3/0x100
vm_brk+0x1f/0x30
load_elf_library+0x281/0x2e0
__ia32_sys_uselib+0x170/0x1e0
do_fast_syscall_32+0xca/0x420
entry_SYSENTER_compat+0x70/0x7f
The reason is that the length of the new brk is not page aligned when we
try to populate the it. There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should. All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place. The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.
Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags. The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.
Also remove the bogus BUG_ONs.
[[email protected]: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: syzbot <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Mike Rapoport is converting architectures from bootmem to nobootmem
allocator. While doing so for m68k Geert has noticed that he gets a
scary looking warning:
WARNING: CPU: 0 PID: 0 at mm/memblock.c:230
memblock_find_in_range_node+0x11c/0x1be
memblock: bottom-up allocation failed, memory hotunplug may be affected
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted
4.18.0-rc3-atari-01343-gf2fb5f2e09a97a3c-dirty #7
Call Trace: __warn+0xa8/0xc2
kernel_pg_dir+0x0/0x1000
netdev_lower_get_next+0x2/0x22
warn_slowpath_fmt+0x2e/0x36
memblock_find_in_range_node+0x11c/0x1be
memblock_find_in_range_node+0x11c/0x1be
memblock_find_in_range_node+0x0/0x1be
vprintk_func+0x66/0x6e
memblock_virt_alloc_internal+0xd0/0x156
netdev_lower_get_next+0x2/0x22
netdev_lower_get_next+0x2/0x22
kernel_pg_dir+0x0/0x1000
memblock_virt_alloc_try_nid_nopanic+0x58/0x7a
netdev_lower_get_next+0x2/0x22
kernel_pg_dir+0x0/0x1000
kernel_pg_dir+0x0/0x1000
EXPTBL+0x234/0x400
EXPTBL+0x234/0x400
alloc_node_mem_map+0x4a/0x66
netdev_lower_get_next+0x2/0x22
free_area_init_node+0xe2/0x29e
EXPTBL+0x234/0x400
paging_init+0x430/0x462
kernel_pg_dir+0x0/0x1000
printk+0x0/0x1a
EXPTBL+0x234/0x400
setup_arch+0x1b8/0x22c
start_kernel+0x4a/0x40a
_sinittext+0x344/0x9e8
The warning is basically saying that a top-down allocation can break
memory hotremove because memblock allocation is not movable. But m68k
doesn't even support MEMORY_HOTREMOVE so there is no point to warn about
it.
Make the warning conditional only to configurations that care.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Geert Uytterhoeven <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Greg Ungerer <[email protected]>
Cc: Sam Creasey <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The current code does not make sure to page align bss before calling
vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate() due to
the requested lenght not being correctly aligned.
Let us make sure to align it properly.
Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured
for libc5.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Oscar Salvador <[email protected]>
Reported-by: [email protected]
Tested-by: Tetsuo Handa <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nicolas Pitre <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
- Build the kernel without the fix
- Add some flag to the purgatories KBUILD_CFLAGS,I used
-fno-asynchronous-unwind-tables
- Re-build the kernel
When you look at makes output you see that sha256.o is not re-build in the
last step. Also readelf -S still shows the .eh_frame section for
sha256.o.
With the fix sha256.o is rebuilt in the last step.
Without FORCE make does not detect changes only made to the command line
options. So object files might not be re-built even when they should be.
Fix this by adding FORCE where it is missing.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: df6f2801f511 ("kernel/kexec_file.c: move purgatories sha256 to common code")
Signed-off-by: Philipp Rudo <[email protected]>
Acked-by: Dave Young <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: <[email protected]> [4.17+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
In my testing, the second mount will fail after umounting successfully.
The reason is that we put refcount of trans_mod in the correct case
rather than the error case in parse_opts() at last. That will cause the
refcount decrease to -1, and when we try to get trans_mod again in
try_module_get(), we could only increase refcount to 0 which will cause
failure as follows:
parse_opts
v9fs_get_trans_by_name
try_module_get : return NULL to caller which cause error
So we should put refcount of trans_mod in error case.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 9421c3e64137ec ("net/9p/client.c: fix potential refcnt problem of trans module")
Signed-off-by: Jun Piao <[email protected]>
Reviewed-by: Yiwen Jiang <[email protected]>
Reviewed-by: Greg Kurz <[email protected]>
Reviewed-by: Dominique Martinet <[email protected]>
Tested-by: Dominique Martinet <[email protected]>
Cc: Eric Van Hensbergen <[email protected]>
Cc: Ron Minnich <[email protected]>
Cc: Latchesar Ionkov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The mmu_gather APIs keep track of the invalidated address range
including the span covered by invalidated page table pages. Ranges
covered by page tables but not ptes (and therefore no TLBs) still need
to be invalidated because some architectures (x86) can cache
intermediate page table entries, and invalidate those with normal TLB
invalidation instructions to be almost-backward-compatible.
Architectures which don't cache intermediate page table entries, or
which invalidate these caches separately from TLB invalidation, do not
require TLB invalidation range expanded over page tables.
Allow architectures to supply their own p??_free_tlb functions, which
can avoid the __tlb_adjust_range.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: "Aneesh Kumar K. V" <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Nadav Amit <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The autofs subsystem does not check that the "path" parameter is present
for all cases where it is required when it is passed in via the "param"
struct.
In particular it isn't checked for the AUTOFS_DEV_IOCTL_OPENMOUNT_CMD
ioctl command.
To solve it, modify validate_dev_ioctl(function to check that a path has
been provided for ioctl commands that require it.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Tomas Bortoli <[email protected]>
Signed-off-by: Ian Kent <[email protected]>
Reported-by: [email protected]
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Thomas reports:
"While looking around in /proc on my v4.14.52 system I noticed that all
processes got a lot of "Locked" memory in /proc/*/smaps. A lot more
memory than a regular user can usually lock with mlock().
Commit 493b0e9d945f (in v4.14-rc1) seems to have changed the behavior
of "Locked".
Before that commit the code was like this. Notice the VM_LOCKED check.
(vma->vm_flags & VM_LOCKED) ?
(unsigned long)(mss.pss >> (10 + PSS_SHIFT)) : 0);
After that commit Locked is now the same as Pss:
(unsigned long)(mss->pss >> (10 + PSS_SHIFT)));
This looks like a mistake."
Indeed, the commit has added mss->pss_locked with the correct value that
depends on VM_LOCKED, but forgot to actually use it. Fix it.
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 493b0e9d945f ("mm: add /proc/pid/smaps_rollup")
Signed-off-by: Vlastimil Babka <[email protected]>
Reported-by: Thomas Lindroth <[email protected]>
Cc: Alexey Dobriyan <[email protected]>
Cc: Daniel Colascione <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
KVM guests on s390 can notify the host of unused pages. This can result
in pte_unused callbacks to be true for KVM guest memory.
If a page is unused (checked with pte_unused) we might drop this page
instead of paging it. This can have side-effects on userfaultd, when
the page in question was already migrated:
The next access of that page will trigger a fault and a user fault
instead of faulting in a new and empty zero page. As QEMU does not
expect a userfault on an already migrated page this migration will fail.
The most straightforward solution is to ignore the pte_unused hint if a
userfault context is active for this VMA.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Christian Borntraeger <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Janosch Frank <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Cornelia Huck <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
We must zero struct pages for memory that is not backed by physical
memory, or kernel does not have access to.
Recently, there was a change which zeroed all memmap for all holes in
e820. Unfortunately, it introduced a bug that is discussed here:
https://www.spinics.net/lists/linux-mm/msg156764.html
Linus, also saw this bug on his machine, and confirmed that reverting
commit 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into
memblock.reserved") fixes the issue.
The problem is that we incorrectly zero some struct pages after they
were setup.
The fix is to zero unavailable struct pages prior to initializing of
struct pages.
A more detailed fix should come later that would avoid double zeroing
cases: one in __init_single_page(), the other one in
zero_resv_unavail().
Fixes: 124049decbb1 ("x86/e820: put !E820_TYPE_RAM regions into memblock.reserved")
Signed-off-by: Pavel Tatashin <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
on Clang
Clang puts its section header names in the '.strtab' section instead of
'.shstrtab', which causes objtool to fail with a "can't find
.shstrtab section" warning when attempting to write ORC metadata to an
object file.
If '.shstrtab' doesn't exist, use '.strtab' instead.
Signed-off-by: Simon Ser <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/d1c1c3fe55872be433da7bc5e1860538506229ba.1531153015.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
platform_get_resource() may fail and return NULL, so we should
better check it's return value to avoid a NULL pointer dereference
a bit later in the code.
This is detected by Coccinelle semantic patch.
@@
expression pdev, res, n, t, e, e1, e2;
@@
res = platform_get_resource(pdev, t, n);
+ if (!res)
+ return -EINVAL;
... when != res == NULL
e = devm_ioremap_nocache(e1, res->start, e2);
Fixes: cc4fa83f66e9 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Wei Yongjun <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
The > comparisons should be >= or else we read beyond the end of the
pinctrl->functions[] array.
Fixes: cc4fa83f66e9 ("pinctrl: nsp: add pinmux driver support for Broadcom NSP SoC")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
The datasheet does not document any registers to control drive strength,
and no drive strength registers are for this reason described for this
SoC. The flags indicating that drive strength can be controlled are
however set for some pins in the driver.
This leads to a NULL pointer dereference when the sh-pfc core tries to
access the struct describing the drive strength registers, for example
when reading the sysfs file pinconf-pins.
Fix this by removing the SH_PFC_PIN_CFG_DRIVE_STRENGTH from all pins.
Fixes: b92ac66a1819602b ("pinctrl: sh-pfc: Add R8A77970 PFC support")
Signed-off-by: Niklas Söderlund <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Sergei Shtylyov <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
The .gpio_set_direction() callback was setting inverted direction
for SoCs older than the JZ4770, this restores the correct behaviour.
Signed-off-by: Paul Cercueil <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
When we are explicitly using GPIO hogging mechanism in the pinctrl node,
such as:
&pio {
line_input {
gpio-hog;
gpios = <95 0>, <96 0>, <97 0>;
input;
};
};
A kernel panic happens at dereferencing a NULL pointer: In this case, the
drvdata is still not setup properly yet when it is being accessed.
A better solution for fixing up this issue should be we should obtain the
private data from struct gpio_chip using a specific gpiochip_get_data
instead of a generic dev_get_drvdata.
[ 0.249424] Unable to handle kernel NULL pointer dereference at virtual
address 000000c8
[ 0.257818] Mem abort info:
[ 0.260704] ESR = 0x96000005
[ 0.263869] Exception class = DABT (current EL), IL = 32 bits
[ 0.270011] SET = 0, FnV = 0
[ 0.273167] EA = 0, S1PTW = 0
[ 0.276421] Data abort info:
[ 0.279398] ISV = 0, ISS = 0x00000005
[ 0.283372] CM = 0, WnR = 0
[ 0.286440] [00000000000000c8] user address but active_mm is swapper
[ 0.293027] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 0.298795] Modules linked in:
[ 0.301958] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.16.0-rc1+ #389
[ 0.308716] Hardware name: MediaTek MT7622 RFB1 board (DT)
[ 0.314396] pstate: 80000005 (Nzcv daif -PAN -UAO)
[ 0.319362] pc : mtk_hw_pin_field_get+0x28/0x118
[ 0.324140] lr : mtk_hw_set_value+0x30/0x104
[ 0.328557] sp : ffffff800801b6d0
[ 0.331983] x29: ffffff800801b6d0 x28: ffffff80086b7970
[ 0.337484] x27: 0000000000000000 x26: ffffff80087b8000
[ 0.342986] x25: 0000000000000000 x24: ffffffc00324c230
[ 0.348487] x23: 0000000000000003 x22: 0000000000000000
[ 0.353988] x21: ffffff80087b8000 x20: 0000000000000000
[ 0.359489] x19: 0000000000000054 x18: 00000000fffff7c0
[ 0.364990] x17: 0000000000006300 x16: 000000000000003f
[ 0.370492] x15: 000000000000000e x14: ffffffffffffffff
[ 0.375993] x13: 0000000000000000 x12: 0000000000000020
[ 0.381494] x11: 0000000000000006 x10: 0101010101010101
[ 0.386995] x9 : fffffffffffffffa x8 : 0000000000000007
[ 0.392496] x7 : ffffff80085d63f8 x6 : 0000000000000003
[ 0.397997] x5 : 0000000000000054 x4 : ffffffc0031eb800
[ 0.403499] x3 : ffffff800801b728 x2 : 0000000000000003
[ 0.409000] x1 : 0000000000000054 x0 : 0000000000000000
[ 0.414502] Process swapper/0 (pid: 1, stack limit = 0x000000002a913c1c)
[ 0.421441] Call trace:
[ 0.423968] mtk_hw_pin_field_get+0x28/0x118
[ 0.428387] mtk_hw_set_value+0x30/0x104
[ 0.432445] mtk_gpio_set+0x20/0x28
[ 0.436052] mtk_gpio_direction_output+0x18/0x30
[ 0.440833] gpiod_direction_output_raw_commit+0x7c/0xa0
[ 0.446333] gpiod_direction_output+0x104/0x114
[ 0.451022] gpiod_configure_flags+0xbc/0xfc
[ 0.455441] gpiod_hog+0x8c/0x140
[ 0.458869] of_gpiochip_add+0x27c/0x2d4
[ 0.462928] gpiochip_add_data_with_key+0x338/0x5f0
[ 0.467976] mtk_pinctrl_probe+0x388/0x400
[ 0.472217] platform_drv_probe+0x58/0xa4
[ 0.476365] driver_probe_device+0x204/0x44c
[ 0.480783] __device_attach_driver+0xac/0x108
[ 0.485384] bus_for_each_drv+0x7c/0xac
[ 0.489352] __device_attach+0xa0/0x144
[ 0.493320] device_initial_probe+0x10/0x18
[ 0.497647] bus_probe_device+0x2c/0x8c
[ 0.501616] device_add+0x2f8/0x540
[ 0.505226] of_device_add+0x3c/0x44
[ 0.508925] of_platform_device_create_pdata+0x80/0xb8
[ 0.514245] of_platform_bus_create+0x290/0x3e8
[ 0.518933] of_platform_populate+0x78/0x100
[ 0.523352] of_platform_default_populate+0x24/0x2c
[ 0.528403] of_platform_default_populate_init+0x94/0xa4
[ 0.533903] do_one_initcall+0x98/0x130
[ 0.537874] kernel_init_freeable+0x13c/0x1d4
[ 0.542385] kernel_init+0x10/0xf8
[ 0.545903] ret_from_fork+0x10/0x18
[ 0.549603] Code: 900020a1 f9400800 911dcc21 1400001f (f9406401)
[ 0.555916] ---[ end trace de8c34787fdad3b3 ]---
[ 0.560722] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x0000000b
[ 0.560722]
[ 0.570188] SMP: stopping secondary CPUs
[ 0.574253] ---[ end Kernel panic - not syncing: Attempted to kill
init! exitcode=0x0000000b
[ 0.574253]
Cc: [email protected]
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
If the pinctrl node has the gpio-ranges property, the range will be added
by the gpio core and doesn't need to be added by the pinctrl driver.
But for keeping backward compatibility, an explicit pinctrl_add_gpio_range
is still needed to be called when there is a missing gpio-ranges in pinctrl
node in old dts files.
Cc: [email protected]
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
To allow claiming hogs by pinctrl, we cannot enable pinctrl until all
groups and functions are being added done. Also, it's necessary that
the corresponding gpiochip is being added when the pinctrl device is
enabled.
Cc: [email protected]
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
Because gpichip applied in the driver must depend on mtk eint to implement
the input data debouncing and the translation between gpio and irq, it's
better to keep logic consistent with mtk eint being built prior to gpiochip
being added.
Cc: [email protected]
Fixes: e6dabd38d8e7 ("pinctrl: mediatek: add EINT support to MT7622 SoC")
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
It should be to return an error code when failing at groups building.
Cc: [email protected]
Fixes: d6ed93551320 ("pinctrl: mediatek: add pinctrl driver for MT7622 SoC")
Signed-off-by: Sean Wang <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
|
|
Yuchung Cheng says:
====================
fix DCTCP delayed ACK
This patch series addresses the issue that sometimes DCTCP
fail to acknowledge the latest sequence and result in sender timeout
if inflight is small.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
After fixing the way DCTCP tracking delayed ACKs, the delayed-ACK
related callbacks are no longer needed
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Acked-by: Lawrence Brakmo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Previously, when a data segment was sent an ACK was piggybacked
on the data segment without generating a CA_EVENT_NON_DELAYED_ACK
event to notify congestion control modules. So the DCTCP
ca->delayed_ack_reserved flag could incorrectly stay set when
in fact there were no delayed ACKs being reserved. This could result
in sending a special ECN notification ACK that carries an older
ACK sequence, when in fact there was no need for such an ACK.
DCTCP keeps track of the delayed ACK status with its own separate
state ca->delayed_ack_reserved. Previously it may accidentally cancel
the delayed ACK without updating this field upon sending a special
ACK that carries a older ACK sequence. This inconsistency would
lead to DCTCP receiver never acknowledging the latest data until the
sender times out and retry in some cases.
Packetdrill script (provided by Larry Brakmo)
0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
0.000 setsockopt(3, SOL_TCP, TCP_CONGESTION, "dctcp", 5) = 0
0.000 bind(3, ..., ...) = 0
0.000 listen(3, 1) = 0
0.100 < [ect0] SEW 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7>
0.100 > SE. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8>
0.110 < [ect0] . 1:1(0) ack 1 win 257
0.200 accept(3, ..., ...) = 4
0.200 < [ect0] . 1:1001(1000) ack 1 win 257
0.200 > [ect01] . 1:1(0) ack 1001
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 1:2(1) ack 1001
0.200 < [ect0] . 1001:2001(1000) ack 2 win 257
0.200 write(4, ..., 1) = 1
0.200 > [ect01] P. 2:3(1) ack 2001
0.200 < [ect0] . 2001:3001(1000) ack 3 win 257
0.200 < [ect0] . 3001:4001(1000) ack 3 win 257
0.200 > [ect01] . 3:3(0) ack 4001
0.210 < [ce] P. 4001:4501(500) ack 3 win 257
+0.001 read(4, ..., 4500) = 4500
+0 write(4, ..., 1) = 1
+0 > [ect01] PE. 3:4(1) ack 4501
+0.010 < [ect0] W. 4501:5501(1000) ack 4 win 257
// Previously the ACK sequence below would be 4501, causing a long RTO
+0.040~+0.045 > [ect01] . 4:4(0) ack 5501 // delayed ack
+0.311 < [ect0] . 5501:6501(1000) ack 4 win 257 // More data
+0 > [ect01] . 4:4(0) ack 6501 // now acks everything
+0.500 < F. 9501:9501(0) ack 4 win 257
Reported-by: Larry Brakmo <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Acked-by: Lawrence Brakmo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We accidentally left out the error handling for kstrtoul().
Fixes: a520030e326a ("qlcnic: Implement flash sysfs callback for 83xx adapter")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Michael Heimpold <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
- I2C core bugfix regarding bus recovery
- driver bugfix for the tegra driver
- typo correction
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: recovery: if possible send STOP with recovery pulses
i2c: tegra: Fix NACK error handling
i2c: stu300: use non-archaic spelling of failes
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2018-07-13
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix AF_XDP TX error reporting before final kernel release such that it
becomes consistent between copy mode and zero-copy, from Magnus.
2) Fix three different syzkaller reported issues: oob due to ld_abs
rewrite with too large offset, another oob in l3 based skb test run
and a bug leaving mangled prog in subprog JITing error path, from Daniel.
3) Fix BTF handling for bitfield extraction on big endian, from Okash.
4) Fix a missing linux/errno.h include in cgroup/BPF found by kbuild bot,
from Roman.
5) Fix xdp2skb_meta.sh sample by using just command names instead of
absolute paths for tc and ip and allow them to be redefined, from Taeung.
6) Fix availability probing for BPF seg6 helpers before final kernel ships
so they can be detected at prog load time, from Mathieu.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 8b7008620b84 ("net: Don't copy pfmemalloc flag in
__copy_skb_header()") introduced a different handling for the
pfmemalloc flag in copy and clone paths.
In __skb_clone(), now, the flag is set only if it was set in the
original skb, but not cleared if it wasn't. This is wrong and
might lead to socket buffers being flagged with pfmemalloc even
if the skb data wasn't allocated from pfmemalloc reserves. Copy
the flag instead of ORing it.
Reported-by: Sabrina Dubroca <[email protected]>
Fixes: 8b7008620b84 ("net: Don't copy pfmemalloc flag in __copy_skb_header()")
Signed-off-by: Stefano Brivio <[email protected]>
Tested-by: Sabrina Dubroca <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fixes from Ingo Molnar:
"A clocksource driver fix and a revert"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource: arm_arch_timer: Set arch_mem_timer cpumask to cpu_possible_mask
Revert "tick: Prefer a lower rating device only if it's CPU local device"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf tool fixes from Ingo Molnar:
"Misc tooling fixes: python3 related fixes, gcc8 fix, bashism fixes and
some other smaller fixes"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Use python-config --includes rather than --cflags
perf script python: Fix dict reference counting
perf stat: Fix --interval_clear option
perf tools: Fix compilation errors on gcc8
perf test shell: Prevent temporary editor files from being considered test scripts
perf llvm-utils: Remove bashism from kernel include fetch script
perf test shell: Make perf's inet_pton test more portable
perf test shell: Replace '|&' with '2>&1 |' to work with more shells
perf scripts python: Add Python 3 support to EventClass.py
perf scripts python: Add Python 3 support to sched-migration.py
perf scripts python: Add Python 3 support to Util.py
perf scripts python: Add Python 3 support to SchedGui.py
perf scripts python: Add Python 3 support to Core.py
perf tools: Generate a Python script compatible with Python 2 and 3
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
"Fix a UEFI mixed mode (64-bit kernel on 32-bit UEFI) reboot loop
regression"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/x86: Fix mixed mode reboot loop by removing pointless call to PciIo->Attributes()
|