Age | Commit message (Collapse) | Author | Files | Lines |
|
Since commit e79f245ddec1 ("X86/KVM: Properly update 'tsc_offset' to
represent the running guest"), vcpu->arch.tsc_offset meaning was
changed to always reflect the tsc_offset value set on active VMCS.
Regardless if vCPU is currently running L1 or L2.
However, above mentioned commit failed to also change
kvm_vcpu_write_tsc_offset() to set vcpu->arch.tsc_offset correctly.
This is because vmx_write_tsc_offset() could set the tsc_offset value
in active VMCS to given offset parameter *plus vmcs12->tsc_offset*.
However, kvm_vcpu_write_tsc_offset() just sets vcpu->arch.tsc_offset
to given offset parameter. Without taking into account the possible
addition of vmcs12->tsc_offset. (Same is true for SVM case).
Fix this issue by changing kvm_x86_ops->write_tsc_offset() to return
actually set tsc_offset in active VMCS and modify
kvm_vcpu_write_tsc_offset() to set returned value in
vcpu->arch.tsc_offset.
In addition, rename write_tsc_offset() callback to write_l1_tsc_offset()
to make it clear that it is meant to set L1 TSC offset.
Fixes: e79f245ddec1 ("X86/KVM: Properly update 'tsc_offset' to represent the running guest")
Reviewed-by: Liran Alon <[email protected]>
Reviewed-by: Mihai Carabas <[email protected]>
Reviewed-by: Krish Sadhukhan <[email protected]>
Signed-off-by: Leonid Shatz <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
The inline keyword which is not at the beginning of the function
declaration may trigger the following build warnings, so let's fix it:
arch/x86/kvm/vmx.c:1309:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5947:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:5985:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
arch/x86/kvm/vmx.c:6023:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
Signed-off-by: Yi Wang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
We get the following warnings about empty statements when building
with 'W=1':
arch/x86/kvm/lapic.c:632:53: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1907:42: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1936:65: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
arch/x86/kvm/lapic.c:1975:44: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body]
Rework the debug helper macro to get rid of these warnings.
Signed-off-by: Yi Wang <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
When guest transitions from/to long-mode by modifying MSR_EFER.LMA,
the list of shared MSRs to be saved/restored on guest<->host
transitions is updated (See vmx_set_efer() call to setup_msrs()).
On every entry to guest, vcpu_enter_guest() calls
vmx_prepare_switch_to_guest(). This function should also take care
of setting the shared MSRs to be saved/restored. However, the
function does nothing in case we are already running with loaded
guest state (vmx->loaded_cpu_state != NULL).
This means that even when guest modifies MSR_EFER.LMA which results
in updating the list of shared MSRs, it isn't being taken into account
by vmx_prepare_switch_to_guest() because it happens while we are
running with loaded guest state.
To fix above mentioned issue, add a flag to mark that the list of
shared MSRs has been updated and modify vmx_prepare_switch_to_guest()
to set shared MSRs when running with host state *OR* list of shared
MSRs has been updated.
Note that this issue was mistakenly introduced by commit
678e315e78a7 ("KVM: vmx: add dedicated utility to access guest's
kernel_gs_base") because previously vmx_set_efer() always called
vmx_load_host_state() which resulted in vmx_prepare_switch_to_guest() to
set shared MSRs.
Fixes: 678e315e78a7 ("KVM: vmx: add dedicated utility to access guest's kernel_gs_base")
Reported-by: Eyal Moscovici <[email protected]>
Reviewed-by: Mihai Carabas <[email protected]>
Reviewed-by: Liam Merwick <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
kvm_pv_clock_pairing() allocates local var
"struct kvm_clock_pairing clock_pairing" on stack and initializes
all it's fields besides padding (clock_pairing.pad[]).
Because clock_pairing var is written completely (including padding)
to guest memory, failure to init struct padding results in kernel
info-leak.
Fix the issue by making sure to also init the padding with zeroes.
Fixes: 55dd00a73a51 ("KVM: x86: add KVM_HC_CLOCK_PAIRING hypercall")
Reported-by: [email protected]
Reviewed-by: Mark Kanda <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
KVM_CAP_HYPERV_ENLIGHTENED_VMCS more than once
Consider the case that userspace enables KVM_CAP_HYPERV_ENLIGHTENED_VMCS twice:
1) kvm_vcpu_ioctl_enable_cap() is called to enable
KVM_CAP_HYPERV_ENLIGHTENED_VMCS which calls nested_enable_evmcs().
2) nested_enable_evmcs() sets enlightened_vmcs_enabled to true and fills
vmcs_version which is then copied to userspace.
3) kvm_vcpu_ioctl_enable_cap() is called again to enable
KVM_CAP_HYPERV_ENLIGHTENED_VMCS which calls nested_enable_evmcs().
4) This time nested_enable_evmcs() just returns 0 as
enlightened_vmcs_enabled is already true. *Without filling
vmcs_version*.
5) kvm_vcpu_ioctl_enable_cap() continues as usual and copies
*uninitialized* vmcs_version to userspace which leads to kernel info-leak.
Fix this issue by simply changing nested_enable_evmcs() to always fill
vmcs_version output argument. Even when enlightened_vmcs_enabled is
already set to true.
Note that SVM's nested_enable_evmcs() should not be modified because it
always returns a non-zero value (-ENODEV) which results in
kvm_vcpu_ioctl_enable_cap() skipping the copy of vmcs_version to
userspace (as it should).
Fixes: 57b119da3594 ("KVM: nVMX: add KVM_CAP_HYPERV_ENLIGHTENED_VMCS capability")
Reported-by: [email protected]
Reviewed-by: Krish Sadhukhan <[email protected]>
Reviewed-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Liran Alon <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
There is a race condition when accessing kvm->arch.apic_access_page_done.
Due to it, x86_set_memory_region will fail when creating the second vcpu
for a svm guest.
Add a mutex_lock to serialize the accesses to apic_access_page_done.
This lock is also used by vmx for the same purpose.
Signed-off-by: Wei Wang <[email protected]>
Signed-off-by: Amadeusz Juskowiak <[email protected]>
Signed-off-by: Julian Stecklina <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
Reviewed-by: Joerg Roedel <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Reported by syzkaller:
BUG: unable to handle kernel NULL pointer dereference at 00000000000001c8
PGD 80000003ec4da067 P4D 80000003ec4da067 PUD 3f7bfa067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 7 PID: 5059 Comm: debug Tainted: G OE 4.19.0-rc5 #16
RIP: 0010:__lock_acquire+0x1a6/0x1990
Call Trace:
lock_acquire+0xdb/0x210
_raw_spin_lock+0x38/0x70
kvm_ioapic_scan_entry+0x3e/0x110 [kvm]
vcpu_enter_guest+0x167e/0x1910 [kvm]
kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
do_vfs_ioctl+0xa5/0x690
ksys_ioctl+0x6d/0x80
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x83/0x6e0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The reason is that the testcase writes hyperv synic HV_X64_MSR_SINT6 msr
and triggers scan ioapic logic to load synic vectors into EOI exit bitmap.
However, irqchip is not initialized by this simple testcase, ioapic/apic
objects should not be accessed.
This can be triggered by the following program:
#define _GNU_SOURCE
#include <endian.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
uint64_t r[3] = {0xffffffffffffffff, 0xffffffffffffffff, 0xffffffffffffffff};
int main(void)
{
syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0);
long res = 0;
memcpy((void*)0x20000040, "/dev/kvm", 9);
res = syscall(__NR_openat, 0xffffffffffffff9c, 0x20000040, 0, 0);
if (res != -1)
r[0] = res;
res = syscall(__NR_ioctl, r[0], 0xae01, 0);
if (res != -1)
r[1] = res;
res = syscall(__NR_ioctl, r[1], 0xae41, 0);
if (res != -1)
r[2] = res;
memcpy(
(void*)0x20000080,
"\x01\x00\x00\x00\x00\x5b\x61\xbb\x96\x00\x00\x40\x00\x00\x00\x00\x01\x00"
"\x08\x00\x00\x00\x00\x00\x0b\x77\xd1\x78\x4d\xd8\x3a\xed\xb1\x5c\x2e\x43"
"\xaa\x43\x39\xd6\xff\xf5\xf0\xa8\x98\xf2\x3e\x37\x29\x89\xde\x88\xc6\x33"
"\xfc\x2a\xdb\xb7\xe1\x4c\xac\x28\x61\x7b\x9c\xa9\xbc\x0d\xa0\x63\xfe\xfe"
"\xe8\x75\xde\xdd\x19\x38\xdc\x34\xf5\xec\x05\xfd\xeb\x5d\xed\x2e\xaf\x22"
"\xfa\xab\xb7\xe4\x42\x67\xd0\xaf\x06\x1c\x6a\x35\x67\x10\x55\xcb",
106);
syscall(__NR_ioctl, r[2], 0x4008ae89, 0x20000080);
syscall(__NR_ioctl, r[2], 0xae80, 0);
return 0;
}
This patch fixes it by bailing out scan ioapic if ioapic is not initialized in
kernel.
Reported-by: Wei Wu <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Wei Wu <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Reported by syzkaller:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000014
PGD 800000040410c067 P4D 800000040410c067 PUD 40410d067 PMD 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 2567 Comm: poc Tainted: G OE 4.19.0-rc5 #16
RIP: 0010:kvm_pv_send_ipi+0x94/0x350 [kvm]
Call Trace:
kvm_emulate_hypercall+0x3cc/0x700 [kvm]
handle_vmcall+0xe/0x10 [kvm_intel]
vmx_handle_exit+0xc1/0x11b0 [kvm_intel]
vcpu_enter_guest+0x9fb/0x1910 [kvm]
kvm_arch_vcpu_ioctl_run+0x35c/0x610 [kvm]
kvm_vcpu_ioctl+0x3e9/0x6d0 [kvm]
do_vfs_ioctl+0xa5/0x690
ksys_ioctl+0x6d/0x80
__x64_sys_ioctl+0x1a/0x20
do_syscall_64+0x83/0x6e0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
The reason is that the apic map has not yet been initialized, the testcase
triggers pv_send_ipi interface by vmcall which results in kvm->arch.apic_map
is dereferenced. This patch fixes it by checking whether or not apic map is
NULL and bailing out immediately if that is the case.
Fixes: 4180bf1b65 (KVM: X86: Implement "send IPI" hypercall)
Reported-by: Wei Wu <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Radim Krčmář <[email protected]>
Cc: Wei Wu <[email protected]>
Signed-off-by: Wanpeng Li <[email protected]>
Cc: [email protected]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Apparently, the ple_gap parameter was accidentally removed
by commit c8e88717cfc6b36bedea22368d97667446318291. Add it
back.
Signed-off-by: Luiz Capitulino <[email protected]>
Cc: [email protected]
Fixes: c8e88717cfc6b36bedea22368d97667446318291
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Fix an error caused by 3-bit right rotation on offset address
calculation of MSI-X table in dw_pcie_ep_raise_msix_irq().
The initial testing code was setting by default the offset address of
MSI-X table to zero, so that even with a 3-bit right rotation the
computed result would still be zero and valid, therefore this bug went
unnoticed.
Fixes: beb4641a787d ("PCI: dwc: Add MSI-X callbacks handler")
Signed-off-by: Gustavo Pimentel <[email protected]>
[[email protected]: updated commit log]
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Cc: [email protected]
|
|
This patch will enable ALC300.
[ It's almost equivalent with other ALC269-compatible ones, and
apparently has no loopback mixer -- tiwai ]
Signed-off-by: Kailang Yang <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
|
|
This device makes a loud buzzing sound when a headphone is inserted while
playing audio at full volume through the speaker.
Fixes: bbf8ff6b1d2a ("ALSA: hda/realtek - Fixup for HP x360 laptops with B&O speakers")
Signed-off-by: Girija Kumar Kasinadhuni <[email protected]>
Cc: <[email protected]>
Signed-off-by: Takashi Iwai <[email protected]>
|
|
Until the vfio-ap driver came into live there was a well known
agreement about the way how ap devices are initialized and their
states when the driver's probe function is called.
However, the vfio device driver when receiving an ap queue device does
additional resets thereby removing the registration for interrupts for
the ap device done by the ap bus core code. So when later the vfio
driver releases the device and one of the default zcrypt drivers takes
care of the device the interrupt registration needs to get
renewed. The current code does no renew and result is that requests
send into such a queue will never see a reply processed - the
application hangs.
This patch adds a function which resets the aq queue state machine for
the ap queue device and triggers the walk through the initial states
(which are reset and registration for interrupts). This function is
now called before the driver's probe function is invoked.
When the association between driver and device is released, the
driver's remove function is called. The current implementation calls a
ap queue function ap_queue_remove(). This invokation has been moved to
the ap bus function to make the probe / remove pair for ap bus and
drivers more symmetric.
Fixes: 7e0bdbe5c21c ("s390/zcrypt: AP bus support for alternate driver(s)")
Cc: [email protected] # 4.19+
Signed-off-by: Harald Freudenberger <[email protected]>
Reviewd-by: Tony Krowiak <[email protected]>
Reviewd-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
The function ext2_xattr_set calls brelse(bh) to drop the reference count
of bh. After that, bh may be freed. However, following brelse(bh),
it reads bh->b_data via macro HDR(bh). This may result in a
use-after-free bug. This patch moves brelse(bh) after reading field.
CC: [email protected]
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
We need to initialize opts.s_mount_opt as zero before using it, else we
may get some unexpected mount options.
Fixes: 088519572ca8 ("ext2: Parse mount options into a dedicated structure")
CC: [email protected]
Signed-off-by: xingaopeng <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
|
|
Move all arguments into output registers from input registers.
This path is exercised by test_verifier.c's "calls: two calls with
args" test. Adjust BPF_TAILCALL_PROLOGUE_SKIP as needed.
Let's also make the prologue length a constant size regardless of
the combination of ->saw_frame_pointer and ->saw_tail_call
settings.
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Commit 587f7a694f01 ("gpio: davinci: Use dev name for label and
automatic base selection") broke the GPIO support on DaVinci boards
in legacy mode by allowing gpiolib to set the GPIO base automatically.
DaVinci board files use the legacy GPIO API with hard-coded GPIO line
numbers. Use the new fields in struct davinci_gpio_platform_data to
manually set the GPIO base to 0.
Fixes: 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection")
Cc: [email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
|
|
Commit 587f7a694f01 ("gpio: davinci: Use dev name for label and
automatic base selection") broke the GPIO support on DaVinci boards
in legacy mode by allowing gpiolib to set the GPIO base automatically.
DaVinci board files use the legacy GPIO API with hard-coded GPIO line
numbers. Use the new fields in struct davinci_gpio_platform_data to
manually set the GPIO base to 0.
Fixes: 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection")
Cc: [email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
|
|
Commit 587f7a694f01 ("gpio: davinci: Use dev name for label and
automatic base selection") broke the GPIO support on DaVinci boards
in legacy mode by allowing gpiolib to set the GPIO base automatically.
DaVinci board files use the legacy GPIO API with hard-coded GPIO line
numbers. Use the new fields in struct davinci_gpio_platform_data to
manually set the GPIO base to 0.
Fixes: 587f7a694f01 ("gpio: davinci: Use dev name for label and automatic base selection")
Cc: [email protected]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sekhar Nori <[email protected]>
|
|
gcc '-Wunused-but-set-variable' warning:
drivers/misc/mic/scif/scif_rma.c: In function 'scif_create_remote_lookup':
drivers/misc/mic/scif/scif_rma.c:373:25: warning:
variable 'vmalloc_num_pages' set but not used [-Wunused-but-set-variable]
'vmalloc_num_pages' should be used to determine if the address is
within the vmalloc range.
Fixes: ba612aa8b487 ("misc: mic: SCIF memory registration and unregistration")
Signed-off-by: YueHaibing <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Free the kobject name that was allocated for the controller device on
failure rather than its parent.
Fixes: d22524a4782a9 ("nvme: switch controller refcounting to use struct device")
Signed-off-by: Keith Busch <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Layout of coprocessor registers in the elf_xtregs_t and
xtregs_coprocessor_t may be different due to alignment. Thus it is not
always possible to copy data between the xtregs_coprocessor_t structure
and the elf_xtregs_t and get correct values for all registers.
Use a table of offsets and sizes of individual coprocessor register
groups to do coprocessor context copying in the ptrace_getxregs and
ptrace_setxregs.
This fixes incorrect coprocessor register values reading from the user
process by the native gdb on an xtensa core with multiple coprocessors
and registers with high alignment requirements.
Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
|
|
Coprocessor context offsets are used by the assembly code that moves
coprocessor context between the individual fields of the
thread_info::xtregs_cp structure and coprocessor registers.
This fixes coprocessor context clobbering on flushing and reloading
during normal user code execution and user process debugging in the
presence of more than one coprocessor in the core configuration.
Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
|
|
coprocessor_flush_all may be called from a context of a thread that is
different from the thread being flushed. In that case contents of the
cpenable special register may not match ti->cpenable of the target
thread, resulting in unhandled coprocessor exception in the kernel
context.
Set cpenable special register to the ti->cpenable of the target register
for the duration of the flush and restore it afterwards.
This fixes the following crash caused by coprocessor register inspection
in native gdb:
(gdb) p/x $w0
Illegal instruction in kernel: sig: 9 [#1] PREEMPT
Call Trace:
___might_sleep+0x184/0x1a4
__might_sleep+0x41/0xac
exit_signals+0x14/0x218
do_exit+0xc9/0x8b8
die+0x99/0xa0
do_illegal_instruction+0x18/0x6c
common_exception+0x77/0x77
coprocessor_flush+0x16/0x3c
arch_ptrace+0x46c/0x674
sys_ptrace+0x2ce/0x3b4
system_call+0x54/0x80
common_exception+0x77/0x77
note: gdb[100] exited with preempt_count 1
Killed
Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
|
|
The numa_slit variable used by node_distance is available to a
module as long as it is linked at compile-time. However, it is
not available to loadable modules. Leading to errors such as:
ERROR: "numa_slit" [drivers/nvme/host/nvme-core.ko] undefined!
The error above is caused by the nvme multipath code that makes
use of node_distance for its path calculation. When the patch was
added, the lightnvm subsystem would select nvme and always compile
it in, leading to the node_distance call to always succeed.
However, when this requirement was removed, nvme could be compiled
in as a module, which exposed this bug.
This patch extracts node_distance to a function and exports it.
Since ACPI is depending on node_distance being a simple lookup to
numa_slit, the previous behavior is exposed as slit_distance and its
users updated.
Fixes: f333444708f82 "nvme: take node locality into account when selecting a path"
Fixes: 73569e11032f "lightnvm: remove dependencies on BLK_DEV_NVME and PCI"
Signed-off-by: Matias Bjøring <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Make the high-level BPF JIT entry a general 'catch-all' and add
architecture specific entries to make it more clear who actively
maintains which BPF JIT compiler. The list (L) address implies
that this eventually lands in the bpf patchwork bucket. Goal is
that this set of responsible developers listed here is always up
to date and a point of contact for helping out in e.g. feature
development, fixes, review or testing patches in order to help
long-term in ensuring quality of the BPF JITs and therefore BPF
core under a given architecture. Every new JIT in future /must/
have an entry here as well.
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Acked-by: Naveen N. Rao <[email protected]>
Acked-by: Sandipan Das <[email protected]>
Acked-by: Martin Schwidefsky <[email protected]>
Acked-by: Heiko Carstens <[email protected]>
Acked-by: David S. Miller <[email protected]>
Acked-by: Zi Shen Lim <[email protected]>
Acked-by: Paul Burton <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Acked-by: Wang YanQing <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
We need to initialize the frame pointer register not just if it is
seen as a source operand, but also if it is seen as the destination
operand of a store or an atomic instruction (which effectively is a
source operand).
This is exercised by test_verifier's "non-invalid fp arithmetic"
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
On T4 and later sparc64 cpus we can use the fused compare and branch
instruction.
However, it can only be used if the branch destination is in the range
of a signed 10-bit immediate offset. This amounts to 1024
instructions forwards or backwards.
After the commit referenced in the Fixes: tag, the largest possible
size program seen by the JIT explodes by a significant factor.
As a result of this convergance takes many more passes since the
expanded "BPF_LDX | BPF_MSH | BPF_B" code sequence, for example,
contains several embedded branch on condition instructions.
On each pass, as suddenly new fused compare and branch instances
become valid, this makes thousands more in range for the next pass.
And so on and so forth.
This is most greatly exemplified by "BPF_MAXINSNS: exec all MSH" which
takes 35 passes to converge, and shrinks the image by about 64K.
To decrease the cost of this number of convergance passes, do the
convergance pass before we have the program image allocated, just like
other JITs (such as x86) do.
Fixes: e0cea7ce988c ("bpf: implement ld_abs/ld_ind in native bpf")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Daniel Borkmann says:
====================
This set contains a fix for arm64 BPF JIT. First patch generalizes
ppc64 way of retrieving subprog into bpf_jit_get_func_addr() as core
code and uses the same on arm64 in second patch. Tested on both arm64
and ppc64.
====================
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
The arm64 JIT has the same issue as ppc64 JIT in that the relative BPF
to BPF call offset can be too far away from core kernel in that relative
encoding into imm is not sufficient and could potentially be truncated,
see also fd045f6cd98e ("arm64: add support for module PLTs") which adds
spill-over space for module_alloc() and therefore bpf_jit_binary_alloc().
Therefore, use the recently added bpf_jit_get_func_addr() helper for
properly fetching the address through prog->aux->func[off]->bpf_func
instead. This also has the benefit to optimize normal helper calls since
their address can use the optimized emission. Tested on Cavium ThunderX
CN8890.
Fixes: db496944fdaa ("bpf: arm64: add JIT support for multi-function programs")
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Make fetching of the BPF call address from ppc64 JIT generic. ppc64
was using a slightly different variant rather than through the insns'
imm field encoding as the target address would not fit into that space.
Therefore, the target subprog number was encoded into the insns' offset
and fetched through fp->aux->func[off]->bpf_func instead. Given there
are other JITs with this issue and the mechanism of fetching the address
is JIT-generic, move it into the core as a helper instead. On the JIT
side, we get information on whether the retrieved address is a fixed
one, that is, not changing through JIT passes, or a dynamic one. For
the former, JITs can optimize their imm emission because this doesn't
change jump offsets throughout JIT process.
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Sandipan Das <[email protected]>
Tested-by: Sandipan Das <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
All lists that reach the tree_nodes_free() function have both zero
counter and true dead flag. The reason for this is that lists to be
release are selected by nf_conncount_gc_list() which already decrements
the list counter and sets on the dead flag. Therefore, this if statement
in tree_nodes_free() is unnecessary and wrong.
Fixes: 31568ec09ea0 ("netfilter: nf_conncount: fix list_del corruption in conn_free")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
There is a reference counter to ensure that masquerade modules register
notifiers only once. However, the existing reference counter approach is
not safe, test commands are:
while :
do
modprobe ip6t_MASQUERADE &
modprobe nft_masq_ipv6 &
modprobe -rv ip6t_MASQUERADE &
modprobe -rv nft_masq_ipv6 &
done
numbers below represent the reference counter.
--------------------------------------------------------
CPU0 CPU1 CPU2 CPU3 CPU4
[insmod] [insmod] [rmmod] [rmmod] [insmod]
--------------------------------------------------------
0->1
register 1->2
returns 2->1
returns 1->0
0->1
register <--
unregister
--------------------------------------------------------
The unregistation of CPU3 should be processed before the
registration of CPU4.
In order to fix this, use a mutex instead of reference counter.
splat looks like:
[ 323.869557] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [modprobe:1381]
[ 323.869574] Modules linked in: nf_tables(+) nf_nat_ipv6(-) nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 n]
[ 323.869574] irq event stamp: 194074
[ 323.898930] hardirqs last enabled at (194073): [<ffffffff90004a0d>] trace_hardirqs_on_thunk+0x1a/0x1c
[ 323.898930] hardirqs last disabled at (194074): [<ffffffff90004a29>] trace_hardirqs_off_thunk+0x1a/0x1c
[ 323.898930] softirqs last enabled at (182132): [<ffffffff922006ec>] __do_softirq+0x6ec/0xa3b
[ 323.898930] softirqs last disabled at (182109): [<ffffffff90193426>] irq_exit+0x1a6/0x1e0
[ 323.898930] CPU: 0 PID: 1381 Comm: modprobe Not tainted 4.20.0-rc2+ #27
[ 323.898930] RIP: 0010:raw_notifier_chain_register+0xea/0x240
[ 323.898930] Code: 3c 03 0f 8e f2 00 00 00 44 3b 6b 10 7f 4d 49 bc 00 00 00 00 00 fc ff df eb 22 48 8d 7b 10 488
[ 323.898930] RSP: 0018:ffff888101597218 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
[ 323.898930] RAX: 0000000000000000 RBX: ffffffffc04361c0 RCX: 0000000000000000
[ 323.898930] RDX: 1ffffffff26132ae RSI: ffffffffc04aa3c0 RDI: ffffffffc04361d0
[ 323.898930] RBP: ffffffffc04361c8 R08: 0000000000000000 R09: 0000000000000001
[ 323.898930] R10: ffff8881015972b0 R11: fffffbfff26132c4 R12: dffffc0000000000
[ 323.898930] R13: 0000000000000000 R14: 1ffff110202b2e44 R15: ffffffffc04aa3c0
[ 323.898930] FS: 00007f813ed41540(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[ 323.898930] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 323.898930] CR2: 0000559bf2c9f120 CR3: 000000010bc80000 CR4: 00000000001006f0
[ 323.898930] Call Trace:
[ 323.898930] ? atomic_notifier_chain_register+0x2d0/0x2d0
[ 323.898930] ? down_read+0x150/0x150
[ 323.898930] ? sched_clock_cpu+0x126/0x170
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] register_netdevice_notifier+0xbb/0x790
[ 323.898930] ? __dev_close_many+0x2d0/0x2d0
[ 323.898930] ? __mutex_unlock_slowpath+0x17f/0x740
[ 323.898930] ? wait_for_completion+0x710/0x710
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 323.898930] ? up_write+0x6c/0x210
[ 323.898930] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] ? nf_tables_core_module_init+0xe4/0xe4 [nf_tables]
[ 324.127073] nft_chain_filter_init+0x1e/0xe8a [nf_tables]
[ 324.127073] nf_tables_module_init+0x37/0x92 [nf_tables]
[ ... ]
Fixes: 8dd33cc93ec9 ("netfilter: nf_nat: generalize IPv4 masquerading support for nf_tables")
Fixes: be6b635cd674 ("netfilter: nf_nat: generalize IPv6 masquerading support for nf_tables")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
register_{netdevice/inetaddr/inet6addr}_notifier may return an error
value, this patch adds the code to handle these error paths.
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Memory windows are implemented with an indirect MKey, when a page fault
event comes for a MW Mkey we need to find the MR at the end of the list of
the indirect MKeys by iterating on all items from the first to the last.
The offset calculated during this process has to be zeroed after the first
iteration or the next iteration will start from a wrong address, resulting
incorrect ODP faulting behavior.
Fixes: db570d7deafb ("IB/mlx5: Add ODP support to MW")
Signed-off-by: Artemy Kovalyov <[email protected]>
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The invalidate range was using PAGE_SIZE instead of the computed 'end',
and had the wrong transformation of page_index due the weird
construction. This can trigger during error unwind and would cause
malfunction.
Inline the code and correct the math.
Fixes: 403cd12e2cf7 ("IB/umem: Add contiguous ODP support")
Signed-off-by: Artemy Kovalyov <[email protected]>
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
It is possible that we call pagefault_single_data_segment() with a MKey
that belongs to a memory region which is not on demand (i.e. pinned
pages). This can happen if, for instance, a WQE that points to multiple
MRs where some of them are ODP MRs and some are not. In this case we
don't need to handle this MR in the ODP context besides reporting success.
Otherwise the code will call pagefault_mr() which will do to_ib_umem_odp()
on a non-ODP MR and thus access out of bounds.
Fixes: 7bdf65d411c1 ("IB/mlx5: Handle page faults")
Signed-off-by: Artemy Kovalyov <[email protected]>
Signed-off-by: Moni Shoua <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
When ip6_route_me_harder is invoked, it resets outgoing interface of:
- link-local scoped packets sent by neighbor discovery
- multicast packets sent by MLD host
- multicast packets send by MLD proxy daemon that sets outgoing
interface through IPV6_PKTINFO ipi6_ifindex
Link-local and multicast packets must keep their original oif after
ip6_route_me_harder is called.
Signed-off-by: Alin Nastac <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Currently all the architectures do basically the same thing in preparing the
function graph tracer on entry to a function. This code can be pulled into a
generic location and then this will allow the function graph tracer to be
fixed, as well as extended.
Create a new function graph helper function_graph_enter() that will call the
hook function (ftrace_graph_entry) and the shadow stack operation
(ftrace_push_return_trace), and remove the need of the architecture code to
manage the shadow stack.
This is needed to prepare for a fix of a design bug on how the curr_ret_stack
is used.
Cc: [email protected]
Fixes: 03274a3ffb449 ("tracing/fgraph: Adjust fgraph depth before calling trace return callback")
Reviewed-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
|
|
This essential mode for PAL users is missing, so add it.
Fixes: 335e3713afb87 ("drm/meson: Add support for HDMI venc modes and settings")
Signed-off-by: Christian Hewitt <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
Currently on driver bringup with KASAN enabled, meson triggers an OOB
memory access as shown below:
[ 117.904528] ==================================================================
[ 117.904560] BUG: KASAN: global-out-of-bounds in meson_viu_set_osd_lut+0x7a0/0x890
[ 117.904588] Read of size 4 at addr ffff20000a63ce24 by task systemd-udevd/498
[ 117.904601]
[ 118.083372] CPU: 4 PID: 498 Comm: systemd-udevd Not tainted 4.20.0-rc3Lyude-Test+ #20
[ 118.091143] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 118.099768] Call trace:
[ 118.102181] dump_backtrace+0x0/0x3e8
[ 118.105796] show_stack+0x14/0x20
[ 118.109083] dump_stack+0x130/0x1c4
[ 118.112539] print_address_description+0x60/0x25c
[ 118.117214] kasan_report+0x1b4/0x368
[ 118.120851] __asan_report_load4_noabort+0x18/0x20
[ 118.125566] meson_viu_set_osd_lut+0x7a0/0x890
[ 118.129953] meson_viu_init+0x10c/0x290
[ 118.133741] meson_drv_bind_master+0x474/0x748
[ 118.138141] meson_drv_bind+0x10/0x18
[ 118.141760] try_to_bring_up_master+0x3d8/0x768
[ 118.146249] component_add+0x214/0x570
[ 118.149978] meson_dw_hdmi_probe+0x18/0x20 [meson_dw_hdmi]
[ 118.155404] platform_drv_probe+0x98/0x138
[ 118.159455] really_probe+0x2a0/0xa70
[ 118.163070] driver_probe_device+0x1b4/0x2d8
[ 118.167299] __driver_attach+0x200/0x280
[ 118.171189] bus_for_each_dev+0x10c/0x1a8
[ 118.175144] driver_attach+0x38/0x50
[ 118.178681] bus_add_driver+0x330/0x608
[ 118.182471] driver_register+0x140/0x388
[ 118.186361] __platform_driver_register+0xc8/0x108
[ 118.191117] meson_dw_hdmi_platform_driver_init+0x1c/0x1000 [meson_dw_hdmi]
[ 118.198022] do_one_initcall+0x12c/0x3bc
[ 118.201883] do_init_module+0x1fc/0x638
[ 118.205673] load_module+0x4b4c/0x6808
[ 118.209387] __se_sys_init_module+0x2e8/0x3c0
[ 118.213699] __arm64_sys_init_module+0x68/0x98
[ 118.218100] el0_svc_common+0x104/0x210
[ 118.221893] el0_svc_handler+0x48/0xb8
[ 118.225594] el0_svc+0x8/0xc
[ 118.228429]
[ 118.229887] The buggy address belongs to the variable:
[ 118.235007] eotf_33_linear_mapping+0x84/0xc0
[ 118.239301]
[ 118.240752] Memory state around the buggy address:
[ 118.245522] ffff20000a63cd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 118.252695] ffff20000a63cd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 118.259850] >ffff20000a63ce00: 00 00 00 00 04 fa fa fa fa fa fa fa 00 00 00 00
[ 118.267000] ^
[ 118.271222] ffff20000a63ce80: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
[ 118.278393] ffff20000a63cf00: 00 00 00 00 00 00 00 00 00 00 00 00 04 fa fa fa
[ 118.285542] ==================================================================
[ 118.292699] Disabling lock debugging due to kernel taint
It seems that when looping through the OSD EOTF LUT maps, we use the
same max iterator for OETF: 20. This is wrong though, since 20*2 is 40,
which means that we'll stop out of bounds on the EOTF maps.
But, this whole thing is already confusing enough to read through as-is,
so let's just replace all of the hardcoded sizes with
OSD_(OETF/EOTF)_LUT_SIZE / 2.
Signed-off-by: Lyude Paul <[email protected]>
Fixes: bbbe775ec5b5 ("drm: Add support for Amlogic Meson Graphic Controller")
Cc: Neil Armstrong <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Carlo Caione <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]> # v4.10+
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
Seeing as we use this registermap in the context of our IRQ handlers, we
need to be using spinlocks for reading/writing registers so that we can
still read them from IRQ handlers without having to grab any mutexes and
accidentally sleep. We don't currently do this, as pointed out by
lockdep:
[ 18.403770] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908
[ 18.406744] in_atomic(): 1, irqs_disabled(): 128, pid: 68, name: kworker/u17:0
[ 18.413864] INFO: lockdep is turned off.
[ 18.417675] irq event stamp: 12
[ 18.420778] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60
[ 18.429510] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60
[ 18.437345] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50
[ 18.446684] softirqs last disabled at (0): [<0000000000000000>] (null)
[ 18.453979] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9
[ 18.469839] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 18.480037] Workqueue: hci0 hci_power_on [bluetooth]
[ 18.487138] Call trace:
[ 18.494192] dump_backtrace+0x0/0x1b8
[ 18.501280] show_stack+0x14/0x20
[ 18.508361] dump_stack+0xbc/0xf4
[ 18.515427] ___might_sleep+0x140/0x1d8
[ 18.522515] __might_sleep+0x50/0x88
[ 18.529582] __mutex_lock+0x60/0x870
[ 18.536621] mutex_lock_nested+0x1c/0x28
[ 18.543660] regmap_lock_mutex+0x10/0x18
[ 18.550696] regmap_read+0x38/0x70
[ 18.557727] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi]
[ 18.564804] __handle_irq_event_percpu+0xac/0x410
[ 18.571891] handle_irq_event_percpu+0x34/0x88
[ 18.578982] handle_irq_event+0x48/0x78
[ 18.586051] handle_fasteoi_irq+0xac/0x160
[ 18.593061] generic_handle_irq+0x24/0x38
[ 18.599989] __handle_domain_irq+0x60/0xb8
[ 18.606857] gic_handle_irq+0x50/0xa0
[ 18.613659] el1_irq+0xb4/0x130
[ 18.620394] debug_lockdep_rcu_enabled+0x2c/0x30
[ 18.627111] schedule+0x38/0xa0
[ 18.633781] schedule_timeout+0x3a8/0x510
[ 18.640389] wait_for_common+0x15c/0x180
[ 18.646905] wait_for_completion+0x14/0x20
[ 18.653319] mmc_wait_for_req_done+0x28/0x168
[ 18.659693] mmc_wait_for_req+0xa8/0xe8
[ 18.665978] mmc_wait_for_cmd+0x64/0x98
[ 18.672180] mmc_io_rw_direct_host+0x94/0x130
[ 18.678385] mmc_io_rw_direct+0x10/0x18
[ 18.684516] sdio_enable_func+0xe8/0x1d0
[ 18.690627] btsdio_open+0x24/0xc0 [btsdio]
[ 18.696821] hci_dev_do_open+0x64/0x598 [bluetooth]
[ 18.703025] hci_power_on+0x50/0x270 [bluetooth]
[ 18.709163] process_one_work+0x2a0/0x6e0
[ 18.715252] worker_thread+0x40/0x448
[ 18.721310] kthread+0x12c/0x130
[ 18.727326] ret_from_fork+0x10/0x1c
[ 18.735555] ------------[ cut here ]------------
[ 18.741430] do not call blocking ops when !TASK_RUNNING; state=2 set at [<000000006265ec59>] wait_for_common+0x140/0x180
[ 18.752417] WARNING: CPU: 0 PID: 68 at kernel/sched/core.c:6096 __might_sleep+0x7c/0x88
[ 18.760553] Modules linked in: dm_mirror dm_region_hash dm_log dm_mod
btsdio bluetooth snd_soc_hdmi_codec dw_hdmi_i2s_audio ecdh_generic
brcmfmac brcmutil cfg80211 rfkill ir_nec_decoder meson_dw_hdmi(O)
dw_hdmi rc_geekbox meson_rng meson_ir ao_cec rng_core rc_core cec
leds_pwm efivars nfsd ip_tables x_tables crc32_generic f2fs uas
meson_gxbb_wdt pwm_meson efivarfs ipv6
[ 18.799469] CPU: 0 PID: 68 Comm: kworker/u17:0 Tainted: G W O 4.20.0-rc3Lyude-Test+ #9
[ 18.808858] Hardware name: amlogic khadas-vim2/khadas-vim2, BIOS 2018.07-rc2-armbian 09/11/2018
[ 18.818045] Workqueue: hci0 hci_power_on [bluetooth]
[ 18.824088] pstate: 80000085 (Nzcv daIf -PAN -UAO)
[ 18.829891] pc : __might_sleep+0x7c/0x88
[ 18.835722] lr : __might_sleep+0x7c/0x88
[ 18.841256] sp : ffff000008003cb0
[ 18.846751] x29: ffff000008003cb0 x28: 0000000000000000
[ 18.852269] x27: ffff00000938e000 x26: ffff800010283000
[ 18.857726] x25: ffff800010353280 x24: ffff00000868ef50
[ 18.863166] x23: 0000000000000000 x22: 0000000000000000
[ 18.868551] x21: 0000000000000000 x20: 000000000000038c
[ 18.873850] x19: ffff000008cd08c0 x18: 0000000000000010
[ 18.879081] x17: ffff000008a68cb0 x16: 0000000000000000
[ 18.884197] x15: 0000000000aaaaaa x14: 0e200e200e200e20
[ 18.889239] x13: 0000000000000001 x12: 00000000ffffffff
[ 18.894261] x11: ffff000008adfa48 x10: 0000000000000001
[ 18.899517] x9 : ffff0000092a0158 x8 : 0000000000000000
[ 18.904674] x7 : ffff00000812136c x6 : 0000000000000000
[ 18.909895] x5 : 0000000000000000 x4 : 0000000000000001
[ 18.915080] x3 : 0000000000000007 x2 : 0000000000000007
[ 18.920269] x1 : 99ab8e9ebb6c8500 x0 : 0000000000000000
[ 18.925443] Call trace:
[ 18.929904] __might_sleep+0x7c/0x88
[ 18.934311] __mutex_lock+0x60/0x870
[ 18.938687] mutex_lock_nested+0x1c/0x28
[ 18.943076] regmap_lock_mutex+0x10/0x18
[ 18.947453] regmap_read+0x38/0x70
[ 18.951842] dw_hdmi_hardirq+0x58/0x138 [dw_hdmi]
[ 18.956269] __handle_irq_event_percpu+0xac/0x410
[ 18.960712] handle_irq_event_percpu+0x34/0x88
[ 18.965176] handle_irq_event+0x48/0x78
[ 18.969612] handle_fasteoi_irq+0xac/0x160
[ 18.974058] generic_handle_irq+0x24/0x38
[ 18.978501] __handle_domain_irq+0x60/0xb8
[ 18.982938] gic_handle_irq+0x50/0xa0
[ 18.987351] el1_irq+0xb4/0x130
[ 18.991734] debug_lockdep_rcu_enabled+0x2c/0x30
[ 18.996180] schedule+0x38/0xa0
[ 19.000609] schedule_timeout+0x3a8/0x510
[ 19.005064] wait_for_common+0x15c/0x180
[ 19.009513] wait_for_completion+0x14/0x20
[ 19.013951] mmc_wait_for_req_done+0x28/0x168
[ 19.018402] mmc_wait_for_req+0xa8/0xe8
[ 19.022809] mmc_wait_for_cmd+0x64/0x98
[ 19.027177] mmc_io_rw_direct_host+0x94/0x130
[ 19.031563] mmc_io_rw_direct+0x10/0x18
[ 19.035922] sdio_enable_func+0xe8/0x1d0
[ 19.040294] btsdio_open+0x24/0xc0 [btsdio]
[ 19.044742] hci_dev_do_open+0x64/0x598 [bluetooth]
[ 19.049228] hci_power_on+0x50/0x270 [bluetooth]
[ 19.053687] process_one_work+0x2a0/0x6e0
[ 19.058143] worker_thread+0x40/0x448
[ 19.062608] kthread+0x12c/0x130
[ 19.067064] ret_from_fork+0x10/0x1c
[ 19.071513] irq event stamp: 12
[ 19.075937] hardirqs last enabled at (11): [<ffff000008a4f57c>] _raw_spin_unlock_irq+0x2c/0x60
[ 19.083560] hardirqs last disabled at (12): [<ffff000008a48914>] __schedule+0xc4/0xa60
[ 19.091401] softirqs last enabled at (0): [<ffff0000080b55e0>] copy_process.isra.4.part.5+0x4d8/0x1c50
[ 19.100801] softirqs last disabled at (0): [<0000000000000000>] (null)
[ 19.108135] ---[ end trace 38c4920787b88c75 ]---
So, fix this by enabling the fast_io option in our regmap config so that
regmap uses spinlocks for locking instead of mutexes.
Signed-off-by: Lyude Paul <[email protected]>
Fixes: 3f68be7d8e96 ("drm/meson: Add support for HDMI encoder and DW-HDMI bridge + PHY")
Cc: Daniel Vetter <[email protected]>
Cc: Neil Armstrong <[email protected]>
Cc: Carlo Caione <[email protected]>
Cc: Kevin Hilman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]> # v4.12+
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
Since Linux 4.17, calls to drm_crtc_vblank_on/off are mandatory, and we get
a warning when ctrc is disabled :
" driver forgot to call drm_crtc_vblank_off()"
But, the vsync IRQ was not totally disabled due the transient hardware
state and specific interrupt line, thus adding proper IRQ masking from
the HHI system control registers.
The last change fixes a race condition introduced by calling the added
drm_crtc_vblank_on/off when an HPD event occurs from the HDMI connector,
triggering a WARN_ON() in the _atomic_begin() callback when the CRTC
is disabled, thus also triggering a WARN_ON() in drm_vblank_put() :
WARNING: CPU: 0 PID: 1185 at drivers/gpu/drm/meson/meson_crtc.c:157 meson_crtc_atomic_begin+0x78/0x80
[...]
Call trace:
meson_crtc_atomic_begin+0x78/0x80
drm_atomic_helper_commit_planes+0x140/0x218
drm_atomic_helper_commit_tail+0x38/0x80
commit_tail+0x7c/0x80
drm_atomic_helper_commit+0xdc/0x150
drm_atomic_commit+0x54/0x60
restore_fbdev_mode_atomic+0x198/0x238
restore_fbdev_mode+0x6c/0x1c0
drm_fb_helper_restore_fbdev_mode_unlocked+0x7c/0xf0
drm_fb_helper_set_par+0x34/0x60
drm_fb_helper_hotplug_event.part.28+0xb8/0xc8
drm_fbdev_client_hotplug+0xa4/0xe0
drm_client_dev_hotplug+0x90/0xe0
drm_kms_helper_hotplug_event+0x3c/0x48
drm_helper_hpd_irq_event+0x134/0x168
dw_hdmi_top_thread_irq+0x3c/0x50
[...]
WARNING: CPU: 0 PID: 1185 at drivers/gpu/drm/drm_vblank.c:1026 drm_vblank_put+0xb4/0xc8
[...]
Call trace:
drm_vblank_put+0xb4/0xc8
drm_crtc_vblank_put+0x24/0x30
drm_atomic_helper_wait_for_vblanks.part.9+0x130/0x2b8
drm_atomic_helper_commit_tail+0x68/0x80
[...]
The issue is that vblank need to be enabled in any occurrence of :
- atomic_enable()
- atomic_begin() and state->enable == true, which was not the case
Moving the CRTC enable code to a common function and calling in one of
these occurrence solves this race condition and makes sure vblank is
enabled in each call to _atomic_begin() from the HPD event leading to
drm_atomic_helper_commit_planes().
To Summarize :
- Make sure that the CRTC code will call the drm_crtc_vblank_on()/off()
- *Really* mask the Vsync IRQ
- Initialize and enable vblank at the first
atomic_begin()/_atomic_enable()
Cc: [email protected] # 4.17+
Signed-off-by: Neil Armstrong <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
[fixed typos+added cc for stable]
Signed-off-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
When drm_new_set_master() fails, set is_master to 0, to prevent a
possible NULL pointer deref.
Here is a problematic flow: we check is_master in drm_is_current_master(),
then proceed to call drm_lease_owner() passing master. If we do not restore
is_master status when drm_new_set_master() fails, we may have a situation
in which is_master will be 1 and master itself, NULL, leading to the deref
of a NULL pointer in drm_lease_owner().
This fixes the following OOPS, observed on an ArchLinux running a 4.19.2
kernel:
[ 97.804282] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080
[ 97.807224] PGD 0 P4D 0
[ 97.807224] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 97.807224] CPU: 0 PID: 1348 Comm: xfwm4 Tainted: P OE 4.19.2-arch1-1-ARCH #1
[ 97.807224] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350 Pro4, BIOS P5.10 10/16/2018
[ 97.807224] RIP: 0010:drm_lease_owner+0xd/0x20 [drm]
[ 97.807224] Code: 83 c4 18 5b 5d c3 b8 ea ff ff ff eb e2 b8 ed ff ff ff eb db e8 b4 ca 68 fb 0f 1f 40 00 0f 1f 44 00 00 48 89 f8 eb 03 48 89 d0 <48> 8b 90 80 00 00 00 48 85 d2 75 f1 c3 66 0f 1f 44 00 00 0f 1f 44
[ 97.807224] RSP: 0018:ffffb8cf08e07bb0 EFLAGS: 00010202
[ 97.807224] RAX: 0000000000000000 RBX: ffff9cf0f2586c00 RCX: ffff9cf0f2586c88
[ 97.807224] RDX: ffff9cf0ddbd8000 RSI: 0000000000000000 RDI: 0000000000000000
[ 97.807224] RBP: ffff9cf1040e9800 R08: 0000000000000000 R09: 0000000000000000
[ 97.807224] R10: ffffdeb30fd5d680 R11: ffffdeb30f5d6808 R12: ffff9cf1040e9888
[ 97.807224] R13: 0000000000000000 R14: dead000000000200 R15: ffff9cf0f2586cc8
[ 97.807224] FS: 00007f4145513180(0000) GS:ffff9cf10ea00000(0000) knlGS:0000000000000000
[ 97.807224] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 97.807224] CR2: 0000000000000080 CR3: 00000003d7548000 CR4: 00000000003406f0
[ 97.807224] Call Trace:
[ 97.807224] drm_is_current_master+0x1a/0x30 [drm]
[ 97.807224] drm_master_release+0x3e/0x130 [drm]
[ 97.807224] drm_file_free.part.0+0x2be/0x2d0 [drm]
[ 97.807224] drm_open+0x1ba/0x1e0 [drm]
[ 97.807224] drm_stub_open+0xaf/0xe0 [drm]
[ 97.807224] chrdev_open+0xa3/0x1b0
[ 97.807224] ? cdev_put.part.0+0x20/0x20
[ 97.807224] do_dentry_open+0x132/0x340
[ 97.807224] path_openat+0x2d1/0x14e0
[ 97.807224] ? mem_cgroup_commit_charge+0x7a/0x520
[ 97.807224] do_filp_open+0x93/0x100
[ 97.807224] ? __check_object_size+0x102/0x189
[ 97.807224] ? _raw_spin_unlock+0x16/0x30
[ 97.807224] do_sys_open+0x186/0x210
[ 97.807224] do_syscall_64+0x5b/0x170
[ 97.807224] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 97.807224] RIP: 0033:0x7f4147b07976
[ 97.807224] Code: 89 54 24 08 e8 7b f4 ff ff 8b 74 24 0c 48 8b 3c 24 41 89 c0 44 8b 54 24 08 b8 01 01 00 00 89 f2 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 30 44 89 c7 89 44 24 08 e8 a6 f4 ff ff 8b 44
[ 97.807224] RSP: 002b:00007ffcced96ca0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[ 97.807224] RAX: ffffffffffffffda RBX: 00005619d5037f80 RCX: 00007f4147b07976
[ 97.807224] RDX: 0000000000000002 RSI: 00005619d46b969c RDI: 00000000ffffff9c
[ 98.040039] RBP: 0000000000000024 R08: 0000000000000000 R09: 0000000000000000
[ 98.040039] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000024
[ 98.040039] R13: 0000000000000012 R14: 00005619d5035950 R15: 0000000000000012
[ 98.040039] Modules linked in: nct6775 hwmon_vid algif_skcipher af_alg nls_iso8859_1 nls_cp437 vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common arc4 videodev media snd_usb_audio snd_hda_codec_hdmi snd_usbmidi_lib snd_rawmidi snd_seq_device mousedev input_leds iwlmvm mac80211 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec edac_mce_amd kvm_amd snd_hda_core kvm iwlwifi snd_hwdep r8169 wmi_bmof cfg80211 snd_pcm irqbypass snd_timer snd libphy soundcore pinctrl_amd rfkill pcspkr sp5100_tco evdev gpio_amdpt k10temp mac_hid i2c_piix4 wmi pcc_cpufreq acpi_cpufreq vboxnetflt(OE) vboxnetadp(OE) vboxpci(OE) vboxdrv(OE) msr sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 fscrypto uas usb_storage dm_crypt hid_generic usbhid hid
[ 98.040039] dm_mod raid1 md_mod sd_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc ahci libahci aesni_intel aes_x86_64 libata crypto_simd cryptd glue_helper ccp xhci_pci rng_core scsi_mod xhci_hcd nvidia_drm(POE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm agpgart nvidia_uvm(POE) nvidia_modeset(POE) nvidia(POE) ipmi_devintf ipmi_msghandler
[ 98.040039] CR2: 0000000000000080
[ 98.040039] ---[ end trace 3b65093b6fe62b2f ]---
[ 98.040039] RIP: 0010:drm_lease_owner+0xd/0x20 [drm]
[ 98.040039] Code: 83 c4 18 5b 5d c3 b8 ea ff ff ff eb e2 b8 ed ff ff ff eb db e8 b4 ca 68 fb 0f 1f 40 00 0f 1f 44 00 00 48 89 f8 eb 03 48 89 d0 <48> 8b 90 80 00 00 00 48 85 d2 75 f1 c3 66 0f 1f 44 00 00 0f 1f 44
[ 98.040039] RSP: 0018:ffffb8cf08e07bb0 EFLAGS: 00010202
[ 98.040039] RAX: 0000000000000000 RBX: ffff9cf0f2586c00 RCX: ffff9cf0f2586c88
[ 98.040039] RDX: ffff9cf0ddbd8000 RSI: 0000000000000000 RDI: 0000000000000000
[ 98.040039] RBP: ffff9cf1040e9800 R08: 0000000000000000 R09: 0000000000000000
[ 98.040039] R10: ffffdeb30fd5d680 R11: ffffdeb30f5d6808 R12: ffff9cf1040e9888
[ 98.040039] R13: 0000000000000000 R14: dead000000000200 R15: ffff9cf0f2586cc8
[ 98.040039] FS: 00007f4145513180(0000) GS:ffff9cf10ea00000(0000) knlGS:0000000000000000
[ 98.040039] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 98.040039] CR2: 0000000000000080 CR3: 00000003d7548000 CR4: 00000000003406f0
Signed-off-by: Sergio Correia <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
Jerry Zuo pointed out a rather obscure hotplugging issue that it seems I
accidentally introduced into DRM two years ago.
Pretend we have a topology like this:
|- DP-1: mst_primary
|- DP-4: active display
|- DP-5: disconnected
|- DP-6: active hub
|- DP-7: active display
|- DP-8: disconnected
|- DP-9: disconnected
If we unplug DP-6, the topology starting at DP-7 will be destroyed but
it's payloads will live on in DP-1's VCPI allocations and thus require
removal. However, this removal currently fails because
drm_dp_update_payload_part1() will (rightly so) try to validate the port
before accessing it, fail then abort. If we keep going, eventually we
run the MST hub out of bandwidth and all new allocations will start to
fail (or in my case; all new displays just start flickering a ton).
We could just teach drm_dp_update_payload_part1() not to drop the port
ref in this case, but then we also need to teach
drm_dp_destroy_payload_step1() to do the same thing, then hope no one
ever adds anything to the that requires a validated port reference in
drm_dp_destroy_connector_work(). Kind of sketchy.
So let's go with a more clever solution: any port that
drm_dp_destroy_connector_work() interacts with is guaranteed to still
exist in memory until we say so. While said port might not be valid we
don't really care: that's the whole reason we're destroying it in the
first place! So, teach drm_dp_get_validated_port_ref() to use the all
mighty current_work() function to avoid attempting to validate ports
from the context of mgr->destroy_connector_work. I can't see any
situation where this wouldn't be safe, and this avoids having to play
whack-a-mole in the future of trying to work around port validation.
Signed-off-by: Lyude Paul <[email protected]>
Fixes: 263efde31f97 ("drm/dp/mst: Get validated port ref in drm_dp_update_payload_part1()")
Reported-by: Jerry Zuo <[email protected]>
Cc: Jerry Zuo <[email protected]>
Cc: Harry Wentland <[email protected]>
Cc: <[email protected]> # v4.6+
Reviewed-by: Dave Airlie <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Sean Paul <[email protected]>
|
|
During NVM upgrade process the host router is hot-removed for a short
while. During this time it is possible that the root port is moved into
D3cold which would be fine if the root port could trigger PME on itself.
However, many systems actually do not implement it so what happens is
that the root port goes into D3cold and never wakes up unless userspace
does PCI config space access, such as running 'lscpi'.
For this reason we explicitly prevent the root port from runtime
suspending during NVM upgrade.
Signed-off-by: Mika Westerberg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This is a longstanding issue: if the vmbus upper-layer drivers try to
consume too many GPADLs, the host may return with an error
0xC0000044 (STATUS_QUOTA_EXCEEDED), but currently we forget to check
the creation_status, and hence we can pass an invalid GPADL handle
into the OPEN_CHANNEL message, and get an error code 0xc0000225 in
open_info->response.open_result.status, and finally we hang in
vmbus_open() -> "goto error_free_info" -> vmbus_teardown_gpadl().
With this patch, we can exit gracefully on STATUS_QUOTA_EXCEEDED.
Cc: Stephen Hemminger <[email protected]>
Cc: K. Y. Srinivasan <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: [email protected]
Signed-off-by: Dexuan Cui <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Malicious code can attempt to free buffers using the BC_FREE_BUFFER
ioctl to binder. There are protections against a user freeing a buffer
while in use by the kernel, however there was a window where
BC_FREE_BUFFER could be used to free a recently allocated buffer that
was not completely initialized. This resulted in a use-after-free
detected by KASAN with a malicious test program.
This window is closed by setting the buffer's allow_user_free attribute
to 0 when the buffer is allocated or when the user has previously freed
it instead of waiting for the caller to set it. The problem was that
when the struct buffer was recycled, allow_user_free was stale and set
to 1 allowing a free to go through.
Signed-off-by: Todd Kjos <[email protected]>
Acked-by: Arve Hjønnevåg <[email protected]>
Cc: stable <[email protected]> # 4.14
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
In case the nd_sd group is set to the sd-card function, Pins 45 + 46 are
configured as GPIOs. If they are blocked by the sd function, they can't
be used as GPIOs.
Reported-by: Kristian Evensen <[email protected]>
Signed-off-by: Mathias Kresin <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Fixes: f576fb6a0700 ("MIPS: ralink: cleanup the soc specific pinmux data")
Patchwork: https://patchwork.linux-mips.org/patch/21220/
Cc: John Crispin <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected] # v3.18+
|