Age | Commit message (Collapse) | Author | Files | Lines |
|
Add a simple helper that filesystems can use in their parameter parser
to parse the "source" parameter. A few places open-coded this function
and that already caused a bug in the cgroup v1 parser that we fixed.
Let's make it harder to get this wrong by introducing a helper which
performs all necessary checks.
Link: https://syzkaller.appspot.com/bug?id=6312526aba5beae046fdae8f00399f87aab48b12
Cc: Christoph Hellwig <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The following sequence can be used to trigger a UAF:
int fscontext_fd = fsopen("cgroup");
int fd_null = open("/dev/null, O_RDONLY);
int fsconfig(fscontext_fd, FSCONFIG_SET_FD, "source", fd_null);
close_range(3, ~0U, 0);
The cgroup v1 specific fs parser expects a string for the "source"
parameter. However, it is perfectly legitimate to e.g. specify a file
descriptor for the "source" parameter. The fs parser doesn't know what
a filesystem allows there. So it's a bug to assume that "source" is
always of type fs_value_is_string when it can reasonably also be
fs_value_is_file.
This assumption in the cgroup code causes a UAF because struct
fs_parameter uses a union for the actual value. Access to that union is
guarded by the param->type member. Since the cgroup paramter parser
didn't check param->type but unconditionally moved param->string into
fc->source a close on the fscontext_fd would trigger a UAF during
put_fs_context() which frees fc->source thereby freeing the file stashed
in param->file causing a UAF during a close of the fd_null.
Fix this by verifying that param->type is actually a string and report
an error if not.
In follow up patches I'll add a new generic helper that can be used here
and by other filesystems instead of this error-prone copy-pasta fix.
But fixing it in here first makes backporting a it to stable a lot
easier.
Fixes: 8d2451f4994f ("cgroup1: switch to option-by-option parsing")
Reported-by: [email protected]
Cc: Christoph Hellwig <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: <[email protected]>
Cc: syzkaller-bugs <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
the SVM
The AMD platform does not support the functions Ah CPUID leaf. The returned
results for this entry should all remain zero just like the native does:
AMD host:
0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000
(uncanny) AMD guest:
0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00008000
Fixes: cadbaa039b99 ("perf/x86/intel: Make anythread filter support conditional")
Signed-off-by: Like Xu <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
BUG: KASAN: use-after-free in kvm_vm_ioctl_unregister_coalesced_mmio+0x7c/0x1ec arch/arm64/kvm/../../../virt/kvm/coalesced_mmio.c:183
Read of size 8 at addr ffff0000c03a2500 by task syz-executor083/4269
CPU: 5 PID: 4269 Comm: syz-executor083 Not tainted 5.10.0 #7
Hardware name: linux,dummy-virt (DT)
Call trace:
dump_backtrace+0x0/0x2d0 arch/arm64/kernel/stacktrace.c:132
show_stack+0x28/0x34 arch/arm64/kernel/stacktrace.c:196
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x110/0x164 lib/dump_stack.c:118
print_address_description+0x78/0x5c8 mm/kasan/report.c:385
__kasan_report mm/kasan/report.c:545 [inline]
kasan_report+0x148/0x1e4 mm/kasan/report.c:562
check_memory_region_inline mm/kasan/generic.c:183 [inline]
__asan_load8+0xb4/0xbc mm/kasan/generic.c:252
kvm_vm_ioctl_unregister_coalesced_mmio+0x7c/0x1ec arch/arm64/kvm/../../../virt/kvm/coalesced_mmio.c:183
kvm_vm_ioctl+0xe30/0x14c4 arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3755
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__arm64_sys_ioctl+0xf88/0x131c fs/ioctl.c:739
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:220
el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
Allocated by task 4269:
stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track mm/kasan/common.c:56 [inline]
__kasan_kmalloc+0xdc/0x120 mm/kasan/common.c:461
kasan_kmalloc+0xc/0x14 mm/kasan/common.c:475
kmem_cache_alloc_trace include/linux/slab.h:450 [inline]
kmalloc include/linux/slab.h:552 [inline]
kzalloc include/linux/slab.h:664 [inline]
kvm_vm_ioctl_register_coalesced_mmio+0x78/0x1cc arch/arm64/kvm/../../../virt/kvm/coalesced_mmio.c:146
kvm_vm_ioctl+0x7e8/0x14c4 arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3746
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__arm64_sys_ioctl+0xf88/0x131c fs/ioctl.c:739
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:220
el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
Freed by task 4269:
stack_trace_save+0x80/0xb8 kernel/stacktrace.c:121
kasan_save_stack mm/kasan/common.c:48 [inline]
kasan_set_track+0x38/0x6c mm/kasan/common.c:56
kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:355
__kasan_slab_free+0x124/0x150 mm/kasan/common.c:422
kasan_slab_free+0x10/0x1c mm/kasan/common.c:431
slab_free_hook mm/slub.c:1544 [inline]
slab_free_freelist_hook mm/slub.c:1577 [inline]
slab_free mm/slub.c:3142 [inline]
kfree+0x104/0x38c mm/slub.c:4124
coalesced_mmio_destructor+0x94/0xa4 arch/arm64/kvm/../../../virt/kvm/coalesced_mmio.c:102
kvm_iodevice_destructor include/kvm/iodev.h:61 [inline]
kvm_io_bus_unregister_dev+0x248/0x280 arch/arm64/kvm/../../../virt/kvm/kvm_main.c:4374
kvm_vm_ioctl_unregister_coalesced_mmio+0x158/0x1ec arch/arm64/kvm/../../../virt/kvm/coalesced_mmio.c:186
kvm_vm_ioctl+0xe30/0x14c4 arch/arm64/kvm/../../../virt/kvm/kvm_main.c:3755
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__arm64_sys_ioctl+0xf88/0x131c fs/ioctl.c:739
__invoke_syscall arch/arm64/kernel/syscall.c:36 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:48 [inline]
el0_svc_common arch/arm64/kernel/syscall.c:158 [inline]
do_el0_svc+0x120/0x290 arch/arm64/kernel/syscall.c:220
el0_svc+0x1c/0x28 arch/arm64/kernel/entry-common.c:367
el0_sync_handler+0x98/0x170 arch/arm64/kernel/entry-common.c:383
el0_sync+0x140/0x180 arch/arm64/kernel/entry.S:670
If kvm_io_bus_unregister_dev() return -ENOMEM, we already call kvm_iodevice_destructor()
inside this function to delete 'struct kvm_coalesced_mmio_dev *dev' from list
and free the dev, but kvm_iodevice_destructor() is called again, it will lead
the above issue.
Let's check the the return value of kvm_io_bus_unregister_dev(), only call
kvm_iodevice_destructor() if the return value is 0.
Cc: Paolo Bonzini <[email protected]>
Cc: [email protected]
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Kefeng Wang <[email protected]>
Message-Id: <[email protected]>
Cc: [email protected]
Fixes: 5d3c4c79384a ("KVM: Stop looking for coalesced MMIO zones if the bus is destroyed", 2021-04-20)
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Don't clear the C-bit in the #NPF handler, as it is a legal GPA bit for
non-SEV guests, and for SEV guests the C-bit is dropped before the GPA
hits the NPT in hardware. Clearing the bit for non-SEV guests causes KVM
to mishandle #NPFs with that collide with the host's C-bit.
Although the APM doesn't explicitly state that the C-bit is not reserved
for non-SEV, Tom Lendacky confirmed that the following snippet about the
effective reduction due to the C-bit does indeed apply only to SEV guests.
Note that because guest physical addresses are always translated
through the nested page tables, the size of the guest physical address
space is not impacted by any physical address space reduction indicated
in CPUID 8000_001F[EBX]. If the C-bit is a physical address bit however,
the guest physical address space is effectively reduced by 1 bit.
And for SEV guests, the APM clearly states that the bit is dropped before
walking the nested page tables.
If the C-bit is an address bit, this bit is masked from the guest
physical address when it is translated through the nested page tables.
Consequently, the hypervisor does not need to be aware of which pages
the guest has chosen to mark private.
Note, the bogus C-bit clearing was removed from legacy #PF handler in
commit 6d1b867d0456 ("KVM: SVM: Don't strip the C-bit from CR2 on #PF
interception").
Fixes: 0ede79e13224 ("KVM: SVM: Clear C-bit from the page fault address")
Cc: Peter Gonda <[email protected]>
Cc: Brijesh Singh <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Ignore "dynamic" host adjustments to the physical address mask when
generating the masks for guest PTEs, i.e. the guest PA masks. The host
physical address space and guest physical address space are two different
beasts, e.g. even though SEV's C-bit is the same bit location for both
host and guest, disabling SME in the host (which clears shadow_me_mask)
does not affect the guest PTE->GPA "translation".
For non-SEV guests, not dropping bits is the correct behavior. Assuming
KVM and userspace correctly enumerate/configure guest MAXPHYADDR, bits
that are lost as collateral damage from memory encryption are treated as
reserved bits, i.e. KVM will never get to the point where it attempts to
generate a gfn using the affected bits. And if userspace wants to create
a bogus vCPU, then userspace gets to deal with the fallout of hardware
doing odd things with bad GPAs.
For SEV guests, not dropping the C-bit is technically wrong, but it's a
moot point because KVM can't read SEV guest's page tables in any case
since they're always encrypted. Not to mention that the current KVM code
is also broken since sme_me_mask does not have to be non-zero for SEV to
be supported by KVM. The proper fix would be to teach all of KVM to
correctly handle guest private memory, but that's a task for the future.
Fixes: d0ec49d4de90 ("kvm/x86/svm: Support Secure Memory Encryption within KVM")
Cc: [email protected]
Cc: Brijesh Singh <[email protected]>
Cc: Tom Lendacky <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
[Use a new header instead of adding header guards to paging_tmpl.h. - Paolo]
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Use boot_cpu_data.x86_phys_bits instead of the raw CPUID information to
enumerate the MAXPHYADDR for KVM guests when TDP is disabled (the guest
version is only relevant to NPT/TDP).
When using shadow paging, any reductions to the host's MAXPHYADDR apply
to KVM and its guests as well, i.e. using the raw CPUID info will cause
KVM to misreport the number of PA bits available to the guest.
Unconditionally zero out the "Physical Address bit reduction" entry.
For !TDP, the adjustment is already done, and for TDP enumerating the
host's reduction is wrong as the reduction does not apply to GPAs.
Fixes: 9af9b94068fb ("x86/cpu/AMD: Handle SME reduction in physical address size")
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Ignore the guest MAXPHYADDR reported by CPUID.0x8000_0008 if TDP, i.e.
NPT, is disabled, and instead use the host's MAXPHYADDR. Per AMD'S APM:
Maximum guest physical address size in bits. This number applies only
to guests using nested paging. When this field is zero, refer to the
PhysAddrSize field for the maximum guest physical address size.
Fixes: 24c82e576b78 ("KVM: Sanitize cpuid")
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
enabled"
Let KVM load if EFER.NX=0 even if NX is supported, the analysis and
testing (or lack thereof) for the non-PAE host case was garbage.
If the kernel won't be using PAE paging, .Ldefault_entry in head_32.S
skips over the entire EFER sequence. Hopefully that can be changed in
the future to allow KVM to require EFER.NX, but the motivation behind
KVM's requirement isn't yet merged. Reverting and revisiting the mess
at a later date is by far the safest approach.
This reverts commit 8bbed95d2cb6e5de8a342d761a89b0a04faed7be.
Fixes: 8bbed95d2cb6 ("KVM: x86: WARN and reject loading KVM if NX is supported but not enabled")
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Commit b78f4a59669 ("KVM: selftests: Rename vm_handle_exception")
raced with a couple of new x86 tests, missing two vm_handle_exception
to vm_install_exception_handler conversions.
Help the two broken tests to catch up with the new world.
Cc: Andrew Jones <[email protected]>
CC: Ricardo Koller <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Andrew Jones <[email protected]>
Reviewed-by: Ricardo Koller <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
KVM: selftests: Fixes
- provide memory model for IBM z196 and zEC12
- do not require 64GB of memory
|
|
With the recent fixes for fallthrough warnings, it is now possible to
enable -Wimplicit-fallthrough for Clang.
It's important to mention that since we have adopted the use of the
pseudo-keyword macro fallthrough; we also want to avoid having more
/* fall through */ comments being introduced. Notice that contrary
to GCC, Clang doesn't recognize any comments as implicit fall-through
markings when the -Wimplicit-fallthrough option is enabled. So, in
order to avoid having more comments being introduced, we have to use
the option -Wimplicit-fallthrough=5 for GCC, which similar to Clang,
will cause a warning in case a code comment is intended to be used
as a fall-through marking.
Co-developed-by: Kees Cook <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning:
arch/powerpc/platforms/powermac/smp.c:149:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
sysconf(__SC_THREAD_STACK_MIN_VALUE)
In fedora rawhide the PTHREAD_STACK_MIN define may end up expanded to a
sysconf() call, and that will return 'long int', breaking the build:
45 fedora:rawhide : FAIL gcc version 11.1.1 20210623 (Red Hat 11.1.1-6) (GCC)
builtin-sched.c: In function 'create_tasks':
/git/perf-5.14.0-rc1/tools/include/linux/kernel.h:43:24: error: comparison of distinct pointer types lacks a cast [-Werror]
43 | (void) (&_max1 == &_max2); \
| ^~
builtin-sched.c:673:34: note: in expansion of macro 'max'
673 | (size_t) max(16 * 1024, PTHREAD_STACK_MIN));
| ^~~
cc1: all warnings being treated as errors
$ grep __sysconf /usr/include/*/*.h
/usr/include/bits/pthread_stack_min-dynamic.h:extern long int __sysconf (int __name) __THROW;
/usr/include/bits/pthread_stack_min-dynamic.h:# define PTHREAD_STACK_MIN __sysconf (__SC_THREAD_STACK_MIN_VALUE)
/usr/include/bits/time.h:extern long int __sysconf (int);
/usr/include/bits/time.h:# define CLK_TCK ((__clock_t) __sysconf (2)) /* 2 is _SC_CLK_TCK */
$
So cast it to int to cope with that.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fix the following fallthrough warning (powerpc-randconfig):
drivers/dma/mpc512x_dma.c:816:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (powerpc-randconfig):
drivers/usb/gadget/udc/fsl_qe_udc.c:589:4: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60ef0750.I8J+C6KAtb0xVOAa%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
When calling ttm_range_man_fini(), 'man' may be uninitialized, which may
cause a null pointer dereference bug.
Fix this by checking if it is a null pointer.
This log reveals it:
[ 7.902580 ] BUG: kernel NULL pointer dereference, address: 0000000000000058
[ 7.905721 ] RIP: 0010:ttm_range_man_fini+0x40/0x160
[ 7.911826 ] Call Trace:
[ 7.911826 ] radeon_ttm_fini+0x167/0x210
[ 7.911826 ] radeon_bo_fini+0x15/0x40
[ 7.913767 ] rs400_fini+0x55/0x80
[ 7.914358 ] radeon_device_fini+0x3c/0x140
[ 7.914358 ] radeon_driver_unload_kms+0x5c/0xe0
[ 7.914358 ] radeon_driver_load_kms+0x13a/0x200
[ 7.914358 ] ? radeon_driver_unload_kms+0xe0/0xe0
[ 7.914358 ] drm_dev_register+0x1db/0x290
[ 7.914358 ] radeon_pci_probe+0x16a/0x230
[ 7.914358 ] local_pci_probe+0x4a/0xb0
Signed-off-by: Zheyu Ma <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Christian König <[email protected]>
|
|
Because the out of range assignment to bit fields
are compiler-dependant, the fields could have wrong
value.
Signed-off-by: Hyunchul Lee <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
mounts
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=213565
cruid should only be used for the initial mount and after this we should use the current
users credentials.
Ignore the original cruid mount argument when creating a new context for a multiuser mount
following a DFS link.
Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api")
Cc: [email protected] # 5.11+
Reported-by: Xiaoli Feng <[email protected]>
Signed-off-by: Ronnie Sahlberg <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
We recently fixed DNS resolution of the server hostname during reconnect.
However, server IP address may change, even when the old one continues
to server (although sub-optimally).
We should schedule the next DNS resolution based on the TTL of
the DNS record used for the last resolution. This way, we resolve the
server hostname again when a DNS record expires.
Signed-off-by: Shyam Prasad N <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Cc: <[email protected]> # v5.11+
Signed-off-by: Steve French <[email protected]>
|
|
Fix build error with LIBPFM4=1:
CC util/pfm.o
util/pfm.c: In function ‘parse_libpfm_events_option’:
util/pfm.c:102:30: error: ‘struct evsel’ has no member named ‘leader’
102 | evsel->leader = grp_leader;
| ^~
Committer notes:
There is this entry in 'make -C tools/perf build-test' to test the build
with libpfm:
$ grep libpfm tools/perf/tests/make
make_with_libpfm4 := LIBPFM4=1
run += make_with_libpfm4
$
But the test machine lacked libpfm-devel, now its installed and further
cases like this shouldn't happen.
Committer testing:
Before this patch this fails, after applying it:
$ make -C tools/perf build-test
make: Entering directory '/var/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_static: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j24 DESTDIR=/tmp/tmp.KzFSfvGRQa
<SNIP>
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_with_libpfm4_O: make LIBPFM4=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_auxtrace_O: make NO_AUXTRACE=1
<SNIP>
$ rpm -q libpfm-devel
libpfm-devel-4.11.0-4.fc34.x86_64
$
FIXME:
This shows a need for 'build-test' to bail out when a build option is
specified that has no required library devel files installed.
Fixes: fba7c86601e2e42d ("libperf: Move 'leader' from tools/perf to perf_evsel::leader")
Signed-off-by: Heiko Carstens <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in this cset:
7bb7f2ac24a028b2 ("arch, mm: wire up memfd_secret system call where relevant")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# perf trace -v -e memfd_secret
event qualifier tracepoint filter: (common_pid != 13375 && common_pid != 3713) && (id == 447)
^C#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep memfd_secret tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
447 common memfd_secret sys_memfd_secret
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/unistd.h' differs from latest version at 'arch/arm64/include/uapi/asm/unistd.h'
diff -u tools/arch/arm64/include/uapi/asm/unistd.h arch/arm64/include/uapi/asm/unistd.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Linus Torvalds <[email protected]>
Cc: Mike Rapoport <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On a hybrid platform, by default 'perf stat' aggregates and reports the
event counts per PMU. For example,
# perf stat -e cycles -a true
Performance counter stats for 'system wide':
1,400,445 cpu_core/cycles/
680,881 cpu_atom/cycles/
0.001770773 seconds time elapsed
But for uncore events that's not a suitable method. Uncore has nothing
to do with hybrid. So for uncore events, we aggregate event counts from
all PMUs and report the counts without PMUs.
Before:
# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
Performance counter stats for 'system wide':
2,058 uncore_arb_0/event=0x81,umask=0x1/
2,028 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/
0.000614498 seconds time elapsed
After:
# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true
Performance counter stats for 'system wide':
3,996 arb/event=0x81,umask=0x1/
0 arb/event=0x84,umask=0x1/
0.000630046 seconds time elapsed
Of course, we also keep the '--no-merge' working for uncore events.
# perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true
Performance counter stats for 'system wide':
1,952 uncore_arb_0/event=0x81,umask=0x1/
1,921 uncore_arb_1/event=0x81,umask=0x1/
0 uncore_arb_0/event=0x84,umask=0x1/
0 uncore_arb_1/event=0x84,umask=0x1/
0.000575536 seconds time elapsed
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
We don't need the test case for 'cpu_atom'.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
Perf will not create two events for one hw event, so the
evsel->idx doesn't need to be divided by 2 before comparing.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If the atom CPUs are offlined, the 'cpu_atom' is not valid.
We don't need the test case for 'cpu_atom'.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On hybrid platform, such as Alderlake, if atom CPUs are offlined,
the kernel still exports the sysfs path '/sys/devices/cpu_atom/' for
'cpu_atom' pmu but the file '/sys/devices/cpu_atom/cpus' is empty,
which indicates this is an invalid pmu.
Need to check and skip the invalid hybrid pmu.
Before:
# perf list
...
branch-instructions OR cpu_atom/branch-instructions/ [Kernel PMU event]
branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
branch-misses OR cpu_atom/branch-misses/ [Kernel PMU event]
branch-misses OR cpu_core/branch-misses/ [Kernel PMU event]
bus-cycles OR cpu_atom/bus-cycles/ [Kernel PMU event]
bus-cycles OR cpu_core/bus-cycles/ [Kernel PMU event]
...
The cpu_atom events are still displayed even if atom CPUs are offlined.
After:
# perf list
...
branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
branch-misses OR cpu_core/branch-misses/ [Kernel PMU event]
bus-cycles OR cpu_core/bus-cycles/ [Kernel PMU event]
...
Now only cpu_core events are displayed.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We skip filling out the pt with scratch entries if the va range covers
the entire pt, since we later have to fill it with the PTEs for the
object pages anyway. However this might leave open a small window where
the PTEs don't point to anything valid for the HW to consume.
When for example using 2M GTT pages this fill_px() showed up as being
quite significant in perf measurements, and ends up being completely
wasted since we ignore the pt and just use the pde directly.
Anyway, currently we have our PTE construction split between alloc and
insert, which is probably slightly iffy nowadays, since the alloc
doesn't actually allocate anything anymore, instead it just sets up the
page directories and points the PTEs at the scratch page. Later when we
do the insert step we re-program the PTEs again. Better might be to
squash the alloc and insert into a single step, then bringing back this
optimisation(along with some others) should be possible.
Fixes: 14826673247e ("drm/i915: Only initialize partially filled pagetables")
Signed-off-by: Matthew Auld <[email protected]>
Cc: Jon Bloomfield <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: <[email protected]> # v4.15+
Reviewed-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8f88ca76b3942d82e2c1cea8735ec368d89ecc15)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Restore bits 39 to 32 at correct position.
It reverses the operation done in rk_dma_addr_dte_v2().
Fixes: c55356c534aa ("iommu: rockchip: Add support for iommu v2")
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Benjamin Gaignard <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
|
|
The commit 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
fixes an issue of "sub-device is removed where the context entry is cleared
for all aliases". But this commit didn't consider the PASID entry and PASID
table in VT-d scalable mode. This fix increases the coverage of scalable
mode.
Suggested-by: Sanjay Kumar <[email protected]>
Fixes: 8038bdb855331 ("iommu/vt-d: Only clear real DMA device's context entries")
Fixes: 2b0140c69637e ("iommu/vt-d: Use pci_real_dma_dev() for mapping")
Cc: [email protected] # v5.6+
Cc: Jon Derrick <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
|
|
This fixes a bug in context cache clear operation. The code was not
following the correct invalidation flow. A global device TLB invalidation
should be added after the IOTLB invalidation. At the same time, it
uses the domain ID from the context entry. But in scalable mode, the
domain ID is in PASID table entry, not context entry.
Fixes: 7373a8cc38197 ("iommu/vt-d: Setup context and enable RID2PASID support")
Cc: [email protected] # v5.0+
Signed-off-by: Sanjay Kumar <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
|
|
QCOM IOMMU driver calls bus_set_iommu() for every IOMMU device controller,
what fails for the second and latter IOMMU devices. This is intended and
must be not fatal to the driver registration process. Also the cleanup
path should take care of the runtime PM state, what is missing in the
current patch. Revert relevant changes to the QCOM IOMMU driver until
a proper fix is prepared.
This partially reverts commit 249c9dc6aa0db74a0f7908efd04acf774e19b155.
Fixes: 249c9dc6aa0d ("iommu/arm: Cleanup resources in case of probe error path")
Suggested-by: Will Deacon <[email protected]>
Signed-off-by: Marek Szyprowski <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Joerg Roedel <[email protected]>
|
|
Fix the following fallthrough warnings (powernv_defconfig and powerpc64):
drivers/char/powernv-op-panel.c:78:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
dsa_switch_bridge_leave()
This was not caught because there is no switch driver which implements
the .port_bridge_join but not .port_bridge_leave method, but it should
nonetheless be fixed, as in certain conditions (driver development) it
might lead to NULL pointer dereference.
Fixes: f66a6a69f97a ("net: dsa: permit cross-chip bridging between all trees in the system")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix the following warning (mips-randconfig):
arch/mips/include/asm/fpu.h:79:3: warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough]
Originally, the /* fallthrough */ comment was introduced by commit:
597ce1723e0f ("MIPS: Support for 64-bit FP with O32 binaries")
and it was wrongly replaced with fallthrough; by commit:
c9b029903466 ("MIPS: Use fallthrough for arch/mips")
As the original comment is actually useful, fix this issue by
removing unreachable fallthrough; statement and place the original
/* fallthrough */ comment back.
Fixes: c9b029903466 ("MIPS: Use fallthrough for arch/mips")
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warnings:
arch/mips/mm/tlbex.c:1386:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
arch/mips/mm/tlbex.c:2173:3: error: unannotated fall-through between switch labels [-Werror,-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning:
sound/soc/mediatek/mt8183/mt8183-dai-adda.c:342:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
The conversion to ww mutexes failed to address the fence code which
already returns -EDEADLK when we run out of fences. Ww mutexes on
the other hand treat -EDEADLK as an internal errno value indicating
a need to restart the operation due to a deadlock. So now when the
fence code returns -EDEADLK the higher level code erroneously
restarts everything instead of returning the error to userspace
as is expected.
To remedy this let's switch the fence code to use a different errno
value for this. -ENOBUFS seems like a semi-reasonable unique choice.
Apart from igt the only user of this I could find is sna, and even
there all we do is dump the current fence registers from debugfs
into the X server log. So no user visible functionality is affected.
If we really cared about preserving this we could of course convert
back to -EDEADLK higher up, but doesn't seem like that's worth
the hassle here.
Not quite sure which commit specifically broke this, but I'll
just attribute it to the general gem ww mutex work.
Cc: [email protected]
Cc: Maarten Lankhorst <[email protected]>
Cc: Thomas Hellström <[email protected]>
Testcase: igt/gem_pread/exhaustion
Testcase: igt/gem_pwrite/basic-exhaustion
Testcase: igt/gem_fenced_exec_thrash/too-many-fences
Fixes: 80f0b679d6f0 ("drm/i915: Add an implementation for i915_gem_ww_ctx locking, v2.")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Maarten Lankhorst <[email protected]>
(cherry picked from commit 78d2ad7eb4e1f0e9cd5d79788446b6092c21d3e0)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Fix the following fallthrough warnings:
drivers/power/supply/ab8500_fg.c:1730:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
drivers/power/supply/abx500_chargalg.c:1155:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning:
drivers/dma/ti/k3-udma.c:4951:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warnings:
drivers/s390/net/ctcm_fsms.c:1457:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
drivers/s390/net/qeth_l3_main.c:437:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
drivers/s390/char/tape_char.c:374:4: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
arch/s390/kernel/uprobes.c:129:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warnings (arm64-randconfig):
drivers/dma/ipu/ipu_idmac.c:621:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
rivers/dma/ipu/ipu_idmac.c:981:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hansg/linux
Pull vboxsf fixes from Hans de Goede:
"This adds support for the atomic_open directory-inode op to vboxsf.
Note this is not just an enhancement this also fixes an actual issue
which users are hitting, see the commit message of the "boxsf: Add
support for the atomic_open directory-inode" patch"
* tag 'vboxsf-v5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hansg/linux:
vboxsf: Add support for the atomic_open directory-inode op
vboxsf: Add vboxsf_[create|release]_sf_handle() helpers
vboxsf: Make vboxsf_dir_create() return the handle for the created file
vboxsf: Honor excl flag to the dir-inode create op
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs zoned mode fixes from David Sterba:
- fix deadlock when allocating system chunk
- fix wrong mutex unlock on an error path
- fix extent map splitting for append operation
- update and fix message reporting unusable chunk space
- don't block when background zone reclaim runs with balance in
parallel
* tag 'for-5.14-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: fix wrong mutex unlock on failure to allocate log root tree
btrfs: don't block if we can't acquire the reclaim lock
btrfs: properly split extent_map for REQ_OP_ZONE_APPEND
btrfs: rework chunk allocation to avoid exhaustion of the system chunk array
btrfs: fix deadlock with concurrent chunk allocations involving system chunks
btrfs: zoned: print unusable percentage when reclaiming block groups
btrfs: zoned: fix types for u64 division in btrfs_reclaim_bgs_work
|
|
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:382:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (mips-randconfig with Clang):
drivers/mmc/host/jz4740_mmc.c:792:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/pci/proc.c:234:3: warning: fallthrough annotation in unreachable code [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/scsi/libsas/sas_discover.c:467:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (arm64-randconfig with Clang):
drivers/video/fbdev/xilinxfb.c:244:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|
|
Fix the following fallthrough warning (nds32-randconfig with GCC):
include/math-emu/op-common.h:332:8: warning: this statement may fall through [-Wimplicit-fallthrough=]
Reported-by: kernel test robot <[email protected]>
Link: https://lore.kernel.org/lkml/60edca25.k00ut905IFBjPyt5%[email protected]/
Signed-off-by: Gustavo A. R. Silva <[email protected]>
|