Age | Commit message (Collapse) | Author | Files | Lines |
|
To get the changes in:
f345a0143b4dd1cf ("vhost-vdpa: uAPI to suspend the device")
Silencing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
To pick up these changes and support them:
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
$ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
$ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
$ diff -u before after
--- before 2022-08-18 09:46:12.355958316 -0300
+++ after 2022-08-18 09:46:19.701182822 -0300
@@ -29,6 +29,7 @@
[0x75] = "VDPA_SET_VRING_ENABLE",
[0x77] = "VDPA_SET_CONFIG_CALL",
[0x7C] = "VDPA_SET_GROUP_ASID",
+ [0x7D] = "VDPA_SUSPEND",
};
= {
[0x00] = "GET_FEATURES",
$
For instance, see how those 'cmd' ioctl arguments get translated, now
VDPA_SUSPEND will be as well:
# perf trace -a -e ioctl --max-events=10
0.000 ( 0.011 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0
21.353 ( 0.014 ms): pipewire/2261 ioctl(fd: 60, cmd: SNDRV_PCM_HWSYNC, arg: 0x1) = 0
25.766 ( 0.014 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0
25.845 ( 0.034 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0
25.916 ( 0.011 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0
25.941 ( 0.025 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ATOMIC, arg: 0x7ffe4a22c840) = 0
32.915 ( 0.009 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_RMFB, arg: 0x7ffe4a22cf9c) = 0
42.522 ( 0.013 ms): gnome-shell/2196 ioctl(fd: 14, cmd: DRM_I915_IRQ_WAIT, arg: 0x7ffe4a22c740) = 0
42.579 ( 0.031 ms): gnome-shel:cs0/2212 ioctl(fd: 14, cmd: DRM_I915_IRQ_EMIT, arg: 0x7fd43915dc70) = 0
42.644 ( 0.010 ms): gnome-shell/2196 ioctl(fd: 9, cmd: DRM_MODE_ADDFB2, arg: 0x7ffe4a22c8a0) = 0
#
Cc: Adrian Hunter <[email protected]>
Cc: Eugenio Pérez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in:
f5ecfee944934757 ("KVM: s390: resetting the Topology-Change-Report")
None of them trigger any changes in tooling, this time this is just to silence
these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/s390/include/uapi/asm/kvm.h' differs from latest version at 'arch/s390/include/uapi/asm/kvm.h'
diff -u tools/arch/s390/include/uapi/asm/kvm.h arch/s390/include/uapi/asm/kvm.h
Cc: Janosch Frank <[email protected]>
Cc: Pierre Morel <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in:
8a061562e2f2b32b ("RISC-V: KVM: Add extensible CSR emulation framework")
f5ecfee944934757 ("KVM: s390: resetting the Topology-Change-Report")
450a563924ae9437 ("KVM: stats: Fix value for KVM_STATS_UNIT_MAX for boolean stats")
1b870fa5573e260b ("kvm: stats: tell userspace which values are boolean")
db1c875e0539518e ("KVM: s390: add KVM_S390_ZPCI_OP to manage guest zPCI devices")
94dfc73e7cf4a31d ("treewide: uapi: Replace zero-length arrays with flexible-array members")
084cc29f8bbb034c ("KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis")
2f4073e08f4cc5a4 ("KVM: VMX: Enable Notify VM exit")
ed2351174e38ad4f ("KVM: x86: Extend KVM_{G,S}ET_VCPU_EVENTS to support pending triple fault")
e9bf3acb23f0a6e1 ("KVM: s390: Add KVM_CAP_S390_PROTECTED_DUMP")
8aba09588d2af37c ("KVM: s390: Add CPU dump functionality")
0460eb35b443f73f ("KVM: s390: Add configuration dump functionality")
fe9a93e07ba4f29d ("KVM: s390: pv: Add query dump information")
35d02493dba1ae63 ("KVM: s390: pv: Add query interface")
c24a950ec7d60c4d ("KVM, SEV: Add KVM_EXIT_SHUTDOWN metadata for SEV-ES")
ffbb61d09fc56c85 ("KVM: x86: Accept KVM_[GS]ET_TSC_KHZ as a VM ioctl.")
661a20fab7d156cf ("KVM: x86/xen: Advertise and document KVM_XEN_HVM_CONFIG_EVTCHN_SEND")
fde0451be8fb3208 ("KVM: x86/xen: Support per-vCPU event channel upcall via local APIC")
28d1629f751c4a5f ("KVM: x86/xen: Kernel acceleration for XENVER_version")
536395260582be74 ("KVM: x86/xen: handle PV timers oneshot mode")
942c2490c23f2800 ("KVM: x86/xen: Add KVM_XEN_VCPU_ATTR_TYPE_VCPU_ID")
2fd6df2f2b47d430 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
35025735a79eaa89 ("KVM: x86/xen: Support direct injection of event channel events")
That just rebuilds perf, as these patches add just an ioctl that is S390
specific and may clash with other arches, so are so far being excluded
in the harvester script:
$ tools/perf/trace/beauty/kvm_ioctl.sh > before
$ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
$ tools/perf/trace/beauty/kvm_ioctl.sh > after
$ diff -u before after
$ grep 390 tools/perf/trace/beauty/kvm_ioctl.sh
egrep -v " ((ARM|PPC|S390)_|[GS]ET_(DEBUGREGS|PIT2|XSAVE|TSC_KHZ)|CREATE_SPAPR_TCE_64)" | \
$
This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
Cc: Anup Patel <[email protected]>
Cc: Ben Gardon <[email protected]>
Cc: Chenyi Qiang <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: Janosch Frank <[email protected]>
Cc: João Martins <[email protected]>
Cc: Matthew Rosato <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Gonda <[email protected]>
Cc: Pierre Morel <[email protected]>
Cc: Tao Xu <[email protected]>
Link: https://lore.kernel.org/lkml/YvzuryClcn%[email protected]/
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up the changes in:
a913bde810fc464d ("drm/i915: Update i915 uapi documentation")
525e93f6317a08a0 ("drm/i915/uapi: add NEEDS_CPU_ACCESS hint")
141f733bb3abb000 ("drm/i915/uapi: expose the avail tracking")
3f4309cbdc849637 ("drm/i915/uapi: add probed_cpu_visible_size")
a50794f26f52c66c ("uapi/drm/i915: Document memory residency and Flat-CCS capability of obj")
That don't add any new ioctl, so no changes in tooling.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Niranjana Vishwanathapura <[email protected]>
Cc: Ramalingam C <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes from:
2b1299322016731d ("x86/speculation: Add RSB VM Exit protections")
28a99e95f55c6185 ("x86/amd: Use IBPB for firmware calls")
4ad3278df6fe2b08 ("x86/speculation: Disable RRSBA behavior")
26aae8ccbc197223 ("x86/cpu/amd: Enumerate BTC_NO")
9756bba28470722d ("x86/speculation: Fill RSB on vmexit for IBRS")
3ebc170068885b6f ("x86/bugs: Add retbleed=ibpb")
2dbb887e875b1de3 ("x86/entry: Add kernel IBRS implementation")
6b80b59b35557065 ("x86/bugs: Report AMD retbleed vulnerability")
a149180fbcf336e9 ("x86: Add magic AMD return-thunk")
15e67227c49a5783 ("x86: Undo return-thunk damage")
a883d624aed463c8 ("x86/cpufeatures: Move RETPOLINE flags to word 11")
aae99a7c9ab371b2 ("x86/cpufeatures: Introduce x2AVIC CPUID bit")
6f33a9daff9f0790 ("x86: Fix comment for X86_FEATURE_ZEN")
51802186158c74a0 ("x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <[email protected]>
Cc: Alexandre Chartre <[email protected]>
Cc: Andrew Cooper <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Daniel Sneddon <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suravee Suthikulpanit <[email protected]>
Cc: Wyes Karny <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes from:
6b2a51ff03bf0c54 ("fscrypt: Add HCTR2 support for filename encryption")
That don't result in any changes in tooling, just causes this to be
rebuilt:
CC /tmp/build/perf-urgent/trace/beauty/sync_file_range.o
LD /tmp/build/perf-urgent/trace/beauty/perf-in.o
addressing this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fscrypt.h' differs from latest version at 'include/uapi/linux/fscrypt.h'
diff -u tools/include/uapi/linux/fscrypt.h include/uapi/linux/fscrypt.h
Cc: Adrian Hunter <[email protected]>
Cc: Herbert Xu <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nathan Huckleberry <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up the changes in:
2b1299322016731d ("x86/speculation: Add RSB VM Exit protections")
4af184ee8b2c0a69 ("tools/power turbostat: dump secondary Turbo-Ratio-Limit")
4ad3278df6fe2b08 ("x86/speculation: Disable RRSBA behavior")
d7caac991feeef1b ("x86/cpu/amd: Add Spectral Chicken")
6ad0ad2bf8a67e27 ("x86/bugs: Report Intel retbleed vulnerability")
c59a1f106f5cd484 ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
465932db25f36648 ("x86/cpu: Add new VMX feature, Tertiary VM-Execution control")
027bbb884be006b0 ("KVM: x86/speculation: Disable Fill buffer clear within guests")
51802186158c74a0 ("x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug")
Addressing these tools/perf build warnings:
diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
That makes the beautification scripts to pick some new entries:
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
$ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
$ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
$ diff -u before after
--- before 2022-08-17 09:05:13.938246475 -0300
+++ after 2022-08-17 09:05:22.221455851 -0300
@@ -161,6 +161,7 @@
[0x0000048f] = "IA32_VMX_TRUE_EXIT_CTLS",
[0x00000490] = "IA32_VMX_TRUE_ENTRY_CTLS",
[0x00000491] = "IA32_VMX_VMFUNC",
+ [0x00000492] = "IA32_VMX_PROCBASED_CTLS3",
[0x000004c1] = "IA32_PMC0",
[0x000004d0] = "IA32_MCG_EXT_CTL",
[0x00000560] = "IA32_RTIT_OUTPUT_BASE",
@@ -212,6 +213,7 @@
[0x0000064D] = "PLATFORM_ENERGY_STATUS",
[0x0000064e] = "PPERF",
[0x0000064f] = "PERF_LIMIT_REASONS",
+ [0x00000650] = "SECONDARY_TURBO_RATIO_LIMIT",
[0x00000658] = "PKG_WEIGHTED_CORE_C0_RES",
[0x00000659] = "PKG_ANY_CORE_C0_RES",
[0x0000065A] = "PKG_ANY_GFXE_C0_RES",
$
Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written, see this example with a previous update:
# perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
^C#
If we use -v (verbose mode) we can see what it does behind the scenes:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
Using CPUID AuthenticAMD-25-21-0
0x6a0
0x6a8
New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
0x6a0
0x6a8
New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
mmap size 528384B
^C#
Example with a frequent msr:
# perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
Using CPUID AuthenticAMD-25-21-0
0x48
New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
0x48
New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
mmap size 528384B
Looking at the vmlinux_path (8 entries long)
symsrc__init: build id mismatch for vmlinux.
Using /proc/kcore for kernel data
Using /proc/kallsyms for symbols
0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule ([kernel.kallsyms])
futex_wait_queue_me ([kernel.kallsyms])
futex_wait ([kernel.kallsyms])
do_futex ([kernel.kallsyms])
__x64_sys_futex ([kernel.kallsyms])
do_syscall_64 ([kernel.kallsyms])
entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
__futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
__switch_to_xtra ([kernel.kallsyms])
__switch_to ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule_idle ([kernel.kallsyms])
do_idle ([kernel.kallsyms])
cpu_startup_entry ([kernel.kallsyms])
secondary_startup_64_no_verify ([kernel.kallsyms])
#
Cc: Adrian Hunter <[email protected]>
Cc: Daniel Sneddon <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Len Brown <[email protected]>
Cc: Like Xu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Pawan Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Robert Hoo <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Link: https://lore.kernel.org/lkml/YvzbT24m2o5U%[email protected]/
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in:
7fa875b8e53c288d ("net: copy from user before calling __copy_msghdr")
ebe73a284f4de8c5 ("net: Allow custom iter handler in msghdr")
7c701d92b2b5e517 ("skbuff: carry external ubuf_info in msghdr")
c04245328dd7e915 ("net: make __sys_accept4_file() static")
That don't result in any changes in the tables generated from that
header.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
Cc: David Ahern <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Dylan Yudaken <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: Pavel Begunkov <[email protected]>
Cc: Yajun Deng <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
A mask encoding of a cpu map is laid out as:
u16 nr
u16 long_size
unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
u16 type
char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.
Fix the mask struct by creating explicit 32 and 64-bit variants, use a
union to avoid data[] and casts; the struct must be packed so the
layout matches the existing perf.data layout. Taking an address of a
member of a packed struct breaks alignment so pass the packed
perf_record_cpu_map_data to functions, so they can access variables with
the right alignment.
As the 64-bit version has 4 bytes of padding, optimizing writing to only
write the 32-bit version.
Committer notes:
Disable warnings about 'packed' that break the build in some arches like
riscv64, but just around that specific struct.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dave Marchevsky <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_cpu_map__max() computes the cpumap's maximum value, no need to
iterate over all values.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dave Marchevsky <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make the cpumap arguments const to make it clearer they are in rather
than out arguments. Make two functions static and remove external
declarations.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dave Marchevsky <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Allows max() to be used with 'const struct perf_cpu_maps *'.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dave Marchevsky <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Gustavo A. R. Silva <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Change the mov in KVM_ASM_SAFE() that zeroes @vector to a movb to
make it unambiguous.
This fixes a build failure with Clang since, unlike the GNU assembler,
the LLVM integrated assembler rejects ambiguous X86 instructions that
don't have suffixes:
In file included from x86_64/hyperv_features.c:13:
include/x86_64/processor.h:825:9: error: ambiguous instructions require an explicit suffix (could be 'movb', 'movw', 'movl', or 'movq')
return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));
^
include/x86_64/processor.h:802:15: note: expanded from macro 'kvm_asm_safe'
asm volatile(KVM_ASM_SAFE(insn) \
^
include/x86_64/processor.h:788:16: note: expanded from macro 'KVM_ASM_SAFE'
"1: " insn "\n\t" \
^
<inline asm>:5:2: note: instantiated into assembly here
mov $0, 15(%rsp)
^
It seems like this change could introduce undesirable behavior in the
future, e.g. if someone used a type larger than a u8 for @vector, since
KVM_ASM_SAFE() will only zero the bottom byte. I tried changing the type
of @vector to an int to see what would happen. GCC failed to compile due
to a size mismatch between `movb` and `%eax`. Clang succeeded in
compiling, but the generated code looked correct, so perhaps it will not
be an issue. That being said it seems like there could be a better
solution to this issue that does not assume @vector is a u8.
Fixes: 3b23054cd3f5 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Change KVM_EXCEPTION_MAGIC to use the all-caps "ULL", rather than lower
case. This fixes a build failure with Clang:
In file included from x86_64/hyperv_features.c:13:
include/x86_64/processor.h:825:9: error: unexpected token in argument list
return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));
^
include/x86_64/processor.h:802:15: note: expanded from macro 'kvm_asm_safe'
asm volatile(KVM_ASM_SAFE(insn) \
^
include/x86_64/processor.h:785:2: note: expanded from macro 'KVM_ASM_SAFE'
"mov $" __stringify(KVM_EXCEPTION_MAGIC) ", %%r9\n\t" \
^
<inline asm>:1:18: note: instantiated into assembly here
mov $0xabacadabaull, %r9
^
Fixes: 3b23054cd3f5 ("KVM: selftests: Add x86-64 support for exception fixup")
Signed-off-by: David Matlack <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Add a macro which prevents a function from getting sealed if there are
no compile-time references to it.
Signed-off-by: Josh Poimboeuf <[email protected]>
Message-Id: <20220818213927.e44fmxkoq4yj6ybn@treble>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via
REIV register.
The table needs to be updated in the following flows:
1. When a RIF is set on a FID, for each FID's {Port, VID} mapping, a new
{RIF, Port}->VID mapping should be created.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
a new {RIF, Port}->VID mapping should be created.
Add a test to verify that packets get the correct VID after routing,
regardless of the order of the configuration.
# ./egress_vid_classification.sh
TEST: Add RIF for existing {port, VID}->FID mapping [ OK ]
TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VXLAN decapsulation, the FID classification is done according to the
VNI. When a RIF is added on top of a FID, the existing VNI->FID mapping
should be updated by the software with the new RIF. In addition, when a new
mapping is added for FID which already has a RIF, the correct RIF should
be used for it.
Add a test to verify that packets can be routed after decapsulation which
is done after VNI->FID classification, regardless of the order of the
configuration.
# ./ingress_rif_conf_vxlan.sh
TEST: Add RIF for existing VNI->FID mapping [ OK ]
TEST: Add VNI->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VLAN-aware bridges (802.1Q), the FID classification is done according
to VID. When a RIF is added on top of a FID, the existing VID->FID mapping
should be updated by the software with the new RIF.
We never map multiple VLANs to the same FID using VID->FID, so we cannot
create VID->FID for FID which already has a RIF using 802.1Q. Anyway,
verify that packets can be routed via port which is added after the FID
already has a RIF.
Add a test to verify that packets can be routed after VID->FID
classification, regardless of the order of the configuration.
# ./ingress_rif_conf_1q.sh
TEST: Add RIF for existing VID->FID mapping [ OK ]
TEST: Add port to VID->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Before layer 2 forwarding, the device classifies an incoming packet to a
FID. After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that needs
to be routed will ingress the router block.
For VLAN-unaware bridges (802.1D), the FID classification is done according
to {Port, VID}. When a RIF is added on top of a FID, all the existing
{Port, VID}->FID mappings should be updated by the software with the new
RIF. In addition, when a new mapping is added for FID which already has a
RIF, the correct RIF should be used for it.
Add a test to verify that packets can be routed after {Port, VID}->FID
classification, regardless of the order of the configuration.
# ./ingress_rif_conf_1d.sh
TEST: Add RIF for existing {port, VID}->FID mapping [ OK ]
TEST: Add {port, VID}->FID mapping for FID with a RIF [ OK ]
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.
Current release - regressions:
- tcp: fix cleanup and leaks in tcp_read_skb() (the new way BPF
socket maps get data out of the TCP stack)
- tls: rx: react to strparser initialization errors
- netfilter: nf_tables: fix scheduling-while-atomic splat
- net: fix suspicious RCU usage in bpf_sk_reuseport_detach()
Current release - new code bugs:
- mlxsw: ptp: fix a couple of races, static checker warnings and
error handling
Previous releases - regressions:
- netfilter:
- nf_tables: fix possible module reference underflow in error path
- make conntrack helpers deal with BIG TCP (skbs > 64kB)
- nfnetlink: re-enable conntrack expectation events
- net: fix potential refcount leak in ndisc_router_discovery()
Previous releases - always broken:
- sched: cls_route: disallow handle of 0
- neigh: fix possible local DoS due to net iface start/stop loop
- rtnetlink: fix module refcount leak in rtnetlink_rcv_msg
- sched: fix adding qlen to qcpu->backlog in gnet_stats_add_queue_cpu
- virtio_net: fix endian-ness for RSS
- dsa: mv88e6060: prevent crash on an unused port
- fec: fix timer capture timing in `fec_ptp_enable_pps()`
- ocelot: stats: fix races, integer wrapping and reading incorrect
registers (the change of register definitions here accounts for
bulk of the changed LoC in this PR)"
* tag 'net-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (77 commits)
net: moxa: MAC address reading, generating, validity checking
tcp: handle pure FIN case correctly
tcp: refactor tcp_read_skb() a bit
tcp: fix tcp_cleanup_rbuf() for tcp_read_skb()
tcp: fix sock skb accounting in tcp_read_skb()
igb: Add lock to avoid data race
dt-bindings: Fix incorrect "the the" corrections
net: genl: fix error path memory leak in policy dumping
stmmac: intel: Add a missing clk_disable_unprepare() call in intel_eth_pci_remove()
net: ethernet: mtk_eth_soc: fix possible NULL pointer dereference in mtk_xdp_run
net/mlx5e: Allocate flow steering storage during uplink initialization
net: mscc: ocelot: report ndo_get_stats64 from the wraparound-resistant ocelot->stats
net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset
net: mscc: ocelot: make struct ocelot_stat_layout array indexable
net: mscc: ocelot: fix race between ndo_get_stats64 and ocelot_check_stats_work
net: mscc: ocelot: turn stats_lock into a spinlock
net: mscc: ocelot: fix address of SYS_COUNT_TX_AGING counter
net: mscc: ocelot: fix incorrect ndo_get_stats64 packet counters
net: dsa: felix: fix ethtool 256-511 and 512-1023 TX packet counters
net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest fix from Shuah Khan:
- fix landlock test build regression
* tag 'linux-kselftest-next-6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
selftests/landlock: fix broken include of linux/landlock.h
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull rtla tool fixes from Steven Rostedt:
"Fixes for the Real-Time Linux Analysis tooling:
- Fix tracer name in comments and prints
- Fix setting up symlinks
- Allow extra flags to be set in build
- Consolidate and show all necessary libraries not found in build
error"
* tag 'trace-rtla-v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
rtla: Consolidate and show all necessary libraries that failed for building
tools/rtla: Build with EXTRA_{C,LD}FLAGS
tools/rtla: Fix command symlinks
rtla: Fix tracer name
|
|
This patch adds tests to exercise optnames that are allowed
in bpf_setsockopt().
Reviewed-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Based on a patch by Mark Hemment <[email protected]> and
incorporating very sane suggestions from Linus.
The point here is to have the default case with FSRM - which is supposed
to be the majority of x86 hw out there - if not now then soon - be
directly inlined into the instruction stream so that no function call
overhead is taking place.
Drop the early clobbers from the @size and @addr operands as those are
not needed anymore since we have single instruction alternatives.
The benchmarks I ran would show very small improvements and a PF
benchmark would even show weird things like slowdowns with higher core
counts.
So for a ~6m running the git test suite, the function gets called under
700K times, all from padzero():
<...>-2536 [006] ..... 261.208801: padzero: to: 0x55b0663ed214, size: 3564, cycles: 21900
<...>-2536 [006] ..... 261.208819: padzero: to: 0x7f061adca078, size: 3976, cycles: 17160
<...>-2537 [008] ..... 261.211027: padzero: to: 0x5572d019e240, size: 3520, cycles: 23850
<...>-2537 [008] ..... 261.211049: padzero: to: 0x7f1288dc9078, size: 3976, cycles: 15900
...
which is around 1%-ish of the total time and which is consistent with
the benchmark numbers.
So Mel gave me the idea to simply measure how fast the function becomes.
I.e.:
start = rdtsc_ordered();
ret = __clear_user(to, n);
end = rdtsc_ordered();
Computing the mean average of all the samples collected during the test
suite run then shows some improvement:
clear_user_original:
Amean: 9219.71 (Sum: 6340154910, samples: 687674)
fsrm:
Amean: 8030.63 (Sum: 5522277720, samples: 687652)
That's on Zen3.
The situation looks a lot more confusing on Intel:
Icelake:
clear_user_original:
Amean: 19679.4 (Sum: 13652560764, samples: 693750)
Amean: 19743.7 (Sum: 13693470604, samples: 693562)
(I ran it twice just to be sure.)
ERMS:
Amean: 20374.3 (Sum: 13910601024, samples: 682752)
Amean: 20453.7 (Sum: 14186223606, samples: 693576)
FSRM:
Amean: 20458.2 (Sum: 13918381386, sample s: 680331)
The original microbenchmark which people were complaining about:
for i in $(seq 1 10); do dd if=/dev/zero of=/dev/null bs=1M status=progress count=65536; done 2>&1 | grep copied
32207011840 bytes (32 GB, 30 GiB) copied, 1 s, 32.2 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.93069 s, 35.6 GB/s
37597741056 bytes (38 GB, 35 GiB) copied, 1 s, 37.6 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.78017 s, 38.6 GB/s
62020124672 bytes (62 GB, 58 GiB) copied, 2 s, 31.0 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 2.13716 s, 32.2 GB/s
60010004480 bytes (60 GB, 56 GiB) copied, 1 s, 60.0 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.14129 s, 60.2 GB/s
53212086272 bytes (53 GB, 50 GiB) copied, 1 s, 53.2 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.28398 s, 53.5 GB/s
55698259968 bytes (56 GB, 52 GiB) copied, 1 s, 55.7 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.22507 s, 56.1 GB/s
55306092544 bytes (55 GB, 52 GiB) copied, 1 s, 55.3 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.23647 s, 55.6 GB/s
54387539968 bytes (54 GB, 51 GiB) copied, 1 s, 54.4 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.25693 s, 54.7 GB/s
50566529024 bytes (51 GB, 47 GiB) copied, 1 s, 50.6 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.35096 s, 50.9 GB/s
58308165632 bytes (58 GB, 54 GiB) copied, 1 s, 58.3 GB/s
68719476736 bytes (69 GB, 64 GiB) copied, 1.17394 s, 58.5 GB/s
Now the same thing with smaller buffers:
for i in $(seq 1 10); do dd if=/dev/zero of=/dev/null bs=1M status=progress count=8192; done 2>&1 | grep copied
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.28485 s, 30.2 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.276112 s, 31.1 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.29136 s, 29.5 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.283803 s, 30.3 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.306503 s, 28.0 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.349169 s, 24.6 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.276912 s, 31.0 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.265356 s, 32.4 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.28464 s, 30.2 GB/s
8589934592 bytes (8.6 GB, 8.0 GiB) copied, 0.242998 s, 35.3 GB/s
is also not conclusive because it all depends on the buffer sizes,
their alignments and when the microcode detects that cachelines can be
aggregated properly and copied in bigger sizes.
Signed-off-by: Borislav Petkov <[email protected]>
Link: https://lore.kernel.org/r/CAHk-=wh=Mu_EYhtOmPn6AxoQZyEh-4fo2Zx3G7rBv1g7vwoKiw@mail.gmail.com
|
|
Andrii Nakryiko says:
====================
bpf-next 2022-08-17
We've added 45 non-merge commits during the last 14 day(s) which contain
a total of 61 files changed, 986 insertions(+), 372 deletions(-).
The main changes are:
1) New bpf_ktime_get_tai_ns() BPF helper to access CLOCK_TAI, from Kurt
Kanzenbach and Jesper Dangaard Brouer.
2) Few clean ups and improvements for libbpf 1.0, from Andrii Nakryiko.
3) Expose crash_kexec() as kfunc for BPF programs, from Artem Savkov.
4) Add ability to define sleepable-only kfuncs, from Benjamin Tissoires.
5) Teach libbpf's bpf_prog_load() and bpf_map_create() to gracefully handle
unsupported names on old kernels, from Hangbin Liu.
6) Allow opting out from auto-attaching BPF programs by libbpf's BPF skeleton,
from Hao Luo.
7) Relax libbpf's requirement for shared libs to be marked executable, from
Henqgi Chen.
8) Improve bpf_iter internals handling of error returns, from Hao Luo.
9) Few accommodations in libbpf to support GCC-BPF quirks, from James Hilliard.
10) Fix BPF verifier logic around tracking dynptr ref_obj_id, from Joanne Koong.
11) bpftool improvements to handle full BPF program names better, from Manu
Bretelle.
12) bpftool fixes around libcap use, from Quentin Monnet.
13) BPF map internals clean ups and improvements around memory allocations,
from Yafang Shao.
14) Allow to use cgroup_get_from_file() on cgroupv1, allowing BPF cgroup
iterator to work on cgroupv1, from Yosry Ahmed.
15) BPF verifier internal clean ups, from Dave Marchevsky and Joanne Koong.
16) Various fixes and clean ups for selftests/bpf and vmtest.sh, from Daniel
Xu, Artem Savkov, Joanne Koong, Andrii Nakryiko, Shibin Koikkara Reeny.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits)
selftests/bpf: Few fixes for selftests/bpf built in release mode
libbpf: Clean up deprecated and legacy aliases
libbpf: Streamline bpf_attr and perf_event_attr initialization
libbpf: Fix potential NULL dereference when parsing ELF
selftests/bpf: Tests libbpf autoattach APIs
libbpf: Allows disabling auto attach
selftests/bpf: Fix attach point for non-x86 arches in test_progs/lsm
libbpf: Making bpf_prog_load() ignore name if kernel doesn't support
selftests/bpf: Update CI kconfig
selftests/bpf: Add connmark read test
selftests/bpf: Add existing connection bpf_*_ct_lookup() test
bpftool: Clear errno after libcap's checks
bpf: Clear up confusion in bpf_skb_adjust_room()'s documentation
bpftool: Fix a typo in a comment
libbpf: Add names for auxiliary maps
bpf: Use bpf_map_area_alloc consistently on bpf map creation
bpf: Make __GFP_NOWARN consistent in bpf map creation
bpf: Use bpf_map_area_free instread of kvfree
bpf: Remove unneeded memset in queue_stack_map creation
libbpf: preserve errno across pr_warn/pr_info/pr_debug
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Florian Westphal says:
====================
netfilter: conntrack and nf_tables bug fixes
The following patchset contains netfilter fixes for net.
Broken since 5.19:
A few ancient connection tracking helpers assume TCP packets cannot
exceed 64kb in size, but this isn't the case anymore with 5.19 when
BIG TCP got merged, from myself.
Regressions since 5.19:
1. 'conntrack -E expect' won't display anything because nfnetlink failed
to enable events for expectations, only for normal conntrack events.
2. partially revert change that added resched calls to a function that can
be in atomic context. Both broken and fixed up by myself.
Broken for several releases (up to original merge of nf_tables):
Several fixes for nf_tables control plane, from Pablo.
This fixes up resource leaks in error paths and adds more sanity
checks for mutually exclusive attributes/flags.
Kconfig:
NF_CONNTRACK_PROCFS is very old and doesn't provide all info provided
via ctnetlink, so it should not default to y. From Geert Uytterhoeven.
Selftests:
rework nft_flowtable.sh: it frequently indicated failure; the way it
tried to detect an offload failure did not work reliably.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
testing: selftests: nft_flowtable.sh: rework test to detect offload failure
testing: selftests: nft_flowtable.sh: use random netns names
netfilter: conntrack: NF_CONNTRACK_PROCFS should no longer default to y
netfilter: nf_tables: check NFT_SET_CONCAT flag if field_count is specified
netfilter: nf_tables: disallow NFT_SET_ELEM_CATCHALL and NFT_SET_ELEM_INTERVAL_END
netfilter: nf_tables: NFTA_SET_ELEM_KEY_END requires concat and interval flags
netfilter: nf_tables: validate NFTA_SET_ELEM_OBJREF based on NFT_SET_OBJECT flag
netfilter: nf_tables: really skip inactive sets when allocating name
netfilter: nfnetlink: re-enable conntrack expectation events
netfilter: nf_tables: fix scheduling-while-atomic splat
netfilter: nf_ct_irc: cap packet search space to 4k
netfilter: nf_ct_ftp: prefer skb_linearize
netfilter: nf_ct_h323: cap packet size at 64k
netfilter: nf_ct_sane: remove pseudo skb linearization
netfilter: nf_tables: possible module reference underflow in error path
netfilter: nf_tables: disallow NFTA_SET_ELEM_KEY_END with NFT_SET_ELEM_INTERVAL_END flag
netfilter: nf_tables: use READ_ONCE and WRITE_ONCE for shared generation id access
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix few issues found when building and running test_progs in
release mode.
First, potentially uninitialized idx variable in xskxceiver,
force-initialize to zero to satisfy compiler.
Few instances of defining uprobe trigger functions break in release mode
unless marked as noinline, due to being static. Add noinline to make
sure everything works.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Hao Luo <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Remove three missed deprecated APIs that were aliased to new APIs:
bpf_object__unload, bpf_prog_attach_xattr and btf__load.
Also move legacy API libbpf_find_kernel_btf (aliased to
btf__load_vmlinux_btf) into libbpf_legacy.h.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Hao Luo <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Make sure that entire libbpf code base is initializing bpf_attr and
perf_event_attr with memset(0). Also for bpf_attr make sure we
clear and pass to kernel only relevant parts of bpf_attr. bpf_attr is
a huge union of independent sub-command attributes, so there is no need
to clear and pass entire union bpf_attr, which over time grows quite
a lot and for most commands this growth is completely irrelevant.
Few cases where we were relying on compiler initialization of BPF UAPI
structs (like bpf_prog_info, bpf_map_info, etc) with `= {};` were
switched to memset(0) pattern for future-proofing.
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Hao Luo <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Fix if condition filtering empty ELF sections to prevent NULL
dereference.
Fixes: 47ea7417b074 ("libbpf: Skip empty sections in bpf_object__init_global_data_maps")
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Hao Luo <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Adds test for libbpf APIs that toggle bpf program auto-attaching.
Signed-off-by: Hao Luo <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Adds libbpf APIs for disabling auto-attach for individual functions.
This is motivated by the use case of cgroup iter [1]. Some iter
types require their parameters to be non-zero, therefore applying
auto-attach on them will fail. With these two new APIs, users who
want to use auto-attach and these types of iters can disable
auto-attach on the program and perform manual attach.
[1] https://lore.kernel.org/bpf/CAEf4BzZ+a2uDo_t6kGBziqdz--m2gh2_EUwkGLDtMd65uwxUjA@mail.gmail.com/
Signed-off-by: Hao Luo <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This test fails on current kernel releases because the flotwable path
now calls dst_check from packet path and will then remove the offload.
Test script has two purposes:
1. check that file (random content) can be sent to other netns (and vv)
2. check that the flow is offloaded (rather than handled by classic
forwarding path).
Since dst_check is in place, 2) fails because the nftables ruleset in
router namespace 1 intentionally blocks traffic under the assumption
that packets are not passed via classic path at all.
Rework this: Instead of blocking traffic, create two named counters, one
for original and one for reverse direction.
The first three test cases are handled by classic forwarding path
(path mtu discovery is disabled and packets exceed MTU).
But all other tests enable PMTUD, so the originator and responder are
expected to lower packet size and flowtable is expected to do the packet
forwarding.
For those tests, check that the packet counters (which are only
incremented for packets that are passed up to classic forward path)
are significantly lower than the file size transferred.
I've tested that the counter-checks fail as expected when the 'flow add'
statement is removed from the ruleset.
Signed-off-by: Florian Westphal <[email protected]>
|
|
"ns1" is a too generic name, use a random suffix to avoid
errors when such a netns exists. Also allows to run multiple
instances of the script in parallel.
Signed-off-by: Florian Westphal <[email protected]>
|
|
The LSM hook userns_create was introduced to provide LSM's an
opportunity to block or allow unprivileged user namespace creation. This
test serves two purposes: it provides a test eBPF implementation, and
tests the hook successfully blocks or allows user namespace creation.
This tests 3 cases:
1. Unattached bpf program does not block unpriv user namespace
creation.
2. Attached bpf program allows user namespace creation given
CAP_SYS_ADMIN privileges.
3. Attached bpf program denies user namespace creation for a
user without CAP_SYS_ADMIN.
Acked-by: KP Singh <[email protected]>
Signed-off-by: Frederick Lawler <[email protected]>
Signed-off-by: Paul Moore <[email protected]>
|
|
Use SYS_PREFIX macro from bpf_misc.h instead of hard-coded '__x64_'
prefix for sys_setdomainname attach point in lsm test.
Signed-off-by: Artem Savkov <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
A TODO in net/ipsec.c asks to refactor the code in xfrm_fill_key() to
use set/map to avoid manually comparing each algorithm with the "name"
parameter passed to the function as an argument. This patch refactors
the code to create an array of structs where each struct contains the
algorithm name and its corresponding key length.
Signed-off-by: Gautam Menghani <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
OpenSSL 3.0 deprecates some of the functions used in the SGX
selftests, causing build errors on new distros. For now ignore
the warnings until support for the functions is no longer
available and mark FIXME so that it can be clear this should
be removed at some point.
Signed-off-by: Kristen Carlson Accardi <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
Similar with commit 10b62d6a38f7 ("libbpf: Add names for auxiliary maps"),
let's make bpf_prog_load() also ignore name if kernel doesn't support
program name.
To achieve this, we need to call sys_bpf_prog_load() directly in
probe_kern_prog_name() to avoid circular dependency. sys_bpf_prog_load()
also need to be exported in the libbpf_internal.h file.
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Quentin Monnet <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The previous selftest changes require two kconfig changes in bpf-ci.
Signed-off-by: Daniel Xu <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/bpf/2c27c6ebf7a03954915f83560653752450389564.1660254747.git.dxu@dxuuu.xyz
|
|
Test that the prog can read from the connection mark. This test is nice
because it ensures progs can interact with netfilter subsystem
correctly.
Signed-off-by: Daniel Xu <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/bpf/d3bc620a491e4c626c20d80631063922cbe13e2b.1660254747.git.dxu@dxuuu.xyz
|
|
Add a test where we do a conntrack lookup on an existing connection.
This is nice because it's a more realistic test than artifically
creating a ct entry and looking it up afterwards.
Signed-off-by: Daniel Xu <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Kumar Kartikeya Dwivedi <[email protected]>
Link: https://lore.kernel.org/bpf/de5a617832f38f8b5631cc87e2a836da7c94d497.1660254747.git.dxu@dxuuu.xyz
|
|
When bpftool is linked against libcap, the library runs a "constructor"
function to compute the number of capabilities of the running kernel
[0], at the beginning of the execution of the program. As part of this,
it performs multiple calls to prctl(). Some of these may fail, and set
errno to a non-zero value:
# strace -e prctl ./bpftool version
prctl(PR_CAPBSET_READ, CAP_MAC_OVERRIDE) = 1
prctl(PR_CAPBSET_READ, 0x30 /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, CAP_CHECKPOINT_RESTORE) = 1
prctl(PR_CAPBSET_READ, 0x2c /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, 0x2a /* CAP_??? */) = -1 EINVAL (Invalid argument)
prctl(PR_CAPBSET_READ, 0x29 /* CAP_??? */) = -1 EINVAL (Invalid argument)
** fprintf added at the top of main(): we have errno == 1
./bpftool v7.0.0
using libbpf v1.0
features: libbfd, libbpf_strict, skeletons
+++ exited with 0 +++
This has been addressed in libcap 2.63 [1], but until this version is
available everywhere, we can fix it on bpftool side.
Let's clean errno at the beginning of the main() function, to make sure
that these checks do not interfere with the batch mode, where we error
out if errno is set after a bpftool command.
[0] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/tree/libcap/cap_alloc.c?h=libcap-2.65#n20
[1] https://git.kernel.org/pub/scm/libs/libcap/libcap.git/commit/?id=f25a1b7e69f7b33e6afb58b3e38f3450b7d2d9a0
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Revert part of the earlier changes to fix the kselftest build when
using a sub-directory from the top of the tree as this broke the
landlock test build as a side-effect when building with "make -C
tools/testing/selftests/landlock".
Reported-by: Mickaël Salaün <[email protected]>
Fixes: a917dd94b832 ("selftests/landlock: drop deprecated headers dependency")
Fixes: f2745dc0ba3d ("selftests: stop using KSFT_KHDR_INSTALL")
Signed-off-by: Guillaume Tucker <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
|
|
There are two "the" in the text. Remove one.
Signed-off-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Wei Liu <[email protected]>
|
|
Adding or removing room space _below_ layers 2 or 3, as the description
mentions, is ambiguous. This was written with a mental image of the
packet with layer 2 at the top, layer 3 under it, and so on. But it has
led users to believe that it was on lower layers (before the beginning
of the L2 and L3 headers respectively).
Let's make it more explicit, and specify between which layers the room
space is adjusted.
Reported-by: Rumen Telbizov <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
This is the wrong library name: libcap, not libpcap.
Signed-off-by: Quentin Monnet <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Some recently added selftests don't have their binaries in .gitignores,
so add them.
I also alphabetically sorted sampling_tests/.gitignore while I was in
there.
Signed-off-by: Russell Currey <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tool updates from Arnaldo Carvalho de Melo:
- 'perf c2c' now supports ARM64, adjust its output to cope with
differences with what is in x86_64. Now go find false sharing on
ARM64 (at least Neoverse) as well!
- Refactor the JSON processing, making the output more compact and thus
reducing the size of the resulting perf binary
- Improvements for 'perf offcpu' profiling, including tracking child
processes
- Update Intel JSON metrics and events files for broadwellde,
broadwellx, cascadelakex, haswellx, icelakex, ivytown, jaketown,
knightslanding, sapphirerapids, skylakex and snowridgex
- Add 'perf stat' JSON output and a 'perf test' entry for it
- Ignore memfd and anonymous mmap events if jitdump present
- Refactor 'perf test' shell tests allowing subdirs
- Fix an error handling path in 'parse_perf_probe_command()'
- Fixes for the guest Intel PT tracing patchkit in the 1st batch of
this merge window
- Print debuginfod queries if -v option is used, to explain delays in
processing when debuginfo servers are enabled to fetch DSOs with
richer symbol tables
- Improve error message for 'perf record -p not_existing_pid'
- Fix openssl and libbpf feature detection
- Add PMU pai_crypto event description for IBM z16 on 'perf list'
- Fix typos and duplicated words on comments in various places
* tag 'perf-tools-fixes-for-v6.0-2022-08-13' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (81 commits)
perf test: Refactor shell tests allowing subdirs
perf vendor events: Update events for snowridgex
perf vendor events: Update events and metrics for skylakex
perf vendor events: Update metrics for sapphirerapids
perf vendor events: Update events for knightslanding
perf vendor events: Update metrics for jaketown
perf vendor events: Update metrics for ivytown
perf vendor events: Update events and metrics for icelakex
perf vendor events: Update events and metrics for haswellx
perf vendor events: Update events and metrics for cascadelakex
perf vendor events: Update events and metrics for broadwellx
perf vendor events: Update metrics for broadwellde
perf jevents: Fold strings optimization
perf jevents: Compress the pmu_events_table
perf metrics: Copy entire pmu_event in find metric
perf pmu-events: Hide the pmu_events
perf pmu-events: Don't assume pmu_event is an array
perf pmu-events: Move test events/metrics to JSON
perf test: Use full metric resolution
perf pmu-events: Hide pmu_events_map
...
|