aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2020-09-28libbpf: Allow modification of BTF and add btf__add_str APIAndrii Nakryiko3-8/+258
Allow internal BTF representation to switch from default read-only mode, in which raw BTF data is a single non-modifiable block of memory with BTF header, types, and strings layed out sequentially and contiguously in memory, into a writable representation with types and strings data split out into separate memory regions, that can be dynamically expanded. Such writable internal representation is transparent to users of libbpf APIs, but allows to append new types and strings at the end of BTF, which is a typical use case when generating BTF programmatically. All the basic guarantees of BTF types and strings layout is preserved, i.e., user can get `struct btf_type *` pointer and read it directly. Such btf_type pointers might be invalidated if BTF is modified, so some care is required in such mixed read/write scenarios. Switch from read-only to writable configuration happens automatically the first time when user attempts to modify BTF by either adding a new type or new string. It is still possible to get raw BTF data, which is a single piece of memory that can be persisted in ELF section or into a file as raw BTF. Such raw data memory is also still owned by BTF and will be freed either when BTF object is freed or if another modification to BTF happens, as any modification invalidates BTF raw representation. This patch adds the first two BTF manipulation APIs: btf__add_str(), which allows to add arbitrary strings to BTF string section, and btf__find_str() which allows to find existing string offset, but not add it if it's missing. All the added strings are automatically deduplicated. This is achieved by maintaining an additional string lookup index for all unique strings. Such index is built when BTF is switched to modifiable mode. If at that time BTF strings section contained duplicate strings, they are not de-duplicated. This is done specifically to not modify the existing content of BTF (types, their string offsets, etc), which can cause confusion and is especially important property if there is struct btf_ext associated with struct btf. By following this "imperfect deduplication" process, btf_ext is kept consitent and correct. If deduplication of strings is necessary, it can be forced by doing BTF deduplication, at which point all the strings will be eagerly deduplicated and all string offsets both in struct btf and struct btf_ext will be updated. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28libbpf: Extract generic string hashing function for reuseAndrii Nakryiko2-8/+13
Calculating a hash of zero-terminated string is a common need when using hashmap, so extract it for reuse. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28libbpf: Generalize common logic for managing dynamically-sized arraysAndrii Nakryiko2-21/+59
Managing dynamically-sized array is a common, but not trivial functionality, which significant amount of logic and code to implement properly. So instead of re-implementing it all the time, extract it into a helper function ans reuse. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28libbpf: Remove assumption of single contiguous memory for BTF dataAndrii Nakryiko3-43/+60
Refactor internals of struct btf to remove assumptions that BTF header, type data, and string data are layed out contiguously in a memory in a single memory allocation. Now we have three separate pointers pointing to the start of each respective are: header, types, strings. In the next patches, these pointers will be re-assigned to point to independently allocated memory areas, if BTF needs to be modified. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28libbpf: Refactor internals of BTF type indexAndrii Nakryiko1-64/+75
Refactor implementation of internal BTF type index to not use direct pointers. Instead it uses offset relative to the start of types data section. This allows for types data to be reallocatable, enabling implementation of modifiable BTF. As now getting type by ID has an extra indirection step, convert all internal type lookups to a new helper btf_type_id(), that returns non-const pointer to a type by its ID. Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28selftests: Remove fmod_ret from test_overheadToke Høiland-Jørgensen4-39/+1
The test_overhead prog_test included an fmod_ret program that attached to __set_task_comm() in the kernel. However, this function was never listed as allowed for return modification, so this only worked because of the verifier skipping tests when a trampoline already existed for the attach point. Now that the verifier checks have been fixed, remove fmod_ret from the test so it works again. Fixes: 4eaf0b5c5e04 ("selftest/bpf: Fmod_ret prog and implement test_overhead as part of bench") Acked-by: Andrii Nakryiko <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2020-09-28selftest: bpf: Test copying a sockmap and sockhashLorenz Bauer2-11/+30
Since we can now call map_update_elem(sockmap) from bpf_iter context it's possible to copy a sockmap or sockhash in the kernel. Add a selftest which exercises this. Signed-off-by: Lorenz Bauer <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28selftests: bpf: Remove shared header from sockmap iter testLorenz Bauer3-24/+20
The shared header to define SOCKMAP_MAX_ENTRIES is a bit overkill. Dynamically allocate the sock_fd array based on bpf_map__max_entries instead. Suggested-by: Yonghong Song <[email protected]> Signed-off-by: Lorenz Bauer <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Yonghong Song <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28selftests: bpf: Add helper to compare socket cookiesLorenz Bauer1-14/+36
We compare socket cookies to ensure that insertion into a sockmap worked. Pull this out into a helper function for use in other tests. Signed-off-by: Lorenz Bauer <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28selftests/bpf: Add raw_tp_test_runSong Liu2-0/+120
This test runs test_run for raw_tracepoint program. The test covers ctx input, retval output, and running on correct cpu. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28libbpf: Support test run of raw tracepoint programsSong Liu4-0/+63
Add bpf_prog_test_run_opts() with support of new fields in bpf_attr.test, namely, flags and cpu. Also extend _opts operations to support outputs via opts. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28bpf: Enable BPF_PROG_TEST_RUN for raw_tracepointSong Liu1-0/+7
Add .test_run for raw_tracepoint. Also, introduce a new feature that runs the target program on a specific CPU. This is achieved by a new flag in bpf_attr.test, BPF_F_TEST_RUN_ON_CPU. When this flag is set, the program is triggered on cpu with id bpf_attr.test.cpu. This feature is needed for BPF programs that handle perf_event and other percpu resources, as the program can access these resource locally. Signed-off-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-28selftests: net: add a test for static UDP tunnel portsJakub Kicinski1-0/+58
Check UDP_TUNNEL_NIC_INFO_STATIC_IANA_VXLAN works as expected. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28selftests: net: add a test for shared UDP tunnel info tablesJakub Kicinski1-0/+109
Add a test run of checks validating the shared UDP tunnel port tables function as we expect. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-28Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo12-11/+106
To pick up fixes and get v5.10 development in sync with the main kernel sources. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf test: Fix msan uninitialized use.Ian Rogers1-1/+1
Ensure 'st' is initialized before an error branch is taken. Fixes test "67: Parse and process metrics" with LLVM msan: ==6757==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5570edae947d in rblist__exit tools/perf/util/rblist.c:114:2 #1 0x5570edb1c6e8 in runtime_stat__exit tools/perf/util/stat-shadow.c:141:2 #2 0x5570ed92cfae in __compute_metric tools/perf/tests/parse-metric.c:187:2 #3 0x5570ed92cb74 in compute_metric tools/perf/tests/parse-metric.c:196:9 #4 0x5570ed92c6d8 in test_recursion_fail tools/perf/tests/parse-metric.c:318:2 #5 0x5570ed92b8c8 in test__parse_metric tools/perf/tests/parse-metric.c:356:2 #6 0x5570ed8de8c1 in run_test tools/perf/tests/builtin-test.c:410:9 #7 0x5570ed8ddadf in test_and_print tools/perf/tests/builtin-test.c:440:9 #8 0x5570ed8dca04 in __cmd_test tools/perf/tests/builtin-test.c:661:4 #9 0x5570ed8dbc07 in cmd_test tools/perf/tests/builtin-test.c:807:9 #10 0x5570ed7326cc in run_builtin tools/perf/perf.c:313:11 #11 0x5570ed731639 in handle_internal_command tools/perf/perf.c:365:8 #12 0x5570ed7323cd in run_argv tools/perf/perf.c:409:2 #13 0x5570ed731076 in main tools/perf/perf.c:539:3 Fixes: commit f5a56570a3f2 ("perf test: Fix memory leaks in parse-metric test") Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf parse-events: Reduce casts around bp_addrIan Rogers3-7/+7
perf_event_attr bp_addr is a u64. parse-events.y parses it as a u64, but casts it to a void* and then parse-events.c casts it back to a u64. Rather than all the casts, change the type of the address to be a u64. This removes an issue noted in: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf test: Add expand cgroup event testNamhyung Kim4-0/+247
It'll expand given events for cgroups A, B and C. $ perf test -v expansion 69: Event expansion for cgroups : --- start --- test child forked, pid 983140 metric expr 1 / IPC for CPI metric expr instructions / cycles for IPC found event instructions found event cycles adding {instructions,cycles}:W copying metric event for cgroup 'A': instructions (idx=0) copying metric event for cgroup 'B': instructions (idx=0) copying metric event for cgroup 'C': instructions (idx=0) test child finished with 0 ---- end ---- Event expansion for cgroups: Ok Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf tools: Allow creation of cgroup without openNamhyung Kim3-9/+14
This is a preparation for a test case of expanding events for multiple cgroups. Instead of using real system cgroup, the test will use fake cgroups so it needs a way to have them without a open file descriptor. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf tools: Copy metric events properly when expand cgroupsNamhyung Kim8-3/+149
The metricgroup__copy_metric_events() is to handle metrics events when expanding event for cgroups. As the metric events keep pointers to evsel, it should be refreshed when events are cloned during the operation. The perf_stat__collect_metric_expr() is also called in case an event has a metric directly. During the copy, it references evsel by index as the evlist now has cloned evsels for the given cgroup. Also kernel test robot found an issue in the python module import so add empty implementations of those two functions to fix it. Reported-by: kernel test robot <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf stat: Add --for-each-cgroup optionNamhyung Kim5-1/+112
The --for-each-cgroup option is a syntax sugar to monitor large number of cgroups easily. Current command line requires to list all the events and cgroups even if users want to monitor same events for each cgroup. This patch addresses that usage by copying given events for each cgroup on user's behalf. For instance, if they want to monitor 6 events for 200 cgroups each they should write 1200 event names (with -e) AND 1200 cgroup names (with -G) on the command line. But with this change, they can just specify 6 events and 200 cgroups with a new option. A simpler example below: It wants to measure 3 events for 2 cgroups ('A' and 'B'). The result is that total 6 events are counted like below. $ perf stat -a -e cpu-clock,cycles,instructions --for-each-cgroup A,B sleep 1 Performance counter stats for 'system wide': 988.18 msec cpu-clock A # 0.987 CPUs utilized 3,153,761,702 cycles A # 3.200 GHz (100.00%) 8,067,769,847 instructions A # 2.57 insn per cycle (100.00%) 982.71 msec cpu-clock B # 0.982 CPUs utilized 3,136,093,298 cycles B # 3.182 GHz (99.99%) 8,109,619,327 instructions B # 2.58 insn per cycle (99.99%) 1.001228054 seconds time elapsed Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28KVM: x86: do not attempt TSC synchronization on guest writesPaolo Bonzini2-0/+169
KVM special-cases writes to MSR_IA32_TSC so that all CPUs have the same base for the TSC. This logic is complicated, and we do not want it to have any effect once the VM is started. In particular, if any guest started to synchronize its TSCs with writes to MSR_IA32_TSC rather than MSR_IA32_TSC_ADJUST, the additional effect of kvm_write_tsc code would be uncharted territory. Therefore, this patch makes writes to MSR_IA32_TSC behave essentially the same as writes to MSR_IA32_TSC_ADJUST when they come from the guest. A new selftest (which passes both before and after the patch) checks the current semantics of writes to MSR_IA32_TSC and MSR_IA32_TSC_ADJUST originating from both the host and the guest. Upcoming work to remove the special side effects of host-initiated writes to MSR_IA32_TSC and MSR_IA32_TSC_ADJUST will be able to build onto this test, adjusting the host side to use the new APIs and achieve the same effect. Signed-off-by: Paolo Bonzini <[email protected]>
2020-09-28KVM: selftests: Add test for user space MSR handlingAlexander Graf3-0/+250
Now that we have the ability to handle MSRs from user space and also to select which ones we do want to prevent in-kernel KVM code from handling, let's add a selftest to show case and verify the API. Signed-off-by: Alexander Graf <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-09-28KVM: VMX: Rename RDTSCP secondary exec control name to insert "ENABLE"Sean Christopherson1-1/+1
Rename SECONDARY_EXEC_RDTSCP to SECONDARY_EXEC_ENABLE_RDTSCP in preparation for consolidating the logic for adjusting secondary exec controls based on the guest CPUID model. No functional change intended. Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-09-28perf evsel: Add evsel__clone() functionNamhyung Kim2-39/+158
The evsel__clone() is to create an exactly same evsel from same attributes. The function assumes the given evsel is not configured yet so it cares fields set during event parsing. Those fields are now moved together as Jiri suggested. Note that metric events will be handled by later patch. It will be used by perf stat to generate separate events for each cgroup. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf vendor events: Update SkylakeX events to v1.21Jin Yao10-3565/+4129
- Update SkylakeX events to v1.21. - Update SkylakeX JSON metrics from TMAM 4.0. Other fixes: - Add NO_NMI_WATCHDOG metric constraint to Backend_Bound - Fix misspelled error Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-28perf vendor events intel: Update CascadelakeX events to v1.08Jin Yao8-995/+1067
- Update CascadelakeX events to v1.08. - Update CascadelakeX JSON metrics from TMAM 4.0. Other fixes: - Add NO_NMI_WATCHDOG metric constraint to Backend_Bound - Change 'MB/sec' to 'MB' in UNC_M_PMM_BANDWIDTH. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-26netdevsim: fix duplicated debugfs directoryJakub Kicinski1-1/+1
The "ethtool" debugfs directory holds per-netdev knobs, so move it from the device instance directory to the port directory. This fixes the following warning when creating multiple ports: debugfs: Directory 'ethtool' with parent 'netdevsim1' already present! Fixes: ff1f7c17fb20 ("netdevsim: add pause frame stats") Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-25netdevsim: add support for flash_update overwrite maskJacob Keller1-0/+18
The devlink interface recently gained support for a new "overwrite mask" parameter that allows specifying how various sub-sections of a flash component are modified when updating. Add support for this to netdevsim, to enable easily testing the interface. Make the allowed overwrite mask values controllable via a debugfs parameter. This enables testing a flow where the driver rejects an unsupportable overwrite mask. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-25devlink: convert flash_update to use params structureJacob Keller1-0/+3
The devlink core recently gained support for checking whether the driver supports a flash_update parameter, via `supported_flash_update_params`. However, parameters are specified as function arguments. Adding a new parameter still requires modifying the signature of the .flash_update callback in all drivers. Convert the .flash_update function to take a new `struct devlink_flash_update_params` instead. By using this structure, and the `supported_flash_update_params` bit field, a new parameter to flash_update can be added without requiring modification to existing drivers. As before, all parameters except file_name will require driver opt-in. Because file_name is a necessary field to for the flash_update to make sense, no "SUPPORTED" bitflag is provided and it is always considered valid. All future additional parameters will require a new bit in the supported_flash_update_params bitfield. Signed-off-by: Jacob Keller <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Cc: Jiri Pirko <[email protected]> Cc: Jakub Kicinski <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michael Chan <[email protected]> Cc: Bin Luo <[email protected]> Cc: Saeed Mahameed <[email protected]> Cc: Leon Romanovsky <[email protected]> Cc: Ido Schimmel <[email protected]> Cc: Danielle Ratson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-1/+1
Pull more kvm fixes from Paolo Bonzini: "Five small fixes. The nested migration bug will be fixed with a better API in 5.10 or 5.11, for now this is a fix that works with existing userspace but keeps the current ugly API" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Add a dedicated INVD intercept routine KVM: x86: Reset MMU context if guest toggles CR4.SMAP or CR4.PKE KVM: x86: fix MSR_IA32_TSC read for nested migration selftests: kvm: Fix assert failure in single-step test KVM: x86: VMX: Make smaller physical guest address space support user-configurable
2020-09-25bpf: Add AND verifier test case where 32bit and 64bit bounds differJohn Fastabend1-0/+16
If we AND two values together that are known in the 32bit subregs, but not known in the 64bit registers we rely on the tnum value to report the 32bit subreg is known. And do not use mark_reg_known() directly from scalar32_min_max_and() Add an AND test to cover the case with known 32bit subreg, but unknown 64bit reg. Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2020-09-25nfsd: remove fault injection codeJ. Bruce Fields1-50/+0
It was an interesting idea but nobody seems to be using it, it's buggy at this point, and nfs4state.c is already complicated enough without it. The new nfsd/clients/ code provides some of the same functionality, and could probably do more if desired. This feature has been deprecated since 9d60d93198c6 ("Deprecate nfsd fault injection"). Signed-off-by: J. Bruce Fields <[email protected]>
2020-09-25bpf: selftest: Add test_btf_skc_cls_ingressMartin KaFai Lau3-0/+413
This patch attaches a classifier prog to the ingress filter. It exercises the following helpers with different socket pointer types in different logical branches: 1. bpf_sk_release() 2. bpf_sk_assign() 3. bpf_skc_to_tcp_request_sock(), bpf_skc_to_tcp_sock() 4. bpf_tcp_gen_syncookie, bpf_tcp_check_syncookie Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Remove enum tcp_ca_state from bpf_tcp_helpers.hMartin KaFai Lau3-8/+4
The enum tcp_ca_state is available in <linux/tcp.h>. Remove it from the bpf_tcp_helpers.h to avoid conflict when the bpf prog needs to include both both <linux/tcp.h> and bpf_tcp_helpers.h. Modify the bpf_cubic.c and bpf_dctcp.c to use <linux/tcp.h> instead. The <linux/stddef.h> is needed by <linux/tcp.h>. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Use bpf_skc_to_tcp_sock() in the sock_fields testMartin KaFai Lau2-9/+55
This test uses bpf_skc_to_tcp_sock() to get a kernel tcp_sock ptr "ktp". Access the ktp->lsndtime and also pass ktp to bpf_sk_storage_get(). It also exercises the bpf_sk_cgroup_id() and bpf_sk_ancestor_cgroup_id() with the "ktp". To do that, a parent cgroup and a child cgroup are created. The bpf prog is attached to the child cgroup. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Use network_helpers in the sock_fields testMartin KaFai Lau1-79/+9
This patch uses start_server() and connect_to_fd() from network_helpers.h to remove the network testing boiler plate codes. epoll is no longer needed also since the timeout has already been taken care of also. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Adapt sock_fields test to use skel and global variablesMartin KaFai Lau2-305/+229
skel is used. Global variables are used to store the result from bpf prog. addr_map, sock_result_map, and tcp_sock_result_map are gone. Instead, global variables listen_tp, srv_sa6, cli_tp,, srv_tp, listen_sk, srv_sk, and cli_sk are added. Because of that, bpf_addr_array_idx and bpf_result_array_idx are also no longer needed. CHECK() macro from test_progs.h is reused and bail as soon as a CHECK failure. shutdown() is used to ensure the previous data-ack is received. The bytes_acked, bytes_received, and the pkt_out_cnt checks are using "<" to accommodate the final ack may not have been received/sent. It is enough since it is not the focus of this test. The sk local storage is all initialized to 0xeB9F now, so the check_sk_pkt_out_cnt() always checks with the 0xeB9F base. It is to keep things simple. The next patch will reuse helpers from network_helpers.h to simplify things further. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Move sock_fields test into test_progsMartin KaFai Lau4-6/+3
This is a mechanical change to 1. move test_sock_fields.c to prog_tests/sock_fields.c 2. rename progs/test_sock_fields_kern.c to progs/test_sock_fields.c Minimal change is made to the code itself. Next patch will make changes to use new ways of writing test, e.g. use skel and global variables. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: selftest: Add ref_tracking verifier test for bpf_skc castingMartin KaFai Lau1-0/+47
The patch tests for: 1. bpf_sk_release() can be called on a tcp_sock btf_id ptr. 2. Ensure the tcp_sock btf_id pointer cannot be used after bpf_sk_release(). Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Lorenz Bauer <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: Change bpf_sk_assign to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau1-1/+1
This patch changes the bpf_sk_assign() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. The bpf_sk_lookup_assign() is taking ARG_PTR_TO_SOCKET_"OR_NULL". Meaning it specifically takes a literal NULL. ARG_PTR_TO_BTF_ID_SOCK_COMMON does not allow a literal NULL, so another ARG type is required for this purpose and another follow-up patch can be used if there is such need. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: Change bpf_tcp_*_syncookie to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau1-2/+2
This patch changes the bpf_tcp_*_syncookie() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Lorenz Bauer <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: Change bpf_sk_storage_*() to accept ARG_PTR_TO_BTF_ID_SOCK_COMMONMartin KaFai Lau1-0/+1
This patch changes the bpf_sk_storage_*() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. A micro benchmark has been done on a "cgroup_skb/egress" bpf program which does a bpf_sk_storage_get(). It was driven by netperf doing a 4096 connected UDP_STREAM test with 64bytes packet. The stats from "kernel.bpf_stats_enabled" shows no meaningful difference. The sk_storage_get_btf_proto, sk_storage_delete_btf_proto, btf_sk_storage_get_proto, and btf_sk_storage_delete_proto are no longer needed, so they are removed. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Lorenz Bauer <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25bpf: Change bpf_sk_release and bpf_sk_*cgroup_id to accept ↵Martin KaFai Lau1-4/+4
ARG_PTR_TO_BTF_ID_SOCK_COMMON The previous patch allows the networking bpf prog to use the bpf_skc_to_*() helpers to get a PTR_TO_BTF_ID socket pointer, e.g. "struct tcp_sock *". It allows the bpf prog to read all the fields of the tcp_sock. This patch changes the bpf_sk_release() and bpf_sk_*cgroup_id() to take ARG_PTR_TO_BTF_ID_SOCK_COMMON such that they will work with the pointer returned by the bpf_skc_to_*() helpers also. For example, the following will work: sk = bpf_skc_lookup_tcp(skb, tuple, tuplen, BPF_F_CURRENT_NETNS, 0); if (!sk) return; tp = bpf_skc_to_tcp_sock(sk); if (!tp) { bpf_sk_release(sk); return; } lsndtime = tp->lsndtime; /* Pass tp to bpf_sk_release() will also work */ bpf_sk_release(tp); Since PTR_TO_BTF_ID could be NULL, the helper taking ARG_PTR_TO_BTF_ID_SOCK_COMMON has to check for NULL at runtime. A btf_id of "struct sock" may not always mean a fullsock. Regardless the helper's running context may get a non-fullsock or not, considering fullsock check/handling is pretty cheap, it is better to keep the same verifier expectation on helper that takes ARG_PTR_TO_BTF_ID* will be able to handle the minisock situation. In the bpf_sk_*cgroup_id() case, it will try to get a fullsock by using sk_to_full_sk() as its skb variant bpf_sk"b"_*cgroup_id() has already been doing. bpf_sk_release can already handle minisock, so nothing special has to be done. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-09-25rseq/selftests: Test MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQPeter Oskolkov2-1/+224
Based on Google-internal RSEQ work done by Paul Turner and Andrew Hunter. This patch adds a selftest for MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ. The test quite often fails without the previous patch in this patchset, but consistently passes with it. Signed-off-by: Peter Oskolkov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Mathieu Desnoyers <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-25rseq/selftests,x86_64: Add rseq_offset_deref_addv()Peter Oskolkov1-0/+57
This patch adds rseq_offset_deref_addv() function to tools/testing/selftests/rseq/rseq-x86.h, to be used in a selftest in the next patch in the patchset. Once an architecture adds support for this function they should define "RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV". Signed-off-by: Peter Oskolkov <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Mathieu Desnoyers <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-09-24selftests: mptcp: add remove addr and subflow test casesGeliang Tang1-3/+142
This patch added the remove addr and subflow test cases and two new functions. The first function run_remove_tests calls do_transfer with two new arguments, rm_nr_ns1 and rm_nr_ns2, for the numbers of addresses should be removed during the transfer process in namespace 1 and namespace 2. If both these two arguments are 0, we do the join test cases with "mptcp_connect -j" command. Otherwise, do the remove test cases with "mptcp_connect -r" command. The second function chk_rm_nr checks the RM_ADDR related mibs's counters. The output of the test cases looks like this: 11 remove single subflow syn[ ok ] - synack[ ok ] - ack[ ok ] rm [ ok ] - sf [ ok ] 12 remove multiple subflows syn[ ok ] - synack[ ok ] - ack[ ok ] rm [ ok ] - sf [ ok ] 13 remove single address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] 14 remove subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] 15 remove subflows and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] rm [ ok ] - sf [ ok ] Suggested-by: Matthieu Baerts <[email protected]> Suggested-by: Paolo Abeni <[email protected]> Suggested-by: Mat Martineau <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24selftests: mptcp: add remove cfg in mptcp_connectGeliang Tang1-3/+15
This patch added a new cfg, named cfg_remove in mptcp_connect. This new cfg_remove is copied from cfg_join. The only difference between them is in the do_rnd_write function. Here we slow down the transfer process of all data to let the RM_ADDR suboption can be sent and received completely. Otherwise the remove address and subflow test cases don't work. Suggested-by: Matthieu Baerts <[email protected]> Suggested-by: Paolo Abeni <[email protected]> Suggested-by: Mat Martineau <[email protected]> Acked-by: Paolo Abeni <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24selftests: mptcp: add ADD_ADDR mibs check functionGeliang Tang1-0/+44
This patch added the ADD_ADDR related mibs counter check function chk_add_nr(). This function check both ADD_ADDR and ADD_ADDR with echo flag. The output looks like this: 07 unused signal address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 08 signal address syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 09 subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 10 multiple subflows and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] 11 remove subflow and signal syn[ ok ] - synack[ ok ] - ack[ ok ] add[ ok ] - echo [ ok ] Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-09-24libbpf: Fix XDP program load regression for old kernelsAndrii Nakryiko1-1/+1
Fix regression in libbpf, introduced by XDP link change, which causes XDP programs to fail to be loaded into kernel due to specified BPF_XDP expected_attach_type. While kernel doesn't enforce expected_attach_type for BPF_PROG_TYPE_XDP, some old kernels already support XDP program, but they don't yet recognize expected_attach_type field in bpf_attr, so setting it to non-zero value causes program load to fail. Luckily, libbpf already has a mechanism to deal with such cases, so just make expected_attach_type optional for XDP programs. Fixes: dc8698cac7aa ("libbpf: Add support for BPF XDP link") Reported-by: Nikita Shirokov <[email protected]> Reported-by: Udip Pant <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]