aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2021-04-02selftests: mptcp: init nstat historyMatthieu Baerts1-0/+7
Not to be impacted by packets sent between sub-tests. Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-04-02selftests: mptcp: launch mptcp_connect with timeoutMatthieu Baerts4-33/+72
'mptcp_connect' already has a timeout for poll() but in some cases, it is not enough. With "timeout" tool, we will force the command to fail if it doesn't finish on time. Thanks to that, the script will continue and display details about the current state before marking the test as failed. Displaying this state is very important to be able to understand the issue. Best to have our CI reporting the issue than just "the test hanged". Note that in mptcp_connect.sh, we were using a long timeout to validate the fact we cannot create a socket if a sysctl is set. We don't need this timeout. In diag.sh, we want to send signals to mptcp_connect instances that have been started in the netns. But we cannot send this signal to 'timeout' otherwise that will stop the timeout and messages telling us SIGUSR1 has been received will be printed. Instead of trying to find the right PID and storing them in an array, we can simply use the output of 'ip netns pids' which is all the PIDs we want to send signal to. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/160 Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-04-02kunit: tool: make --kunitconfig accept dirs, add lib/kunit fragmentDaniel Latypov3-1/+11
TL;DR $ ./tools/testing/kunit/kunit.py run --kunitconfig=lib/kunit Per suggestion from Ted [1], we can reduce the amount of typing by assuming a convention that these files are named '.kunitconfig'. In the case of [1], we now have $ ./tools/testing/kunit/kunit.py run --kunitconfig=fs/ext4 Also add in such a fragment for kunit itself so we can give that as an example more close to home (and thus less likely to be accidentally broken). [1] https://lore.kernel.org/linux-ext4/[email protected]/ Signed-off-by: Daniel Latypov <[email protected]> Reviewed-by: David Gow <[email protected]> Reviewed-by: Brendan Higgins <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Create .gitignore to include resctrl_testsFenghua Yu1-0/+2
Create .gitignore to hold the test file resctrl_tests generated after compiling. Suggested-by: Shuah Khan <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix checking for < 0 for unsigned valuesFenghua Yu1-18/+23
Dan reported following static checker warnings tools/testing/selftests/resctrl/resctrl_val.c:545 measure_vals() warn: 'bw_imc' unsigned <= 0 tools/testing/selftests/resctrl/resctrl_val.c:549 measure_vals() warn: 'bw_resc_end' unsigned <= 0 These warnings are reported because 1. measure_vals() declares 'bw_imc' and 'bw_resc_end' as unsigned long variables 2. Return value of get_mem_bw_imc() and get_mem_bw_resctrl() are assigned to 'bw_imc' and 'bw_resc_end' respectively 3. The returned values are checked for <= 0 to see if the calls failed Checking for < 0 for an unsigned value doesn't make any sense. Fix this issue by changing the implementation of get_mem_bw_imc() and get_mem_bw_resctrl() such that they now accept reference to a variable and set the variable appropriately upon success and return 0, else return < 0 on error. Reported-by: Dan Carpenter <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix incorrect parsing of iMC countersFenghua Yu1-2/+20
iMC (Integrated Memory Controller) counters are usually at "/sys/bus/event_source/devices/" and are named as "uncore_imc_<n>". num_of_imcs() function tries to count number of such iMC counters so that it could appropriately initialize required number of perf_attr structures that could be used to read these iMC counters. num_of_imcs() function assumes that all the directories under this path that start with "uncore_imc" are iMC counters. But, on some systems there could be directories named as "uncore_imc_free_running" which aren't iMC counters. Trying to read from such directories will result in "not found file" errors and MBM/MBA tests will fail. Hence, fix the logic in num_of_imcs() such that it looks at the first character after "uncore_imc_" to check if it's a numerical digit or not. If it's a digit then the directory represents an iMC counter, else, skip the directory. Reported-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix unmount resctrl FSFenghua Yu2-0/+5
umount_resctrlfs() directly attempts to unmount resctrl file system without checking if resctrl FS is already mounted or not. It returns 0 on success and on failure it prints an error message and returns an error status. Calling umount_resctrlfs() when resctrl FS isn't mounted will return an error status. There could be situations where-in the caller might not know if resctrl FS is already mounted or not and the caller might still want to unmount resctrl FS if it's already mounted (For example during teardown). To support above use cases, change umount_resctrlfs() such that it now first checks if resctrl FS is already mounted or not and unmounts resctrl FS only if it's already mounted. unmount resctrl FS upon exit. For example, running only mba test on a Broadwell (BDW) machine (MBA isn't supported on BDW CPU). This happens because validate_resctrl_feature_request() would mount resctrl FS to check if mba is enabled on the platform or not and finds that the H/W doesn't support mba and hence will return false to run_mba_test(). This in turn makes the main() function return without unmounting resctrl FS. Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Skip the test if requested resctrl feature is not supportedFenghua Yu4-9/+23
There could be two reasons why a resctrl feature might not be enabled on the platform 1. H/W might not support the feature 2. Even if the H/W supports it, the user might have disabled the feature through kernel command line arguments Hence, any resctrl unit test (like cmt, cat, mbm and mba) before starting the test will first check if the feature is enabled on the platform or not. If the feature isn't enabled, then the test returns with an error status. For example, if MBA isn't supported on a platform and if the user tries to run MBA, the output will look like this ok mounting resctrl to "/sys/fs/resctrl" not ok MBA: schemata change But, not supporting a feature isn't a test failure. So, instead of treating it as an error, use the SKIP directive of the TAP protocol. With the change, the output will look as below ok MBA # SKIP Hardware does not support MBA or MBA is disabled Suggested-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Modularize resctrl test suite main() functionFenghua Yu1-31/+57
Resctrl test suite main() function does the following things 1. Parses command line arguments passed by user 2. Some setup checks 3. Logic that calls into each unit test 4. Print result and clean up after running each unit test Introduce wrapper functions for steps 3 and 4 to modularize the main() function. Adding these wrapper functions makes it easier to add any logic to each individual test. Please note that this is a preparatory patch for the next one and no functional changes are intended. Suggested-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Don't hard code value of "no_of_bits" variableFenghua Yu2-3/+10
Cache related tests (like CAT and CMT) depend on a variable called no_of_bits to run. no_of_bits defines the number of contiguous bits that should be set in the CBM mask and a user can pass a value for no_of_bits using -n command line argument. If a user hasn't passed any value, it defaults to 5 (randomly chosen value). Hard coding no_of_bits to 5 will make the cache tests fail to run on systems that support maximum cbm mask that is less than or equal to 5 bits. Hence, don't hard code no_of_bits value. If a user passes a value for "no_of_bits" using -n option, use it. Otherwise, no_of_bits is equal to half of the maximum number of bits in the cbm mask. Please note that CMT test is still hard coded to 5 bits. It will change in subsequent patches that change CMT test. Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix MBA/MBM results reporting formatFenghua Yu2-16/+21
MBM unit test starts fill_buf (default built-in benchmark) in a new con_mon group (c1, m1) and records resctrl reported mbm values and iMC (Integrated Memory Controller) values every second. It does this for five seconds (randomly chosen value) in total. It then calculates average of resctrl_mbm values and imc_mbm values and if the difference is greater than 300 MB/sec (randomly chosen value), the test treats it as a failure. MBA unit test is similar to MBM but after every run it changes schemata. Checking for a difference of 300 MB/sec doesn't look very meaningful when the mbm values are changing over a wide range. For example, below are the values running MBA test on SKL with different allocations 1. With 10% as schemata both iMC and resctrl mbm_values are around 2000 MB/sec 2. With 100% as schemata both iMC and resctrl mbm_values are around 10000 MB/sec A 300 MB/sec difference between resctrl_mbm and imc_mbm values is acceptable at 100% schemata but it isn't acceptable at 10% schemata because that's a huge difference. So, fix this by checking for percentage difference instead of absolute difference i.e. check if the difference between resctrl_mbm value and imc_mbm value is within 5% (randomly chosen value) of imc_mbm value. If the difference is greater than 5% of imc_mbm value, treat it is a failure. Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Use resctrl/info for feature detectionFenghua Yu2-12/+46
Resctrl test suite before running any unit test (like cmt, cat, mbm and mba) should first check if the feature is enabled (by kernel and not just supported by H/W) on the platform or not. validate_resctrl_feature_request() is supposed to do that. This function intends to grep for relevant flags in /proc/cpuinfo but there are several issues here 1. validate_resctrl_feature_request() calls fgrep() to get flags from /proc/cpuinfo. But, fgrep() can only return a string with maximum of 255 characters and hence the complete cpu flags are never returned. 2. The substring search logic is also busted. If strstr() finds requested resctrl feature in the cpu flags, it returns pointer to the first occurrence. But, the logic negates the return value of strstr() and hence validate_resctrl_feature_request() returns false if the feature is present in the cpu flags and returns true if the feature is not present. 3. validate_resctrl_feature_request() checks if a resctrl feature is reported in /proc/cpuinfo flags or not. Having a cpu flag means that the H/W supports the feature, but it doesn't mean that the kernel enabled it. A user could selectively enable only a subset of resctrl features using kernel command line arguments. Hence, /proc/cpuinfo isn't a reliable source to check if a feature is enabled or not. The 3rd issue being the major one and fixing it requires changing the way validate_resctrl_feature_request() works. Since, /proc/cpuinfo isn't the right place to check if a resctrl feature is enabled or not, a more appropriate place is /sys/fs/resctrl/info directory. Change validate_resctrl_feature_request() such that, 1. For cat, check if /sys/fs/resctrl/info/L3 directory is present or not 2. For mba, check if /sys/fs/resctrl/info/MB directory is present or not 3. For cmt, check if /sys/fs/resctrl/info/L3_MON directory is present and check if /sys/fs/resctrl/info/L3_MON/mon_features has llc_occupancy 4. For mbm, check if /sys/fs/resctrl/info/L3_MON directory is present and check if /sys/fs/resctrl/info/L3_MON/mon_features has mbm_<total/local>_bytes Please note that only L3_CAT, L3_CMT, MBA and MBM are supported. CDP and L2 variants can be added later. Reported-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Check for resctrl mount point only if resctrl FS is supportedFenghua Yu1-0/+3
check_resctrlfs_support() does the following 1. Checks if the platform supports resctrl file system or not by looking for resctrl in /proc/filesystems 2. Calls opendir() on default resctrl file system path (i.e. /sys/fs/resctrl) 3. Checks if resctrl file system is mounted or not by looking at /proc/mounts Steps 2 and 3 will fail if the platform does not support resctrl file system. So, there is no need to check for them if step 1 fails. Fix this by returning immediately if the platform does not support resctrl file system. Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Add config dependenciesFenghua Yu1-0/+2
Add the config file for test dependencies. Suggested-by: Shuah Khan <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix a printed messageReinette Chatre1-2/+2
Add a missing newline to the printed help text to improve readability. Tested-by: Babu Moger <[email protected]> Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Share show_cache_info() by CAT and CMT testsFenghua Yu4-55/+52
show_cache_info() functions are defined separately in CAT and CMT tests. But the functions are same for the tests and unnecessary to be defined separately. Share the function by the tests. Suggested-by: Shuah Khan <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Call kselftest APIs to log test resultsFenghua Yu8-117/+105
Call kselftest APIs instead of using printf() to log test results for cleaner code and better future extension. Suggested-by: Shuah Khan <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Rename CQM test as CMT testFenghua Yu7-41/+41
CMT (Cache Monitoring Technology) [1] is a H/W feature that reports cache occupancy of a process. resctrl selftest suite has a unit test to test CMT for LLC but the test is named as CQM (Cache Quality Monitoring). Furthermore, the unit test source file is named as cqm_test.c and several functions, variables, comments, preprocessors and statements widely use "cqm" as either suffix or prefix. This rampant misusage of CQM for CMT might confuse someone who is newly looking at resctrl selftests because this feature is named CMT in the Intel Software Developer's Manual. Hence, rename all the occurrences (unit test source file name, functions, variables, comments and preprocessors) of cqm with cmt. [1] Please see Intel SDM, Volume 3, chapter 17 and section 18 for more information on CMT: https://software.intel.com/content/www/us/en/develop/articles/intel-sdm.html Suggested-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix missing options "-n" and "-p"Fenghua Yu1-1/+1
resctrl test suite accepts command line arguments (like -b, -t, -n and -p) as documented in the help. But passing -n and -p throws an invalid option error. This happens because -n and -p are missing in the list of characters that getopt() recognizes as valid arguments. Hence, they are treated as invalid options. Fix this by adding them to the list of characters that getopt() recognizes as valid arguments. Please note that the main() function already has the logic to deal with the values passed as part of these arguments and hence no changes are needed there. Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Ensure sibling CPU is not same as original CPUReinette Chatre1-1/+1
The resctrl tests can accept a CPU on which the tests are run and use default of CPU #1 if it is not provided. In the CAT test a "sibling CPU" is determined that is from the same package where another thread will be run. The current algorithm with which a "sibling CPU" is determined does not take the provided/default CPU into account and when that CPU is the first CPU in a package then the "sibling CPU" will be selected to be the same CPU since it starts by picking the first CPU from core_siblings_list. Fix the "sibling CPU" selection by taking the provided/default CPU into account and ensuring a sibling that is a different CPU is selected. Tested-by: Babu Moger <[email protected]> Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Clean up resctrl features checkFenghua Yu10-35/+41
Checking resctrl features call strcmp() to compare feature strings (e.g. "mba", "cat" etc). The checkings are error prone and don't have good coding style. Define the constant strings in macros and call strncmp() to solve the potential issues. Suggested-by: Shuah Khan <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix compilation issues for other global variablesFenghua Yu1-4/+4
Reinette reported following compilation issue on Fedora 32, gcc version 10.1.1 /usr/bin/ld: resctrl_tests.o:<src_dir>/resctrl.h:65: multiple definition of `bm_pid'; cache.o:<src_dir>/resctrl.h:65: first defined here Other variables are ppid, tests_run, llc_occup_path, is_amd. Compiler isn't happy because these variables are defined globally in two .c files but are not declared as extern. To fix issues for the global variables, declare them as extern. Chang Log: - Split this patch from v4's patch 1 (Shuah). Reported-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Fix compilation issues for global variablesFenghua Yu4-16/+16
Reinette reported following compilation issue on Fedora 32, gcc version 10.1.1 /usr/bin/ld: cqm_test.o:<src_dir>/cqm_test.c:22: multiple definition of `cache_size'; cat_test.o:<src_dir>/cat_test.c:23: first defined here The same issue is reported for long_mask, cbm_mask, count_of_bits etc variables as well. Compiler isn't happy because these variables are defined globally in two .c files namely cqm_test.c and cat_test.c and the compiler during compilation finds that the variable is already defined (multiple definition error). Taking a closer look at the usage of these variables reveals that these variables are used only locally in functions such as cqm_resctrl_val() (defined in cqm_test.c) and cat_perf_miss_val() (defined in cat_test.c). These variables are not shared between those functions. So, there is no need for these variables to be global. Hence, fix this issue by making them static variables. Reported-by: Reinette Chatre <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02selftests/resctrl: Enable gcc checks to detect buffer overflowsFenghua Yu2-2/+2
David reported a buffer overflow error in the check_results() function of the cmt unit test and he suggested enabling _FORTIFY_SOURCE gcc compiler option to automatically detect any such errors. Feature Test Macros man page describes_FORTIFY_SOURCE as below "Defining this macro causes some lightweight checks to be performed to detect some buffer overflow errors when employing various string and memory manipulation functions (for example, memcpy, memset, stpcpy, strcpy, strncpy, strcat, strncat, sprintf, snprintf, vsprintf, vsnprintf, gets, and wide character variants thereof). For some functions, argument consistency is checked; for example, a check is made that open has been supplied with a mode argument when the specified flags include O_CREAT. Not all problems are detected, just some common cases. If _FORTIFY_SOURCE is set to 1, with compiler optimization level 1 (gcc -O1) and above, checks that shouldn't change the behavior of conforming programs are performed. With _FORTIFY_SOURCE set to 2, some more checking is added, but some conforming programs might fail. Some of the checks can be performed at compile time (via macros logic implemented in header files), and result in compiler warnings; other checks take place at run time, and result in a run-time error if the check fails. Use of this macro requires compiler support, available with gcc since version 4.0." Fix the buffer overflow error in the check_results() function of the cmt unit test and enable _FORTIFY_SOURCE gcc check to catch any future buffer overflow errors. Reported-by: David Binderman <[email protected]> Suggested-by: David Binderman <[email protected]> Tested-by: Babu Moger <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2021-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller28-668/+1496
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-04-01 The following pull-request contains BPF updates for your *net-next* tree. We've added 68 non-merge commits during the last 7 day(s) which contain a total of 70 files changed, 2944 insertions(+), 1139 deletions(-). The main changes are: 1) UDP support for sockmap, from Cong. 2) Verifier merge conflict resolution fix, from Daniel. 3) xsk selftests enhancements, from Maciej. 4) Unstable helpers aka kernel func calling, from Martin. 5) Batches ops for LPM map, from Pedro. 6) Fix race in bpf_get_local_storage, from Yonghong. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller4-21/+101
Alexei Starovoitov says: ==================== pull-request: bpf 2021-04-01 The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 8 day(s) which contain a total of 10 files changed, 151 insertions(+), 26 deletions(-). The main changes are: 1) xsk creation fixes, from Ciara. 2) bpf_get_task_stack fix, from Dave. 3) trampoline in modules fix, from Jiri. 4) bpf_obj_get fix for links and progs, from Lorenz. 5) struct_ops progs must be gpl compatible fix, from Toke. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-04-02objtool/x86: Rewrite retpoline thunk callsPeter Zijlstra1-0/+117
When the compiler emits: "CALL __x86_indirect_thunk_\reg" for an indirect call, have objtool rewrite it to: ALTERNATIVE "call __x86_indirect_thunk_\reg", "call *%reg", ALT_NOT(X86_FEATURE_RETPOLINE) Additionally, in order to not emit endless identical .altinst_replacement chunks, use a global symbol for them, see __x86_indirect_alt_*. This also avoids objtool from having to do code generation. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Skip magical retpoline .altinstr_replacementPeter Zijlstra1-1/+11
When the .altinstr_replacement is a retpoline, skip the alternative. We already special case retpolines anyway. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Cache instruction relocsPeter Zijlstra2-6/+23
Track the reloc of instructions in the new instruction->reloc field to avoid having to look them up again later. ( Technically x86 instructions can have two relocations, but not jumps and calls, for which we're using this. ) Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Keep track of retpoline call sitesPeter Zijlstra5-6/+34
Provide infrastructure for architectures to rewrite/augment compiler generated retpoline calls. Similar to what we do for static_call()s, keep track of the instructions that are retpoline calls. Use the same list_head, since a retpoline call cannot also be a static_call. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Add elf_create_undef_symbol()Peter Zijlstra2-0/+61
Allow objtool to create undefined symbols; this allows creating relocations to symbols not currently in the symbol table. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Extract elf_symbol_add()Peter Zijlstra1-25/+31
Create a common helper to add symbols. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Extract elf_strtab_concat()Peter Zijlstra1-22/+38
Create a common helper to append strings to a strtab. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Create reloc sections implicitlyPeter Zijlstra4-10/+8
Have elf_add_reloc() create the relocation section implicitly. Suggested-by: Josh Poimboeuf <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Add elf_create_reloc() helperPeter Zijlstra4-119/+85
We have 4 instances of adding a relocation. Create a common helper to avoid growing even more. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Rework the elf_rebuild_reloc_section() logicPeter Zijlstra4-16/+14
Instead of manually calling elf_rebuild_reloc_section() on sections we've called elf_add_reloc() on, have elf_write() DTRT. This makes it easier to add random relocations in places without carefully tracking when we're done and need to flush what section. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Fix static_call list generationPeter Zijlstra1-5/+12
Currently, objtool generates tail call entries in add_jump_destination() but waits until validate_branch() to generate the regular call entries. Move these to add_call_destination() for consistency. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Handle per arch retpoline namingPeter Zijlstra3-2/+14
The __x86_indirect_ naming is obviously not generic. Shorten to allow matching some additional magic names later. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02objtool: Correctly handle retpoline thunk callsPeter Zijlstra1-0/+12
Just like JMP handling, convert a direct CALL to a retpoline thunk into a retpoline safe indirect CALL. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02x86/retpoline: Simplify retpolinesPeter Zijlstra1-2/+1
Due to: c9c324dc22aa ("objtool: Support stack layout changes in alternatives") it is now possible to simplify the retpolines. Currently our retpolines consist of 2 symbols: - __x86_indirect_thunk_\reg: the compiler target - __x86_retpoline_\reg: the actual retpoline. Both are consecutive in code and aligned such that for any one register they both live in the same cacheline: 0000000000000000 <__x86_indirect_thunk_rax>: 0: ff e0 jmpq *%rax 2: 90 nop 3: 90 nop 4: 90 nop 0000000000000005 <__x86_retpoline_rax>: 5: e8 07 00 00 00 callq 11 <__x86_retpoline_rax+0xc> a: f3 90 pause c: 0f ae e8 lfence f: eb f9 jmp a <__x86_retpoline_rax+0x5> 11: 48 89 04 24 mov %rax,(%rsp) 15: c3 retq 16: 66 2e 0f 1f 84 00 00 00 00 00 nopw %cs:0x0(%rax,%rax,1) The thunk is an alternative_2, where one option is a JMP to the retpoline. This was done so that objtool didn't need to deal with alternatives with stack ops. But that problem has been solved, so now it is possible to fold the entire retpoline into the alternative to simplify and consolidate unused bytes: 0000000000000000 <__x86_indirect_thunk_rax>: 0: ff e0 jmpq *%rax 2: 90 nop 3: 90 nop 4: 90 nop 5: 90 nop 6: 90 nop 7: 90 nop 8: 90 nop 9: 90 nop a: 90 nop b: 90 nop c: 90 nop d: 90 nop e: 90 nop f: 90 nop 10: 90 nop 11: 66 66 2e 0f 1f 84 00 00 00 00 00 data16 nopw %cs:0x0(%rax,%rax,1) 1c: 0f 1f 40 00 nopl 0x0(%rax) Notice that since the longest alternative sequence is now: 0: e8 07 00 00 00 callq c <.altinstr_replacement+0xc> 5: f3 90 pause 7: 0f ae e8 lfence a: eb f9 jmp 5 <.altinstr_replacement+0x5> c: 48 89 04 24 mov %rax,(%rsp) 10: c3 retq 17 bytes, we have 15 bytes NOP at the end of our 32 byte slot. (IOW, if we can shrink the retpoline by 1 byte we can pack it more densely). [ bp: Massage commit message. ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02x86/alternatives: Optimize optimize_nops()Peter Zijlstra1-1/+1
Currently, optimize_nops() scans to see if the alternative starts with NOPs. However, the emit pattern is: 141: \oldinstr 142: .skip (len-(142b-141b)), 0x90 That is, when 'oldinstr' is short, the tail is padded with NOPs. This case never gets optimized. Rewrite optimize_nops() to replace any trailing string of NOPs inside the alternative to larger NOPs. Also run it irrespective of patching, replacing NOPs in both the original and replaced code. A direct consequence is that 'padlen' becomes superfluous, so remove it. [ bp: - Adjust commit message - remove a stale comment about needing to pad - add a comment in optimize_nops() - exit early if the NOP verif. loop catches a mismatch - function should not not add NOPs in that case - fix the "optimized NOPs" offsets output ] Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2021-04-02Merge branch 'x86/cpu' into WIP.x86/core, to merge the NOP changes & resolve ↵Ingo Molnar3-5/+90
a semantic conflict Conflict-merge this main commit in essence: a89dfde3dc3c: ("x86: Remove dynamic NOP selection") With this upstream commit: b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG") Semantic merge conflict: arch/x86/net/bpf_jit_comp.c - memcpy(prog, ideal_nops[NOP_ATOMIC5], X86_PATCH_SIZE); + memcpy(prog, x86_nops[5], X86_PATCH_SIZE); Signed-off-by: Ingo Molnar <[email protected]>
2021-04-02Merge tag 'v5.12-rc5' into WIP.x86/core, to pick up recent NOP related changesIngo Molnar40-101/+1021
In particular we want to have this upstream commit: b90829704780: ("bpf: Use NOP_ATOMIC5 instead of emit_nops(&prog, 5) for BPF_TRAMP_F_CALL_ORIG") ... before merging in x86/cpu changes and the removal of the NOP optimizations, and applying PeterZ's !retpoline objtool series. Signed-off-by: Ingo Molnar <[email protected]>
2021-04-01libbpf: Only create rx and tx XDP rings when necessaryCiara Loftus1-2/+11
Prior to this commit xsk_socket__create(_shared) always attempted to create the rx and tx rings for the socket. However this causes an issue when the socket being setup is that which shares the fd with the UMEM. If a previous call to this function failed with this socket after the rings were set up, a subsequent call would always fail because the rings are not torn down after the first call and when we try to set them up again we encounter an error because they already exist. Solve this by remembering whether the rings were set up by introducing new bools to struct xsk_umem which represent the ring setup status and using them to determine whether or not to set up the rings. Fixes: 1cad07884239 ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Ciara Loftus <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-04-01libbpf: Restore umem state after socket create failureCiara Loftus1-18/+23
If the call to xsk_socket__create fails, the user may want to retry the socket creation using the same umem. Ensure that the umem is in the same state on exit if the call fails by: 1. ensuring the umem _save pointers are unmodified. 2. not unmapping the set of umem rings that were set up with the umem during xsk_umem__create, since those maps existed before the call to xsk_socket__create and should remain in tact even in the event of failure. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-04-01libbpf: Ensure umem pointer is non-NULL before dereferencingCiara Loftus1-0/+3
Calls to xsk_socket__create dereference the umem to access the fill_save and comp_save pointers. Make sure the umem is non-NULL before doing this. Fixes: 2f6324a3937f ("libbpf: Support shared umems between queues and devices") Signed-off-by: Ciara Loftus <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Magnus Karlsson <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-04-01Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-7/+17
Pull kvm fixes from Paolo Bonzini: "It's a bit larger than I (and probably you) would like by the time we get to -rc6, but perhaps not entirely unexpected since the changes in the last merge window were larger than usual. x86: - Fixes for missing TLB flushes with TDP MMU - Fixes for race conditions in nested SVM - Fixes for lockdep splat with Xen emulation - Fix for kvmclock underflow - Fix srcdir != builddir builds - Other small cleanups ARM: - Fix GICv3 MMIO compatibility probing - Prevent guests from using the ARMv8.4 self-hosted tracing extension" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: selftests: kvm: Check that TSC page value is small after KVM_SET_CLOCK(0) KVM: x86: Prevent 'hv_clock->system_time' from going negative in kvm_guest_time_update() KVM: x86: disable interrupts while pvclock_gtod_sync_lock is taken KVM: x86: reduce pvclock_gtod_sync_lock critical sections KVM: SVM: ensure that EFER.SVME is set when running nested guest or on nested vmexit KVM: SVM: load control fields from VMCB12 before checking them KVM: x86/mmu: Don't allow TDP MMU to yield when recovering NX pages KVM: x86/mmu: Ensure TLBs are flushed for TDP MMU during NX zapping KVM: x86/mmu: Ensure TLBs are flushed when yielding during GFN range zap KVM: make: Fix out-of-source module builds selftests: kvm: make hardware_disable_test less verbose KVM: x86/vPMU: Forbid writing to MSR_F15H_PERF MSRs when guest doesn't have X86_FEATURE_PERFCTR_CORE KVM: x86: remove unused declaration of kvm_write_tsc() KVM: clean up the unused argument tools/kvm_stat: Add restart delay KVM: arm64: Fix CPU interface MMIO compatibility detection KVM: arm64: Disable guest access to trace filter controls KVM: arm64: Hide system instruction access to Trace registers
2021-04-01selftests/bpf: Add a test case for loading BPF_SK_SKB_VERDICTCong Wang2-0/+58
This adds a test case to ensure BPF_SK_SKB_VERDICT and BPF_SK_STREAM_VERDICT will never be attached at the same time. Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-04-01selftests/bpf: Add a test case for udp sockmapCong Wang2-0/+158
Add a test case to ensure redirection between two UDP sockets work. Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-04-01sock_map: Introduce BPF_SK_SKB_VERDICTCong Wang3-0/+3
Reusing BPF_SK_SKB_STREAM_VERDICT is possible but its name is confusing and more importantly we still want to distinguish them from user-space. So we can just reuse the stream verdict code but introduce a new type of eBPF program, skb_verdict. Users are not allowed to attach stream_verdict and skb_verdict programs to the same map. Signed-off-by: Cong Wang <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]