aboutsummaryrefslogtreecommitdiff
path: root/tools/lib/find_bit.c
AgeCommit message (Collapse)AuthorFilesLines
2022-01-15tools: sync tools/bitmap with mother linuxYury Norov1-0/+20
Remove tools/include/asm-generic/bitops/find.h and copy include/linux/bitmap.h to tools. find_*_le() functions are not copied because not needed in tools. Signed-off-by: Yury Norov <[email protected]> Tested-by: Wolfram Sang <[email protected]>
2021-05-06tools: sync lib/find_bit implementationYury Norov1-2/+2
Add fast paths to find_*_bit() functions as per kernel implementation. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yury Norov <[email protected]> Acked-by: Rasmus Villemoes <[email protected]> Cc: Alexey Klimov <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: David Sterba <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Jianpeng Ma <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Rich Felker <[email protected]> Cc: Stefano Brivio <[email protected]> Cc: Wei Yang <[email protected]> Cc: Wolfram Sang <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-06tools: sync find_next_bit implementationYury Norov1-31/+21
Sync the implementation with recent kernel changes. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yury Norov <[email protected]> Acked-by: Rasmus Villemoes <[email protected]> Cc: Alexey Klimov <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: David Sterba <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Jianpeng Ma <[email protected]> Cc: Joe Perches <[email protected]> Cc: John Paul Adrian Glaubitz <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Rich Felker <[email protected]> Cc: Stefano Brivio <[email protected]> Cc: Wei Yang <[email protected]> Cc: Wolfram Sang <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-02-06lib: optimize cpumask_next_and()Clement Courbet1-10/+29
We've measured that we spend ~0.6% of sys cpu time in cpumask_next_and(). It's essentially a joined iteration in search for a non-zero bit, which is currently implemented as a lookup join (find a nonzero bit on the lhs, lookup the rhs to see if it's set there). Implement a direct join (find a nonzero bit on the incrementally built join). Also add generic bitmap benchmarks in the new `test_find_bit` module for new function (see `find_next_and_bit` in [2] and [3] below). For cpumask_next_and, direct benchmarking shows that it's 1.17x to 14x faster with a geometric mean of 2.1 on 32 CPUs [1]. No impact on memory usage. Note that on Arm, the new pure-C implementation still outperforms the old one that uses a mix of C and asm (`find_next_bit`) [3]. [1] Approximate benchmark code: ``` unsigned long src1p[nr_cpumask_longs] = {pattern1}; unsigned long src2p[nr_cpumask_longs] = {pattern2}; for (/*a bunch of repetitions*/) { for (int n = -1; n <= nr_cpu_ids; ++n) { asm volatile("" : "+rm"(src1p)); // prevent any optimization asm volatile("" : "+rm"(src2p)); unsigned long result = cpumask_next_and(n, src1p, src2p); asm volatile("" : "+rm"(result)); } } ``` Results: pattern1 pattern2 time_before/time_after 0x0000ffff 0x0000ffff 1.65 0x0000ffff 0x00005555 2.24 0x0000ffff 0x00001111 2.94 0x0000ffff 0x00000000 14.0 0x00005555 0x0000ffff 1.67 0x00005555 0x00005555 1.71 0x00005555 0x00001111 1.90 0x00005555 0x00000000 6.58 0x00001111 0x0000ffff 1.46 0x00001111 0x00005555 1.49 0x00001111 0x00001111 1.45 0x00001111 0x00000000 3.10 0x00000000 0x0000ffff 1.18 0x00000000 0x00005555 1.18 0x00000000 0x00001111 1.17 0x00000000 0x00000000 1.25 ----------------------------- geo.mean 2.06 [2] test_find_next_bit, X86 (skylake) [ 3913.477422] Start testing find_bit() with random-filled bitmap [ 3913.477847] find_next_bit: 160868 cycles, 16484 iterations [ 3913.477933] find_next_zero_bit: 169542 cycles, 16285 iterations [ 3913.478036] find_last_bit: 201638 cycles, 16483 iterations [ 3913.480214] find_first_bit: 4353244 cycles, 16484 iterations [ 3913.480216] Start testing find_next_and_bit() with random-filled bitmap [ 3913.481074] find_next_and_bit: 89604 cycles, 8216 iterations [ 3913.481075] Start testing find_bit() with sparse bitmap [ 3913.481078] find_next_bit: 2536 cycles, 66 iterations [ 3913.481252] find_next_zero_bit: 344404 cycles, 32703 iterations [ 3913.481255] find_last_bit: 2006 cycles, 66 iterations [ 3913.481265] find_first_bit: 17488 cycles, 66 iterations [ 3913.481266] Start testing find_next_and_bit() with sparse bitmap [ 3913.481272] find_next_and_bit: 764 cycles, 1 iterations [3] test_find_next_bit, arm (v7 odroid XU3). [ 267.206928] Start testing find_bit() with random-filled bitmap [ 267.214752] find_next_bit: 4474 cycles, 16419 iterations [ 267.221850] find_next_zero_bit: 5976 cycles, 16350 iterations [ 267.229294] find_last_bit: 4209 cycles, 16419 iterations [ 267.279131] find_first_bit: 1032991 cycles, 16420 iterations [ 267.286265] Start testing find_next_and_bit() with random-filled bitmap [ 267.302386] find_next_and_bit: 2290 cycles, 8140 iterations [ 267.309422] Start testing find_bit() with sparse bitmap [ 267.316054] find_next_bit: 191 cycles, 66 iterations [ 267.322726] find_next_zero_bit: 8758 cycles, 32703 iterations [ 267.329803] find_last_bit: 84 cycles, 66 iterations [ 267.336169] find_first_bit: 4118 cycles, 66 iterations [ 267.342627] Start testing find_next_and_bit() with sparse bitmap [ 267.356919] find_next_and_bit: 91 cycles, 1 iterations [[email protected]: v6] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: m68k/bitops: always include <asm-generic/bitops/find.h>] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Clement Courbet <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: Yury Norov <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-02-24lib/find_bit.c: micro-optimise find_next_*_bitMatthew Wilcox1-1/+1
This saves 32 bytes on my x86-64 build, mostly due to alignment considerations and sharing more code between find_next_bit and find_next_zero_bit, but it does save a couple of instructions. There's really two parts to this commit: - First, the first half of the test: (!nbits || start >= nbits) is trivially a subset of the second half, since nbits and start are both unsigned - Second, while looking at the disassembly, I noticed that GCC was predicting the branch taken. Since this is a failure case, it's clearly the less likely of the two branches, so add an unlikely() to override GCC's heuristics. [[email protected]: v2] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Acked-by: Yury Norov <[email protected]> Acked-by: Rasmus Villemoes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-10-24tools lib: Add for_each_clear_bit macroJiri Olsa1-0/+25
Adding for_each_clear_bit macro plus all its the necessary backbone functions. Taken from related kernel code. It will be used in following patch. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-08tools lib: Sync tools/lib/find_bit.c with the kernelArnaldo Carvalho de Melo1-54/+49
Need to move the bitmap.[ch] things from tools/perf/ to tools/lib, will be done in the next patches. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: George Spelvin <[email protected] Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Wang Nan <[email protected]> Cc: Yury Norov <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-08tools lib: Move find_next_bit.c to tools/lib/Arnaldo Carvalho de Melo1-0/+89
The commit that introduced it should've moved it to the same place, plus the 'tools/' prefix, but instead moved it to a bogus tools/lib/util/ directory, being the only file there. Move it to tools/lib/find_bit.c, picking the name for the file where these routines live since: 8f6f19dd5143 ("lib: move find_last_bit to lib/find_next_bit.c") Next step is to make tools/lib/find_bit.c to differ from lib/find_bit.c just in removing what is not used by tools/. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: George Spelvin <[email protected] Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Rasmus Villemoes <[email protected]> Cc: Wang Nan <[email protected]> Cc: Yury Norov <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>