diff options
| author | Alexei Starovoitov <[email protected]> | 2023-12-05 13:40:21 -0800 |
|---|---|---|
| committer | Alexei Starovoitov <[email protected]> | 2023-12-05 13:41:34 -0800 |
| commit | 3aee2bf9c49be2144460d7267560232e3d45d367 (patch) | |
| tree | fa3c0247846db9c119d29a1b24960c7674188d61 /include/linux | |
| parent | 5ffb260f754bf838507fe0c23d05254b33e2bf3d (diff) | |
| parent | 064e0bea19b356c5d5f48a4549d80a3c03ce898b (diff) | |
Merge branch 'complete-bpf-verifier-precision-tracking-support-for-register-spills'
Andrii Nakryiko says:
====================
Complete BPF verifier precision tracking support for register spills
Add support to BPF verifier to track and support register spill/fill to/from
stack regardless if it was done through read-only R10 register (which is the
only form supported today), or through a general register after copying R10
into it, while also potentially modifying offset.
Once we add register this generic spill/fill support to precision
backtracking, we can take advantage of it to stop doing eager STACK_ZERO
conversion on register spill. Instead we can rely on (im)precision of spilled
const zero register to improve verifier state pruning efficiency. This
situation of using const zero register to initialize stack slots is very
common with __builtin_memset() usage or just zero-initializing variables on
the stack, and it causes unnecessary state duplication, as that STACK_ZERO
knowledge is often not necessary for correctness, as those zero values are
never used in precise context. Thus, relying on register imprecision helps
tremendously, especially in real-world BPF programs.
To make spilled const zero register behave completely equivalently to
STACK_ZERO, we need to improve few other small pieces, which is done in the
second part of the patch set. See individual patches for details. There are
also two small bug fixes spotted during STACK_ZERO debugging.
The patch set consists of logically three changes:
- patch #1 (and corresponding tests in patch #2) is fixing/impoving precision
propagation for stack spills/fills. This can be landed as a stand-alone
improvement;
- patches #3 through #9 is improving verification scalability by utilizing
register (im)precision instead of eager STACK_ZERO. These changes depend
on patch #1.
- patch #10 is a memory efficiency improvement to how instruction/jump
history is tracked and maintained. It depends on patch #1, but is not
strictly speaking required, even though I believe it's a good long-term
solution to have a path-dependent per-instruction information. Kind
of like a path-dependent counterpart to path-agnostic insn_aux array.
v3->v3:
- fixed up Fixes tag (Alexei);
- fixed few more selftests to not use BPF_ST instruction in inline asm
directly, checked with CI, it was happy (CI);
v2->v3:
- BPF_ST instruction workaround (Eduard);
- force dereference in added tests to catch problems (Eduard);
- some commit message massaging (Alexei);
v1->v2:
- clean ups, WARN_ONCE(), insn_flags helpers added (Eduard);
- added more selftests for STACK_ZERO/STACK_MISC cases (Eduard);
- a bit more detailed explanation of effect of avoiding STACK_ZERO in favor
of register spill in patch #8 commit (Alexei);
- global shared instruction history refactoring moved to be the last patch
in the series to make it easier to revert it, if applied (Alexei).
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/bpf_verifier.h | 31 |
1 files changed, 27 insertions, 4 deletions
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 3378cc753061..bada59812e00 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -325,12 +325,34 @@ struct bpf_func_state { int allocated_stack; }; -struct bpf_idx_pair { - u32 prev_idx; +#define MAX_CALL_FRAMES 8 + +/* instruction history flags, used in bpf_jmp_history_entry.flags field */ +enum { + /* instruction references stack slot through PTR_TO_STACK register; + * we also store stack's frame number in lower 3 bits (MAX_CALL_FRAMES is 8) + * and accessed stack slot's index in next 6 bits (MAX_BPF_STACK is 512, + * 8 bytes per slot, so slot index (spi) is [0, 63]) + */ + INSN_F_FRAMENO_MASK = 0x7, /* 3 bits */ + + INSN_F_SPI_MASK = 0x3f, /* 6 bits */ + INSN_F_SPI_SHIFT = 3, /* shifted 3 bits to the left */ + + INSN_F_STACK_ACCESS = BIT(9), /* we need 10 bits total */ +}; + +static_assert(INSN_F_FRAMENO_MASK + 1 >= MAX_CALL_FRAMES); +static_assert(INSN_F_SPI_MASK + 1 >= MAX_BPF_STACK / 8); + +struct bpf_jmp_history_entry { u32 idx; + /* insn idx can't be bigger than 1 million */ + u32 prev_idx : 22; + /* special flags, e.g., whether insn is doing register stack spill/load */ + u32 flags : 10; }; -#define MAX_CALL_FRAMES 8 /* Maximum number of register states that can exist at once */ #define BPF_ID_MAP_SIZE ((MAX_BPF_REG + MAX_BPF_STACK / BPF_REG_SIZE) * MAX_CALL_FRAMES) struct bpf_verifier_state { @@ -413,7 +435,7 @@ struct bpf_verifier_state { * For most states jmp_history_cnt is [0-3]. * For loops can go up to ~40. */ - struct bpf_idx_pair *jmp_history; + struct bpf_jmp_history_entry *jmp_history; u32 jmp_history_cnt; u32 dfs_depth; u32 callback_unroll_depth; @@ -656,6 +678,7 @@ struct bpf_verifier_env { int cur_stack; } cfg; struct backtrack_state bt; + struct bpf_jmp_history_entry *cur_hist_ent; u32 pass_cnt; /* number of times do_check() was called */ u32 subprog_cnt; /* number of instructions analyzed by the verifier */ |