aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-07-21net: phylink: ensure link is down when changing interfaceRussell King1-1/+11
The only PHYs that are used with phylink which change their interface are the BCM84881 and MV88X3310 family, both of which only change their interface modes on link-up events. However, rather than relying upon this behaviour by the PHY, we should give a stronger guarantee when resolving that the link will be down whenever we change the interface mode. This patch implements that stronger guarantee for resolve. Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21net: phylink: rearrange resolve mac_config() callRussell King1-13/+8
Use a boolean to indicate whether mac_config() should be called during a resolution. This allows resolution to have a single location where mac_config() will be called, which will allow us to make decisions about how and what we do. Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21net: phylink: rejig link state trackingRussell King1-7/+7
Rejig the link state tracking, so that we can use the current state in a future patch. Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21net: phylink: update ethtool reporting for fixed-link modesRussell King1-0/+2
Comparing the ethtool output from phylink and non-phylink fixed-link setups shows that we have some differences: - The "auto-negotiation" fields are different; phylink reports these as "No", non-phylink reports these as "Yes" for the supported and advertising masks. - The link partner advertisement is set to the link speed with non- phylink, but phylink leaves this unset, causing all link partner fields to be omitted. The phylink ethtool output also disagrees with the software emulated PHY dump via the MII registers. Update the phylink fixed-link parsing code so that we better reflect the behaviour of the non-phylink code that this facility replaces, and bring the ethtool interface more into line with the report from via the MII interface. Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21Merge branch 'enetc-Add-adaptive-interrupt-coalescing'David S. Miller5-44/+257
Claudiu Manoil says: ==================== enetc: Add adaptive interrupt coalescing Apart from some related cleanup patches, this set introduces in a straightforward way the support needed to enable and configure interrupt coalescing for ENETC. Patch 5 introduces the support needed for configuring the interrupt coalescing parameters and for switching between moderated (int. coalescing) and per-packet interrupt modes. When interrupt coalescing is enabled the Rx/Tx time thresholds are configurable, packet thresholds are fixed. To make this work reliably, patch 5 uses the traffic pause procedure introduced in patch 2. Patch 6 adds DIM (Dynamic Interrupt Moderation) to implement adaptive coalescing based on time thresholds, for the Rx 'channel'. On the Tx side a default optimal value is used instead, optimized for TCP traffic over 1G and 2.5G links. This default 'optimal' value can be overridden anytime via 'ethtool -C tx-usecs'. netperf -t TCP_MAERTS measurements show a significant CPU load reduction correlated w/ reduced interrupt rates. For the measurement results refer to the comments in patch 6. v2: Replaced Tx DIM with predefined optimal value, giving better results. This was also suggested by Jakub (cc). Switched order of patches 4 and 5, for better grouping. v3: minor cleanup/improvements ==================== Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Add adaptive interrupt coalescingClaudiu Manoil4-7/+73
Use the generic dynamic interrupt moderation (dim) framework to implement adaptive interrupt coalescing on Rx. With the per-packet interrupt scheme, a high interrupt rate has been noted for moderate traffic flows leading to high CPU utilization. The 'dim' scheme implemented by the current patch addresses this issue improving CPU utilization while using minimal coalescing time thresholds in order to preserve a good latency. On the Tx side use an optimal time threshold value by default. This value has been optimized for Tx TCP streams at a rate of around 85kpps on a 1G link, at which rate half of the Tx ring size (128) gets filled in 1500 usecs. Scaling this down to 2.5G links yields the current value of 600 usecs, which is conservative and gives good enough results for 1G links too (see next). Below are some measurement results for before and after this patch (and related dependencies) basically, for a 2 ARM Cortex-A72 @1.3Ghz CPUs system (32 KB L1 data cache), using 60secs log netperf TCP stream tests @ 1Gbit link (maximum throughput): 1) 1 Rx TCP flow, both Rx and Tx processed by the same NAPI thread on the same CPU: CPU utilization int rate (ints/sec) Before: 50%-60% (over 50%) 92k After: 13%-22% 3.5k-12k Comment: Major CPU utilization improvement for a single flow Rx TCP flow (i.e. netperf -t TCP_MAERTS) on a single CPU. Usually settles under 16% for longer tests. 2) 4 Rx TCP flows + 4 Tx TCP flows (+ pings to check the latency): Total CPU utilization Total int rate (ints/sec) Before: ~80% (spikes to 90%) ~100k After: 60% (more steady) ~4k Comment: Important improvement for this load test, while the ping test outcome does not show any notable difference compared to before. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Add interrupt coalescing supportClaudiu Manoil4-10/+132
Enable programming of the interrupt coalescing registers and allow manual configuration of the coalescing time thresholds via ethtool. Packet thresholds have been fixed to predetermined values as there's no point in making them run-time configurable, also anticipating the dynamic interrupt moderation (DIM) algorithm which uses fixed packet thresholds as well. If the interface is up when the operation mode of traffic interrupt events is changed by the user (i.e. switching from default per-packet interrupts to coalesced interrupts), the traffic needs to be paused in the process. This patch also prepares the ground for introducing DIM on Rx. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Drop redundant ____cacheline_aligned_in_smpClaudiu Manoil1-1/+1
'struct enetc_bdr' is already '____cacheline_aligned_in_smp'. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Fix interrupt coalescing register namingClaudiu Manoil3-7/+7
Interrupt coalescing registers naming in the current revision of the Ref Man (RM) is ICR, deprecating the ICIR name used in earlier (draft) versions of the RM. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Factor out the traffic start/stop proceduresClaudiu Manoil1-25/+49
A reliable traffic pause (and reconfiguration) procedure is needed to be able to safely make h/w configuration changes during run-time, like changing the mode in which the interrupts are operating (i.e. with or without coalescing), as opposed to making on-the-fly register updates that may be subject to h/w or s/w concurrency issues. To this end, the code responsible of the run-time device configurations that basically starts resp. stops the traffic flow through the device has been extracted from the the enetc_open/_close procedures, to the separate standalone enetc_start/_stop procedures. Traffic stop should be as graceful as possible, it lets the executing napi threads to to finish while the interrupts stay disabled. But since the napi thread will try to re-enable interrupts by clearing the device's unmask register, the enable_irq/ disable_irq API has been used to avoid this potential concurrency issue and make the traffic pause procedure more reliable. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21enetc: Refine buffer descriptor ring sizesClaudiu Manoil2-4/+5
It's time to differentiate between Rx and Tx ring sizes. Not only Tx rings are processed differently than Rx rings, but their default number also differs - i.e. up to 8 Tx rings per device (8 traffic classes) vs. 2 Rx rings (one per CPU). So let's set Tx rings sizes to half the size of the Rx rings for now, to be conservative. The default ring sizes were decreased as well (to the next lower power of 2), to reduce the memory footprint, buffering etc., since the measurements I've made so far show that the rings are very unlikely to get full. This change also anticipates the introduction of the dynamic interrupt moderation (dim) algorithm which operates on maximum packet thresholds of 256 packets for Rx and 128 packets for Tx. Signed-off-by: Claudiu Manoil <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21net: mdio-mux-gpio: use devm_gpiod_get_array()Jisheng Zhang1-8/+3
Use devm_gpiod_get_array() to simplify the error handling and exit code path. Signed-off-by: Jisheng Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-07-21bpftool: Use only nftw for file tree parsingTony Ambardar2-59/+82
The bpftool sources include code to walk file trees, but use multiple frameworks to do so: nftw and fts. While nftw conforms to POSIX/SUSv3 and is widely available, fts is not conformant and less common, especially on non-glibc systems. The inconsistent framework usage hampers maintenance and portability of bpftool, in particular for embedded systems. Standardize code usage by rewriting one fts-based function to use nftw and clean up some related function warnings by extending use of "const char *" arguments. This change helps in building bpftool against musl for OpenWrt. Also fix an unsafe call to dirname() by duplicating the string to pass, since some implementations may directly alter it. The same approach is used in libbpf.c. Signed-off-by: Tony Ambardar <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Quentin Monnet <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21Merge branch 'bpf_iter-BTF_ID-at-build-time'Alexei Starovoitov12-73/+153
Yonghong Song says: ==================== Commit 5a2798ab32ba ("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros") implemented a mechanism to compute btf_ids at kernel build time which can simplify kernel implementation and reduce runtime overhead by removing in-kernel btf_id calculation. This patch set tried to use this mechanism to compute btf_ids for bpf_skc_to_*() helpers and for btf_id_or_null ctx arguments specified during bpf iterator registration. Please see individual patch for details. Changelogs: v1 -> v2: - v1 ([1]) is only for bpf_skc_to_*() helpers. This version expanded it to cover ctx btf_id_or_null arguments - abandoned the change of "extern u32 name[]" to "static u32 name[]" for BPF_ID_LIST local "name" definition. gcc 9 incurred a compilation error. [1]: https://lore.kernel.org/bpf/[email protected]/T ==================== Signed-off-by: Alexei Starovoitov <[email protected]>
2020-07-21samples/bpf, selftests/bpf: Use bpf_probe_read_kernelIlya Leoshkevich10-16/+33
A handful of samples and selftests fail to build on s390, because after commit 0ebeea8ca8a4 ("bpf: Restrict bpf_probe_read{, str}() only to archs where they work") bpf_probe_read is not available anymore. Fix by using bpf_probe_read_kernel. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf: net: Use precomputed btf_id for bpf iteratorsYonghong Song8-9/+38
One additional field btf_id is added to struct bpf_ctx_arg_aux to store the precomputed btf_ids. The btf_id is computed at build time with BTF_ID_LIST or BTF_ID_LIST_GLOBAL macro definitions. All existing bpf iterators are changed to used pre-compute btf_ids. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21selftests/bpf: Fix test_lwt_seg6local.sh hangsIlya Leoshkevich1-1/+1
OpenBSD netcat (Debian patchlevel 1.195-2) does not seem to react to SIGINT for whatever reason, causing prefix.pl to hang after test_lwt_seg6local.sh exits due to netcat inheriting test_lwt_seg6local.sh's file descriptors. Fix by using SIGTERM instead. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf: Make btf_sock_ids globalYonghong Song3-28/+62
tcp and udp bpf_iter can reuse some socket ids in btf_sock_ids, so make it global. I put the extern definition in btf_ids.h as a central place so it can be easily discovered by developers. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21Merge branch 'compressed-JITed-insn'Alexei Starovoitov4-153/+643
Luke Nelson says: ==================== his patch series enables using compressed riscv (RVC) instructions in the rv64 BPF JIT. RVC is a standard riscv extension that adds a set of compressed, 2-byte instructions that can replace some regular 4-byte instructions for improved code density. This series first modifies the JIT to support using 2-byte instructions (e.g., in jump offset computations), then adds RVC encoding and helper functions, and finally uses the helper functions to optimize the rv64 JIT. I used our formal verification framework, Serval, to verify the correctness of the RVC encodings and their uses in the rv64 JIT. The JIT continues to pass all tests in lib/test_bpf.c, and introduces no new failures to test_verifier; both with and without RVC being enabled. The following are examples of the JITed code for the verifier selftest "direct packet read test#3 for CGROUP_SKB OK", without and with RVC enabled, respectively. The former uses 178 bytes, and the latter uses 112, for a ~37% reduction in code size for this example. Without RVC: 0: 02000813 addi a6,zero,32 4: fd010113 addi sp,sp,-48 8: 02813423 sd s0,40(sp) c: 02913023 sd s1,32(sp) 10: 01213c23 sd s2,24(sp) 14: 01313823 sd s3,16(sp) 18: 01413423 sd s4,8(sp) 1c: 03010413 addi s0,sp,48 20: 03056683 lwu a3,48(a0) 24: 02069693 slli a3,a3,0x20 28: 0206d693 srli a3,a3,0x20 2c: 03456703 lwu a4,52(a0) 30: 02071713 slli a4,a4,0x20 34: 02075713 srli a4,a4,0x20 38: 03856483 lwu s1,56(a0) 3c: 02049493 slli s1,s1,0x20 40: 0204d493 srli s1,s1,0x20 44: 03c56903 lwu s2,60(a0) 48: 02091913 slli s2,s2,0x20 4c: 02095913 srli s2,s2,0x20 50: 04056983 lwu s3,64(a0) 54: 02099993 slli s3,s3,0x20 58: 0209d993 srli s3,s3,0x20 5c: 09056a03 lwu s4,144(a0) 60: 020a1a13 slli s4,s4,0x20 64: 020a5a13 srli s4,s4,0x20 68: 00900313 addi t1,zero,9 6c: 006a7463 bgeu s4,t1,0x74 70: 00000a13 addi s4,zero,0 74: 02d52823 sw a3,48(a0) 78: 02e52a23 sw a4,52(a0) 7c: 02952c23 sw s1,56(a0) 80: 03252e23 sw s2,60(a0) 84: 05352023 sw s3,64(a0) 88: 00000793 addi a5,zero,0 8c: 02813403 ld s0,40(sp) 90: 02013483 ld s1,32(sp) 94: 01813903 ld s2,24(sp) 98: 01013983 ld s3,16(sp) 9c: 00813a03 ld s4,8(sp) a0: 03010113 addi sp,sp,48 a4: 00078513 addi a0,a5,0 a8: 00008067 jalr zero,0(ra) With RVC: 0: 02000813 addi a6,zero,32 4: 7179 c.addi16sp sp,-48 6: f422 c.sdsp s0,40(sp) 8: f026 c.sdsp s1,32(sp) a: ec4a c.sdsp s2,24(sp) c: e84e c.sdsp s3,16(sp) e: e452 c.sdsp s4,8(sp) 10: 1800 c.addi4spn s0,sp,48 12: 03056683 lwu a3,48(a0) 16: 1682 c.slli a3,0x20 18: 9281 c.srli a3,0x20 1a: 03456703 lwu a4,52(a0) 1e: 1702 c.slli a4,0x20 20: 9301 c.srli a4,0x20 22: 03856483 lwu s1,56(a0) 26: 1482 c.slli s1,0x20 28: 9081 c.srli s1,0x20 2a: 03c56903 lwu s2,60(a0) 2e: 1902 c.slli s2,0x20 30: 02095913 srli s2,s2,0x20 34: 04056983 lwu s3,64(a0) 38: 1982 c.slli s3,0x20 3a: 0209d993 srli s3,s3,0x20 3e: 09056a03 lwu s4,144(a0) 42: 1a02 c.slli s4,0x20 44: 020a5a13 srli s4,s4,0x20 48: 4325 c.li t1,9 4a: 006a7363 bgeu s4,t1,0x50 4e: 4a01 c.li s4,0 50: d914 c.sw a3,48(a0) 52: d958 c.sw a4,52(a0) 54: dd04 c.sw s1,56(a0) 56: 03252e23 sw s2,60(a0) 5a: 05352023 sw s3,64(a0) 5e: 4781 c.li a5,0 60: 7422 c.ldsp s0,40(sp) 62: 7482 c.ldsp s1,32(sp) 64: 6962 c.ldsp s2,24(sp) 66: 69c2 c.ldsp s3,16(sp) 68: 6a22 c.ldsp s4,8(sp) 6a: 6145 c.addi16sp sp,48 6c: 853e c.mv a0,a5 6e: 8082 c.jr ra RFC -> v1: - From Björn Töpel: * Changed RVOFF macro to static inline "ninsns_rvoff". * Changed return type of rvc_ functions from u32 to u16. * Changed sizeof(u16) to sizeof(*ctx->insns). * Factored unsigned immediate checks into helper functions (is_8b_uint, etc.) * Changed to use IS_ENABLED instead of #ifdef to check if RVC is enabled. * Changed type of immediate arguments to rvc_* encoding to u32 to avoid issues from promotion of u16 to signed int. * Cleaned up RVC checks in emit_{addi,slli,srli,srai}. + Wrapped lines at 100 instead of 80 columns for increased clarity. + Move !imm checks into each branch instead of checking separately. + Strengthed checks for c.{slli,srli,srai} to check that imm < XLEN. Otherwise, imm could be non-zero but the lower XLEN bits could all be zero, leading to invalid RVC encoding. * Changed emit_imm to sign-extend the 12-bit value in "lower" + The immediate checks for emit_{addiw,li,addi} use signed comparisons, so this enables the RVC variants to be used more often (e.g., if val == -1, then lower should be -1 as opposed to 4095). ==================== Reviewed-by: Björn Töpel <[email protected]> Acked-by: Björn Töpel <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2020-07-21bpf: Add BTF_ID_LIST_GLOBAL in btf_ids.hYonghong Song3-14/+39
Existing BTF_ID_LIST used a local static variable to store btf_ids. This patch provided a new macro BTF_ID_LIST_GLOBAL to store btf_ids in a global variable which can be shared among multiple files. The existing BTF_ID_LIST is still retained. Two reasons. First, BTF_ID_LIST is also used to build btf_ids for helper arguments which typically is an array of 5. Since typically different helpers have different signature, it makes little sense to share them. Second, some current computed btf_ids are indeed local. If later those btf_ids are shared between different files, they can use BTF_ID_LIST_GLOBAL then. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21tools/bpf: Sync btf_ids.h to toolsYonghong Song2-1/+11
Sync kernel header btf_ids.h to tools directory. Also define macro CONFIG_DEBUG_INFO_BTF before including btf_ids.h in prog_tests/resolve_btfids.c since non-stub definitions for BTF_ID_LIST etc. macros are defined under CONFIG_DEBUG_INFO_BTF. This prevented test_progs from failing. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf: Compute bpf_skc_to_*() helper socket btf ids at build timeYonghong Song3-36/+18
Currently, socket types (struct tcp_sock, udp_sock, etc.) used by bpf_skc_to_*() helpers are computed when vmlinux_btf is first built in the kernel. Commit 5a2798ab32ba ("bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros") implemented a mechanism to compute btf_ids at kernel build time which can simplify kernel implementation and reduce runtime overhead by removing in-kernel btf_id calculation. This patch did exactly this, removing in-kernel btf_id computation and utilizing build-time btf_id computation. If CONFIG_DEBUG_INFO_BTF is not defined, BTF_ID_LIST will define an array with size of 5, which is not enough for btf_sock_ids. So define its own static array if CONFIG_DEBUG_INFO_BTF is not defined. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21tools/bpftool: Fix error handing in do_skeleton()YueHaibing1-1/+4
Fix pass 0 to PTR_ERR, also dump more err info using libbpf_strerror. Fixes: 5dc7a8b21144 ("bpftool, selftests/bpf: Embed object file inside skeleton") Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Reviewed-by: Quentin Monnet <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf, riscv: Use compressed instructions in the rv64 JITLuke Nelson1-134/+147
This patch uses the RVC support and encodings from bpf_jit.h to optimize the rv64 jit. The optimizations work by replacing emit(rv_X(...)) with a call to a helper function emit_X, which will emit a compressed version of the instruction when possible, and when RVC is enabled. The JIT continues to pass all tests in lib/test_bpf.c, and introduces no new failures to test_verifier; both with and without RVC being enabled. Most changes are straightforward replacements of emit(rv_X(...), ctx) with emit_X(..., ctx), with the following exceptions bearing mention; * Change emit_imm to sign-extend the value in "lower", since the checks for RVC (and the instructions themselves) treat the value as signed. Otherwise, small negative immediates will not be recognized as encodable using an RVC instruction. For example, without this change, emit_imm(rd, -1, ctx) would cause lower to become 4095, which is not a 6b int even though a "c.li rd, -1" instruction suffices. * For {BPF_MOV,BPF_ADD} BPF_X, drop using addiw,addw in the 32-bit cases since the values are zero-extended into the upper 32 bits in the following instructions anyways, and the addition commutes with zero-extension. (BPF_SUB BPF_X must still use subw since subtraction does not commute with zero-extension.) This patch avoids optimizing branches and jumps to use RVC instructions since surrounding code often makes assumptions about the sizes of emitted instructions. Optimizing these will require changing these functions (e.g., emit_branch) to dynamically compute jump offsets. The following are examples of the JITed code for the verifier selftest "direct packet read test#3 for CGROUP_SKB OK", without and with RVC enabled, respectively. The former uses 178 bytes, and the latter uses 112, for a ~37% reduction in code size for this example. Without RVC: 0: 02000813 addi a6,zero,32 4: fd010113 addi sp,sp,-48 8: 02813423 sd s0,40(sp) c: 02913023 sd s1,32(sp) 10: 01213c23 sd s2,24(sp) 14: 01313823 sd s3,16(sp) 18: 01413423 sd s4,8(sp) 1c: 03010413 addi s0,sp,48 20: 03056683 lwu a3,48(a0) 24: 02069693 slli a3,a3,0x20 28: 0206d693 srli a3,a3,0x20 2c: 03456703 lwu a4,52(a0) 30: 02071713 slli a4,a4,0x20 34: 02075713 srli a4,a4,0x20 38: 03856483 lwu s1,56(a0) 3c: 02049493 slli s1,s1,0x20 40: 0204d493 srli s1,s1,0x20 44: 03c56903 lwu s2,60(a0) 48: 02091913 slli s2,s2,0x20 4c: 02095913 srli s2,s2,0x20 50: 04056983 lwu s3,64(a0) 54: 02099993 slli s3,s3,0x20 58: 0209d993 srli s3,s3,0x20 5c: 09056a03 lwu s4,144(a0) 60: 020a1a13 slli s4,s4,0x20 64: 020a5a13 srli s4,s4,0x20 68: 00900313 addi t1,zero,9 6c: 006a7463 bgeu s4,t1,0x74 70: 00000a13 addi s4,zero,0 74: 02d52823 sw a3,48(a0) 78: 02e52a23 sw a4,52(a0) 7c: 02952c23 sw s1,56(a0) 80: 03252e23 sw s2,60(a0) 84: 05352023 sw s3,64(a0) 88: 00000793 addi a5,zero,0 8c: 02813403 ld s0,40(sp) 90: 02013483 ld s1,32(sp) 94: 01813903 ld s2,24(sp) 98: 01013983 ld s3,16(sp) 9c: 00813a03 ld s4,8(sp) a0: 03010113 addi sp,sp,48 a4: 00078513 addi a0,a5,0 a8: 00008067 jalr zero,0(ra) With RVC: 0: 02000813 addi a6,zero,32 4: 7179 c.addi16sp sp,-48 6: f422 c.sdsp s0,40(sp) 8: f026 c.sdsp s1,32(sp) a: ec4a c.sdsp s2,24(sp) c: e84e c.sdsp s3,16(sp) e: e452 c.sdsp s4,8(sp) 10: 1800 c.addi4spn s0,sp,48 12: 03056683 lwu a3,48(a0) 16: 1682 c.slli a3,0x20 18: 9281 c.srli a3,0x20 1a: 03456703 lwu a4,52(a0) 1e: 1702 c.slli a4,0x20 20: 9301 c.srli a4,0x20 22: 03856483 lwu s1,56(a0) 26: 1482 c.slli s1,0x20 28: 9081 c.srli s1,0x20 2a: 03c56903 lwu s2,60(a0) 2e: 1902 c.slli s2,0x20 30: 02095913 srli s2,s2,0x20 34: 04056983 lwu s3,64(a0) 38: 1982 c.slli s3,0x20 3a: 0209d993 srli s3,s3,0x20 3e: 09056a03 lwu s4,144(a0) 42: 1a02 c.slli s4,0x20 44: 020a5a13 srli s4,s4,0x20 48: 4325 c.li t1,9 4a: 006a7363 bgeu s4,t1,0x50 4e: 4a01 c.li s4,0 50: d914 c.sw a3,48(a0) 52: d958 c.sw a4,52(a0) 54: dd04 c.sw s1,56(a0) 56: 03252e23 sw s2,60(a0) 5a: 05352023 sw s3,64(a0) 5e: 4781 c.li a5,0 60: 7422 c.ldsp s0,40(sp) 62: 7482 c.ldsp s1,32(sp) 64: 6962 c.ldsp s2,24(sp) 66: 69c2 c.ldsp s3,16(sp) 68: 6a22 c.ldsp s4,8(sp) 6a: 6145 c.addi16sp sp,48 6c: 853e c.mv a0,a5 6e: 8082 c.jr ra Signed-off-by: Luke Nelson <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Cc: Björn Töpel <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21libbpf bpf_helpers: Use __builtin_offsetof for offsetofIan Rogers1-1/+1
The non-builtin route for offsetof has a dependency on size_t from stdlib.h/stdint.h that is undeclared and may break targets. The offsetof macro in bpf_helpers may disable the same macro in other headers that have a #ifdef offsetof guard. Rather than add additional dependencies improve the offsetof macro declared here to use the builtin that is available since llvm 3.7 (the first with a BPF backend). Signed-off-by: Ian Rogers <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf, riscv: Add encodings for compressed instructionsLuke Nelson1-0/+452
This patch adds functions for encoding and emitting compressed riscv (RVC) instructions to the BPF JIT. Some regular riscv instructions can be compressed into an RVC instruction if the instruction fields meet some requirements. For example, "add rd, rs1, rs2" can be compressed into "c.add rd, rs2" when rd == rs1. To make using RVC encodings simpler, this patch also adds helper functions that selectively emit either a regular instruction or a compressed instruction if possible. For example, emit_add will produce a "c.add" if possible and regular "add" otherwise. Signed-off-by: Luke Nelson <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21s390/bpf: Use bpf_skip() in bpf_jit_prologue()Ilya Leoshkevich1-4/+5
Now that we have bpf_skip() for emitting nops, use it in bpf_jit_prologue() in order to reduce code duplication. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21bpf, riscv: Modify JIT ctx to support compressed instructionsLuke Nelson4-19/+44
This patch makes the necessary changes to struct rv_jit_context and to bpf_int_jit_compile to support compressed riscv (RVC) instructions in the BPF JIT. It changes the JIT image to be u16 instead of u32, since RVC instructions are 2 bytes as opposed to 4. It also changes ctx->offset and ctx->ninsns to refer to 2-byte instructions rather than 4-byte ones. The riscv PC is required to be 16-bit aligned with or without RVC, so this is sufficient to refer to any valid riscv offset. The code for computing jump offsets in bytes is updated accordingly, and factored into a new "ninsns_rvoff" function to simplify the code. Signed-off-by: Luke Nelson <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21s390/bpf: Tolerate not converging code shrinkingIlya Leoshkevich1-1/+26
"BPF_MAXINSNS: Maximum possible literals" unnecessarily falls back to the interpreter because of failing sanity check in bpf_set_addr. The problem is that there are a lot of branches that can be shrunk, and doing so opens up the possibility to shrink even more. This process does not converge after 3 passes, causing code offsets to change during the codegen pass, which must never happen. Fix by inserting nops during codegen pass in order to preserve code offets. Fixes: 4e9b4a6883dd ("s390/bpf: Use relative long branches") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21s390/bpf: Use brcl for jumping to exit_ip if necessaryIlya Leoshkevich1-2/+6
"BPF_MAXINSNS: Maximum possible literals" test causes panic with bpf_jit_harden = 2. The reason is that BPF_JMP | BPF_EXIT is always emitted as brc, however, after removal of JITed image size limitations, brcl might be required. Fix by using brcl when necessary. Fixes: 4e9b4a6883dd ("s390/bpf: Use relative long branches") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21s390/bpf: Fix sign extension in branch_kuIlya Leoshkevich1-15/+4
Both signed and unsigned variants of BPF_JMP | BPF_K require sign-extending the immediate. JIT emits cgfi for the signed case, which is correct, and clgfi for the unsigned case, which is not correct: clgfi zero-extends the immediate. s390 does not provide an instruction that does sign-extension and unsigned comparison at the same time. Therefore, fix by first loading the sign-extended immediate into work register REG_1 and proceeding as if it's BPF_X. Fixes: 4e9b4a6883dd ("s390/bpf: Use relative long branches") Reported-by: Seth Forshee <[email protected]> Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Tested-by: Seth Forshee <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21selftests: bpf: test_kmod.sh: Fix running out of srctreeIlya Leoshkevich1-3/+9
When running out of srctree, relative path to lib/test_bpf.ko is different than when running in srctree. Check $building_out_of_srctree environment variable and use a different relative path if needed. Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-07-21mt76: mt76u: add missing release on skb in __mt76x02u_mcu_send_msgNavid Emamdoost1-2/+5
In the implementation of __mt76x02u_mcu_send_msg() the skb is consumed all execution paths except one. Release skb before returning if test_bit() fails. Signed-off-by: Navid Emamdoost <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: fix possible memory leak in mt7615_mcu_wtbl_sta_addLorenzo Bianconi1-1/+5
Free the second mcu skb if __mt76_mcu_skb_send_msg() fails to transmit the first one in mt7615_mcu_wtbl_sta_add(). Fixes: 99c457d902cf9 ("mt76: mt7615: move mt7615_mcu_set_bmc to mt7615_mcu_ops") Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7915: fix potential memory leak in mcu message handlerRyder Lee1-2/+5
Fix potential memory leak in mcu message handler on error condition. Fixes: c6b002bcdfa6 ("mt76: add mac80211 driver for MT7915 PCIe-based chipsets") Signed-off-by: Ryder Lee <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt76s: move queue accounting in mt76s_tx_queue_skbLorenzo Bianconi2-4/+6
In order to avoid possible overflows, move tx queue accounting from mt7663s_tx_run_queue() to mt76s_tx_queue_skb_raw()/mt76s_tx_queue_skb() Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: introduce mt7663s supportSean Wang12-1/+1114
Introduce support for mt7663s 802.11ac 2x2:2 chipset to mt7615 driver. Co-developed-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: introduce mt76_sdio moduleSean Wang6-2/+408
Introduce mt76_sdio module as common layer to add mt7663s support Co-developed-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: introduce mt7663-usb-sdio-common moduleLorenzo Bianconi6-389/+424
Introduce mt7663-usb-sdio-common module as container for shared code between usb and sdio driver. Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: sdio code must access rate/key regs in preocess contextLorenzo Bianconi2-4/+4
As usb, sdio relies on mt76 workqueue to configure tx rate or upload keys to the hw. This is a preliminary patch to add SDIO support to mt76 driver Co-developed-by: Sean Wang <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt76u: add mt76_skb_adjust_pad utility routineLorenzo Bianconi6-35/+39
Introduce mt76_skb_adjust_pad to reuse the code adding sdio support to mt7615 driver and remove code duplication. Move 4B header configuration for usb devices out of mt76_skb_adjust_pad Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: take into account sdio bus configuring txwiLorenzo Bianconi1-9/+9
usb and sdio relies on SF architecture. This is a preliminary patch to add SDIO support to mt76 driver Co-developed-by: Sean Wang <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7915: add missing CONFIG_MAC80211_DEBUGFSRyder Lee1-0/+2
Add CONFIG_MAC80211_DEBUGFS to fix a reported warning. Fixes: ec9742a8f38e ("mt76: mt7915: add .sta_add_debugfs support") Reported-by: kernel test robot <[email protected]> Signed-off-by: Ryder Lee <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7915: potential array overflow in mt7915_mcu_tx_rate_report()Dan Carpenter1-6/+13
Smatch complains that "wcidx" value comes from the network and thus cannot be trusted. In this case, it actually seems to come from the firmware. If your wireless firmware is malicious then probably no amount of carefulness can protect you. On the other hand, these days we still try to check the firmware as much as possible. Verifying that the index is within bounds will silence a static checker warning. And it's harmless and a good exercise in kernel hardening. So I suggest that we do add a bounds check. Fixes: e57b7901469f ("mt76: add mac80211 driver for MT7915 PCIe-based chipsets") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: fix potential memory leak in mcu message handlerSean Wang1-2/+5
Fix potential memory leak in mcu message handler on error condition. Fixes: 0e6a29e477f3 ("mt76: mt7615: add support to read temperature from mcu") Acked-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7663u: fix potential memory leak in mcu message handlerSean Wang1-1/+1
Fix potential memory leak in mcu message handler on error condition. Fixes: eb99cc95c3b6 ("mt76: mt7615: introduce mt7663u support") Acked-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Sean Wang <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7663u: fix memory leak in set keySean Wang1-7/+14
Fix memory leak in set key. Fixes: eb99cc95c3b6 ("mt76: mt7615: introduce mt7663u support") Signed-off-by: Sean Wang <[email protected]> Acked-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: reschedule ps work according to last activityLorenzo Bianconi1-3/+11
Reschedule runtime-pm delayed work if there is a new activity when ps delayed work is already scheduled Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7615: avoid scheduling runtime-pm during hw scanLorenzo Bianconi1-3/+12
Do not schedule ps_work during hw scanning or hw scheduled frequency scan Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>
2020-07-21mt76: mt7663u: sync probe sampling with rate configurationLorenzo Bianconi4-16/+22
On usb device rate configuration for sampling is performed relying on a workqueue since it is not possible to access the device in the interrupt context. Move the configuration of the probe_rate flag in the workqueue in order to keep probe sampling in sync with actual rate configuration Fixes: eb99cc95c3b6 ("mt76: mt7615: introduce mt7663u support") Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]>