aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-13net: stmmac: add support for RZ/N1 GMACClément Léger4-0/+105
Add support for the Renesas RZ/N1 GMAC. This support can make use of a custom RZ/N1 PCS which is fetched by parsing the pcs-handle device tree property. Signed-off-by: Clément Léger <[email protected]> Co-developed-by: Romain Gantois <[email protected]> Signed-off-by: Romain Gantois <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: stmmac: dwmac-socfpga: use pcs_init/pcs_exitRussell King (Oracle)1-54/+53
Use the newly introduced pcs_init() and pcs_exit() operations to create and destroy the PCS instance at a more appropriate moment during the driver lifecycle, thereby avoiding publishing a network device to userspace that has not yet finished its PCS initialisation. There are other similar issues with this driver which remain unaddressed, but these are out of scope for this patch. Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Maxime Chevallier <[email protected]> [rgantois: removed second parameters of new callbacks] Signed-off-by: Romain Gantois <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: stmmac: introduce pcs_init/pcs_exit stmmac operationsRussell King (Oracle)2-1/+9
Introduce a mechanism whereby platforms can create their PCS instances prior to the network device being published to userspace, but after some of the core stmmac initialisation has been completed. This means that the data structures that platforms need will be available. Signed-off-by: Russell King (Oracle) <[email protected]> Reviewed-by: Maxime Chevallier <[email protected]> Reviewed-by: Serge Semin <[email protected]> Co-developed-by: Romain Gantois <[email protected]> Signed-off-by: Romain Gantois <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: stmmac: Make stmmac_xpcs_setup() generic to all PCS devicesSerge Semin3-19/+23
A pcs_init() callback will be introduced to stmmac in a future patch. This new function will be called during the hardware initialization phase. Instead of separately initializing XPCS and PCS components, let's group all PCS-related hardware initialization logic in the current stmmac_xpcs_setup() function. Rename stmmac_xpcs_setup() to stmmac_pcs_setup() and move the conditional call to stmmac_xpcs_setup() inside the function itself. Signed-off-by: Serge Semin <[email protected]> Co-developed-by: Romain Gantois <[email protected]> Signed-off-by: Romain Gantois <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: stmmac: Add dedicated XPCS cleanup methodSerge Semin3-4/+17
Currently the XPCS handler destruction is performed in the stmmac_mdio_unregister() method. It doesn't look good because the handler isn't originally created in the corresponding protagonist stmmac_mdio_unregister(), but in the stmmac_xpcs_setup() function. In order to have more coherent MDIO and XPCS setup/cleanup procedures, let's move the DW XPCS destruction to the dedicated stmmac_pcs_clean() method. This method will also be used to cleanup PCS hardware using the pcs_exit() callback that will be introduced to stmmac in a subsequent patch. Signed-off-by: Serge Semin <[email protected]> Co-developed-by: Romain Gantois <[email protected]> Signed-off-by: Romain Gantois <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Reviewed-by: Hariprasad Kelam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13dt-bindings: net: renesas,rzn1-gmac: Document RZ/N1 GMAC supportClément Léger1-0/+66
The RZ/N1 series of MPUs feature up to two Gigabit Ethernet controllers. These controllers are based on Synopsys IPs. They can be connected to RZ/N1 RGMII/RMII converters. Add a binding that describes these GMAC devices. Signed-off-by: Clément Léger <[email protected]> [rgantois: commit log] Reviewed-by: Rob Herring <[email protected]> Reviewed-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Romain Gantois <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13io_uring/net: wire up IORING_CQE_F_SOCK_NONEMPTY for acceptJens Axboe1-4/+16
If the given protocol supports passing back whether or not we had more pending accept post this one, pass back this information to userspace. This is done by setting IORING_CQE_F_SOCK_NONEMPTY in the CQE flags, just like we do for recv/recvmsg if there's more data available post a receive operation. We can also use this information to be smarter about multishot retry, as we don't need to do a pointless retry if we know for a fact that there aren't any more connections to accept. Suggested-by: Norman Maurer <[email protected]> Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2024-05-13net: pass back whether socket was empty post acceptJens Axboe2-0/+2
This adds an 'is_empty' argument to struct proto_accept_arg, which can be used to pass back information on whether or not the given socket has more connections to accept post the one just accepted. To utilize this information, the caller should initialize the 'is_empty' field to, eg, -1 and then check for 0/1 after the accept. If the field has been set, the caller knows whether there are more pending connections or not. If the field remains -1 after the accept call, the protocol doesn't support passing back this information. This patch wires it up for ipv4/6 TCP. Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2024-05-13net: have do_accept() take a struct proto_accept_arg argumentJens Axboe3-10/+11
In preparation for passing in more information via this API, change do_accept() to take a proto_accept_arg struct pointer rather than just the file flags separately. No functional changes in this patch. Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2024-05-13net: change proto and proto_ops accept typeJens Axboe34-111/+132
Rather than pass in flags, error pointer, and whether this is a kernel invocation or not, add a struct proto_accept_arg struct as the argument. This then holds all of these arguments, and prepares accept for being able to pass back more information. No functional changes in this patch. Acked-by: Jakub Kicinski <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2024-05-13Merge tag 'sched-core-2024-05-13' of ↵Linus Torvalds42-441/+550
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: - Add cpufreq pressure feedback for the scheduler - Rework misfit load-balancing wrt affinity restrictions - Clean up and simplify the code around ::overutilized and ::overload access. - Simplify sched_balance_newidle() - Bump SCHEDSTAT_VERSION to 16 due to a cleanup of CPU_MAX_IDLE_TYPES handling that changed the output. - Rework & clean up <asm/vtime.h> interactions wrt arch_vtime_task_switch() - Reorganize, clean up and unify most of the higher level scheduler balancing function names around the sched_balance_*() prefix - Simplify the balancing flag code (sched_balance_running) - Miscellaneous cleanups & fixes * tag 'sched-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (50 commits) sched/pelt: Remove shift of thermal clock sched/cpufreq: Rename arch_update_thermal_pressure() => arch_update_hw_pressure() thermal/cpufreq: Remove arch_update_thermal_pressure() sched/cpufreq: Take cpufreq feedback into account cpufreq: Add a cpufreq pressure feedback for the scheduler sched/fair: Fix update of rd->sg_overutilized sched/vtime: Do not include <asm/vtime.h> header s390/irq,nmi: Include <asm/vtime.h> header directly s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftover sched/vtime: Get rid of generic vtime_task_switch() implementation sched/vtime: Remove confusing arch_vtime_task_switch() declaration sched/balancing: Simplify the sg_status bitmask and use separate ->overloaded and ->overutilized flags sched/fair: Rename set_rd_overutilized_status() to set_rd_overutilized() sched/fair: Rename SG_OVERLOAD to SG_OVERLOADED sched/fair: Rename {set|get}_rd_overload() to {set|get}_rd_overloaded() sched/fair: Rename root_domain::overload to ::overloaded sched/fair: Use helper functions to access root_domain::overload sched/fair: Check root_domain::overload value before update sched/fair: Combine EAS check with root_domain::overutilized access sched/fair: Simplify the continue_balancing logic in sched_balance_newidle() ...
2024-05-13net: qede: flower: validate control flagsAsbjørn Sloth Tønnesen1-0/+3
This driver currently doesn't support any control flags. Use flow_rule_match_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_match_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge tag 'perf-core-2024-05-13' of ↵Linus Torvalds14-172/+525
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: - Combine perf and BPF for fast evalution of HW breakpoint conditions - Add LBR capture support outside of hardware events - Trigger IO signals for watermark_wakeup - Add RAPL support for Intel Arrow Lake and Lunar Lake - Optimize frequency-throttling - Miscellaneous cleanups & fixes * tag 'perf-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits) perf/bpf: Mark perf_event_set_bpf_handler() and perf_event_free_bpf_handler() as inline too selftests/perf_events: Test FASYNC with watermark wakeups perf/ring_buffer: Trigger IO signals for watermark_wakeup perf: Move perf_event_fasync() to perf_event.h perf/bpf: Change the !CONFIG_BPF_SYSCALL stubs to static inlines selftest/bpf: Test a perf BPF program that suppresses side effects perf/bpf: Allow a BPF program to suppress all sample side effects perf/bpf: Remove unneeded uses_default_overflow_handler() perf/bpf: Call BPF handler directly, not through overflow machinery perf/bpf: Remove #ifdef CONFIG_BPF_SYSCALL from struct perf_event members perf/bpf: Create bpf_overflow_handler() stub for !CONFIG_BPF_SYSCALL perf/bpf: Reorder bpf_overflow_handler() ahead of __perf_event_overflow() perf/x86/rapl: Add support for Intel Lunar Lake perf/x86/rapl: Add support for Intel Arrow Lake perf/core: Reduce PMU access to adjust sample freq perf/core: Optimize perf_adjust_freq_unthr_context() perf/x86/amd: Don't reject non-sampling events with configured LBR perf/x86/amd: Support capturing LBR from software events perf/x86/amd: Avoid taking branches before disabling LBR perf/x86/amd: Ensure amd_pmu_core_disable_all() is always inlined ...
2024-05-13Merge branch 'virtio_net-rx-enable-premapped-mode-by-default'Jakub Kicinski2-59/+38
Xuan Zhuo says: ==================== virtio_net: rx enable premapped mode by default Actually, for the virtio drivers, we can enable premapped mode whatever the value of use_dma_api. Because we provide the virtio dma apis. So the driver can enable premapped mode unconditionally. This patch set makes the big mode of virtio-net to support premapped mode. And enable premapped mode for rx by default. Based on the following points, we do not use page pool to manage these pages: 1. virtio-net uses the DMA APIs wrapped by virtio core. Therefore, we can only prevent the page pool from performing DMA operations, and let the driver perform DMA operations on the allocated pages. 2. But when the page pool releases the page, we have no chance to execute dma unmap. 3. A solution to #2 is to execute dma unmap every time before putting the page back to the page pool. (This is actually a waste, we don't execute unmap so frequently.) 4. But there is another problem, we still need to use page.dma_addr to save the dma address. Using page.dma_addr while using page pool is unsafe behavior. 5. And we need space the chain the pages submitted once to virtio core. More: https://lore.kernel.org/all/CACGkMEu=Aok9z2imB_c5qVuujSh=vjj1kx12fy9N7hqyi+M5Ow@mail.gmail.com/ Why we do not use the page space to store the dma? http://lore.kernel.org/all/CACGkMEuyeJ9mMgYnnB42=hw6umNuo=agn7VBqBqYPd7GN=+39Q@mail.gmail.com ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13virtio_net: remove the misleading commentXuan Zhuo1-1/+0
We call the build_skb() actually without copying data. The comment is misleading. So remove it. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13virtio_net: rx remove premapped failover codeXuan Zhuo1-50/+35
Now, the premapped mode can be enabled unconditionally. So we can remove the failover code for merge and small mode. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13virtio_net: big mode skip the unmap checkXuan Zhuo1-2/+2
The virtio-net big mode did not enable premapped mode, so we did not need to check the unmap. And the subsequent commit will remove the failover code for failing enable premapped for merge and small mode. So we need to remove the checking do_dma code in the big mode path. Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13virtio_ring: enable premapped mode whatever use_dma_apiXuan Zhuo1-6/+1
Now, we have virtio DMA APIs, the driver can be the premapped mode whatever the virtio core uses dma api or not. So remove the limit of checking use_dma_api from virtqueue_set_dma_premapped(). Signed-off-by: Xuan Zhuo <[email protected]> Acked-by: Jason Wang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge tag 'locking-core-2024-05-13' of ↵Linus Torvalds17-175/+295
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Over a dozen code generation micro-optimizations for the atomic and spinlock code - Add more __ro_after_init attributes - Robustify the lockdevent_*() macros * tag 'locking-core-2024-05-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock/x86: Use _Q_LOCKED_VAL in PV_UNLOCK_ASM macro locking/qspinlock/x86: Micro-optimize virt_spin_lock() locking/atomic/x86: Merge __arch{,_try}_cmpxchg64_emu_local() with __arch{,_try}_cmpxchg64_emu() locking/atomic/x86: Introduce arch_try_cmpxchg64_local() locking/pvqspinlock/x86: Remove redundant CMP after CMPXCHG in __raw_callee_save___pv_queued_spin_unlock() locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h locking/pvqspinlock: Use try_cmpxchg_acquire() in trylock_clear_pending() locking/qspinlock: Use atomic_try_cmpxchg_relaxed() in xchg_tail() locking/atomic/x86: Define arch_atomic_sub() family using arch_atomic_add() functions locking/atomic/x86: Rewrite x86_32 arch_atomic64_{,fetch}_{and,or,xor}() functions locking/atomic/x86: Introduce arch_atomic64_read_nonatomic() to x86_32 locking/atomic/x86: Introduce arch_atomic64_try_cmpxchg() to x86_32 locking/atomic/x86: Introduce arch_try_cmpxchg64() for !CONFIG_X86_CMPXCHG64 locking/atomic/x86: Modernize x86_32 arch_{,try_}_cmpxchg64{,_local}() locking/atomic/x86: Correct the definition of __arch_try_cmpxchg128() x86/tsc: Make __use_tsc __ro_after_init x86/kvm: Make kvm_async_pf_enabled __ro_after_init context_tracking: Make context_tracking_key __ro_after_init jump_label,module: Don't alloc static_key_mod for __ro_after_init keys locking/qspinlock: Always evaluate lockevent* non-event parameter once
2024-05-13tracing: Improve benchmark test performance by using do_div()Thorsten Blum1-1/+1
Partially revert commit d6cb38e10810 ("tracing: Use div64_u64() instead of do_div()") and use do_div() again to utilize its faster 64-by-32 division compared to the 64-by-64 division done by div64_u64(). Explicitly cast the divisor bm_cnt to u32 to prevent a Coccinelle warning reported by do_div.cocci. The warning was removed with commit d6cb38e10810 ("tracing: Use div64_u64() instead of do_div()"). Using the faster 64-by-32 division and casting bm_cnt to u32 is safe because we return early from trace_do_benchmark() if bm_cnt > UINT_MAX. This approach is already used twice in trace_do_benchmark() when calculating the standard deviation: do_div(stddev, (u32)bm_cnt); do_div(stddev, (u32)bm_cnt - 1); Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13ring-buffer: Have mmapped ring buffer keep track of missed eventsSteven Rostedt (Google)1-6/+47
While testing libtracefs on the mmapped ring buffer, the test that checks if missed events are accounted for failed when using the mapped buffer. This is because the mapped page does not update the missed events that were dropped because the writer filled up the ring buffer before the reader could catch it. Add the missed events to the reader page/sub-buffer when the IOCTL is done and a new reader page is acquired. Note that all accesses to the reader_page via rb_page_commit() had to be switched to rb_page_size(), and rb_page_size() which was just a copy of rb_page_commit() but now it masks out the RB_MISSED bits. This is needed as the mapped reader page is still active in the ring buffer code and where it reads the commit field of the bpage for the size, it now must mask it otherwise the missed bits that are now set will corrupt the size returned. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13dpll: fix return value check for kmemdupChen Ni1-1/+1
The return value of kmemdup() is dst->freq_supported, not src->freq_supported. Update the check accordingly. Fixes: 830ead5fb0c5 ("dpll: fix pin dump crash for rebound module") Signed-off-by: Chen Ni <[email protected]> Reviewed-by: Przemek Kitszel <[email protected]> Reviewed-by: Arkadiusz Kubalewski <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge tag 'tag-chrome-platform-firmware-for-v6.10' of ↵Linus Torvalds3-4/+9
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform firmware updates from Tzung-Bi Shih: - Set driver owner in the core registration so that coreboot drivers don't need to set it individually * tag 'tag-chrome-platform-firmware-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: firmware: google: cbmem: drop driver owner initialization firmware: coreboot: store owner from modules with coreboot_driver_register()
2024-05-13Merge tag 'tag-chrome-platform-for-v6.10' of ↵Linus Torvalds20-74/+214
git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux Pull chrome platform updates from Tzung-Bi Shih: "New: - Support Framework Laptop 13 and 16 (AMD Ryzen) Improvements: - Use sysfs_emit() instead of sprintf() for sysfs' show() Fixes: - Fix flex-array-member-not-at-end compiler warnings by using DEFINE_RAW_FLEX() - Add HAS_IOPORT dependencies - Fix long pending events during suspend after resume Misc cleanups: - Provide ID tables for avoiding fallback match - Replace deprecated UNIVERSAL_DEV_PM_OPS()" * tag 'tag-chrome-platform-for-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux: (22 commits) platform/chrome: cros_ec: Handle events during suspend after resume completion platform/chrome: cros_ec_lpc: add quirks for the Framework Laptop (AMD) platform/chrome: cros_ec_lpc: add a "quirks" system platform/chrome: cros_ec_lpc: pass driver_data from DMI to the device platform/chrome: cros_ec_lpc: introduce a priv struct for the lpc device platform/chrome: add HAS_IOPORT dependencies platform/chrome: cros_hps_i2c: Replace deprecated UNIVERSAL_DEV_PM_OPS() platform/chrome: cros_kbd_led_backlight: provide ID table for avoiding fallback match platform/chrome: wilco_ec: core: provide ID table for avoiding fallback match platform/chrome: wilco_ec: event: remove redundant MODULE_ALIAS platform/chrome: wilco_ec: debugfs: provide ID table for avoiding fallback match platform/chrome: wilco_ec: telemetry: provide ID table for avoiding fallback match platform/chrome: cros_ec_vbc: provide ID table for avoiding fallback match platform/chrome: cros_ec_lightbar: provide ID table for avoiding fallback match platform/chrome: cros_ec_sysfs: provide ID table for avoiding fallback match platform/chrome: cros_ec_debugfs: provide ID table for avoiding fallback match platform/chrome: cros_ec_chardev: provide ID table for avoiding fallback match platform/chrome: cros_usbpd_notify: provide ID table for avoiding fallback match platform/chrome: cros_usbpd_logger: provide ID table for avoiding fallback match platform/chrome: cros_ec_sensorhub: provide ID table for avoiding fallback match ...
2024-05-13Merge tag 'for-netdev' of ↵Jakub Kicinski134-4738/+9458
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2024-05-13 We've added 119 non-merge commits during the last 14 day(s) which contain a total of 134 files changed, 9462 insertions(+), 4742 deletions(-). The main changes are: 1) Add BPF JIT support for 32-bit ARCv2 processors, from Shahab Vahedi. 2) Add BPF range computation improvements to the verifier in particular around XOR and OR operators, refactoring of checks for range computation and relaxing MUL range computation so that src_reg can also be an unknown scalar, from Cupertino Miranda. 3) Add support to attach kprobe BPF programs through kprobe_multi link in a session mode, meaning, a BPF program is attached to both function entry and return, the entry program can decide if the return program gets executed and the entry program can share u64 cookie value with return program. Session mode is a common use-case for tetragon and bpftrace, from Jiri Olsa. 4) Fix a potential overflow in libbpf's ring__consume_n() and improve libbpf as well as BPF selftest's struct_ops handling, from Andrii Nakryiko. 5) Improvements to BPF selftests in context of BPF gcc backend, from Jose E. Marchesi & David Faust. 6) Migrate remaining BPF selftest tests from test_sock_addr.c to prog_test- -style in order to retire the old test, run it in BPF CI and additionally expand test coverage, from Jordan Rife. 7) Big batch for BPF selftest refactoring in order to remove duplicate code around common network helpers, from Geliang Tang. 8) Another batch of improvements to BPF selftests to retire obsolete bpf_tcp_helpers.h as everything is available vmlinux.h, from Martin KaFai Lau. 9) Fix BPF map tear-down to not walk the map twice on free when both timer and wq is used, from Benjamin Tissoires. 10) Fix BPF verifier assumptions about socket->sk that it can be non-NULL, from Alexei Starovoitov. 11) Change BTF build scripts to using --btf_features for pahole v1.26+, from Alan Maguire. 12) Small improvements to BPF reusing struct_size() and krealloc_array(), from Andy Shevchenko. 13) Fix s390 JIT to emit a barrier for BPF_FETCH instructions, from Ilya Leoshkevich. 14) Extend TCP ->cong_control() callback in order to feed in ack and flag parameters and allow write-access to tp->snd_cwnd_stamp from BPF program, from Miao Xu. 15) Add support for internal-only per-CPU instructions to inline bpf_get_smp_processor_id() helper call for arm64 and riscv64 BPF JITs, from Puranjay Mohan. 16) Follow-up to remove the redundant ethtool.h from tooling infrastructure, from Tushar Vyavahare. 17) Extend libbpf to support "module:<function>" syntax for tracing programs, from Viktor Malik. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (119 commits) bpf: make list_for_each_entry portable bpf: ignore expected GCC warning in test_global_func10.c bpf: disable strict aliasing in test_global_func9.c selftests/bpf: Free strdup memory in xdp_hw_metadata selftests/bpf: Fix a few tests for GCC related warnings. bpf: avoid gcc overflow warning in test_xdp_vlan.c tools: remove redundant ethtool.h from tooling infra selftests/bpf: Expand ATTACH_REJECT tests selftests/bpf: Expand getsockname and getpeername tests sefltests/bpf: Expand sockaddr hook deny tests selftests/bpf: Expand sockaddr program return value tests selftests/bpf: Retire test_sock_addr.(c|sh) selftests/bpf: Remove redundant sendmsg test cases selftests/bpf: Migrate ATTACH_REJECT test cases selftests/bpf: Migrate expected_attach_type tests selftests/bpf: Migrate wildcard destination rewrite test selftests/bpf: Migrate sendmsg6 v4 mapped address tests selftests/bpf: Migrate sendmsg deny test cases selftests/bpf: Migrate WILDCARD_IP test selftests/bpf: Handle SYSCALL_EPERM and SYSCALL_ENOTSUPP test cases ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: pcs: lynx: no need to read LPA in lynx_pcs_get_state_2500basex()Vladimir Oltean1-3/+2
Nothing useful is done with the LPA variable in lynx_pcs_get_state_2500basex(), we can just remove the read. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13ftrace: Use asynchronous grace period for register_ftrace_direct()Paul E. McKenney1-4/+9
When running heavy test workloads with KASAN enabled, RCU Tasks grace periods can extend for many tens of seconds, significantly slowing trace registration. Therefore, make the registration-side RCU Tasks grace period be asynchronous via call_rcu_tasks(). Link: https://lore.kernel.org/linux-trace-kernel/ac05be77-2972-475b-9b57-56bef15aa00a@paulmck-laptop Reported-by: Jakub Kicinski <[email protected]> Reported-by: Alexei Starovoitov <[email protected]> Reported-by: Chris Mason <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13ftrace: Replaces simple_strtoul in ftraceYuran Pereira1-4/+3
The function simple_strtoul performs no error checking in scenarios where the input value overflows the intended output variable. This results in this function successfully returning, even when the output does not match the input string (aka the function returns successfully even when the result is wrong). Or as it was mentioned [1], "...simple_strtol(), simple_strtoll(), simple_strtoul(), and simple_strtoull() functions explicitly ignore overflows, which may lead to unexpected results in callers." Hence, the use of those functions is discouraged. This patch replaces all uses of the simple_strtoul with the safer alternatives kstrtoul and kstruint. Callers affected: - add_rec_by_index - set_graph_max_depth_function Side effects of this patch: - Since `fgraph_max_depth` is an `unsigned int`, this patch uses kstrtouint instead of kstrtoul to avoid any compiler warnings that could originate from calling the latter. - This patch ensures that the callers of kstrtou* return accordingly when kstrtoul and kstruint fail for some reason. In this case, both callers this patch is addressing return 0 on error. [1] https://www.kernel.org/doc/html/latest/process/deprecated.html#simple-strtol-simple-strtoll-simple-strtoul-simple-strtoull Link: https://lore.kernel.org/linux-trace-kernel/GV1PR10MB656333529A8D7B8AFB28D238E8B4A@GV1PR10MB6563.EURPRD10.PROD.OUTLOOK.COM Signed-off-by: Yuran Pereira <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13Merge branch 'mlx5-misc-patches'Jakub Kicinski6-114/+48
Tariq Toukan says: ==================== mlx5 misc patches This series includes patches for the mlx5 driver. Patch 1 by Shay enables LAG with HCAs of 8 ports. Patch 2 by Carolina optimizes the safe switch channels operation for the TX-only changes. Patch 3 by Parav cleans up some unused code. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net/mlx5: Remove unused msix related exported APIsParav Pandit2-59/+0
MSIX irq allocation and free APIs are no longer in use. Hence, remove the dead code. Signed-off-by: Parav Pandit <[email protected]> Reviewed-by: Dragos Tatulea <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: Kalesh AP <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net/mlx5e: Modifying channels number and updating TX queuesCarolina Jubran3-51/+47
It is not appropriate for the mlx5e_num_channels_changed function to be called solely for updating the TX queues, even if the channels number has not been changed. Move the code responsible for updating the TC and TX queues from mlx5e_num_channels_changed and produce a new function called mlx5e_update_tc_and_tx_queues. This new function should only be called when the channels number remains unchanged. Signed-off-by: Carolina Jubran <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net/mlx5: Enable 8 ports LAGShay Drory2-4/+1
This patch adds to mlx5 drivers support for 8 ports HCAs. Starting with ConnectX-8 HCAs with 8 ports are possible. As most driver parts aren't affected by such configuration most driver code is unchanged. Specially the only affected areas are: - Lag - Multiport E-Switch - Single FDB E-Switch All of the above are already factored in generic way, and LAG and VF LAG are tested, so all that left is to change a #define and remove checks which are no longer needed. However, Multiport E-Switch is not tested yet, so it is left untouched. This patch will allow to create hardware LAG/VF LAG when all 8 ports are added to the same bond device. for example, In order to activate the hardware lag a user can execute the following: ip link add bond0 type bond ip link set bond0 type bond miimon 100 mode 2 ip link set eth2 master bond0 ip link set eth3 master bond0 ip link set eth4 master bond0 ip link set eth5 master bond0 ip link set eth6 master bond0 ip link set eth7 master bond0 ip link set eth8 master bond0 ip link set eth9 master bond0 Where eth2, eth3, eth4, eth5, eth6, eth7, eth8 and eth9 are the PFs of the same HCA. Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Mark Bloch <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge branch 'ax25-fix-issues-of-ax25_dev-and-net_device'Jakub Kicinski2-34/+17
Duoming Zhou says: ==================== ax25: Fix issues of ax25_dev and net_device The first patch uses kernel universal linked list to implement ax25_dev_list, which makes the operation of the list easier. The second and third patch fix reference count leak issues of the object "ax25_dev" and "net_device". ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13ax25: Fix reference count leak issue of net_deviceDuoming Zhou1-6/+1
There is a reference count leak issue of the object "net_device" in ax25_dev_device_down(). When the ax25 device is shutting down, the ax25_dev_device_down() drops the reference count of net_device one or zero times depending on if we goto unlock_put or not, which will cause memory leak. In order to solve the above issue, decrease the reference count of net_device after dev->ax25_ptr is set to null. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Suggested-by: Dan Carpenter <[email protected]> Signed-off-by: Duoming Zhou <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/7ce3b23a40d9084657ba1125432f0ecc380cbc80.1715247018.git.duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13ax25: Fix reference count leak issues of ax25_devDuoming Zhou1-2/+1
The ax25_addr_ax25dev() and ax25_dev_device_down() exist a reference count leak issue of the object "ax25_dev". Memory leak issue in ax25_addr_ax25dev(): The reference count of the object "ax25_dev" can be increased multiple times in ax25_addr_ax25dev(). This will cause a memory leak. Memory leak issues in ax25_dev_device_down(): The reference count of ax25_dev is set to 1 in ax25_dev_device_up() and then increase the reference count when ax25_dev is added to ax25_dev_list. As a result, the reference count of ax25_dev is 2. But when the device is shutting down. The ax25_dev_device_down() drops the reference count once or twice depending on if we goto unlock_put or not, which will cause memory leak. As for the issue of ax25_addr_ax25dev(), it is impossible for one pointer to be on a list twice. So add a break in ax25_addr_ax25dev(). As for the issue of ax25_dev_device_down(), increase the reference count of ax25_dev once in ax25_dev_device_up() and decrease the reference count of ax25_dev after it is removed from the ax25_dev_list. Fixes: d01ffb9eee4a ("ax25: add refcount in ax25_dev to avoid UAF bugs") Suggested-by: Dan Carpenter <[email protected]> Signed-off-by: Duoming Zhou <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/361bbf2a4b091e120006279ec3b382d73c4a0c17.1715247018.git.duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13ax25: Use kernel universal linked list to implement ax25_dev_listDuoming Zhou2-27/+16
The origin ax25_dev_list implements its own single linked list, which is complicated and error-prone. For example, when deleting the node of ax25_dev_list in ax25_dev_device_down(), we have to operate on the head node and other nodes separately. This patch uses kernel universal linked list to replace original ax25_dev_list, which make the operation of ax25_dev_list easier. We should do "dev->ax25_ptr = ax25_dev;" and "dev->ax25_ptr = NULL;" while holding the spinlock, otherwise the ax25_dev_device_up() and ax25_dev_device_down() could race. Suggested-by: Dan Carpenter <[email protected]> Signed-off-by: Duoming Zhou <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/r/85bba3af651ca0e1a519da8d0d715b949891171c.1715247018.git.duoming@zju.edu.cn Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13test: hsr: Extend the hsr_redbox.sh to have more SAN devices connectedLukasz Majewski1-22/+49
After this change the single SAN device (ns3eth1) is now replaced with two SAN devices - respectively ns4eth1 and ns5eth1. It is possible to extend this script to have more SAN devices connected by adding them to ns3br1 bridge. Signed-off-by: Lukasz Majewski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge branch 'net-dsa-microchip-dcb-fixes'Jakub Kicinski3-69/+85
Oleksij Rempel says: ==================== net: dsa: microchip: DCB fixes This patch series address recommendation to rename IPV to IPM to avoid confusion with IPV name used in 802.1Qci PSFP. And restores default "PCP only" configuration as source of priorities to avoid possible regressions. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: dsa: microchip: dcb: set default apptrust to PCP onlyOleksij Rempel1-18/+3
Before DCB support, the KSZ driver had only PCP as source of packet priority values. To avoid regressions, make PCP only as default value. User will need enable DSCP support manually. This patch do not affect other KSZ8 related quirks. User will still be warned by setting not support configurations for the port 2. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: dsa: microchip: dcb: add comments for DSCP related functionsOleksij Rempel1-0/+31
All other functions are commented. Add missing comments to following functions: ksz_set_global_dscp_entry() ksz_port_add_dscp_prio() ksz_port_del_dscp_prio() Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: dsa: microchip: dcb: rename IPV to IPMOleksij Rempel3-51/+51
IPV is added and used term in 802.1Qci PSFP and merged into 802.1Q (from 802.1Q-2018) for another functions. Even it does similar operation holding temporal priority value internally (as it is named), because KSZ datasheet doesn't use the term of IPV (Internal Priority Value) and avoiding any confusion later when PSFP is in the Linux world, it is better to rename IPV to IPM (Internal Priority Mapping). In addition, LAN937x documentation already use IPV for 802.1Qci PSFP related functionality. Suggested-by: Woojung Huh <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Woojung Huh <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13l2tp: Support different protocol versions with same IP/port quadrupleSamuel Thibault1-8/+10
628bc3e5a1be ("l2tp: Support several sockets with same IP/port quadruple") added support for several L2TPv2 tunnels using the same IP/port quadruple, but if an L2TPv3 socket exists it could eat all the trafic. We thus have to first use the version from the packet to get the proper tunnel, and only then check that the version matches. Signed-off-by: Samuel Thibault <[email protected]> Reviewed-by: James Chapman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: usb: ax88179_178a: fix link status when link is set to down/upJose Ignacio Tornos Martinez1-11/+26
The idea was to keep only one reset at initialization stage in order to reduce the total delay, or the reset from usbnet_probe or the reset from usbnet_open. I have seen that restarting from usbnet_probe is necessary to avoid doing too complex things. But when the link is set to down/up (for example to configure a different mac address) the link is not correctly recovered unless a reset is commanded from usbnet_open. So, detect the initialization stage (first call) to not reset from usbnet_open after the reset from usbnet_probe and after this stage, always reset from usbnet_open too (when the link needs to be rechecked). Apply to all the possible devices, the behavior now is going to be the same. cc: [email protected] # 6.6+ Fixes: 56f78615bcb1 ("net: usb: ax88179_178a: avoid writing the mac address before first reading") Reported-by: Isaac Ganoung <[email protected]> Reported-by: Yongqin Liu <[email protected]> Signed-off-by: Jose Ignacio Tornos Martinez <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13net: smc91x: Fix m68k kernel compilation for ColdFire CPUThorsten Blum1-2/+2
Compiling the m68k kernel with support for the ColdFire CPU family fails with the following error: In file included from drivers/net/ethernet/smsc/smc91x.c:80: drivers/net/ethernet/smsc/smc91x.c: In function ‘smc_reset’: drivers/net/ethernet/smsc/smc91x.h:160:40: error: implicit declaration of function ‘_swapw’; did you mean ‘swap’? [-Werror=implicit-function-declaration] 160 | #define SMC_outw(lp, v, a, r) writew(_swapw(v), (a) + (r)) | ^~~~~~ drivers/net/ethernet/smsc/smc91x.h:904:25: note: in expansion of macro ‘SMC_outw’ 904 | SMC_outw(lp, x, ioaddr, BANK_SELECT); \ | ^~~~~~~~ drivers/net/ethernet/smsc/smc91x.c:250:9: note: in expansion of macro ‘SMC_SELECT_BANK’ 250 | SMC_SELECT_BANK(lp, 2); | ^~~~~~~~~~~~~~~ cc1: some warnings being treated as errors The function _swapw() was removed in commit d97cf70af097 ("m68k: use asm-generic/io.h for non-MMU io access functions"), but is still used in drivers/net/ethernet/smsc/smc91x.h. Use ioread16be() and iowrite16be() to resolve the error. Cc: [email protected] Fixes: d97cf70af097 ("m68k: use asm-generic/io.h for non-MMU io access functions") Signed-off-by: Thorsten Blum <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-13Merge tag 'rust-6.10' of https://github.com/Rust-for-Linux/linuxLinus Torvalds53-10043/+878
Pull Rust updates from Miguel Ojeda: "The most notable change is the drop of the 'alloc' in-tree fork. This is nicely reflected in the diffstat as a ~10k lines drop. In turn, this makes the version upgrades way simpler and smaller in the future, e.g. the latest one in commit 56f64b370612 ("rust: upgrade to Rust 1.78.0"). More importantly, this increases the chances that a newer compiler version just works, which in turn means supporting several compiler versions is easier now. Thus we will look into finally setting a minimum version in the near future. Toolchain and infrastructure: - Upgrade to Rust 1.78.0 This time around, due to how the kernel and Rust schedules have aligned, there are two upgrades in fact. These allow us to remove one more unstable feature ('offset_of') from the list, among other improvements - Drop 'alloc' in-tree fork of the standard library crate, which means all the unstable features used by 'alloc' (~30 language ones, ~60 library ones) are not a concern anymore - Support DWARFv5 via the '-Zdwarf-version' flag - Support zlib and zstd debuginfo compression via the '-Zdebuginfo-compression' flag 'kernel' crate: - Support allocation flags ('GFP_*'), particularly in 'Box' (via 'BoxExt'), 'Vec' (via 'VecExt'), 'Arc' and 'UniqueArc', as well as in the 'init' module APIs - Remove usage of the 'allocator_api' unstable feature - Remove 'try_' prefix in allocation APIs' names - Add 'VecExt' (an extension trait) to be able to drop the 'alloc' fork - Add the '{make,to}_{upper,lower}case()' methods to 'CStr'/'CString' - Add the 'as_ptr' method to 'ThisModule' - Add the 'from_raw' method to 'ArcBorrow' - Add the 'into_unique_or_drop' method to 'Arc' - Display column number in the 'dbg!' macro output by applying the equivalent change done to the standard library one - Migrate 'Work' to '#[pin_data]' thanks to the changes in the 'macros' crate, which allows to remove an unsafe call in its 'new' associated function - Prevent namespacing issues when using the '[try_][pin_]init!' macros by changing the generated name of guard variables - Make the 'get' method in 'Opaque' const - Implement the 'Default' trait for 'LockClassKey' - Remove unneeded 'kernel::prelude' imports from doctests - Remove redundant imports 'macros' crate: - Add 'decl_generics' to 'parse_generics()' to support default values, and use that to allow them in '#[pin_data]' Helpers: - Trivial English grammar fix Documentation: - Add section on Rust Kselftests to the 'Testing' document - Expand the 'Abstractions vs. bindings' section of the 'General Information' document" * tag 'rust-6.10' of https://github.com/Rust-for-Linux/linux: (31 commits) rust: alloc: fix dangling pointer in VecExt<T>::reserve() rust: upgrade to Rust 1.78.0 rust: kernel: remove redundant imports rust: sync: implement `Default` for `LockClassKey` docs: rust: extend abstraction and binding documentation docs: rust: Add instructions for the Rust kselftest rust: remove unneeded `kernel::prelude` imports from doctests rust: update `dbg!()` to format column number rust: helpers: Fix grammar in comment rust: init: change the generated name of guard variables rust: sync: add `Arc::into_unique_or_drop` rust: sync: add `ArcBorrow::from_raw` rust: types: Make Opaque::get const rust: kernel: remove usage of `allocator_api` unstable feature rust: init: update `init` module to take allocation flags rust: sync: update `Arc` and `UniqueArc` to take allocation flags rust: alloc: update `VecExt` to take allocation flags rust: alloc: introduce the `BoxExt` trait rust: alloc: introduce allocation flags rust: alloc: remove our fork of the `alloc` crate ...
2024-05-13ring-buffer/selftest: Add ring-buffer mapping testVincent Donnefort4-0/+305
Map a ring-buffer, validate the meta-page before and after emitting few events. Also check ring-buffer mapping boundaries and finally ensure the tracing snapshot is mutually exclusive. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Shuah Khan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: [email protected] Acked-by: Muhammad Usama Anjum <[email protected]> Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13Documentation: tracing: Add ring-buffer mappingVincent Donnefort2-0/+107
It is now possible to mmap() a ring-buffer to stream its content. Add some documentation and a code example. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13tracing: Allow user-space mapping of the ring-bufferVincent Donnefort3-5/+102
Currently, user-space extracts data from the ring-buffer via splice, which is handy for storage or network sharing. However, due to splice limitations, it is imposible to do real-time analysis without a copy. A solution for that problem is to let the user-space map the ring-buffer directly. The mapping is exposed via the per-CPU file trace_pipe_raw. The first element of the mapping is the meta-page. It is followed by each subbuffer constituting the ring-buffer, ordered by their unique page ID: * Meta-page -- include/uapi/linux/trace_mmap.h for a description * Subbuf ID 0 * Subbuf ID 1 ... It is therefore easy to translate a subbuf ID into an offset in the mapping: reader_id = meta->reader->id; reader_offset = meta->meta_page_size + reader_id * meta->subbuf_size; When new data is available, the mapper must call a newly introduced ioctl: TRACE_MMAP_IOCTL_GET_READER. This will update the Meta-page reader ID to point to the next reader containing unread data. Mapping will prevent snapshot and buffer size modifications. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] CC: <[email protected]> Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13ring-buffer: Introducing ring-buffer mapping functionsVincent Donnefort3-3/+463
In preparation for allowing the user-space to map a ring-buffer, add a set of mapping functions: ring_buffer_{map,unmap}() And controls on the ring-buffer: ring_buffer_map_get_reader() /* swap reader and head */ Mapping the ring-buffer also involves: A unique ID for each subbuf of the ring-buffer, currently they are only identified through their in-kernel VA. A meta-page, where are stored ring-buffer statistics and a description for the current reader The linear mapping exposes the meta-page, and each subbuf of the ring-buffer, ordered following their unique ID, assigned during the first mapping. Once mapped, no subbuf can get in or out of the ring-buffer: the buffer size will remain unmodified and the splice enabling functions will in reality simply memcpy the data instead of swapping subbufs. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] CC: <[email protected]> Signed-off-by: Vincent Donnefort <[email protected]> Acked-by: David Hildenbrand <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
2024-05-13ring-buffer: Allocate sub-buffers with __GFP_COMPVincent Donnefort1-3/+3
In preparation for the ring-buffer memory mapping, allocate compound pages for the ring-buffer sub-buffers to enable us to map them to user-space with vm_insert_pages(). Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Acked-by: David Hildenbrand <[email protected]> Signed-off-by: Vincent Donnefort <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>