blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-05-08	rxrpc: Fix congestion control algorithm	David Howells	3	-10/+2
	Make the following fixes to the congestion control algorithm: (1) Don't vary the cwnd starting value by the size of RXRPC_TX_SMSS since that's currently held constant - set to the size of a jumbo subpacket payload so that we can create jumbo packets on the fly. The current code invariably picks 3 as the starting value. Further, the starting cwnd needs to be an even number because we ack every other packet, so set it to 4. (2) Don't cut ssthresh when we see an ACK come from the peer with a receive window (rwind) less than ssthresh. ssthresh keeps track of characteristics of the connection whereas rwind may be reduced by the peer for any reason - and may be reduced to 0. Fixes: 1fc4fa2ac93d ("rxrpc: Fix congestion management") Fixes: 0851115090a3 ("rxrpc: Reduce ssthresh to peer's receive window") Signed-off-by: David Howells <[email protected]> Suggested-by: Simon Wilkinson <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected] Reviewed-by: Jeffrey Altman <[email protected] <mailto:[email protected]>> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-08	bpf, arm64: Add support for lse atomics in bpf_arena	Puranjay Mohan	2	-9/+40
	When LSE atomics are available, BPF atomic instructions are implemented as single ARM64 atomic instructions, therefore it is easy to enable these in bpf_arena using the currently available exception handling setup. LL_SC atomics use loops and therefore would need more work to enable in bpf_arena. Enable LSE atomics based instructions in bpf_arena and use the bpf_jit_supports_insn() callback to reject atomics in bpf_arena if LSE atomics are not available. All atomics and arena_atomics selftests are passing: [root@ip-172-31-2-216 bpf]# ./test_progs -a atomics,arena_atomics #3/1 arena_atomics/add:OK #3/2 arena_atomics/sub:OK #3/3 arena_atomics/and:OK #3/4 arena_atomics/or:OK #3/5 arena_atomics/xor:OK #3/6 arena_atomics/cmpxchg:OK #3/7 arena_atomics/xchg:OK #3 arena_atomics:OK #10/1 atomics/add:OK #10/2 atomics/sub:OK #10/3 atomics/and:OK #10/4 atomics/or:OK #10/5 atomics/xor:OK #10/6 atomics/cmpxchg:OK #10/7 atomics/xchg:OK #10 atomics:OK Summary: 2/14 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Puranjay Mohan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-08	selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC	Ido Schimmel	1	-11/+3
	When creating the topology for the test, three veth pairs are created in the initial network namespace before being moved to one of the network namespaces created by the test. On systems where systemd-udev uses MACAddressPolicy=persistent (default since systemd version 242), this will result in some net devices having the same MAC address since they were created with the same name in the initial network namespace. In turn, this leads to arping / ndisc6 failing since packets are dropped by the bridge's loopback filter. Fix by creating each net device in the correct network namespace instead of moving it there from the initial network namespace. Reported-by: Jakub Kicinski <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Fixes: 7648ac72dcd7 ("selftests: net: Add bridge neighbor suppression test") Signed-off-by: Ido Schimmel <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-08	test: hsr: Call cleanup_all_ns when hsr_redbox.sh script exits	Lukasz Majewski	1	-0/+2
	Without this change the created netns instances are not cleared after this script execution. To fix this problem the cleanup_all_ns function from ../lib.sh is called. Signed-off-by: Lukasz Majewski <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	ax25: Remove superfuous "return" from ax25_ds_set_timer	Joel Granados	1	-1/+0
	Remove the explicit call to "return" in the void ax25_ds_set_timer function that was introduced in 78a7b5dbc060 ("ax.25: x.25: Remove the now superfluous sentinel elements from ctl_table array"). Signed-off-by: Joel Granados <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	ipvs: allow some sysctls in non-init user namespaces	Alexander Mikhalitsyn	1	-4/+15
	Let's make all IPVS sysctls writtable even when network namespace is owned by non-initial user namespace. Let's make a few sysctls to be read-only for non-privileged users: - sync_qlen_max - sync_sock_size - run_estimation - est_cpulist - est_nice I'm trying to be conservative with this to prevent introducing any security issues in there. Maybe, we can allow more sysctls to be writable, but let's do this on-demand and when we see real use-case. This patch is motivated by user request in the LXC project [1]. Having this can help with running some Kubernetes [2] or Docker Swarm [3] workloads inside the system containers. Link: https://github.com/lxc/lxc/issues/4278 [1] Link: https://github.com/kubernetes/kubernetes/blob/b722d017a34b300a2284b890448e5a605f21d01e/pkg/proxy/ipvs/proxier.go#L103 [2] Link: https://github.com/moby/libnetwork/blob/3797618f9a38372e8107d8c06f6ae199e1133ae8/osl/namespace_linux.go#L682 [3] Cc: Julian Anastasov <[email protected]> Cc: Simon Horman <[email protected]> Cc: Pablo Neira Ayuso <[email protected]> Cc: Jozsef Kadlecsik <[email protected]> Cc: Florian Westphal <[email protected]> Signed-off-by: Alexander Mikhalitsyn <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	ipvs: add READ_ONCE barrier for ipvs->sysctl_amemthresh	Alexander Mikhalitsyn	1	-7/+7
	Cc: Julian Anastasov <[email protected]> Cc: Simon Horman <[email protected]> Cc: Pablo Neira Ayuso <[email protected]> Cc: Jozsef Kadlecsik <[email protected]> Cc: Florian Westphal <[email protected]> Suggested-by: Julian Anastasov <[email protected]> Signed-off-by: Alexander Mikhalitsyn <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	ipv6: Fix potential uninit-value access in __ip6_make_skb()	Shigeru Yoshida	1	-1/+1
	As it was done in commit fc1092f51567 ("ipv4: Fix uninit-value access in __ip_make_skb()") for IPv4, check FLOWI_FLAG_KNOWN_NH on fl6->flowi6_flags instead of testing HDRINCL on the socket to avoid a race condition which causes uninit-value access. Fixes: ea30388baebc ("ipv6: Fix an uninit variable access bug in __ip6_make_skb()") Signed-off-by: Shigeru Yoshida <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: stmmac: dwmac-ipq806x: account for rgmii-txid/rxid/id phy-mode	Christian Marangi	1	-0/+12
	Currently the ipq806x dwmac driver is almost always used attached to the CPU port of a switch and phy-mode was always set to "rgmii" or "sgmii". Some device came up with a special configuration where the PHY is directly attached to the GMAC port and in those case phy-mode needs to be set to "rgmii-id" to make the PHY correctly work and receive packets. Since the driver supports only "rgmii" and "sgmii" mode, when "rgmii-id" (or variants) mode is set, the mode is rejected and probe fails. Add support also for these phy-modes to correctly setup PHYs that requires delay applied to tx/rx. Signed-off-by: Christian Marangi <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: bridge: switchdev: Improve error message for port_obj_add/del functions	Oleksij Rempel	1	-4/+95
	Enhance the error reporting mechanism in the switchdev framework to provide more informative and user-friendly error messages. Following feedback from users struggling to understand the implications of error messages like "failed (err=-28) to add object (id=2)", this update aims to clarify what operation failed and how this might impact the system or network. With this change, error messages now include a description of the failed operation, the specific object involved, and a brief explanation of the potential impact on the system. This approach helps administrators and developers better understand the context and severity of errors, facilitating quicker and more effective troubleshooting. Example of the improved logging: [ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast Database entry (object id=2) with error: -ENOSPC (-28). [ 70.516446] Failure in updating the port's Multicast Database could lead to multicast forwarding issues. [ 70.516446] Current HW/SW setup lacks sufficient resources. This comprehensive update includes handling for a range of switchdev object IDs, ensuring that most operations within the switchdev framework benefit from clearer error reporting. Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: phy: marvell-88q2xxx: add support for Rev B1 and B2	Gregor Herburger	1	-16/+103
	Different revisions of the Marvell 88q2xxx phy needs different init sequences. Add init sequence for Rev B1 and Rev B2. Rev B2 init sequence skips one register write. Tested-by: Dimitri Fedrau <[email protected]> Signed-off-by: Gregor Herburger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	appletalk: Improve handling of broadcast packets	Vincent Duvert	1	-3/+16
	When a broadcast AppleTalk packet is received, prefer queuing it on the socket whose address matches the address of the interface that received the packet (and is listening on the correct port). Userspace applications that handle such packets will usually send a response on the same socket that received the packet; this fix allows the response to be sent on the correct interface. If a socket matching the interface's address is not found, an arbitrary socket listening on the correct port will be used, if any. This matches the implementation's previous behavior. Fixes atalkd's responses to network information requests when multiple network interfaces are configured to use AppleTalk. Link: https://lore.kernel.org/netdev/[email protected]/ Link: https://gist.github.com/VinDuv/4db433b6dce39d51a5b7847ee749b2a4 Signed-off-by: Vincent Duvert <[email protected]> Signed-off-by: Doug Brown <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net/ipv4: add tracepoint for icmp_send	Peilin He	2	-0/+71
	Introduce a tracepoint for icmp_send, which can help users to get more detail information conveniently when icmp abnormal events happen. 1. Giving an usecase example: ============================= When an application experiences packet loss due to an unreachable UDP destination port, the kernel will send an exception message through the icmp_send function. By adding a trace point for icmp_send, developers or system administrators can obtain detailed information about the UDP packet loss, including the type, code, source address, destination address, source port, and destination port. This facilitates the trouble-shooting of UDP packet loss issues especially for those network-service applications. 2. Operation Instructions: ========================== Switch to the tracing directory. cd /sys/kernel/tracing Filter for destination port unreachable. echo "type==3 && code==3" > events/icmp/icmp_send/filter Enable trace event. echo 1 > events/icmp/icmp_send/enable 3. Result View: ================ udp_client_erro-11370 [002] ...s.12 124.728002: icmp_send: icmp_send: type=3, code=3. From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23 skbaddr=00000000589b167a Signed-off-by: Peilin He <[email protected]> Signed-off-by: xu xin <[email protected]> Reviewed-by: Yunkai Zhang <[email protected]> Cc: Yang Yang <[email protected]> Cc: Liu Chun <[email protected]> Cc: Xuexin Jiang <[email protected]> Reviewed-by: Steven Rostedt (Google) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: bridge: fix corrupted ethernet header on multicast-to-unicast	Felix Fietkau	1	-2/+7
	The change from skb_copy to pskb_copy unfortunately changed the data copying to omit the ethernet header, since it was pulled before reaching this point. Fix this by calling __skb_push/pull around pskb_copy. Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO") Signed-off-by: Felix Fietkau <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	Merge branch 'ksz-dcb-dscp'	David S. Miller	18	-86/+2133
	Oleksij Rempel says: ==================== add DCB and DSCP support for KSZ switches This patch series is aimed at improving support for DCB (Data Center Bridging) and DSCP (Differentiated Services Code Point) on KSZ switches. The main goal is to introduce global DSCP and PCP (Priority Code Point) mapping support, addressing the limitation of KSZ switches not having per-port DSCP priority mapping. This involves extending the DSA framework with new callbacks for managing trust settings for global DSCP and PCP maps. Additionally, we introduce IEEE 802.1q helpers for default configurations, benefiting other drivers too. Change logs are in separate patches. Compared to v6 this series includes some new patches for DSCP global mapping support and QoS selftest script for KSZ9477 switches. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-05-08	selftests: microchip: add test for QoS support on KSZ9477 switch family	Oleksij Rempel	1	-0/+668
	Add tests covering following functionality on KSZ9477 switch family: - default port priority - global DSCP to Internal Priority Mapping - apptrust configuration This script was tested on KSZ9893R Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: add support DSCP priority mapping	Oleksij Rempel	3	-15/+50
	Microchip KSZ and LAN variants do not have per port DSCP priority configuration. Instead there is a global DSCP mapping table. This patch provides write access to this global DSCP map. In case entry is "deleted", we map corresponding DSCP entry to a best effort prio, which is expected to be the default priority for all untagged traffic. Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: add support switches global DSCP priority mapping	Oleksij Rempel	2	-0/+84
	Some switches like Microchip KSZ variants do not support per port DSCP priority configuration. Instead there is a global DSCP mapping table. To handle it, we will accept set/del request to any of user ports to make global configuration and update dcb app entries for all other ports. Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: let DCB code do PCP and DSCP policy configuration	Oleksij Rempel	2	-12/+0
	802.1P (PCP) and DiffServ (DSCP) are handled now by DCB code. Let it do all needed initial configuration. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: init predictable IPV to queue mapping for all non ↵	Oleksij Rempel	1	-24/+33
	KSZ8xxx variants Init priority to queue mapping in the way as it shown in IEEE 802.1Q mapping example. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: enable ETS support for KSZ989X variants	Oleksij Rempel	2	-12/+1
	I tested ETS support on KSZ9893, so it should work other KSZ989X variants too, which was till not listed as support. With this change we now officially not support only ksz8 family of chips. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: dcb: add special handling for KSZ88X3 family	Oleksij Rempel	4	-3/+242
	KSZ88X3 switches have different behavior on different ports: - It seems to be not possible to disable VLAN PCP classification on port 2. It means, as soon as mutliqueue support is enabled, frames with VLAN tag will get PCP prios. This behavior do not affect Port 1 - it is possible to disable PCP prios. - DSCP classification is not working on Port 2. Since there are still usable configuration combinations, I added some quirks to make sure user will get appropriate error message if not possible configuration is chosen. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: add support for different DCB app configurations	Oleksij Rempel	6	-2/+583
	Add DCB support to configure app trust sources and default port priority. Following commands can be used for testing: dcb apptrust set dev lan1 order pcp dscp dcb app replace dev lan1 default-prio 3 Since it is not possible to configure DSCP-Prio mapping per port, this patch provide only ability to read switch global dscp-prio mapping and way to enable/disable app trust for DSCP. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: add multi queue support for KSZ88X3 variants	Oleksij Rempel	2	-35/+61
	KSZ88X3 switches support up to 4 queues. Rework ksz8795_set_prio_queue() to support KSZ8795 and KSZ88X3 families of switches. Per default, configure KSZ88X3 to use one queue, since it need special handling due to priority related errata. Errata handling is implemented in a separate patch. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: add IEEE 802.1q specific helpers	Oleksij Rempel	5	-0/+379
	IEEE 802.1q specification provides recommendation and examples which can be used as good default values for different drivers. This patch implements mapping examples documented in IEEE 802.1Q-2022 in Annex I "I.3 Traffic type to traffic class mapping" and IETF DSCP naming and mapping DSCP to Traffic Type inspired by RFC8325. This helpers will be used in followup patches for dsa/microchip DCB implementation. Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: microchip: add IPV information support	Oleksij Rempel	2	-3/+20
	Most of Microchip KSZ switches use Internal Priority Value associated with every frame. For example, it is possible to map any VLAN PCP or DSCP value to IPV and at the end, map IPV to a queue. Since amount of IPVs is not equal to amount of queues, add this information and make use of it in some functions. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Arun Ramadoss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	net: dsa: add support for DCB get/set apptrust configuration	Oleksij Rempel	2	-0/+32
	Add DCB support to get/set trust configuration for different packet priority information sources. Some switch allow to chose different source of packet priority classification. For example on KSZ switches it is possible to configure VLAN PCP and/or DSCP sources. Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-08	virtiofs: include a newline in sysfs tag	Brian Foster	1	-1/+1
	The internal tag string doesn't contain a newline. Append one when emitting the tag via sysfs. [Stefan] Orthogonal to the newline issue, sysfs_emit(buf, "%s", fs->tag) is needed to prevent format string injection. Signed-off-by: Brian Foster <[email protected]> Fixes: a8f62f50b4e4 ("virtiofs: export filesystem tags through sysfs") Signed-off-by: Miklos Szeredi <[email protected]>
2024-05-07	Merge branch '100GbE' of ↵	Jakub Kicinski	10	-134/+82
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-05-06 (ice) This series contains updates to ice driver only. Paul adds support for additional E830 devices and adjusts naming for existing E830 devices. Marcin commonizes a couple of TC setup calls to reduce duplicated code. Mateusz adds ice_vsi_cfg_params into ice_vsi to consolidate info. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi ice: Deduplicate tc action setup ice: update E830 device ids and comments ice: add additional E830 device ids ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	net: usb: sr9700: stop lying about skb->truesize	Eric Dumazet	1	-7/+3
	Some usb drivers set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize override. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Signed-off-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	net: usb: smsc75xx: stop lying about skb->truesize	Eric Dumazet	1	-8/+4
	Some usb drivers try to set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize override. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Signed-off-by: Eric Dumazet <[email protected]> Cc: Steve Glendinning <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	usb: aqc111: stop lying about skb->truesize	Eric Dumazet	1	-5/+3
	Some usb drivers try to set small skb->truesize and break core networking stacks. I replace one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Fixes: 361459cd9642 ("net: usb: aqc111: Implement RX data path") Signed-off-by: Eric Dumazet <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	mptcp: only allow set existing scheduler for net.mptcp.scheduler	Gregory Detal	1	-1/+38
	The current behavior is to accept any strings as inputs, this results in an inconsistent result where an unexisting scheduler can be set: # sysctl -w net.mptcp.scheduler=notdefault net.mptcp.scheduler = notdefault This patch changes this behavior by checking for existing scheduler before accepting the input. Fixes: e3b2870b6d22 ("mptcp: add a new sysctl scheduler") Cc: [email protected] Signed-off-by: Gregory Detal <[email protected]> Reviewed-by: Matthieu Baerts (NGI0) <[email protected]> Tested-by: Geliang Tang <[email protected]> Reviewed-by: Mat Martineau <[email protected]> Signed-off-by: Matthieu Baerts (NGI0) <[email protected]> Link: https://lore.kernel.org/r/20240506-upstream-net-20240506-mptcp-sched-exist-v1-1-2ed1529e521e@kernel.org Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	selftests/net: fix uninitialized variables	John Hubbard	3	-2/+5
	When building with clang, via: make LLVM=1 -C tools/testing/selftest ...clang warns about three variables that are not initialized in all cases: 1) The opt_ipproto_off variable is used uninitialized if "testname" is not "ip". Willem de Bruijn pointed out that this is an actual bug, and suggested the fix that I'm using here (thanks!). 2) The addr_len is used uninitialized, but only in the assert case, which bails out, so this is harmless. 3) The family variable in add_listener() is only used uninitialized in the error case (neither IPv4 nor IPv6 is specified), so it's also harmless. Fix by initializing each variable. Signed-off-by: John Hubbard <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Mat Martineau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	lib: Allow for the DIM library to be modular	Florian Fainelli	3	-3/+6
	Allow the Dynamic Interrupt Moderation (DIM) library to be built as a module. This is particularly useful in an Android GKI (Google Kernel Image) configuration where everything is built as a module, including Ethernet controller drivers. Having to build DIMLIB into the kernel image with potentially no user is wasteful. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	nfc: nci: Fix kcov check in nci_rx_work()	Tetsuo Handa	1	-0/+1
	Commit 7e8cdc97148c ("nfc: Add KCOV annotations") added kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(), with an assumption that kcov_remote_stop() is called upon continue of the for loop. But commit d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before break of the for loop. Reported-by: syzbot <[email protected]> Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2 Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Suggested-by: Andrey Konovalov <[email protected]> Signed-off-by: Tetsuo Handa <[email protected]> Reviewed-by: Krzysztof Kozlowski <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	mptcp: fix possible NULL dereferences	Eric Dumazet	1	-15/+17
	subflow_add_reset_reason(skb, ...) can fail. We can not assume mptcp_get_ext(skb) always return a non NULL pointer. syzbot reported: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 0 PID: 5098 Comm: syz-executor132 Not tainted 6.9.0-rc6-syzkaller-01478-gcdc74c9d06e7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:subflow_v6_route_req+0x2c7/0x490 net/mptcp/subflow.c:388 Code: 8d 7b 07 48 89 f8 48 c1 e8 03 42 0f b6 04 20 84 c0 0f 85 c0 01 00 00 0f b6 43 07 48 8d 1c c3 48 83 c3 18 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 84 01 00 00 0f b6 5b 01 83 e3 0f 48 89 RSP: 0018:ffffc9000362eb68 EFLAGS: 00010206 RAX: 0000000000000003 RBX: 0000000000000018 RCX: ffff888022039e00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88807d961140 R08: ffffffff8b6cb76b R09: 1ffff1100fb2c230 R10: dffffc0000000000 R11: ffffed100fb2c231 R12: dffffc0000000000 R13: ffff888022bfe273 R14: ffff88802cf9cc80 R15: ffff88802ad5a700 FS: 0000555587ad2380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f420c3f9720 CR3: 0000000022bfc000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_conn_request+0xf07/0x32c0 net/ipv4/tcp_input.c:7180 tcp_rcv_state_process+0x183c/0x4500 net/ipv4/tcp_input.c:6663 tcp_v6_do_rcv+0x8b2/0x1310 net/ipv6/tcp_ipv6.c:1673 tcp_v6_rcv+0x22b4/0x30b0 net/ipv6/tcp_ipv6.c:1910 ip6_protocol_deliver_rcu+0xc76/0x1570 net/ipv6/ip6_input.c:438 ip6_input_finish+0x186/0x2d0 net/ipv6/ip6_input.c:483 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5625 [inline] __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5739 netif_receive_skb_internal net/core/dev.c:5825 [inline] netif_receive_skb+0x1e8/0x890 net/core/dev.c:5885 tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1549 tun_get_user+0x2f35/0x4560 drivers/net/tun.c:2002 tun_chr_write_iter+0x113/0x1f0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2110 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xa84/0xcb0 fs/read_write.c:590 ksys_write+0x1a0/0x2c0 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 3e140491dd80 ("mptcp: support rstreason for passive reset") Reported-by: syzbot <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Paolo Abeni <[email protected]> Reviewed-by: Matthieu Baerts (NGI0) <[email protected]> Reviewed-by: Jason Xing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	selftests: netfilter: conntrack_tcp_unreplied.sh: wait for initial ↵	Florian Westphal	1	-7/+18
	connection attempt Netdev CI reports occasional failures with this test ("ERROR: ns2-dX6bUE did not pick up tcp connection from peer"). Add explicit busywait call until the initial connection attempt shows up in conntrack rather than a one-shot 'must exist' check. Signed-off-by: Florian Westphal <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	Merge branch 'libbpf: further struct_ops fixes and improvements'	Martin KaFai Lau	5	-17/+156
	Andrii Nakryiko says: ==================== Fix yet another case of mishandling SEC("struct_ops") programs that were nulled out programmatically through BPF skeleton by the user. While at it, add some improvements around detecting and reporting errors, specifically a common case of declaring SEC("struct_ops") program, but forgetting to actually make use of it by setting it as a callback implementation in SEC(".struct_ops") variable (i.e., map) declaration. A bunch of new selftests are added as well. ==================== Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	selftests/bpf: shorten subtest names for struct_ops_module test	Andrii Nakryiko	1	-4/+4
	Drive-by clean up, we shouldn't use meaningless "test_" prefix for subtest names. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	selftests/bpf: validate struct_ops early failure detection logic	Andrii Nakryiko	2	-0/+64
	Add a simple test that validates that libbpf will reject isolated struct_ops program early with helpful warning message. Also validate that explicit use of such BPF program through BPF skeleton after BPF object is open won't trigger any warnings. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	libbpf: improve early detection of doomed-to-fail BPF program loading	Andrii Nakryiko	1	-1/+14
	Extend libbpf's pre-load checks for BPF programs, detecting more typical conditions that are destinated to cause BPF program failure. This is an opportunity to provide more helpful and actionable error message to users, instead of potentially very confusing BPF verifier log and/or error. In this case, we detect struct_ops BPF program that was not referenced anywhere, but still attempted to be loaded (according to libbpf logic). Suggest that the program might need to be used in some struct_ops variable. User will get a message of the following kind: libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it? Suggested-by: Tejun Heo <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	libbpf: fix libbpf_strerror_r() handling unknown errors	Andrii Nakryiko	1	-2/+14
	strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is documented to return error in two different ways, depending on glibc version. Take that into account when handling strerror_r()'s own errors, which happens when we pass some non-standard (internal) kernel error to it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is quite confusing. Now for the same situation we'll see a bit less visually scary "unknown error (-524)". At least we won't confuse user with irrelevant EINVAL (22). Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	selftests/bpf: add another struct_ops callback use case test	Andrii Nakryiko	2	-0/+49
	Add a test which tests the case that was just fixed. Kernel has full type information about callback, but user explicitly nulls out the reference to declaratively set BPF program reference. Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	libbpf: handle yet another corner case of nulling out struct_ops program	Andrii Nakryiko	1	-1/+9
	There is yet another corner case where user can set STRUCT_OPS program reference in STRUCT_OPS map to NULL, but libbpf will fail to disable autoload for such BPF program. This time it's the case of "new" kernel which has type information about callback field, but user explicitly nulled-out program reference from user-space after opening BPF object. Fix, hopefully, the last remaining unhandled case. Fixes: 0737df6de946 ("libbpf: better fix for handling nulled-out struct_ops program") Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly") Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	libbpf: remove unnecessary struct_ops prog validity check	Andrii Nakryiko	1	-10/+3
	libbpf ensures that BPF program references set in map->st_ops->progs[i] during open phase are always valid STRUCT_OPS programs. This is done in bpf_object__collect_st_ops_relos(). So there is no need to double-check that in bpf_map__init_kern_struct_ops(). Simplify the code by removing unnecessary check. Also, we avoid using local prog variable to keep code similar to the upcoming fix, which adds similar logic in another part of bpf_map__init_kern_struct_ops(). Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-07	net: annotate writes on dev->mtu from ndo_change_mtu()	Eric Dumazet	153	-173/+174
	Simon reported that ndo_change_mtu() methods were never updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted in commit 501a90c94510 ("inet: protect against too small mtu values.") We read dev->mtu without holding RTNL in many places, with READ_ONCE() annotations. It is time to take care of ndo_change_mtu() methods to use corresponding WRITE_ONCE() Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Simon Horman <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Reviewed-by: Jacob Keller <[email protected]> Reviewed-by: Sabrina Dubroca <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Shannon Nelson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	net: dccp: Fix ccid2_rtt_estimator() kernel-doc	Jeff Johnson	1	-0/+1
	make C=1 reports: warning: Function parameter or struct member 'mrtt' not described in 'ccid2_rtt_estimator' So document the 'mrtt' parameter. Signed-off-by: Jeff Johnson <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	net: phy: marvell: add support for MV88E6250 family internal PHYs	Matthias Schiffer	2	-1/+82
	The embedded PHYs of the 88E6250 family switches are very basic - they do not even have an Extended Address / Page register. This adds support for the PHYs to the driver to set up PHY interrupts and retrieve error stats. To deal with PHYs without a page register, "simple" variants of all stat handling functions are introduced. The code should work with all 88E6250 family switches (6250/6220/6071/ 6070/6020). The PHY ID 0x01410db0 was read from a 88E6020, under the assumption that all switches of this family use the same ID. The spec only lists the prefix 0x01410c00 and leaves the last 10 bits as reserved, but that seems too unspecific to be useful, as it would cover several existing PHY IDs already supported by the driver; therefore, the ID read from the actual hardware is used. Signed-off-by: Matthias Schiffer <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/0695f699cd942e6e06da9d30daeedfd47785bc01.1714643285.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-07	net: phy: marvell: constify marvell_hw_stats	Matthias Schiffer	1	-1/+1
	The list of stat registers is read-only, so we can declare it as const. Signed-off-by: Matthias Schiffer <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/24d7a2f39e0c4c94466e8ad43228fdd798053f3a.1714643285.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Jakub Kicinski <[email protected]>