Age | Commit message (Collapse) | Author | Files | Lines |
|
This loop is supposed to copy the mac address to cmd->addr but the
i++ increment is missing so it copies everything to cmd->addr[0] and
only the last address is recorded.
Fixes: 22bedad3ce11 ("net: convert multicast list to list_head")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Daniel Jurgens says:
====================
Remove RTNL lock protection of CVQ
Currently the buffer used for control VQ commands is protected by the
RTNL lock. Previously this wasn't a major concern because the control VQ
was only used during device setup and user interaction. With the recent
addition of dynamic interrupt moderation the control VQ may be used
frequently during normal operation.
This series removes the RNTL lock dependency by introducing a mutex
to protect the control buffer and writing SGs to the control VQ.
v6:
- Rebased over new stats code.
- Added comment to cvq_lock, init the mutex unconditionally,
and replaced some duplicate code with a goto.
- Fixed minor grammer errors, checkpatch warnings, and clarified
a comment.
v5:
- Changed cvq_lock to a mutex.
- Changed dim_lock to mutex, because it's held taking
the cvq_lock.
- Use spin/mutex_lock/unlock vs guard macros.
v4:
- Protect dim_enabled with same lock as well intr_coal.
- Rename intr_coal_lock to dim_lock.
- Remove some scoped_guard where the error path doesn't
have to be in the lock.
v3:
- Changed type of _offloads to __virtio16 to fix static
analysis warning.
- Moved a misplaced hunk to the correct patch.
v2:
- New patch to only process the provided queue in
virtnet_dim_work
- New patch to lock per queue rx coalescing structure.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The rtnl lock is no longer needed to protect the control buffer and
command VQ.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Once the RTNL locking around the control buffer is removed there can be
contention on the per queue RX interrupt coalescing data. Use a mutex
per queue. A mutex is required because virtnet_send_command can sleep.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Since we no longer have to hold the RTNL lock here just do updates for
the specified queue.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The command VQ will no longer be protected by the RTNL lock. Use a
mutex to protect the control buffer header and the VQ.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Allocate memory for the data when it's used. Ideally the struct could
be on the stack, but we can't DMA stack memory. With this change only
the header and status memory are shared between commands, which will
allow using a tighter lock than RTNL.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Stop storing RSS setting in the control buffer. This is prep work for
removing RTNL lock protection of the control buffer.
Signed-off-by: Daniel Jurgens <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Heng Qi <[email protected]>
Tested-by: Heng Qi <[email protected]>
Acked-by: Jason Wang <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Currently, the MT7530 DSA subdriver configures the MT7530 switch to provide
direct access to switch PHYs, meaning, the switch PHYs listen on the MDIO
bus the switch listens on. The PHY muxing feature makes use of this.
This is problematic as the PHY may be attached before the switch is
initialised, in which case, the PHY will fail to be attached.
Since commit 91374ba537bd ("net: dsa: mt7530: support OF-based registration
of switch MDIO bus"), we can describe the switch PHYs on the MDIO bus of
the switch on the device tree. Extend the check to detect PHY muxing when
the PHY is defined on the MDIO bus of the switch on the device tree.
When the PHY is described this way, the switch will be initialised first,
then the switch MDIO bus will be registered. Only after these steps, the
PHY will be attached.
Signed-off-by: Arınç ÜNAL <[email protected]>
Reviewed-by: Daniel Golle <[email protected]>
Link: https://lore.kernel.org/r/20240430-b4-for-netnext-mt7530-use-switch-mdio-bus-for-phy-muxing-v2-1-9104d886d0db@arinc9.com
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Eric Dumazet says:
====================
rtnetlink: more rcu conversions for rtnl_fill_ifinfo()
We want to no longer rely on RTNL for "ip link show" command.
This is a long road, this series takes care of some parts.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
We want to be able to run rtnl_fill_ifinfo() under RCU protection
instead of RTNL in the future.
All rtnl_link_ops->get_link_net() methods already using dev_net()
are ready. I added READ_ONCE() annotations on others.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
dev->xdp_prog is protected by RCU, we can lift RTNL requirement
from rtnl_xdp_prog_skb().
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Change dev_change_proto_down() and dev_change_proto_down_reason()
to write once on dev->proto_down and dev->proto_down_reason.
Then rtnl_fill_proto_down() can use READ_ONCE() annotations
and run locklessly.
rtnl_proto_down_size() should assume worst case,
because readng dev->proto_down_reason multiple
times would be racy without RTNL in the future.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Following device fields can be read locklessly
in rtnl_fill_ifinfo() :
type, ifindex, operstate, link_mode, mtu, min_mtu, max_mtu, group,
promiscuity, allmulti, num_tx_queues, gso_max_segs, gso_max_size,
gro_max_size, gso_ipv4_max_size, gro_ipv4_max_size, tso_max_size,
tso_max_segs, num_rx_queues.
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In the following patch we want to read dev->allmulti
and dev->promiscuity locklessly from rtnl_fill_ifinfo()
In this patch I change __dev_set_promiscuity() and
__dev_set_allmulti() to write these fields (and dev->flags)
only if they succeed, with WRITE_ONCE() annotations.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
rtnl_fill_ifinfo() can read dev->tx_queue_len locklessly,
granted we add corresponding READ_ONCE()/WRITE_ONCE() annotations.
Add missing READ_ONCE(dev->tx_queue_len) in teql_enqueue()
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
We can use netdev_copy_name() to no longer rely on RTNL
to fetch dev->name.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
dev->qdisc can be read using RCU protection.
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
says:
====================
net: qede: don't restrict error codes
This series fixes the qede driver, so that when a helper function fails,
then the callee should return the returned error code, instead just
assuming that the error is eg. -EINVAL.
The patches in this series, reduces the change of future bugs, so new
error codes can be returned from the helpers, without having to update
the call sites.
This is a follow-up to my recent series "net: qede: avoid overruling
error codes", which fixed the cases where the implicit assumption of
failing with specific error codes had been broken.
https://lore.kernel.org/netdev/[email protected]/
Asbjørn Sloth Tønnesen (3):
net: qede: use return from qede_parse_actions() for flow_spec
net: qede: use return from qede_flow_spec_validate_unused()
net: qede: use return from qede_flow_parse_ports()
.../net/ethernet/qlogic/qede/qede_filter.c | 27 ++++++++++++-------
1 file changed, 18 insertions(+), 9 deletions(-)
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
When calling qede_flow_parse_ports(), then the
return code was only used for a non-zero check,
and then -EINVAL was returned.
qede_flow_parse_ports() can currently fail with:
* -EINVAL
This patch changes qede_flow_parse_v{4,6}_common() to
use the actual return code from qede_flow_parse_ports(),
so it's no longer assumed that all errors are -EINVAL.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
When calling qede_flow_spec_validate_unused() then
the return code was only used for a non-zero check,
and then -EOPNOTSUPP was returned.
qede_flow_spec_validate_unused() can currently fail with:
* -EOPNOTSUPP
This patch changes qede_flow_spec_to_rule() to use the
actual return code from qede_flow_spec_validate_unused(),
so it's no longer assumed that all errors are -EOPNOTSUPP.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
In qede_flow_spec_to_rule(), when calling
qede_parse_actions() then the return code
was only used for a non-zero check, and then
-EINVAL was returned.
qede_parse_actions() can currently fail with:
* -EINVAL
* -EOPNOTSUPP
Commit 319a1d19471e ("flow_offload: check for
basic action hw stats type") broke the implicit
assumption that it could only fail with -EINVAL,
by changing it to return -EOPNOTSUPP, when hardware
stats are requested.
However AFAICT it's not possible to trigger
qede_parse_actions() to return -EOPNOTSUPP, when
called from qede_flow_spec_to_rule(), as hardware
stats can't be requested by ethtool_rx_flow_rule_create().
This patch changes qede_flow_spec_to_rule() to use
the actual return code from qede_parse_actions(),
so it's no longer assumed that all errors are -EINVAL.
Only compile tested.
Signed-off-by: Asbjørn Sloth Tønnesen <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
One more Qualcomm Arm64 DeviceTree fix for v6.9
On ths SA8155P automotive platform, the wrong gpio controller is defined
for the SD-card detect pin, which depending on probe ordering of things
cause ethernet to be broken. The card detect pin reference is corrected
to solve this problem.
* tag 'qcom-arm64-fixes-for-6.9-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
arm64: dts: qcom: sa8155p-adp: fix SDHC2 CD pin configuration
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2024-05-03
1) Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support.
This was defined by an early version of an IETF draft
that did not make it to a standard.
2) Introduce direction attribute for xfrm states.
xfrm states have a direction, a stsate can be used
either for input or output packet processing.
Add a direction to xfrm states to make it clear
for what a xfrm state is used.
* tag 'ipsec-next-2024-05-03' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
xfrm: Restrict SA direction attribute to specific netlink message types
xfrm: Add dir validation to "in" data path lookup
xfrm: Add dir validation to "out" data path lookup
xfrm: Add Direction to the SA in or out
udpencap: Remove Obsolete UDP_ENCAP_ESPINUDP_NON_IKE Support
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This patch fixes the spelling mistakes in comments.
The changes were generated using codespell and reviewed manually.
eariler -> earlier
greceful -> graceful
Signed-off-by: Shi-Sheng Yang <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
According to GCC, the constriction of irq_name in otx2_open()
may, theoretically, be truncated.
This patch takes the approach of treating such a situation as an error
which it detects by making use of the return value of snprintf, which is
the total number of bytes, excluding the trailing '\0', that would have
been written.
Based on the approach taken to a similar problem in
commit 54b909436ede ("rtc: fix snprintf() checking in is_rtc_hctosys()")
Flagged by gcc-13 W=1 builds as:
.../otx2_pf.c:1933:58: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name,
| ^
.../otx2_pf.c:1933:17: note: 'snprintf' output between 8 and 33 bytes into a destination of size 32
1933 | snprintf(irq_name, NAME_SIZE, "%s-rxtx-%d", pf->netdev->name,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1934 | qidx);
| ~~~~~
Compile tested only.
Tested-by: Geetha sowjanya <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/20240503-octeon2-pf-irq_name-truncation-v2-1-91099177b942@kernel.org
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Attributes for FDB learned entries were added to the if_link netlink api
for bridge linkinfo but are missing from the rt_link.yaml spec. Add the
missing attributes to the spec.
Fixes: ddd1ad68826d ("net: bridge: Add netlink knobs for number / max learned FDB entries")
Signed-off-by: Donald Hunter <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
fill_route() stores three components in the skb:
- struct rtmsg
- RTA_DST (u8)
- RTA_OIF (u32)
Therefore, rtm_phonet_notify() should use
NLMSG_ALIGN(sizeof(struct rtmsg)) +
nla_total_size(1) +
nla_total_size(4)
Fixes: f062f41d0657 ("Phonet: routing table Netlink interface")
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: Rémi Denis-Courmont <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This list looks like it's been unused since the OF conversion in
2008 in
commit 826b6cfcd5d4 ("fore200e: Convert over to pure OF driver.")
This also means we can remove the 'entry' member for the list.
Build tested only.
Signed-off-by: Dr. David Alan Gilbert <[email protected]>
Reviewed-by: Breno Leitao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The new netdev queue api is implemented for gve.
Tested-by: Mina Almasry <[email protected]>
Reviewed-by: Mina Almasry <[email protected]>
Reviewed-by: Praveen Kaligineedi <[email protected]>
Reviewed-by: Harshitha Ramamurthy <[email protected]>
Signed-off-by: Shailend Chand <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Cupertino Miranda says:
====================
bpf/verifier: range computation improvements
Hi everyone,
This is what I hope to be the last version. :)
Regards,
Cupertino
Changes from v1:
- Reordered patches in the series.
- Fix refactor to be acurate with original code.
- Fixed other mentioned small problems.
Changes from v2:
- Added a patch to replace mark_reg_unknowon for __mark_reg_unknown in
the context of range computation.
- Reverted implementation of refactor to v1 which used a simpler
boolean return value in check function.
- Further relaxed MUL to allow it to still compute a range when neither
of its registers is a known value.
- Simplified tests based on Eduards example.
- Added messages in selftest commits.
Changes from v3:
- Improved commit message of patch nr 1.
- Coding style fixes.
- Improve XOR and OR tests.
- Made function calls to pass struct bpf_reg_state pointer instead.
- Improved final code as a last patch.
Changes from v4:
- Merged patch nr 7 in 2.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Added a test for bound computation in MUL when non constant
values are used and both registers have bounded ranges.
Signed-off-by: Cupertino Miranda <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
MUL instruction required that src_reg would be a known value (i.e.
src_reg would be a const value). The condition in this case can be
relaxed, since the range computation algorithm used in current code
already supports a proper range computation for any valid range value on
its operands.
Signed-off-by: Cupertino Miranda <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Added a test for bound computation in XOR and OR when non constant
values are used and both registers have bounded ranges.
Signed-off-by: Cupertino Miranda <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Range for XOR and OR operators would not be attempted unless src_reg
would resolve to a single value, i.e. a known constant value.
This condition is unnecessary, and the following XOR/OR operator
handling could compute a possible better range.
Acked-by: Eduard Zingerman <[email protected]>
Signed-off-by: Cupertino Miranda <[email protected]
Acked-by: Eduard Zingerman <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Split range computation checks in its own function, isolating pessimitic
range set for dst_reg and failing return to a single point.
Signed-off-by: Cupertino Miranda <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
bpf/verifier: improve code after range computation recent changes.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
In order to further simplify the code in adjust_scalar_min_max_vals all
the calls to mark_reg_unknown are replaced by __mark_reg_unknown.
static void mark_reg_unknown(struct bpf_verifier_env *env,
struct bpf_reg_state *regs, u32 regno)
{
if (WARN_ON(regno >= MAX_BPF_REG)) {
... mark all regs not init ...
return;
}
__mark_reg_unknown(env, regs + regno);
}
The 'regno >= MAX_BPF_REG' does not apply to
adjust_scalar_min_max_vals(), because it is only called from the
following stack:
- check_alu_op
- adjust_reg_min_max_vals
- adjust_scalar_min_max_vals
The check_alu_op() does check_reg_arg() which verifies that both src and
dst register numbers are within bounds.
Signed-off-by: Cupertino Miranda <[email protected]>
Acked-by: Eduard Zingerman <[email protected]>
Cc: Yonghong Song <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: David Faust <[email protected]>
Cc: Jose Marchesi <[email protected]>
Cc: Elena Zannoni <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This allows to define a GTP tunnel for dual stack MS/UE with both IPv4
and IPv6 addresses while using the same TEID via two PDP context
objects.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Add new protocol field to PDP context that determines the transmit path
IP protocol to encapsulate the original packets, either IPv4 or IPv6.
Relax existing netlink attribute checks to allow to specify different
family in MS and peer attributes from the control plane.
Use build helpers to tx path to encapsulate IPv4-in-IPv6-GTP and
IPv6-in-IPv4-GTP according to the user-specified configuration.
From rx path, snoop for the inner protocol header since outer
skb->protocol might differ and use this to validate for valid PDP
context and to restore skb->protocol after decapsulation.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Add routine to attach an IPv6 route for the encapsulated packet, deal
with Path MTU and push GTP header.
This helper function will be used to deal with IPv4-in-IPv6-GTP.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Add routine to attach an IPv4 route for the encapsulated packet, deal
with Path MTU and push GTP header.
This helper function will be used to deal with IPv6-in-IPv4-GTP.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Based on the idea that ip_tunnel_get_dsfield() provides the tos field
regardless the IP version, use either iph->tos or ipv6_get_dsfield().
This comes in preparation to support for IPv4-in-IPv6-GTP and
IPv6-in-IPv4-GTP.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Move debugging to the routine to build GTP packets in preparation
for supporting IPv4-in-IPv6-GTP and IPv6-in-IPv4-GTP.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
According to TS 29.061, it is possible to see IPv6 link-local traffic in
the GTP tunnel, see 11.2.1.3.2 IPv6 Stateless Address Autoconfiguration
(IPv6 SLAAC).
Pass up these packets to the userspace daemon to handle them as control
GTP traffic.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Harald Welte reports that according to 3GPP TS 29.060:
PDN Connection: the association between a MS represented by one IPv4
address and/or one IPv6 prefix and a PDN represented by an APN.
this clearly states that IPv4 is a single address while IPv6 is a single prefix.
Then, 3GPP TS 29.061, Section 11.2.1.3:
For APNs that are configured for IPv6 address allocation, the GGSN/P-GW
shall only use the Prefix part of the IPv6 address for forwarding of mobile
terminated IP packets. The size of the prefix shall be according to the maximum
prefix length for a global IPv6 address as specified in the IPv6 Addressing
Architecture, see RFC 4291 [82].
RFC 4291 section 2.5.4 states
All Global Unicast addresses other than those that start with binary 000
have a 64-bit interface ID field (i.e., n + m = 64) ...
3GPP TS 29.61 Section 11.2.1.3.2a:
In the procedure in the cases of using GTP-based S5/S8, P-GW acts as an
access router, and allocates to a UE a globally unique /64 IPv6 prefix if the
PLMN allocates the prefix.
Therefore, compare IPv6 address /64 prefix only since MS/UE is not a single
address like in the IPv4 case.
Reject IPv6 address with EADDRNOTAVAIL if it lower 64 bits of the IPv6 address
from the control plane are set.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Add new iflink attributes to configure in-kernel UDP listener socket
address: IFLA_GTP_LOCAL and IFLA_GTP_LOCAL6. If none of these attributes
are specified, default is still to IPv4 INADDR_ANY for backward
compatibility.
Add new attributes to set up family and IPv6 address of GTP tunnels:
GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6. If no GTPA_FAMILY is
specified, AF_INET is assumed for backward compatibility.
setsockopt IPV6_ADDRFORM allows to downgrade socket from IPv6 to IPv4
after socket is bound. Assumption is that socket listener that is
attached to the gtp device needs to be either IPv4 or IPv6. Therefore,
GTP socket listener does not allow for IPv4-mapped-IPv6 listener.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Use union artifact to prepare for IPv6 support.
Add and use GTP_{IPV4,TH}_MAXLEN.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Currently GTP packets are dropped if the next extension field is set to
non-zero value, but this are valid GTP packets.
TS 29.281 provides a longer header format, which is defined as struct
gtp1_header_long. Such long header format is used if any of the S, PN, E
flags is set.
This long header is 4 bytes longer than struct gtp1_header, plus
variable length (optional) extension headers. The next extension header
field is zero is no extension header is provided.
The extension header is composed of a length field which includes total
number of 4 byte words including the extension header itself (1 byte),
payload (variable length) and next type (1 byte). The extension header
size and its payload is aligned to 4 bytes.
A GTP packet might come with a chain extensions headers, which makes it
slightly cumbersome to parse because the extension next header field
comes at the end of the extension header, and there is a need to check
if this field becomes zero to stop the extension header parser.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Update b20dc3c68458 ("gtp: Allow to create GTP device without FDs") to
remove useless initialization to NULL, sockets are initialized to
non-NULL just a few lines of code after this.
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
When building either tools/bpf/bpftool, or tools/testing/selftests/hid,
(the same Makefile is used for these), clang generates many instances of
the following:
"clang: warning: -lLLVM-17: 'linker' input unused"
Quentin points out that the LLVM version is only required in $(LIBS),
not in $(CFLAGS), so the fix is to remove it from CFLAGS.
Suggested-by: Quentin Monnet <[email protected]>
Signed-off-by: John Hubbard <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Quentin Monnet <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|