Age | Commit message (Collapse) | Author | Files | Lines |
|
When one of the three connection complete events is received multiple
times for the same handle, the device is registered multiple times which
leads to memory corruptions. Therefore, consequent events for a single
connection are ignored.
The conn->state can hold different values, therefore HCI_CONN_HANDLE_UNSET
is introduced to identify new connections. To make sure the events do not
contain this or another invalid handle HCI_CONN_HANDLE_MAX and checks
are introduced.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=215497
Signed-off-by: Soenke Huster <soenke.huster@eknoes.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
This patch introduces two new MGMT events for notifying the bluetoothd
whenever the controller starts/stops monitoring a device.
Test performed:
- Verified by logs that the MSFT Monitor Device is received from the
controller and the bluetoothd is notified whenever the controller
starts/stops monitoring a device.
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Miao-chen Chou <mcchou@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Whenever the controller starts/stops monitoring a bt device, it sends
MSFT Monitor Device event. Add handler to read this vendor event.
Test performed:
- Verified by logs that the MSFT Monitor Device event is received from
the controller whenever it starts/stops monitoring a device.
Signed-off-by: Manish Mandlik <mmandlik@google.com>
Reviewed-by: Miao-chen Chou <mcchou@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Merge in fixes directly in prep for the 5.17 merge window.
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace kfree_skb() with kfree_skb_reason() in __udp4_lib_rcv.
New drop reason 'SKB_DROP_REASON_UDP_CSUM' is added for udp csum
error.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace kfree_skb() with kfree_skb_reason() in tcp_v4_rcv(). Following
drop reasons are added:
SKB_DROP_REASON_NO_SOCKET
SKB_DROP_REASON_PKT_TOO_SMALL
SKB_DROP_REASON_TCP_CSUM
SKB_DROP_REASON_TCP_FILTER
After this patch, 'kfree_skb' event will print message like this:
$ TASK-PID CPU# ||||| TIMESTAMP FUNCTION
$ | | | ||||| | |
<idle>-0 [000] ..s1. 36.113438: kfree_skb: skbaddr=(____ptrval____) protocol=2048 location=(____ptrval____) reason: NO_SOCKET
The reason of skb drop is printed too.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce the interface kfree_skb_reason(), which is able to pass
the reason why the skb is dropped to 'kfree_skb' tracepoint.
Add the 'reason' field to 'trace_kfree_skb', therefor user can get
more detail information about abnormal skb with 'drop_monitor' or
eBPF.
All drop reasons are defined in the enum 'skb_drop_reason', and
they will be print as string in 'kfree_skb' tracepoint in format
of 'reason: XXX'.
( Maybe the reasons should be defined in a uapi header file, so that
user space can use them? )
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Netfilter conntrack maintains NAT flags per connection indicating
whether NAT was configured for the connection. Openvswitch maintains
NAT flags on the per packet flow key ct_state field, indicating
whether NAT was actually executed on the packet.
When a packet misses from tc to ovs the conntrack NAT flags are set.
However, NAT was not necessarily executed on the packet because the
connection's state might still be in NEW state. As such, openvswitch
wrongly assumes that NAT was executed and sets an incorrect flow key
NAT flags.
Fix this, by flagging to openvswitch which NAT was actually done in
act_ct via tc_skb_ext and tc_skb_cb to the openvswitch module, so
the packet flow key NAT flags will be correctly set.
Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct")
Signed-off-by: Paul Blakey <paulb@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20220106153804.26451-1-paulb@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next. This
includes one patch to update ovs and act_ct to use nf_ct_put() instead
of nf_conntrack_put().
1) Add netns_tracker to nfnetlink_log and masquerade, from Eric Dumazet.
2) Remove redundant rcu read-size lock in nf_tables packet path.
3) Replace BUG() by WARN_ON_ONCE() in nft_payload.
4) Consolidate rule verdict tracing.
5) Replace WARN_ON() by WARN_ON_ONCE() in nf_tables core.
6) Make counter support built-in in nf_tables.
7) Add new field to conntrack object to identify locally generated
traffic, from Florian Westphal.
8) Prevent NAT from shadowing well-known ports, from Florian Westphal.
9) Merge nf_flow_table_{ipv4,ipv6} into nf_flow_table_inet, also from
Florian.
10) Remove redundant pointer in nft_pipapo AVX2 support, from Colin Ian King.
11) Replace opencoded max() in conntrack, from Jiapeng Chong.
12) Update conntrack to use refcount_t API, from Florian Westphal.
13) Move ip_ct_attach indirection into the nf_ct_hook structure.
14) Constify several pointer object in the netfilter codebase,
from Florian Westphal.
15) Tree-wide replacement of nf_conntrack_put() by nf_ct_put(), also
from Florian.
16) Fix egress splat due to incorrect rcu notation, from Florian.
17) Move stateful fields of connlimit, last, quota, numgen and limit
out of the expression data area.
18) Build a blob to represent the ruleset in nf_tables, this is a
requirement of the new register tracking infrastructure.
19) Add NFT_REG32_NUM to define the maximum number of 32-bit registers.
20) Add register tracking infrastructure to skip redundant
store-to-register operations, this includes support for payload,
meta and bitwise expresssions.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: (32 commits)
netfilter: nft_meta: cancel register tracking after meta update
netfilter: nft_payload: cancel register tracking after payload update
netfilter: nft_bitwise: track register operations
netfilter: nft_meta: track register operations
netfilter: nft_payload: track register operations
netfilter: nf_tables: add register tracking infrastructure
netfilter: nf_tables: add NFT_REG32_NUM
netfilter: nf_tables: add rule blob layout
netfilter: nft_limit: move stateful fields out of expression data
netfilter: nft_limit: rename stateful structure
netfilter: nft_numgen: move stateful fields out of expression data
netfilter: nft_quota: move stateful fields out of expression data
netfilter: nft_last: move stateful fields out of expression data
netfilter: nft_connlimit: move stateful fields out of expression data
netfilter: egress: avoid a lockdep splat
net: prefer nf_ct_put instead of nf_conntrack_put
netfilter: conntrack: avoid useless indirection during conntrack destruction
netfilter: make function op structures const
netfilter: core: move ip_ct_attach indirection to struct nf_ct_hook
netfilter: conntrack: convert to refcount_t api
...
====================
Link: https://lore.kernel.org/r/20220109231640.104123-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Check if the destination register already contains the data that this
bitwise expression performs. This allows to skip this redundant
operation.
If the destination contains a different bitwise operation, cancel the
register tracking information. If the destination contains no bitwise
operation, update the register tracking information.
Update the payload and meta expression to check if this bitwise
operation has been already performed on the register. Hence, both the
payload/meta and the bitwise expressions are reduced.
There is also a special case: If source register != destination register
and source register is not updated by a previous bitwise operation, then
transfer selector from the source register to the destination register.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds new infrastructure to skip redundant selector store
operations on the same register to achieve a performance boost from
the packet path.
This is particularly noticeable in pure linear rulesets but it also
helps in rulesets which are already heaving relying in maps to avoid
ruleset linear inspection.
The idea is to keep data of the most recurrent store operations on
register to reuse them with cmp and lookup expressions.
This infrastructure allows for dynamic ruleset updates since the ruleset
blob reduction happens from the kernel.
Userspace still needs to be updated to maximize register utilization to
cooperate to improve register data reuse / reduce number of store on
register operations.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a definition including the maximum number of 32-bits registers that
are used a scratchpad memory area to store data.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This patch adds a blob layout per chain to represent the ruleset in the
packet datapath.
size (unsigned long)
struct nft_rule_dp
struct nft_expr
...
struct nft_rule_dp
struct nft_expr
...
struct nft_rule_dp (is_last=1)
The new structure nft_rule_dp represents the rule in a more compact way
(smaller memory footprint) compared to the control-plane nft_rule
structure.
The ruleset blob is a read-only data structure. The first field contains
the blob size, then the rules containing expressions. There is a trailing
rule which is used by the tracing infrastructure which is equivalent to
the NULL rule marker in the previous representation. The blob size field
does not include the size of this trailing rule marker.
The ruleset blob is generated from the commit path.
This patch reuses the infrastructure available since 0cbc06b3faba
("netfilter: nf_tables: remove synchronize_rcu in commit phase") to
build the array of rules per chain.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
include/linux/netfilter_netdev.h:97 suspicious rcu_dereference_check() usage!
2 locks held by sd-resolve/1100:
0: ..(rcu_read_lock_bh){1:3}, at: ip_finish_output2
1: ..(rcu_read_lock_bh){1:3}, at: __dev_queue_xmit
__dev_queue_xmit+0 ..
The helper has two callers, one uses rcu_read_lock, the other
rcu_read_lock_bh(). Annotate the dereference to reflect this.
Fixes: 42df6e1d221dd ("netfilter: Introduce egress hook")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
nf_ct_put() results in a usesless indirection:
nf_ct_put -> nf_conntrack_put -> nf_conntrack_destroy -> rcu readlock +
indirect call of ct_hooks->destroy().
There are two _put helpers:
nf_ct_put and nf_conntrack_put. The latter is what should be used in
code that MUST NOT cause a linker dependency on the conntrack module
(e.g. calls from core network stack).
Everyone else should call nf_ct_put() instead.
A followup patch will convert a few nf_conntrack_put() calls to
nf_ct_put(), in particular from modules that already have a conntrack
dependency such as act_ct or even nf_conntrack itself.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
No functional changes, these structures should be const.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
ip_ct_attach predates struct nf_ct_hook, we can place it there and
remove the exported symbol.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Convert nf_conn reference counting from atomic_t to refcount_t based api.
refcount_t api provides more runtime sanity checks and will warn on
certain constructs, e.g. refcount_inc() on a zero reference count, which
usually indicates use-after-free.
For this reason template allocation is changed to init the refcount to
1, the subsequenct add operations are removed.
Likewise, init_conntrack() is changed to set the initial refcount to 1
instead refcount_inc().
This is safe because the new entry is not (yet) visible to other cpus.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix following coccicheck warning:
./include/net/netfilter/nf_conntrack.h:282:16-17: WARNING opportunity
for max().
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Luiz Augusto von Dentz says:
====================
bluetooth-next pull request for net-next:
- Add support for Foxconn QCA 0xe0d0
- Fix HCI init sequence on MacBook Air 8,1 and 8,2
- Fix Intel firmware loading on legacy ROM devices
* tag 'for-net-next-2022-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next:
Bluetooth: hci_sock: fix endian bug in hci_sock_setsockopt()
Bluetooth: L2CAP: uninitialized variables in l2cap_sock_setsockopt()
Bluetooth: btqca: sequential validation
Bluetooth: btusb: Add support for Foxconn QCA 0xe0d0
Bluetooth: btintel: Fix broken LED quirk for legacy ROM devices
Bluetooth: hci_event: Rework hci_inquiry_result_with_rssi_evt
Bluetooth: btbcm: disable read tx power for MacBook Air 8,1 and 8,2
Bluetooth: hci_qca: Fix NULL vs IS_ERR_OR_NULL check in qca_serdev_probe
Bluetooth: hci_bcm: Check for error irq
====================
Link: https://lore.kernel.org/r/20220107210942.3750887-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for flushing the MAC table on a given port in the ocelot
switch library, and use this functionality in the felix DSA driver.
This operation is needed when a port leaves a bridge to become
standalone, and when the learning is disabled, and when the STP state
changes to a state where no FDB entry should be present.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220107144229.244584-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2022-01-06
1) Expose FEC per lane block counters via ethtool
2) Trivial fixes/updates/cleanup to mlx5e netdev driver
3) Fix htmldoc build warning
4) Spread mlx5 SFs (sub-functions) to all available CPU cores: Commits 1..5
Shay Drory Says:
================
Before this patchset, mlx5 subfunction shared the same IRQs (MSI-X) with
their peers subfunctions, causing them to use same CPU cores.
In large scale, this is very undesirable, SFs use small number of cpu
cores and all of them will be packed on the same CPU cores, not
utilizing all CPU cores in the system.
In this patchset we want to achieve two things.
a) Spread IRQs used by SFs to all cpu cores
b) Pack less SFs in the same IRQ, will result in multiple IRQs per core.
In this patchset, we spread SFs over all online cpus available to mlx5
irqs in Round-Robin manner. e.g.: Whenever a SF is created, pick the next
CPU core with least number of SF IRQs bound to it, SFs will share IRQs on
the same core until a certain limit, when such limit is reached, we
request a new IRQ and add it to that CPU core IRQ pool, when out of IRQs,
pick any IRQ with least number of SF users.
This enhancement is done in order to achieve a better distribution of
the SFs over all the available CPUs, which reduces application latency,
as shown bellow.
Machine details:
Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz with 56 cores.
PCI Express 3 with BW of 126 Gb/s.
ConnectX-5 Ex; EDR IB (100Gb/s) and 100GbE; dual-port QSFP28; PCIe4.0
x16.
Base line test description:
Single SF on the system. One instance of netperf is running on-top the
SF.
Numbers: latency = 15.136 usec, CPU Util = 35%
Test description:
There are 250 SFs on the system. There are 3 instances of netperf
running, on-top three different SFs, in parallel.
Perf numbers:
# netperf SFs latency(usec) latency CPU utilization
affinity affinity (lower is better) increase %
1 cpu=0 cpu={0} ~23 (app 1-3) 35% 75%
2 cpu=0,2,4 cpu={0} app 1: 21.625 30% 68% (CPU 0)
app 2-3: 16.5 9% 15% (CPU 2,4)
3 cpu=0 cpu={0,2,4} app 1: ~16 7% 84% (CPU 0)
app 2-3: ~17.9 14% 22% (CPU 2,4)
4 cpu=0,2,4 cpu={0,2,4} 15.2 (app 1-3) 0% 33% (CPU 0,2,4)
- The first two entries (#1 and #2) show current state. e.g.: SFs are
using the same CPU. The last two entries (#3 and #4) shows the latency
reduction improvement of this patch. e.g.: SFs are on different CPUs.
- Whenever we use several CPUs, in case there is a different CPU
utilization, write the utilization of each CPU separately.
- Whenever the latency result of the netperf instances were different,
write the latency of each netperf instances separately.
Commands:
- for netperf CPU=0:
$ for i in {1..3}; do taskset -c 0 netperf -H 1${i}.1.1.1 -t TCP_RR -- \
-o RT_LATENCY -r8 & done
- for netperf CPU=0,2,4
$ for i in {1..3}; do taskset -c $(( ($i - 1) * 2 )) netperf -H \
1${i}.1.1.1 -t TCP_RR -- -o RT_LATENCY -r8 & done
================
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf-next 2022-01-06
We've added 41 non-merge commits during the last 2 day(s) which contain
a total of 36 files changed, 1214 insertions(+), 368 deletions(-).
The main changes are:
1) Various fixes in the verifier, from Kris and Daniel.
2) Fixes in sockmap, from John.
3) bpf_getsockopt fix, from Kuniyuki.
4) INET_POST_BIND fix, from Menglong.
5) arm64 JIT fix for bpf pseudo funcs, from Hou.
6) BPF ISA doc improvements, from Christoph.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
bpf: selftests: Add bind retry for post_bind{4, 6}
bpf: selftests: Use C99 initializers in test_sock.c
net: bpf: Handle return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND()
bpf/selftests: Test bpf_d_path on rdonly_mem.
libbpf: Add documentation for bpf_map batch operations
selftests/bpf: Don't rely on preserving volatile in PT_REGS macros in loop3
xdp: Add xdp_do_redirect_frame() for pre-computed xdp_frames
xdp: Move conversion to xdp_frame out of map functions
page_pool: Store the XDP mem id
page_pool: Add callback to init pages when they are allocated
xdp: Allow registering memory model without rxq reference
samples/bpf: xdpsock: Add timestamp for Tx-only operation
samples/bpf: xdpsock: Add time-out for cleaning Tx
samples/bpf: xdpsock: Add sched policy and priority support
samples/bpf: xdpsock: Add cyclic TX operation capability
samples/bpf: xdpsock: Add clockid selection support
samples/bpf: xdpsock: Add Dest and Src MAC setting for Tx-only operation
samples/bpf: xdpsock: Add VLAN support for Tx-only operation
libbpf 1.0: Deprecate bpf_object__find_map_by_offset() API
libbpf 1.0: Deprecate bpf_map__is_offload_neutral()
...
====================
Link: https://lore.kernel.org/r/20220107013626.53943-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The return value of BPF_CGROUP_RUN_PROG_INET{4,6}_POST_BIND() in
__inet_bind() is not handled properly. While the return value
is non-zero, it will set inet_saddr and inet_rcv_saddr to 0 and
exit:
err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk);
if (err) {
inet->inet_saddr = inet->inet_rcv_saddr = 0;
goto out_release_sock;
}
Let's take UDP for example and see what will happen. For UDP
socket, it will be added to 'udp_prot.h.udp_table->hash' and
'udp_prot.h.udp_table->hash2' after the sk->sk_prot->get_port()
called success. If 'inet->inet_rcv_saddr' is specified here,
then 'sk' will be in the 'hslot2' of 'hash2' that it don't belong
to (because inet_saddr is changed to 0), and UDP packet received
will not be passed to this sock. If 'inet->inet_rcv_saddr' is not
specified here, the sock will work fine, as it can receive packet
properly, which is wired, as the 'bind()' is already failed.
To undo the get_port() operation, introduce the 'put_port' field
for 'struct proto'. For TCP proto, it is inet_put_port(); For UDP
proto, it is udp_lib_unhash(); For icmp proto, it is
ping_unhash().
Therefore, after sys_bind() fail caused by
BPF_CGROUP_RUN_PROG_INET4_POST_BIND(), it will be unbinded, which
means that it can try to be binded to another port.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220106132022.3470772-2-imagedong@tencent.com
|
|
Currently IRQs are requested one by one. To balance spreading IRQs
among cpus using such scheme requires remembering cpu mask for the
cpus used for a given device. This complicates the IRQ allocation
scheme in subsequent patch.
Hence, prepare the code for bulk IRQs allocation. This enables
spreading IRQs among cpus in subsequent patch.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
This rework the handling of hci_inquiry_result_with_rssi_evt to not use
a union to represent the different inquiry responses.
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Soenke Huster <soenke.huster@eknoes.de>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Eric Dumazet suggested to allow users to modify max GRO packet size.
We have seen GRO being disabled by users of appliances (such as
wifi access points) because of claimed bufferbloat issues,
or some work arounds in sch_cake, to split GRO/GSO packets.
Instead of disabling GRO completely, one can chose to limit
the maximum packet size of GRO packets, depending on their
latency constraints.
This patch adds a per device gro_max_size attribute
that can be changed with ip link command.
ip link set dev eth0 gro_max_size 16000
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Coco Li <lixiaoyan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When multiple sockets using the SOF_TIMESTAMPING_BIND_PHC flag received
a packet with a hardware timestamp (e.g. multiple PTP instances in
different PTP domains using the UDPv4/v6 multicast or L2 transport),
the timestamps received on some sockets were corrupted due to repeated
conversion of the same timestamp (by the same or different vclocks).
Fix ptp_convert_timestamp() to not modify the shared skb timestamp
and return the converted timestamp as a ktime_t instead. If the
conversion fails, return 0 to not confuse the application with
timestamps corresponding to an unexpected PHC.
Fixes: d7c088265588 ("net: socket: support hardware timestamp conversion to PHC bound")
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Cc: Yangbo Lu <yangbo.lu@nxp.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As discussed during review here:
https://patchwork.kernel.org/project/netdevbpf/patch/20220105132141.2648876-3-vladimir.oltean@nxp.com/
we should inform developers about pitfalls of concurrent access to the
boolean properties of dsa_switch and dsa_port, now that they've been
converted to bit fields. No other measure than a comment needs to be
taken, since the code paths that update these bit fields are not
concurrent with each other.
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This is a cosmetic incremental fixup to commits
7787ff776398 ("net: dsa: merge all bools of struct dsa_switch into a single u32")
bde82f389af1 ("net: dsa: merge all bools of struct dsa_port into a single u8")
The desire to make this change was enunciated after posting these
patches here:
https://patchwork.kernel.org/project/netdevbpf/cover/20220105132141.2648876-1-vladimir.oltean@nxp.com/
but due to a slight timing overlap (message posted at 2:28 p.m. UTC,
merge commit is at 2:46 p.m. UTC), that comment was missed and the
changes were applied as-is.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2022-01-06
1) Fix xfrm policy lookups for ipv6 gre packets by initializing
fl6_gre_key properly. From Ghalem Boudour.
2) Fix the dflt policy check on forwarding when there is no
policy configured. The check was done for the wrong direction.
From Nicolas Dichtel.
3) Use the correct 'struct xfrm_user_offload' when calculating
netlink message lenghts in xfrm_sa_len(). From Eric Dumazet.
4) Tread inserting xfrm interface id 0 as an error.
From Antony Antony.
5) Fail if xfrm state or policy is inserted with XFRMA_IF_ID 0,
xfrm interfaces with id 0 are not allowed.
From Antony Antony.
6) Fix inner_ipproto setting in the sec_path for tunnel mode.
From Raed Salem.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2022-01-06
1) Fix some clang_analyzer warnings about never read variables.
From luo penghao.
2) Check for pols[0] only once in xfrm_expand_policies().
From Jean Sacren.
3) The SA curlft.use_time was updated only on SA cration time.
Update whenever the SA is used. From Antony Antony
4) Add support for SM3 secure hash.
From Xu Jia.
5) Add support for SM4 symmetric cipher algorithm.
From Xu Jia.
6) Add a rate limit for SA mapping change messages.
From Antony Antony.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add an xdp_do_redirect_frame() variant which supports pre-computed
xdp_frame structures. This will be used in bpf_prog_run() to avoid having
to write to the xdp_frame structure when the XDP program doesn't modify the
frame boundaries.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220103150812.87914-6-toke@redhat.com
|
|
All map redirect functions except XSK maps convert xdp_buff to xdp_frame
before enqueueing it. So move this conversion of out the map functions
and into xdp_do_redirect(). This removes a bit of duplicated code, but more
importantly it makes it possible to support caller-allocated xdp_frame
structures, which will be added in a subsequent commit.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220103150812.87914-5-toke@redhat.com
|
|
Store the XDP mem ID inside the page_pool struct so it can be retrieved
later for use in bpf_prog_run().
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20220103150812.87914-4-toke@redhat.com
|
|
Add a new callback function to page_pool that, if set, will be called every
time a new page is allocated. This will be used from bpf_test_run() to
initialise the page data with the data provided by userspace when running
XDP programs with redirect turned on.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20220103150812.87914-3-toke@redhat.com
|
|
The functions that register an XDP memory model take a struct xdp_rxq as
parameter, but the RXQ is not actually used for anything other than pulling
out the struct xdp_mem_info that it embeds. So refactor the register
functions and export variants that just take a pointer to the xdp_mem_info.
This is in preparation for enabling XDP_REDIRECT in bpf_prog_run(), using a
page_pool instance that is not connected to any network device.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220103150812.87914-2-toke@redhat.com
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski"
"Networking fixes, including fixes from bpf, and WiFi. One last pull
request, turns out some of the recent fixes did more harm than good.
Current release - regressions:
- Revert "xsk: Do not sleep in poll() when need_wakeup set", made the
problem worse
- Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
__fixed_phy_register", broke EPROBE_DEFER handling
- Revert "net: usb: r8152: Add MAC pass-through support for more
Lenovo Docks", broke setups without a Lenovo dock
Current release - new code bugs:
- selftests: set amt.sh executable
Previous releases - regressions:
- batman-adv: mcast: don't send link-local multicast to mcast routers
Previous releases - always broken:
- ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY
- sctp: hold endpoint before calling cb in
sctp_transport_lookup_process
- mac80211: mesh: embed mesh_paths and mpp_paths into
ieee80211_if_mesh to avoid complicated handling of sub-object
allocation failures
- seg6: fix traceroute in the presence of SRv6
- tipc: fix a kernel-infoleak in __tipc_sendmsg()"
* tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
selftests: set amt.sh executable
Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"
sfc: The RX page_ring is optional
iavf: Fix limit of total number of queues to active queues of VF
i40e: Fix incorrect netdev's real number of RX/TX queues
i40e: Fix for displaying message regarding NVM version
i40e: fix use-after-free in i40e_sync_filters_subtask()
i40e: Fix to not show opcode msg on unsuccessful VF MAC change
ieee802154: atusb: fix uninit value in atusb_set_extended_addr
mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
mac80211: initialize variable have_higher_than_11mbit
sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
netrom: fix copying in user data in nr_setsockopt
udp6: Use Segment Routing Header for dest address if present
icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
seg6: export get_srh() for ICMP handling
Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
ipv6: Do cleanup if attribute validation fails in multipath route
ipv6: Continue processing multipath route even if gateway attribute is invalid
net/fsl: Remove leftover definition in xgmac_mdio
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2022-01-05
this is a pull request of 15 patches for net-next/master.
The first patch is by me and removed an unused variable from the
usb_8dev driver.
Andy Shevchenko contributes a patch for the mcp251x driver, which
removes an unneeded assignment.
Jimmy Assarsson's patch for the kvaser_usb makes use of units.h in the
assignment of frequencies.
Lad Prabhakar provides 2 patches, converting the ti_hecc and the
sja1000 driver to make use of platform_get_irq().
The 10 remaining patches are by Vincent Mailhol. First the etas_es58x
driver populates the net_device::dev_port. The next 5 patches cleanup
the handling of CAN error and CAN RTR messages of all drivers. The
remaining 4 patches enhance the CAN controller mode flag handling and
export it via netlink to user space.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a 7 byte hole after dst->setup and a 4 byte hole after
dst->default_proto. Combining them, we have a single hole of just 3
bytes on 64 bit machines.
Before:
pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
struct list_head list; /* 0 16 */
struct list_head ports; /* 16 16 */
struct raw_notifier_head nh; /* 32 8 */
unsigned int index; /* 40 4 */
struct kref refcount; /* 44 4 */
struct net_device * * lags; /* 48 8 */
bool setup; /* 56 1 */
/* XXX 7 bytes hole, try to pack */
/* --- cacheline 1 boundary (64 bytes) --- */
const struct dsa_device_ops * tag_ops; /* 64 8 */
enum dsa_tag_protocol default_proto; /* 72 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_platform_data * pd; /* 80 8 */
struct list_head rtable; /* 88 16 */
unsigned int lags_len; /* 104 4 */
unsigned int last_switch; /* 108 4 */
/* size: 112, cachelines: 2, members: 13 */
/* sum members: 101, holes: 2, sum holes: 11 */
/* last cacheline: 48 bytes */
};
After:
pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
struct list_head list; /* 0 16 */
struct list_head ports; /* 16 16 */
struct raw_notifier_head nh; /* 32 8 */
unsigned int index; /* 40 4 */
struct kref refcount; /* 44 4 */
struct net_device * * lags; /* 48 8 */
const struct dsa_device_ops * tag_ops; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
enum dsa_tag_protocol default_proto; /* 64 4 */
bool setup; /* 68 1 */
/* XXX 3 bytes hole, try to pack */
struct dsa_platform_data * pd; /* 72 8 */
struct list_head rtable; /* 80 16 */
unsigned int lags_len; /* 96 4 */
unsigned int last_switch; /* 100 4 */
/* size: 104, cachelines: 2, members: 13 */
/* sum members: 101, holes: 1, sum holes: 3 */
/* last cacheline: 40 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
dst->ports is accessed most notably by dsa_master_find_slave(), which is
invoked in the RX path.
dst->lags is accessed by dsa_lag_dev(), which is invoked in the RX path
of tag_dsa.c.
dst->tag_ops, dst->default_proto and dst->pd don't need to be in the
first cache line, so they are moved out by this change.
Before:
pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
struct list_head list; /* 0 16 */
struct raw_notifier_head nh; /* 16 8 */
unsigned int index; /* 24 4 */
struct kref refcount; /* 28 4 */
bool setup; /* 32 1 */
/* XXX 7 bytes hole, try to pack */
const struct dsa_device_ops * tag_ops; /* 40 8 */
enum dsa_tag_protocol default_proto; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_platform_data * pd; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct list_head ports; /* 64 16 */
struct list_head rtable; /* 80 16 */
struct net_device * * lags; /* 96 8 */
unsigned int lags_len; /* 104 4 */
unsigned int last_switch; /* 108 4 */
/* size: 112, cachelines: 2, members: 13 */
/* sum members: 101, holes: 2, sum holes: 11 */
/* last cacheline: 48 bytes */
};
After:
pahole -C dsa_switch_tree net/dsa/slave.o
struct dsa_switch_tree {
struct list_head list; /* 0 16 */
struct list_head ports; /* 16 16 */
struct raw_notifier_head nh; /* 32 8 */
unsigned int index; /* 40 4 */
struct kref refcount; /* 44 4 */
struct net_device * * lags; /* 48 8 */
bool setup; /* 56 1 */
/* XXX 7 bytes hole, try to pack */
/* --- cacheline 1 boundary (64 bytes) --- */
const struct dsa_device_ops * tag_ops; /* 64 8 */
enum dsa_tag_protocol default_proto; /* 72 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_platform_data * pd; /* 80 8 */
struct list_head rtable; /* 88 16 */
unsigned int lags_len; /* 104 4 */
unsigned int last_switch; /* 108 4 */
/* size: 112, cachelines: 2, members: 13 */
/* sum members: 101, holes: 2, sum holes: 11 */
/* last cacheline: 48 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, num_ports is declared as size_t, which is defined as
__kernel_ulong_t, therefore it occupies 8 bytes of memory.
Even switches with port numbers in the range of tens are exotic, so
there is no need for this amount of storage.
Additionally, because the max_num_bridges member right above it is also
4 bytes, it means the compiler needs to add padding between the last 2
fields. By reducing the size, we don't need that padding and can reduce
the struct size.
Before:
pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
struct device * dev; /* 0 8 */
struct dsa_switch_tree * dst; /* 8 8 */
unsigned int index; /* 16 4 */
u32 setup:1; /* 20: 0 4 */
u32 vlan_filtering_is_global:1; /* 20: 1 4 */
u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */
u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */
u32 untag_bridge_pvid:1; /* 20: 4 4 */
u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */
u32 vlan_filtering:1; /* 20: 6 4 */
u32 pcs_poll:1; /* 20: 7 4 */
u32 mtu_enforcement_ingress:1; /* 20: 8 4 */
/* XXX 23 bits hole, try to pack */
struct notifier_block nb; /* 24 24 */
/* XXX last struct has 4 bytes of padding */
void * priv; /* 48 8 */
void * tagger_data; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_chip_data * cd; /* 64 8 */
const struct dsa_switch_ops * ops; /* 72 8 */
u32 phys_mii_mask; /* 80 4 */
/* XXX 4 bytes hole, try to pack */
struct mii_bus * slave_mii_bus; /* 88 8 */
unsigned int ageing_time_min; /* 96 4 */
unsigned int ageing_time_max; /* 100 4 */
struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */
struct devlink * devlink; /* 112 8 */
unsigned int num_tx_queues; /* 120 4 */
unsigned int num_lag_ids; /* 124 4 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int max_num_bridges; /* 128 4 */
/* XXX 4 bytes hole, try to pack */
size_t num_ports; /* 136 8 */
/* size: 144, cachelines: 3, members: 27 */
/* sum members: 132, holes: 2, sum holes: 8 */
/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 16 bytes */
};
After:
pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
struct device * dev; /* 0 8 */
struct dsa_switch_tree * dst; /* 8 8 */
unsigned int index; /* 16 4 */
u32 setup:1; /* 20: 0 4 */
u32 vlan_filtering_is_global:1; /* 20: 1 4 */
u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */
u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */
u32 untag_bridge_pvid:1; /* 20: 4 4 */
u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */
u32 vlan_filtering:1; /* 20: 6 4 */
u32 pcs_poll:1; /* 20: 7 4 */
u32 mtu_enforcement_ingress:1; /* 20: 8 4 */
/* XXX 23 bits hole, try to pack */
struct notifier_block nb; /* 24 24 */
/* XXX last struct has 4 bytes of padding */
void * priv; /* 48 8 */
void * tagger_data; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_chip_data * cd; /* 64 8 */
const struct dsa_switch_ops * ops; /* 72 8 */
u32 phys_mii_mask; /* 80 4 */
/* XXX 4 bytes hole, try to pack */
struct mii_bus * slave_mii_bus; /* 88 8 */
unsigned int ageing_time_min; /* 96 4 */
unsigned int ageing_time_max; /* 100 4 */
struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */
struct devlink * devlink; /* 112 8 */
unsigned int num_tx_queues; /* 120 4 */
unsigned int num_lag_ids; /* 124 4 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int max_num_bridges; /* 128 4 */
unsigned int num_ports; /* 132 4 */
/* size: 136, cachelines: 3, members: 27 */
/* sum members: 128, holes: 1, sum holes: 4 */
/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 8 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct dsa_switch has 9 boolean properties, many of which are in fact
set by drivers for custom behavior (vlan_filtering_is_global,
needs_standalone_vlan_filtering, etc etc). The binary layout of the
structure could be improved. For example, the "bool setup" at the
beginning introduces a gratuitous 7 byte hole in the first cache line.
The change merges all boolean properties into bitfields of an u32, and
places that u32 in the first cache line of the structure, since many
bools are accessed from the data path (untag_bridge_pvid, vlan_filtering,
vlan_filtering_is_global).
We place this u32 after the existing ds->index, which is also 4 bytes in
size. As a positive side effect, ds->tagger_data now fits into the first
cache line too, because 4 bytes are saved.
Before:
pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
bool setup; /* 0 1 */
/* XXX 7 bytes hole, try to pack */
struct device * dev; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
unsigned int index; /* 24 4 */
/* XXX 4 bytes hole, try to pack */
struct notifier_block nb; /* 32 24 */
/* XXX last struct has 4 bytes of padding */
void * priv; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
void * tagger_data; /* 64 8 */
struct dsa_chip_data * cd; /* 72 8 */
const struct dsa_switch_ops * ops; /* 80 8 */
u32 phys_mii_mask; /* 88 4 */
/* XXX 4 bytes hole, try to pack */
struct mii_bus * slave_mii_bus; /* 96 8 */
unsigned int ageing_time_min; /* 104 4 */
unsigned int ageing_time_max; /* 108 4 */
struct dsa_8021q_context * tag_8021q_ctx; /* 112 8 */
struct devlink * devlink; /* 120 8 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int num_tx_queues; /* 128 4 */
bool vlan_filtering_is_global; /* 132 1 */
bool needs_standalone_vlan_filtering; /* 133 1 */
bool configure_vlan_while_not_filtering; /* 134 1 */
bool untag_bridge_pvid; /* 135 1 */
bool assisted_learning_on_cpu_port; /* 136 1 */
bool vlan_filtering; /* 137 1 */
bool pcs_poll; /* 138 1 */
bool mtu_enforcement_ingress; /* 139 1 */
unsigned int num_lag_ids; /* 140 4 */
unsigned int max_num_bridges; /* 144 4 */
/* XXX 4 bytes hole, try to pack */
size_t num_ports; /* 152 8 */
/* size: 160, cachelines: 3, members: 27 */
/* sum members: 141, holes: 4, sum holes: 19 */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 32 bytes */
};
After:
pahole -C dsa_switch net/dsa/slave.o
struct dsa_switch {
struct device * dev; /* 0 8 */
struct dsa_switch_tree * dst; /* 8 8 */
unsigned int index; /* 16 4 */
u32 setup:1; /* 20: 0 4 */
u32 vlan_filtering_is_global:1; /* 20: 1 4 */
u32 needs_standalone_vlan_filtering:1; /* 20: 2 4 */
u32 configure_vlan_while_not_filtering:1; /* 20: 3 4 */
u32 untag_bridge_pvid:1; /* 20: 4 4 */
u32 assisted_learning_on_cpu_port:1; /* 20: 5 4 */
u32 vlan_filtering:1; /* 20: 6 4 */
u32 pcs_poll:1; /* 20: 7 4 */
u32 mtu_enforcement_ingress:1; /* 20: 8 4 */
/* XXX 23 bits hole, try to pack */
struct notifier_block nb; /* 24 24 */
/* XXX last struct has 4 bytes of padding */
void * priv; /* 48 8 */
void * tagger_data; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_chip_data * cd; /* 64 8 */
const struct dsa_switch_ops * ops; /* 72 8 */
u32 phys_mii_mask; /* 80 4 */
/* XXX 4 bytes hole, try to pack */
struct mii_bus * slave_mii_bus; /* 88 8 */
unsigned int ageing_time_min; /* 96 4 */
unsigned int ageing_time_max; /* 100 4 */
struct dsa_8021q_context * tag_8021q_ctx; /* 104 8 */
struct devlink * devlink; /* 112 8 */
unsigned int num_tx_queues; /* 120 4 */
unsigned int num_lag_ids; /* 124 4 */
/* --- cacheline 2 boundary (128 bytes) --- */
unsigned int max_num_bridges; /* 128 4 */
/* XXX 4 bytes hole, try to pack */
size_t num_ports; /* 136 8 */
/* size: 144, cachelines: 3, members: 27 */
/* sum members: 132, holes: 2, sum holes: 8 */
/* sum bitfield members: 9 bits, bit holes: 1, sum bit holes: 23 bits */
/* paddings: 1, sum paddings: 4 */
/* last cacheline: 16 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Both dsa_port :: type and dsa_port :: index introduce a 4 octet hole
after them, so we can group them together and the holes would be
eliminated, turning 16 octets of storage into just 8. This makes the
cpu_dp pointer fit in the first cache line, which is good, because
dsa_slave_to_master(), called by dsa_enqueue_skb(), uses it.
Before:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_switch * ds; /* 40 8 */
unsigned int index; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
const char * name; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_port * cpu_dp; /* 64 8 */
u8 mac[6]; /* 72 6 */
u8 stp_state; /* 78 1 */
u8 vlan_filtering:1; /* 79: 0 1 */
u8 learning:1; /* 79: 1 1 */
u8 lag_tx_enabled:1; /* 79: 2 1 */
u8 devlink_port_setup:1; /* 79: 3 1 */
u8 setup:1; /* 79: 4 1 */
/* XXX 3 bits hole, try to pack */
struct device_node * dn; /* 80 8 */
unsigned int ageing_time; /* 88 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_bridge * bridge; /* 96 8 */
struct devlink_port devlink_port; /* 104 288 */
/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
struct phylink * pl; /* 392 8 */
struct phylink_config pl_config; /* 400 40 */
struct net_device * lag_dev; /* 440 8 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct net_device * hsr_dev; /* 448 8 */
struct list_head list; /* 456 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 472 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 480 8 */
struct mutex addr_lists_lock; /* 488 32 */
/* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
struct list_head fdbs; /* 520 16 */
struct list_head mdbs; /* 536 16 */
/* size: 552, cachelines: 9, members: 30 */
/* sum members: 539, holes: 3, sum holes: 12 */
/* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
/* last cacheline: 40 bytes */
};
After:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
struct dsa_switch * ds; /* 32 8 */
unsigned int index; /* 40 4 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 44 4 */
const char * name; /* 48 8 */
struct dsa_port * cpu_dp; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
u8 mac[6]; /* 64 6 */
u8 stp_state; /* 70 1 */
u8 vlan_filtering:1; /* 71: 0 1 */
u8 learning:1; /* 71: 1 1 */
u8 lag_tx_enabled:1; /* 71: 2 1 */
u8 devlink_port_setup:1; /* 71: 3 1 */
u8 setup:1; /* 71: 4 1 */
/* XXX 3 bits hole, try to pack */
struct device_node * dn; /* 72 8 */
unsigned int ageing_time; /* 80 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_bridge * bridge; /* 88 8 */
struct devlink_port devlink_port; /* 96 288 */
/* --- cacheline 6 boundary (384 bytes) --- */
struct phylink * pl; /* 384 8 */
struct phylink_config pl_config; /* 392 40 */
struct net_device * lag_dev; /* 432 8 */
struct net_device * hsr_dev; /* 440 8 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct list_head list; /* 448 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 464 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 472 8 */
struct mutex addr_lists_lock; /* 480 32 */
/* --- cacheline 8 boundary (512 bytes) --- */
struct list_head fdbs; /* 512 16 */
struct list_head mdbs; /* 528 16 */
/* size: 544, cachelines: 9, members: 30 */
/* sum members: 539, holes: 1, sum holes: 4 */
/* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
/* last cacheline: 32 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct dsa_port has 5 bool members which create quite a number of 7 byte
holes in the structure layout. By merging them all into bitfields of an
u8, and placing that u8 in the 1-byte hole after dp->mac and dp->stp_state,
we can reduce the structure size from 576 bytes to 552 bytes on arm64.
Before:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_switch * ds; /* 40 8 */
unsigned int index; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
const char * name; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_port * cpu_dp; /* 64 8 */
u8 mac[6]; /* 72 6 */
u8 stp_state; /* 78 1 */
/* XXX 1 byte hole, try to pack */
struct device_node * dn; /* 80 8 */
unsigned int ageing_time; /* 88 4 */
bool vlan_filtering; /* 92 1 */
bool learning; /* 93 1 */
/* XXX 2 bytes hole, try to pack */
struct dsa_bridge * bridge; /* 96 8 */
struct devlink_port devlink_port; /* 104 288 */
/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
bool devlink_port_setup; /* 392 1 */
/* XXX 7 bytes hole, try to pack */
struct phylink * pl; /* 400 8 */
struct phylink_config pl_config; /* 408 40 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct net_device * lag_dev; /* 448 8 */
bool lag_tx_enabled; /* 456 1 */
/* XXX 7 bytes hole, try to pack */
struct net_device * hsr_dev; /* 464 8 */
struct list_head list; /* 472 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */
struct mutex addr_lists_lock; /* 504 32 */
/* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
struct list_head fdbs; /* 536 16 */
struct list_head mdbs; /* 552 16 */
bool setup; /* 568 1 */
/* size: 576, cachelines: 9, members: 30 */
/* sum members: 544, holes: 6, sum holes: 25 */
/* padding: 7 */
};
After:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_switch * ds; /* 40 8 */
unsigned int index; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
const char * name; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_port * cpu_dp; /* 64 8 */
u8 mac[6]; /* 72 6 */
u8 stp_state; /* 78 1 */
u8 vlan_filtering:1; /* 79: 0 1 */
u8 learning:1; /* 79: 1 1 */
u8 lag_tx_enabled:1; /* 79: 2 1 */
u8 devlink_port_setup:1; /* 79: 3 1 */
u8 setup:1; /* 79: 4 1 */
/* XXX 3 bits hole, try to pack */
struct device_node * dn; /* 80 8 */
unsigned int ageing_time; /* 88 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_bridge * bridge; /* 96 8 */
struct devlink_port devlink_port; /* 104 288 */
/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
struct phylink * pl; /* 392 8 */
struct phylink_config pl_config; /* 400 40 */
struct net_device * lag_dev; /* 440 8 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct net_device * hsr_dev; /* 448 8 */
struct list_head list; /* 456 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 472 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 480 8 */
struct mutex addr_lists_lock; /* 488 32 */
/* --- cacheline 8 boundary (512 bytes) was 8 bytes ago --- */
struct list_head fdbs; /* 520 16 */
struct list_head mdbs; /* 536 16 */
/* size: 552, cachelines: 9, members: 30 */
/* sum members: 539, holes: 3, sum holes: 12 */
/* sum bitfield members: 5 bits, bit holes: 1, sum bit holes: 3 bits */
/* last cacheline: 40 bytes */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The MAC address of a port is 6 octets in size, and this creates a 2
octet hole after it. There are some other u8 members of struct dsa_port
that we can put in that hole. One such member is the stp_state.
Before:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_switch * ds; /* 40 8 */
unsigned int index; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
const char * name; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_port * cpu_dp; /* 64 8 */
u8 mac[6]; /* 72 6 */
/* XXX 2 bytes hole, try to pack */
struct device_node * dn; /* 80 8 */
unsigned int ageing_time; /* 88 4 */
bool vlan_filtering; /* 92 1 */
bool learning; /* 93 1 */
u8 stp_state; /* 94 1 */
/* XXX 1 byte hole, try to pack */
struct dsa_bridge * bridge; /* 96 8 */
struct devlink_port devlink_port; /* 104 288 */
/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
bool devlink_port_setup; /* 392 1 */
/* XXX 7 bytes hole, try to pack */
struct phylink * pl; /* 400 8 */
struct phylink_config pl_config; /* 408 40 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct net_device * lag_dev; /* 448 8 */
bool lag_tx_enabled; /* 456 1 */
/* XXX 7 bytes hole, try to pack */
struct net_device * hsr_dev; /* 464 8 */
struct list_head list; /* 472 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */
struct mutex addr_lists_lock; /* 504 32 */
/* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
struct list_head fdbs; /* 536 16 */
struct list_head mdbs; /* 552 16 */
bool setup; /* 568 1 */
/* size: 576, cachelines: 9, members: 30 */
/* sum members: 544, holes: 6, sum holes: 25 */
/* padding: 7 */
};
After:
pahole -C dsa_port net/dsa/slave.o
struct dsa_port {
union {
struct net_device * master; /* 0 8 */
struct net_device * slave; /* 0 8 */
}; /* 0 8 */
const struct dsa_device_ops * tag_ops; /* 8 8 */
struct dsa_switch_tree * dst; /* 16 8 */
struct sk_buff * (*rcv)(struct sk_buff *, struct net_device *); /* 24 8 */
enum {
DSA_PORT_TYPE_UNUSED = 0,
DSA_PORT_TYPE_CPU = 1,
DSA_PORT_TYPE_DSA = 2,
DSA_PORT_TYPE_USER = 3,
} type; /* 32 4 */
/* XXX 4 bytes hole, try to pack */
struct dsa_switch * ds; /* 40 8 */
unsigned int index; /* 48 4 */
/* XXX 4 bytes hole, try to pack */
const char * name; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct dsa_port * cpu_dp; /* 64 8 */
u8 mac[6]; /* 72 6 */
u8 stp_state; /* 78 1 */
/* XXX 1 byte hole, try to pack */
struct device_node * dn; /* 80 8 */
unsigned int ageing_time; /* 88 4 */
bool vlan_filtering; /* 92 1 */
bool learning; /* 93 1 */
/* XXX 2 bytes hole, try to pack */
struct dsa_bridge * bridge; /* 96 8 */
struct devlink_port devlink_port; /* 104 288 */
/* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */
bool devlink_port_setup; /* 392 1 */
/* XXX 7 bytes hole, try to pack */
struct phylink * pl; /* 400 8 */
struct phylink_config pl_config; /* 408 40 */
/* --- cacheline 7 boundary (448 bytes) --- */
struct net_device * lag_dev; /* 448 8 */
bool lag_tx_enabled; /* 456 1 */
/* XXX 7 bytes hole, try to pack */
struct net_device * hsr_dev; /* 464 8 */
struct list_head list; /* 472 16 */
const struct ethtool_ops * orig_ethtool_ops; /* 488 8 */
const struct dsa_netdevice_ops * netdev_ops; /* 496 8 */
struct mutex addr_lists_lock; /* 504 32 */
/* --- cacheline 8 boundary (512 bytes) was 24 bytes ago --- */
struct list_head fdbs; /* 536 16 */
struct list_head mdbs; /* 552 16 */
bool setup; /* 568 1 */
/* size: 576, cachelines: 9, members: 30 */
/* sum members: 544, holes: 6, sum holes: 25 */
/* padding: 7 */
};
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a couple of helpers and definitions to extract the clause 45 regad
and devad fields from the regnum passed into MDIO drivers.
Tested-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, the CAN netlink interface provides no easy ways to check
the capabilities of a given controller. The only method from the
command line is to try each CAN_CTRLMODE_* individually to check
whether the netlink interface returns an -EOPNOTSUPP error or not
(alternatively, one may find it easier to directly check the source
code of the driver instead...)
This patch introduces a method for the user to check both the
supported and the static capabilities. The proposed method introduces
a new IFLA nest: IFLA_CAN_CTRLMODE_EXT which extends the current
IFLA_CAN_CTRLMODE. This is done to guaranty a full forward and
backward compatibility between the kernel and the user land
applications.
The IFLA_CAN_CTRLMODE_EXT nest contains one single entry:
IFLA_CAN_CTRLMODE_SUPPORTED. Because this entry is only used in one
direction: kernel to userland, no new struct nla_policy are
introduced.
Below table explains how IFLA_CAN_CTRLMODE_SUPPORTED (hereafter:
"supported") and can_ctrlmode::flags (hereafter: "flags") allow us to
identify both the supported and the static capabilities, when masked
with any of the CAN_CTRLMODE_* bit flags:
supported & flags & Controller capabilities
CAN_CTRLMODE_* CAN_CTRLMODE_*
-----------------------------------------------------------------------
false false Feature not supported (always disabled)
false true Static feature (always enabled)
true false Feature supported but disabled
true true Feature supported and enabled
Link: https://lore.kernel.org/all/20211213160226.56219-5-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|
|
Save eight bytes of holes on x86-64 architectures by reordering the
members of struct can_priv.
Before:
| $ pahole -C can_priv drivers/net/can/dev/dev.o
| struct can_priv {
| struct net_device * dev; /* 0 8 */
| struct can_device_stats can_stats; /* 8 24 */
| const struct can_bittiming_const * bittiming_const; /* 32 8 */
| const struct can_bittiming_const * data_bittiming_const; /* 40 8 */
| struct can_bittiming bittiming; /* 48 32 */
| /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
| struct can_bittiming data_bittiming; /* 80 32 */
| const struct can_tdc_const * tdc_const; /* 112 8 */
| struct can_tdc tdc; /* 120 12 */
| /* --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- */
| unsigned int bitrate_const_cnt; /* 132 4 */
| const u32 * bitrate_const; /* 136 8 */
| const u32 * data_bitrate_const; /* 144 8 */
| unsigned int data_bitrate_const_cnt; /* 152 4 */
| u32 bitrate_max; /* 156 4 */
| struct can_clock clock; /* 160 4 */
| unsigned int termination_const_cnt; /* 164 4 */
| const u16 * termination_const; /* 168 8 */
| u16 termination; /* 176 2 */
|
| /* XXX 6 bytes hole, try to pack */
|
| struct gpio_desc * termination_gpio; /* 184 8 */
| /* --- cacheline 3 boundary (192 bytes) --- */
| u16 termination_gpio_ohms[2]; /* 192 4 */
| enum can_state state; /* 196 4 */
| u32 ctrlmode; /* 200 4 */
| u32 ctrlmode_supported; /* 204 4 */
| int restart_ms; /* 208 4 */
|
| /* XXX 4 bytes hole, try to pack */
|
| struct delayed_work restart_work; /* 216 88 */
|
| /* XXX last struct has 4 bytes of padding */
|
| /* --- cacheline 4 boundary (256 bytes) was 48 bytes ago --- */
| int (*do_set_bittiming)(struct net_device *); /* 304 8 */
| int (*do_set_data_bittiming)(struct net_device *); /* 312 8 */
| /* --- cacheline 5 boundary (320 bytes) --- */
| int (*do_set_mode)(struct net_device *, enum can_mode); /* 320 8 */
| int (*do_set_termination)(struct net_device *, u16); /* 328 8 */
| int (*do_get_state)(const struct net_device *, enum can_state *); /* 336 8 */
| int (*do_get_berr_counter)(const struct net_device *, struct can_berr_counter *); /* 344 8 */
| unsigned int echo_skb_max; /* 352 4 */
|
| /* XXX 4 bytes hole, try to pack */
|
| struct sk_buff * * echo_skb; /* 360 8 */
|
| /* size: 368, cachelines: 6, members: 32 */
| /* sum members: 354, holes: 3, sum holes: 14 */
| /* paddings: 1, sum paddings: 4 */
| /* last cacheline: 48 bytes */
| };
After:
| $ pahole -C can_priv drivers/net/can/dev/dev.o
| struct can_priv {
| struct net_device * dev; /* 0 8 */
| struct can_device_stats can_stats; /* 8 24 */
| const struct can_bittiming_const * bittiming_const; /* 32 8 */
| const struct can_bittiming_const * data_bittiming_const; /* 40 8 */
| struct can_bittiming bittiming; /* 48 32 */
| /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
| struct can_bittiming data_bittiming; /* 80 32 */
| const struct can_tdc_const * tdc_const; /* 112 8 */
| struct can_tdc tdc; /* 120 12 */
| /* --- cacheline 2 boundary (128 bytes) was 4 bytes ago --- */
| unsigned int bitrate_const_cnt; /* 132 4 */
| const u32 * bitrate_const; /* 136 8 */
| const u32 * data_bitrate_const; /* 144 8 */
| unsigned int data_bitrate_const_cnt; /* 152 4 */
| u32 bitrate_max; /* 156 4 */
| struct can_clock clock; /* 160 4 */
| unsigned int termination_const_cnt; /* 164 4 */
| const u16 * termination_const; /* 168 8 */
| u16 termination; /* 176 2 */
|
| /* XXX 6 bytes hole, try to pack */
|
| struct gpio_desc * termination_gpio; /* 184 8 */
| /* --- cacheline 3 boundary (192 bytes) --- */
| u16 termination_gpio_ohms[2]; /* 192 4 */
| unsigned int echo_skb_max; /* 196 4 */
| struct sk_buff * * echo_skb; /* 200 8 */
| enum can_state state; /* 208 4 */
| u32 ctrlmode; /* 212 4 */
| u32 ctrlmode_supported; /* 216 4 */
| int restart_ms; /* 220 4 */
| struct delayed_work restart_work; /* 224 88 */
|
| /* XXX last struct has 4 bytes of padding */
|
| /* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */
| int (*do_set_bittiming)(struct net_device *); /* 312 8 */
| /* --- cacheline 5 boundary (320 bytes) --- */
| int (*do_set_data_bittiming)(struct net_device *); /* 320 8 */
| int (*do_set_mode)(struct net_device *, enum can_mode); /* 328 8 */
| int (*do_set_termination)(struct net_device *, u16); /* 336 8 */
| int (*do_get_state)(const struct net_device *, enum can_state *); /* 344 8 */
| int (*do_get_berr_counter)(const struct net_device *, struct can_berr_counter *); /* 352 8 */
|
| /* size: 360, cachelines: 6, members: 32 */
| /* sum members: 354, holes: 1, sum holes: 6 */
| /* paddings: 1, sum paddings: 4 */
| /* last cacheline: 40 bytes */
| };
Link: https://lore.kernel.org/all/20211213160226.56219-4-mailhol.vincent@wanadoo.fr
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
|