Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commit 2ef8e39f58f08589ab035223c2687830c0eba30f, reversing
changes made to e7ce9fc9ad38773b660ef663ae98df4f93cb6a37.
There are build warnings here which break the normal
build due to -Werror. Ratheesh was nice enough to quickly
follow up with fixes but didn't hit all the warnings I
see on GCC 12 so to unlock net-next from taking patches
let get this series out for now.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
TX doorbells may be postponed, because sometimes the driver knows that
another packet follows (for example, when xmit_more is true, or when a
MPWQE session is closed before transmitting a packet).
However, the DMA mapping may fail for the next packet, in which case a
new WQE is not posted, the doorbell isn't updated either, and the
transmission of the previous packet will be delayed indefinitely.
This commit fixes the described rare error flow by posting a NOP and
ringing the doorbell on errors to flush all the previous packets. The
MPWQE session is closed before that. DMA mapping in the MPWQE flow is
moved to the beginning of mlx5e_sq_xmit_mpwqe, because empty sessions
are not allowed. Stop room always has enough space for a NOP, because
the actual TX WQE is not posted.
Fixes: e586b3b0baee ("net/mlx5: Ethernet Datapath files")
Signed-off-by: Maxim Mikityanskiy <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
The existing capability check for vnic env counters only checks for
receive steering discards, although we need the counters update for the
exposed internal queue oob counter as well. This could result in the
latter counter not being updated correctly when the receive steering
discards counter is not supported.
Fix that by checking whether any counter is supported instead of only
the steering counter capability.
Fixes: 0cfafd4b4ddf ("net/mlx5e: Add device out of buffer counter")
Signed-off-by: Gal Pressman <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Allocate a ct priv workqueue instead of using mlx5e priv one
so flushing will only be of related CT entries.
Also move flushing of the workqueue before rhashtable destroy
otherwise entries won't be valid.
Fixes: b069e14fff46 ("net/mlx5e: CT: Fix queued up restore put() executing after relevant ft release")
Signed-off-by: Roi Dayan <[email protected]>
Reviewed-by: Oz Shlomo <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
mode & mode_flags is updated at the end of mlx5_activate_lag which
may not reflect the actual mode as shown in below logic:
mlx5_activate_lag(struct mlx5_lag *ldev,
|-- unsigned long flags = 0;
|-- err = mlx5_lag_set_flags(ldev, mode, tracker, shared_fdb, &flags);
|-- err = mlx5_create_lag(ldev, tracker, mode, flags);
|-- mlx5_get_str_port_sel_mode(ldev);
|-- ldev->mode = mode;
|-- ldev->mode_flags = flags;
Use mode & flag as parameters to get port select mode info.
Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Liu, Changcheng <[email protected]>
Reviewed-by: Eli Cohen <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
There is a total of four 4M entries flow tables. In sriov disabled
mode, ct, ct_nat and post_act take three of them. When adding the
first tc nic rule in this mode, it will take another 4M table
for the tc <chain,prio> table. If user then enables sriov, the legacy
flow table tries to take another 4M and fails, and so enablement fails.
To fix that, have legacy fdb take the next available maximum
size from the fs ft pool.
Fixes: 4a98544d1827 ("net/mlx5: Move chains ft pool to be used by all firmware steering")
Signed-off-by: Paul Blakey <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Use the correct constant (TLS_DRIVER_STATE_SIZE_RX) in the comparison
against the size of the private RX TLS driver context.
Fixes: 1182f3659357 ("net/mlx5e: kTLS, Add kTLS RX HW offload support")
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Use the correct constant (TLS_DRIVER_STATE_SIZE_TX) in the comparison
against the size of the private TX TLS driver context.
Fixes: df8d866770f9 ("net/mlx5e: kTLS, Use kernel API to extract private offload context")
Signed-off-by: Tariq Toukan <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Multiport eswitch is required to use native FDB selection instead of
affinity, This was achieved by passing the shared_fdb flag down
the HW lag creation path. While it did accomplish the goal of setting
FDB selection mode to native, it had the side effect of also
creating a shared FDB configuration.
This created a few issues:
- TC rules are inserted into a non active FDB, which means traffic isn't
offloaded as all traffic will reach only a single FDB.
- All wire traffic is treated as if a single physical port received it; while
this is true for a bond configuration, this shouldn't be the case for
multiport eswitch.
Create a new flag MLX5_LAG_MODE_FLAG_FDB_SEL_MODE_NATIVE
to indicate what FDB selection mode should be used.
Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Mark Bloch <[email protected]>
Reviewed-by: Eli Cohen <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Redirecting traffic from uplink to a VF is a legal operation of
mulitport eswitch mode. Remove the limitation.
Fixes: 94db33177819 ("net/mlx5: Support multiport eswitch mode")
Signed-off-by: Eli Cohen <[email protected]>
Reviewed-by: Maor Dickman <[email protected]>
Reviewed-by: Mark Bloch <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Since previous patch has replaced hinic_dev.tx_stats/rx_stats with local
variable in hinic_get_stats64(), there is nothing need to be protected
by lock, so just removing down()/up() is ok.
Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
Signed-off-by: Qiao Ma <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Function hinic_get_stats64() will do two operations:
1. reads stats from every hinic_rxq/txq and accumulates them
2. calls hinic_rxq/txq_clean_stats() to clean every rxq/txq's stats
For hinic_get_stats64(), it could get right data, because it sums all
data to nic_dev->rx_stats/tx_stats.
But it is wrong for get_drv_queue_stats(), this function will read
hinic_rxq's stats, which have been cleared to zero by hinic_get_stats64().
I have observed hinic's cleanup operation by using such command:
> watch -n 1 "cat ethtool -S eth4 | tail -40"
Result before:
...
rxq7_pkts: 1
rxq7_bytes: 90
rxq7_errors: 0
rxq7_csum_errors: 0
rxq7_other_errors: 0
...
rxq9_pkts: 11
rxq9_bytes: 726
rxq9_errors: 0
rxq9_csum_errors: 0
rxq9_other_errors: 0
...
rxq11_pkts: 0
rxq11_bytes: 0
rxq11_errors: 0
rxq11_csum_errors: 0
rxq11_other_errors: 0
Result after a few seconds:
...
rxq7_pkts: 0
rxq7_bytes: 0
rxq7_errors: 0
rxq7_csum_errors: 0
rxq7_other_errors: 0
...
rxq9_pkts: 2
rxq9_bytes: 132
rxq9_errors: 0
rxq9_csum_errors: 0
rxq9_other_errors: 0
...
rxq11_pkts: 1
rxq11_bytes: 170
rxq11_errors: 0
rxq11_csum_errors: 0
rxq11_other_errors: 0
To solve this problem, we just keep every queue's total stats in their own
queue (aka hinic_{rxq|txq}), and simply sum all per-queue stats every time
calling hinic_get_stats64().
With that solution, there is no need to clean per-queue stats now,
and there is no need to maintain global hinic_dev.{tx|rx}_stats, too.
Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
Signed-off-by: Qiao Ma <[email protected]>
Reported-by: kernel test robot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Referenced commit prepared the code for upcoming extension that allows mlx5
to offload police action attached to flower classifier. However, with
regard to existing matchall classifier offload validation should be
reversed as FLOW_ACTION_CONTINUE is the only supported notexceed police
action type. Fix the problem by allowing FLOW_ACTION_CONTINUE for police
action and extend scan_tc_matchall_fdb_actions() to only allow such actions
with matchall classifier.
Fixes: d97b4b105ce7 ("flow_offload: reject offload for all drivers with invalid police parameters")
Signed-off-by: Vlad Buslov <[email protected]>
Acked-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Enabled EXACT match flag in Kex default profile. Since
there is no space in key, NPC_PARSE_NIBBLE_ERRCODE
is removed
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
NPC exact match table can support more entries than RPM
dmac filters. This requires field size of DMAC filter count
and index to be increased.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If exact match table is suppoted, call functions to add/del/update
entries in exact match table instead of RPM dmac filters
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
These functions are wrappers for mac add/addr/del/update in
exact match table. These will be invoked from mbox handler routines
if exact matct table is supported and enabled.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Exact match table modification requires wider fields as it has
more number of slots to fill in. Modifying an entry in exact match
table may cause hash collision and may be required to delete entry
from 4-way 2K table and add to fully associative 32 entry CAM table.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There debugfs files created.
1. General information on exact match table
2. Exact match table entries.
3. NPC mcam drop on hit count stats.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
NPC exact match table installs drop on hit rules in
NPC mcam for each channel. This rule has broadcast and multicast
bits cleared. Exact match bit cleared and channel bits
set. If exact match table hit bit is 0, corresponding NPC mcam
drop rule will be hit for the packet and will be dropped.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
FLR handler should remove/free all exact match table resources
corresponding to each interface.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
CN10KB silicon supports Exact match feature. This feature can be disabled
through devlink configuration. Devlink command fails if DMAC filter rules
are already present. Once disabled, legacy RPM based DMAC filters will be
configured.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
CN10KB silicon supports exact match table. Scanning KEX
profile should check for exact match feature is enabled
and then set profile masks properly.
These kex profile masks are required to configure NPC
MCAM drop rules. If there is a miss in exact match table,
these drop rules will drop those packets.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
CN10KB silicon has support for exact match table. This table
can be used to match maimum 64 bit value of KPU parsed output.
Hit/non hit in exact match table can be used as a KEX key to
NPC mcam.
This patch makes use of Exact match table to increase number of
DMAC filters supported. NPC mcam is no more need for each of these
DMAC entries as will be populated in Exact match table.
This patch implements following
1. Initialization of exact match table only for CN10KB.
2. Add/del/update interface function for exact match table.
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
CN10KB variant of CN10K series of silicons supports
a new feature where in a large protocol field
(eg 128bit IPv6 DIP) can be condensed into a small
hashed 32bit data. This saves a lot of space in MCAM key
and allows user to add more protocol fields into the filter.
A max of two such protocol data can be hashed.
This patch adds support for hashing IPv6 SIP and/or DIP.
Signed-off-by: Suman Ghosh <[email protected]>
Signed-off-by: Ratheesh Kannoth <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We can benefit from TSO when the host CPU is not powerful enough,
so enable it by default now.
Signed-off-by: Yinjun Zhang <[email protected]>
Reviewed-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Packets with metadata prepended can be correctly handled in
firmware when TSO is enabled, now remove the error path and
related comments. Since there's no existing firmware that
uses prepended metadata, no need to add compatibility check
here.
Signed-off-by: Yinjun Zhang <[email protected]>
Reviewed-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Instead of counting the child nodes in the device tree, hardcode the
number of ports in the driver itself. The counting won't work at all
if an ethernet port is marked as disabled, e.g. because it is not
connected on the board at all.
It turns out that the LAN9662 and LAN9668 use the same switching IP
with the same synthesis parameters. The only difference is that the
output ports are not connected. Thus, we can just hardcode the
number of physical ports to 8.
Fixes: db8bcaad5393 ("net: lan966x: add the basic lan966x driver")
Signed-off-by: Michael Walle <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The last meaningful change to this driver was made by Jon in 2011.
As much as we'd like to believe that this is because the code is
perfect the chances are nobody is using this hardware.
Because of the size of this driver there is a nontrivial maintenance
cost to keeping this code around, in the last 2 years we're averaging
more than 1 change a month. Some of which require nontrivial review
effort, see commit 877fe9d49b74 ("Revert "drivers/net/ethernet/neterion/vxge:
Fix a use-after-free bug in vxge-main.c"") for example.
Let's try to remove this driver. In general, IMHO, we need to
establish a clear path for shedding dead code. It will be hard
to unless we have some experience trying to delete stuff.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them.
It is less verbose and it improves the semantic.
While at it, remove a useless bitmap_zero(). The bitmap is already zeroed
when allocated.
Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://lore.kernel.org/r/8a2168ef9871bd9c4f1cf19b8d5f7530662a5d15.1656866770.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <[email protected]>
|
|
We can't use the division operator on 64 bits integers, that breaks
32 bits build. Instead use the relevant helper.
Fixes: 6ddac26cf763 ("net/mlx5e: Add support to modify hardware flow meter parameters")
Acked-by: Saeed Mahameed <[email protected]>
Link: https://lore.kernel.org/r/ecb00ddd1197b4f8a4882090206bd2eee1eb8b5b.1657005206.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Fix spelling of 'waitting' in comments.
remove unnecessary space of 'MDIO_COMMAND_REG 's'.
Signed-off-by: Zhang Jiaming <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
During a reset, there may have been transmits in flight that are no
longer valid and cannot be fulfilled. Resetting and clearing the
queues is insufficient; each skb also needs to be explicitly freed
so that upper levels are not left waiting for confirmation of a
transmit that will never happen. If this happens frequently enough,
the apparent backlog will cause TCP to begin "congestion control"
unnecessarily, culminating in permanently decreased throughput.
Fixes: d7c0ef36bde03 ("ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change")
Tested-by: Nick Child <[email protected]>
Reviewed-by: Brian King <[email protected]>
Signed-off-by: Rick Lindsley <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support for TX VLAN ctag insert
which may be configured via ethtool.
e.g.
# ethtool -K $DEV tx-vlan-offload on
The NIC supplies VLAN insert information as packet metadata.
The fields of this VLAN metadata are gotten from sk_buff, including
vlan_proto and vlan tag.
Configuration control bit NFP_NET_CFG_CTRL_TXVLAN_V2 is to
signal availability of ctag-insert features of the firmware.
NFDK is used to communicate via PCIE to NFP-3800 based NICs
while NFD3 is used for other NICs supported by the NFP driver.
The metadata format on tx side of NFD3 is different from NFDK.
This feature is not currently implemented for NFDK.
Signed-off-by: Diana Wang <[email protected]>
Reviewed-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support for RX VLAN ctag/stag strip
which may be configured via ethtool.
e.g.
# ethtool -K $DEV rx-vlan-offload on
# ethtool -K $DEV rx-vlan-stag-hw-parse on
Ctag-stripped and stag-stripped cannot be enabled at the same time
because currently the kernel supports only one layer of VLAN stripping.
The NIC supplies VLAN strip information as packet metadata.
The fields of this VLAN metadata are:
* strip flag: 1 for stripped; 0 for unstripped
* tci: VLAN TCI ID
* tpid: 1 for ETH_P_8021AD; 0 for ETH_P_8021Q
Configuration control bits NFP_NET_CFG_CTRL_RXVLAN_V2 and
NFP_NET_CFG_CTRL_RXQINQ are to signal availability of
ctag-strip and stag-strip features of the firmware.
Signed-off-by: Diana Wang <[email protected]>
Reviewed-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Some structures and defines were added with '_ub_' indication, as there
were equivalent objects for the legacy model.
Now when the legacy model is not used anymore, remove the '_ub_'
indication.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The flood_index() function is not needed anymore, as in the unified
bridge model the flood index is calculated using 'mid_base' and
'fid_offset'.
Remove this function.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After all the preparations for unified bridge model, finally flip mlxsw
driver to use the new model.
Change config profile, set 'ubridge' to true and remove the configurations
that are relevant only for the legacy model. Set 'flood_mode' to
'controlled' as the current mode is not supported with unified bridge
model.
Remove all the code which is dedicated to the legacy model. Remove
'struct mlxsw_sp.ubridge' variable which was temporarily added to separate
configurations between the models.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The unified bridge model is enabled via the CONFIG_PROFILE command
during driver initialization. Add the definition of the relevant fields
to the command's payload in preparation for unified bridge enablement.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Using the legacy bridge model, there is no VID classification at egress
for 802.1Q FIDs, which means that the VID is maintained.
This behavior cause the limitation that 802.1Q FIDs cannot work with VXLAN.
This limitation stems from the fact that a decapsulated VXLAN packet should
not contain a VLAN tag. If such a packet was to egress from a local port
using a 802.1Q FID, it would "maintain" its VLAN on egress, which is no
VLAN at all.
Currently 802.1Q FIDs are emulated in mlxsw driver using 802.1D FIDs. Using
unified bridge model, there is a FID->VID mapping, so it is possible to
stop emulating 802.1Q FIDs.
The main changes are:
1. Use 'SFGC.bridge_type' = 0, to separate between 802.1Q FIDs and
802.1D FIDs.
2. Use VLAN RIF instead of the emulated one (VLAN_EMU which is emulated
using FID RIF).
3. Create VID->FID mapping when the FID is created. Then when a new port
is mapped to the FID, if it not in virtual mode, no new mapping is
needed. Save the new port in 'port_vid_list', to be able to update a
RIF in all {Port, VID}->FID mappings in case that the port will be in
virtual mode later.
4. Add a dedicated operation function per FID family to update RIF for
VID->FID mappings. For 802.1d and rFID families, just return. For
802.1q family, handle the global mapping which is created for new 802.1q
FID.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the unified bridge model, mlxsw will no longer emulate 802.1Q FIDs
using 802.1D FIDs. The new FID table will look as follows:
+---------------+
| 802.1q FIDs | 4K entries
| [1..4094] |
+---------------+
| 802.1d FIDs | 1K entries
| [4095..5118] |
+---------------+
| Dummy FIDs | 1 entry
| [5119..5119] |
+---------------+
| rFIDs | 11K entries
| [5120..16383] |
+---------------+
In order to make the change easier to review, four new temporary FID
families will be added (e.g., MLXSW_SP_FID_TYPE_8021D_UB) and will not
be registered with the FID core until mlxsw is flipped to use the unified
bridge model.
Add .1d, rfid and dummy FID families for unified bridge, the next patch
will add .1q family separately as it requires more changes.
The following changes are required:
1. Add 'smpe_index_valid' field to 'struct mlxsw_sp_fid_family' and set
SFMR.smpe accordingly. SMPE index is reserved for rFIDs, as their
flooding is handled by firmware, and always reserved in Spectrum-1,
as it is configured as part of PGT table.
2. Add 'ubridge' field to 'struct mlxsw_sp_fid_family'. This field will
be removed later, use it in mlxsw_sp_fid_family_{register,unregister}()
to skip the registration / unregistration of the new families when the
legacy model is used.
3. Indexes - the start and end indexes of each FID family will need to be
changed according to the above diagram.
4. Add flood tables for unified bridge model, use 'fid_offset' as table
type, as in the new model the access to flood tables will be using
'fid_offset' calculation.
5. FID family operation changes:
a. rFID supposed to be created using SFMR, as it is not created by
firmware using unified bridge model.
b. port_vid_map() should perform SVFA for rFID, as the mapping is not
created by firmware using unified bridge model.
c. flood_index() is not aligned to the new model, as this function will
be removed later.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Router interfaces (RIFs) constructed on top of VLAN-aware bridges are of
'VLAN' type, whereas RIFs constructed on top of VLAN-unaware bridges are of
'FID' type.
Currently 802.1Q FIDs are emulated using 802.1D FIDs, therefore VLAN RIFs
are emulated using FID RIFs. As part of converting the driver to use
unified bridge model, 802.1Q FIDs and VLAN RIFs will be used.
The egress FID is required for VLAN RIFs in Spectrum-2 and above, but not
in Spectrum-1, as in Spectrum-1 the mapping for VLAN RIFs is VID->FID,
while in other ASICs it is FID->FID. The reason for the change is that it
is more scalable to reuse the FID->FID entry than creating multiple
{Port, VID}->FID entries for the router port. Use the existing operation
structure to separate the configuration between different ASICs.
Add support for VLAN RIFs, most of the configurations are same to FID
RIFs.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After routing, a packet needs to perform an L2 lookup using the DMAC it got
from the routing and a FID. In unified bridge model, the egress FID
configuration needs to be performed by software.
It is configured by RITR for both sub-port RIFs and FID RIFs. Currently
FID RIFs already configure eFID. Add eFID configuration for sub-port RIFs.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The field 'vid' in RITR is reserved when unified bridge model is used
and the RIF's type is sub-port RIF. Instead, ingress VID is configured via
SVFA and egress VID is configured via REIV.
Set 'vid' to zero in RITR register for sub-port RIF when unified bridge
model is used.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After routing, the device always consults a table that determines the
packet's egress VID based on {egress RIF, egress local port}. In the
unified bridge model, it is up to software to maintain this table via REIV
register.
The table needs to be updated in the following flows:
1. When a RIF is set on a FID, need to iterate over the FID's {Port, VID}
list and issue REIV write to map the {RIF, Port} to the given VID.
2. When a {Port, VID} is mapped to a FID and the FID already has a RIF,
need to issue REIV write with a single record to map the {RIF, Port}
to the given VID.
REIV register supports a simultaneous update of 256 ports, so use this
capability for the first flow.
Handle the two above mentioned flows.
Add mlxsw_sp_fid_evid_map() function to handle egress VID classification
for both unicast and multicast. Layer 2 multicast configuration is already
done in the driver, just move it to the new function.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Before layer 2 forwarding, the device classifies an incoming packet to
a FID. The classification is done based on one of the following keys:
1. FID
2. VNI (after decapsulation)
3. VID / {Port, VID}
After classification, the FID is known, but also all the attributes of
the FID, such as the router interface (RIF) via which a packet that
needs to be routed will ingress the router block.
In the legacy model, when a RIF was created / destroyed, it was
firmware's responsibility to update it in the previously mentioned FID
classification records. In the unified bridge model, this responsibility
moved to software.
The third classification requires to iterate over the FID's {Port, VID}
list and issue SVFA write with the correct mapping table according to the
port's mode (virtual or not). We never map multiple VLANs to the same FID
using VID->FID mapping, so such a mapping needs to be performed once.
When a new FID classification entry is configured and the FID already has
a RIF, set the RIF as part of SVFA configuration.
The reverse needs to be done when clearing a RIF from a FID. Currently,
clearing is done by issuing mlxsw_sp_fid_rif_set() with a NULL RIF pointer.
Instead, introduce mlxsw_sp_fid_rif_unset().
Note that mlxsw_sp_fid_rif_set() is called after the RIF is fully
operational, so it conforms to the internal requirement regarding
SVFA.irif_v: "Must not be set for a non-enabled RIF".
Do not set the ingress RIF for rFIDs, as the {Port, VID}->rFID entry is
configured by firmware when legacy model is used, a next patch will
handle this configuration for rFIDs and unified bridge model.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the new model, SFMR no longer configures both VNI->FID and FID->VNI
classifications, but only the later. The former needs to be configured via
SVFA.
Add SVFA configuration as part of vni_set() and vni_clear().
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Using unified bridge model, firmware no longer configures the egress VID
"under the hood" and moves this responsibility to software.
For layer 2, this means that software needs to determine the egress VID
for both unicast (i.e., FDB) and multicast (i.e., MDB and flooding) flows.
Unicast FDB records and unicast LAG FDB records have new fields - "set_vid"
and "vid", set them. For records which point to router port, do not set
these fields.
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Petr Machata <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add parsing support by implementing struct mlx5e_tc_act for police
action.
TC rule with police actions is broken down into several rules in
different tables. One rule with the original match in the original
flow table, which set fte_id, do metering, and jump to the post_meter
table. If there are more police actions, more rules are created for
each of them. Besides, a last rule is created in the end.
In post_meter table, there are two pre-defined rules, one is to drop
packet if its packet color is RED, the other is to jump back to
post_act table. As fte_id is updated before jumping, the rule for next
meter is matched to do another round of metering (if there are
multiple meters in the flow rule). Otherwise, last fte_id is matched
and do the original actions.
Signed-off-by: Jianbo Liu <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Ariel Levkovich <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
As a preparation for validating police action, adds flow_action to
parse state, which is to passed to parsing callbacks.
Signed-off-by: Jianbo Liu <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|