aboutsummaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice/ice_switch.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-30ice: fix VLAN replay after resetDave Ertman1-2/+0
There is a bug currently when there are more than one VLAN defined and any reset that affects the PF is initiated, after the reset rebuild no traffic will pass on any VLAN but the last one created. This is caused by the iteration though the VLANs during replay each clearing the vsi_map bitmap of the VSI that is being replayed. The problem is that during rhe replay, the pointer to the vsi_map bitmap is used by each successive vlan to determine if it should be replayed on this VSI. The logic was that the replay of the VLAN would replace the bit in the map before the next VLAN would iterate through. But, since the replay copies the old bitmap pointer to filt_replay_rules and creates a new one for the recreated VLANS, it does not do this, and leaves the old bitmap broken to be used to replay the remaining VLANs. Since the old bitmap will be cleaned up in post replay cleanup, there is no need to alter it and break following VLAN replay, so don't clear the bit. Fixes: 334cb0626de1 ("ice: Implement VSI replay framework") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: fix VSI lists confusion when adding VLANsMichal Schmidt1-1/+1
The description of function ice_find_vsi_list_entry says: Search VSI list map with VSI count 1 However, since the blamed commit (see Fixes below), the function no longer checks vsi_count. This causes a problem in ice_add_vlan_internal, where the decision to share VSI lists between filter rules relies on the vsi_count of the found existing VSI list being 1. The reproducing steps: 1. Have a PF and two VFs. There will be a filter rule for VLAN 0, referring to a VSI list containing VSIs: 0 (PF), 2 (VF#0), 3 (VF#1). 2. Add VLAN 1234 to VF#0. ice will make the wrong decision to share the VSI list with the new rule. The wrong behavior may not be immediately apparent, but it can be observed with debug prints. 3. Add VLAN 1234 to VF#1. ice will unshare the VSI list for the VLAN 1234 rule. Due to the earlier bad decision, the newly created VSI list will contain VSIs 0 (PF) and 3 (VF#1), instead of expected 2 (VF#0) and 3 (VF#1). 4. Try pinging a network peer over the VLAN interface on VF#0. This fails. Reproducer script at: https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/test-vlan-vsi-list-confusion.sh Commented debug trace: https://gitlab.com/mschmidt2/repro/-/blob/master/RHEL-46814/ice-vlan-vsi-lists-debug.txt Patch adding the debug prints: https://gitlab.com/mschmidt2/linux/-/commit/f8a8814623944a45091a77c6094c40bfe726bfdb (Unsafe, by the way. Lacks rule_lock when dumping in ice_remove_vlan.) Michal Swiatkowski added to the explanation that the bug is caused by reusing a VSI list created for VLAN 0. All created VFs' VSIs are added to VLAN 0 filter. When a non-zero VLAN is created on a VF which is already in VLAN 0 (normal case), the VSI list from VLAN 0 is reused. It leads to a problem because all VFs (VSIs to be specific) that are subscribed to VLAN 0 will now receive a new VLAN tag traffic. This is one bug, another is the bug described above. Removing filters from one VF will remove VLAN filter from the previous VF. It happens a VF is reset. Example: - creation of 3 VFs - we have VSI list (used for VLAN 0) [0 (pf), 2 (vf1), 3 (vf2), 4 (vf3)] - we are adding VLAN 100 on VF1, we are reusing the previous list because 2 is there - VLAN traffic works fine, but VLAN 100 tagged traffic can be received on all VSIs from the list (for example broadcast or unicast) - trust is turning on VF2, VF2 is resetting, all filters from VF2 are removed; the VLAN 100 filter is also removed because 3 is on the list - VLAN traffic to VF1 isn't working anymore, there is a need to recreate VLAN interface to readd VLAN filter One thing I'm not certain about is the implications for the LAG feature, which is another caller of ice_find_vsi_list_entry. I don't have a LAG-capable card at hand to test. Fixes: 23ccae5ce15f ("ice: changes to the interface with the HW and FW for SRIOV_VF+LAG") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Dave Ertman <David.m.ertman@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-09ice: fix accounting for filters shared by multiple VSIsJacob Keller1-1/+1
When adding a switch filter (such as a MAC or VLAN filter), it is expected that the driver will detect the case where the filter already exists, and return -EEXIST. This is used by calling code such as ice_vc_add_mac_addr, and ice_vsi_add_vlan to avoid incrementing the accounting fields such as vsi->num_vlan or vf->num_mac. This logic works correctly for the case where only a single VSI has added a given switch filter. When a second VSI adds the same switch filter, the driver converts the existing filter from an ICE_FWD_TO_VSI filter into an ICE_FWD_TO_VSI_LIST filter. This saves switch resources, by ensuring that multiple VSIs can re-use the same filter. The ice_add_update_vsi_list() function is responsible for doing this conversion. When first converting a filter from the FWD_TO_VSI into FWD_TO_VSI_LIST, it checks if the VSI being added is the same as the existing rule's VSI. In such a case it returns -EEXIST. However, when the switch rule has already been converted to a FWD_TO_VSI_LIST, the logic is different. Adding a new VSI in this case just requires extending the VSI list entry. The logic for checking if the rule already exists in this case returns 0 instead of -EEXIST. This breaks the accounting logic mentioned above, so the counters for how many MAC and VLAN filters exist for a given VF or VSI no longer accurately reflect the actual count. This breaks other code which relies on these counts. In typical usage this primarily affects such filters generally shared by multiple VSIs such as VLAN 0, or broadcast and multicast MAC addresses. Fix this by correctly reporting -EEXIST in the case of adding the same VSI to a switch rule already converted to ICE_FWD_TO_VSI_LIST. Fixes: 9daf8208dd4d ("ice: Add support for switch filter programming") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-23ice: Fix recipe read procedureWojciech Drewek1-4/+4
When ice driver reads recipes from firmware information about need_pass_l2 and allow_pass_l2 flags is not stored correctly. Those flags are stored as one bit each in ice_sw_recipe structure. Because of that, the result of checking a flag has to be casted to bool. Note that the need_pass_l2 flag currently works correctly, because it's stored in the first bit. Fixes: bccd9bce29e0 ("ice: Add guard rule when creating FDB in switchdev") Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Add tracepoint for adding and removing switch rulesMarcin Szycik1-2/+20
Track the number of rules and recipes added to switch. Add a tracepoint to ice_aq_sw_rules(), which shows both rule and recipe count. This information can be helpful when designing a set of rules to program to the hardware, as it shows where the practical limit is. Actual limits are known (64 recipes, 32k rules), but it's hard to translate these values to how many rules the *user* can actually create, because of extra metadata being implicitly added, and recipe/rule chaining. Chaining combines several recipes/rules to create a larger recipe/rule, so one large rule added by the user might actually consume multiple rules from hardware perspective. Rule counter is simply incremented/decremented in ice_aq_sw_rules(), since all rules are added or removed via it. Counting recipes is harder, as recipes can't be removed (only overwritten). Recipes added via ice_aq_add_recipe() could end up being unused, when there is an error in later stages of rule creation. Instead, track the allocation and freeing of recipes, which should reflect the actual usage of recipes (if something fails after recipe(s) were created, caller should free them). Also, a number of recipes are loaded from NVM by default - initialize the recipe counter with the number of these recipes on switch initialization. Example configuration: cd /sys/kernel/tracing echo function > current_tracer echo ice_aq_sw_rules > set_ftrace_filter echo ice_aq_sw_rules > set_event echo 1 > tracing_on cat trace Example output: tc-4097 [069] ...1. 787.595536: ice_aq_sw_rules <-ice_rem_adv_rule tc-4097 [069] ..... 787.595705: ice_aq_sw_rules: rules=9 recipes=15 tc-4098 [057] ...1. 787.652033: ice_aq_sw_rules <-ice_add_adv_rule tc-4098 [057] ..... 787.652201: ice_aq_sw_rules: rules=10 recipes=16 Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Remove unused members from switch APIMarcin Szycik1-41/+10
Remove several members of struct ice_sw_recipe and struct ice_prot_lkup_ext. Remove struct ice_recp_grp_entry and struct ice_pref_recipe_group, since they are now unused as well. All of the deleted members were only written to and never read, so it's pointless to keep them. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Optimize switch recipe creationMarcin Szycik1-329/+196
Currently when creating switch recipes, switch ID is always added as the first word in every recipe. There are only 5 words in a recipe, so one word is always wasted. This is also true for the last recipe, which stores result indexes (in case of chain recipes). Therefore the maximum usable length of a chain recipe is 4 * 4 = 16 words. 4 words in a recipe, 4 recipes that can be chained (using a 5th one for result indexes). Current max size chained recipe: 0: smmmm 1: smmmm 2: smmmm 3: smmmm 4: srrrr Where: s - switch ID m - regular match (e.g. ipv4 src addr, udp dst port, etc.) r - result index Switch ID does not actually need to be present in every recipe, only in one of them (in case of chained recipe). This frees up to 8 extra words: 3 from recipes in the middle (because first recipe still needs to have switch ID), and 5 from one extra recipe (because now the last recipe also does not have switch ID, so it can chain 1 more recipe). Max size chained recipe after changes: 0: smmmm 1: Mmmmm 2: Mmmmm 3: Mmmmm 4: MMMMM 5: Rrrrr Extra usable words available after this change are highlighted with capital letters. Changing how switch ID is added is not straightforward, because it's not a regular lookup. Its FV index and mask can't be determined based on protocol + offset pair read from package and instead need to be added manually. Additionally, change how result indexes are added. Currently they are always inserted in a new recipe at the end. Example for 13 words, (with above optimization, switch ID being one of the words): 0: smmmm 1: mmmmm 2: mmmxx 3: rrrxx Where: x - unused word In this and some other cases, the result indexes can be moved just after last matches because there are unused words, saving one recipe. Example for 13 words after both optimizations: 0: smmmm 1: mmmmm 2: mmmrr Note how one less result index is needed in this case, because the last recipe does not need to "link" to itself. There are cases when adding an additional recipe for result indexes cannot be avoided. In that cases result indexes are all put in the last recipe. Example for 14 words after both optimizations: 0: smmmm 1: mmmmm 2: mmmmx 3: rrrxx With these two changes, recipes/rules are more space efficient, allowing more to be created in total. Co-developed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: remove unused recipe bookkeeping dataMichal Swiatkowski1-40/+15
Remove root_buf from recipe struct. Its only usage was in ice_find_recp(), where if recipe had an inverse action, it was skipped, but actually the driver never adds inverse actions, so effectively it was pointless. Without root_buf, the recipe data element in ice_add_sw_recipe() does not need to be persistent and can also be automatically deallocated with __free, which nicely simplifies unroll. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Simplify bitmap setting in adding recipeMichal Swiatkowski1-15/+9
Remove unnecessary size checks when copying bitmaps in ice_add_sw_recipe() and replace them with compile time assert. Check if the bitmaps are equal size, as they are copied both ways. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Remove reading all recipes before adding a new oneMichal Swiatkowski1-27/+2
The content of the first read recipe is used as a template when adding a recipe. It isn't needed - only prune index is directly set from there. Set it in the code instead. Also, now there's no need to set rid and lookup indexes to 0, as the whole recipe buffer is initialized to 0. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-07-11ice: Remove unused struct ice_prot_lkup_ext membersMarcin Szycik1-25/+19
Remove field_off as it's never used. Remove done bitmap, as its value is only checked and never assigned. Reusing sub-recipes while creating new root recipes is currently not supported in the driver. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-19ice: Fix VSI list rule with ICE_SW_LKUP_LAST typeMarcin Szycik1-2/+4
Adding/updating VSI list rule, as well as allocating/freeing VSI list resource are called several times with type ICE_SW_LKUP_LAST, which fails because ice_update_vsi_list_rule() and ice_aq_alloc_free_vsi_list() consider it invalid. Allow calling these functions with ICE_SW_LKUP_LAST. This fixes at least one issue in switchdev mode, where the same rule with different action cannot be added, e.g.: tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \ dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF1_PR tc filter add dev $PF1 ingress protocol arp prio 0 flower skip_sw \ dst_mac ff:ff:ff:ff:ff:ff action mirred egress redirect dev $VF2_PR Fixes: 0f94570d0cae ("ice: allow adding advanced rules") Suggested-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240618210206.981885-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-01ice: Add switch recipe reusing featureSteven Zou1-17/+170
New E810 firmware supports the corresponding functionality, so the driver allows PFs to subscribe the same switch recipes. Then when the PF is done with a switch recipes, the PF can ask firmware to free that switch recipe. When users configure a rule to PFn into E810 switch component, if there is no existing recipe matching this rule's pattern, the driver will request firmware to allocate and return a new recipe resource for the rule by calling ice_add_sw_recipe() and ice_alloc_recipe(). If there is an existing recipe matching this rule's pattern with different key value, or this is a same second rule to PFm into switch component, the driver checks out this recipe by calling ice_find_recp(), the driver will tell firmware to share using this same recipe resource by calling ice_subscribable_recp_shared() and ice_subscribe_recipe(). When firmware detects that all subscribing PFs have freed the switch recipe, firmware will free the switch recipe so that it can be reused. This feature also fixes a problem where all switch recipes would eventually be exhausted because switch recipes could not be freed, as freeing a shared recipe could potentially break other PFs that were using it. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steven Zou <steven.zou@intel.com> Tested-by: Mayank Sharma <mayank.sharma@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-01ice: Add support for PFCP hardware offload in switchdevMarcin Szycik1-0/+85
Add support for creating PFCP filters in switchdev mode. Add support for parsing PFCP-specific tc options: S flag and SEID. To create a PFCP filter, a special netdev must be created and passed to tc command: ip link add pfcp0 type pfcp tc filter add dev eth0 ingress prio 1 flower pfcp_opts \ 1:123/ff:fffffffffffffff0 skip_hw action mirred egress redirect \ dev pfcp0 Changes in iproute2 [1] are required to be able to use pfcp_opts in tc. ICE COMMS package is required to create a filter as it contains PFCP profiles. Link: https://lore.kernel.org/netdev/20230614091758.11180-1-marcin.szycik@linux.intel.com [1] Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-28Merge branch '100GbE' of ↵Jakub Kicinski1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: use less resources in switchdev Michal Swiatkowski says: Switchdev is using one queue per created port representor. This can quickly lead to Rx queue shortage, as with subfunction support user can create high number of PRs. Save one MSI-X and 'number of PRs' * 1 queues. Refactor switchdev slow-path to use less resources (even no additional resources). Do this by removing control plane VSI and move its functionality to PF VSI. Even with current solution PF is acting like uplink and can't be used simultaneously for other use cases (adding filters can break slow-path). In short, do Tx via PF VSI and Rx packets using PF resources. Rx needs additional code in interrupt handler to choose correct PR netdev. Previous solution had to queue filters, it was way more elegant but needed one queue per PRs. Beside that this refactor mostly simplifies switchdev configuration. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: count representor stats ice: do switchdev slow-path Rx using PF VSI ice: change repr::id values ice: remove switchdev control plane VSI ice: control default Tx rule in lag ice: default Tx rule instead of to queue ice: do Tx through PF netdev in slow-path ice: remove eswitch changing queues algorithm ==================== Link: https://lore.kernel.org/r/20240325202623.1012287-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-28Merge tag 'net-6.9-rc2' of ↵Linus Torvalds1-10/+14
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bpf, WiFi and netfilter. Current release - regressions: - ipv6: fix address dump when IPv6 is disabled on an interface Current release - new code bugs: - bpf: temporarily disable atomic operations in BPF arena - nexthop: fix uninitialized variable in nla_put_nh_group_stats() Previous releases - regressions: - bpf: protect against int overflow for stack access size - hsr: fix the promiscuous mode in offload mode - wifi: don't always use FW dump trig - tls: adjust recv return with async crypto and failed copy to userspace - tcp: properly terminate timers for kernel sockets - ice: fix memory corruption bug with suspend and rebuild - at803x: fix kernel panic with at8031_probe - qeth: handle deferred cc1 Previous releases - always broken: - bpf: fix bug in BPF_LDX_MEMSX - netfilter: reject table flag and netdev basechain updates - inet_defrag: prevent sk release while still in use - wifi: pick the version of SESSION_PROTECTION_NOTIF - wwan: t7xx: split 64bit accesses to fix alignment issues - mlxbf_gige: call request_irq() after NAPI initialized - hns3: fix kernel crash when devlink reload during pf initialization" * tag 'net-6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) inet: inet_defrag: prevent sk release while still in use Octeontx2-af: fix pause frame configuration in GMP mode net: lan743x: Add set RFE read fifo threshold for PCI1x1x chips net: bcmasp: Remove phy_{suspend/resume} net: bcmasp: Bring up unimac after PHY link up net: phy: qcom: at803x: fix kernel panic with at8031_probe netfilter: arptables: Select NETFILTER_FAMILY_ARP when building arp_tables.c netfilter: nf_tables: skip netdev hook unregistration if table is dormant netfilter: nf_tables: reject table flag and netdev basechain updates netfilter: nf_tables: reject destroy command to remove basechain hooks bpf: update BPF LSM designated reviewer list bpf: Protect against int overflow for stack access size bpf: Check bloom filter map value size bpf: fix warning for crash_kexec selftests: netdevsim: set test timeout to 10 minutes net: wan: framer: Add missing static inline qualifiers mlxbf_gige: call request_irq() after NAPI initialized tls: get psock ref after taking rxlock to avoid leak selftests: tls: add test with a partially invalid iov tls: adjust recv return with async crypto and failed copy to userspace ...
2024-03-25ice: default Tx rule instead of to queueMichal Swiatkowski1-0/+4
Steer all packets that miss other rules to PF VSI. Previously in switchdev mode, PF VSI received missed packets, but only ones marked as Rx. Now it is receiving all missed packets. To queue rule per PR isn't needed, because we use PF VSI instead of control VSI now, and it's already correctly configured. Add flag to correctly set LAN_EN bit in default Tx rule. It shouldn't allow packet to go outside when there is a match. Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-25ice: Refactor FW data type and fix bitmap casting issueSteven Zou1-10/+14
According to the datasheet, the recipe association data is an 8-byte little-endian value. It is described as 'Bitmap of the recipe indexes associated with this profile', it is from 24 to 31 byte area in FW. Therefore, it is defined to '__le64 recipe_assoc' in struct ice_aqc_recipe_to_profile. And then fix the bitmap casting issue, as we must never ever use castings for bitmap type. Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Andrii Staikov <andrii.staikov@intel.com> Reviewed-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Steven Zou <steven.zou@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-22overflow: Change DEFINE_FLEX to take __counted_by memberKees Cook1-5/+5
The norm should be flexible array structures with __counted_by annotations, so DEFINE_FLEX() is updated to expect that. Rename the non-annotated version to DEFINE_RAW_FLEX(), and update the few existing users. Additionally add selftests for the macros. Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20240306235128.it.933-kees@kernel.org Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Kees Cook <keescook@chromium.org>
2024-01-02ice: Add support for packet mirroring using hardware in switchdev modeAndrii Staikov1-7/+18
Switchdev mode allows to add mirroring rules to mirror incoming and outgoing packets to the interface's port representor. Previously, this was available only using software functionality. Add possibility to offload this functionality to the NIC hardware. Introduce ICE_MIRROR_PACKET filter action to the ice_sw_fwd_act_type enum to identify the desired action and pass it to the hardware as well as the VSI to mirror. Example of tc mirror command using hardware: tc filter add dev ens1f0np0 ingress protocol ip prio 1 flower src_mac b4:96:91:a5:c7:a7 skip_sw action mirred egress mirror dev eth1 ens1f0np0 - PF b4:96:91:a5:c7:a7 - source MAC address eth1 - PR of a VF to mirror to Co-developed-by: Marcin Szycik <marcin.szycik@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Andrii Staikov <andrii.staikov@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-18ice: field prep conversionJesse Brandeburg1-41/+34
Refactor ice driver to use FIELD_PREP(), which reduces lines of code and adds clarity of intent. This code was generated by the following coccinelle/spatch script and then manually repaired. Several places I changed to OR into a single variable with |= instead of using a multi-line statement with trailing OR operators, as it (subjectively) makes the code clearer. A local variable vmvf_and_timeout was created and used to avoid multiple logical ORs being __le16 converted, which shortened some lines and makes the code cleaner. Also clean up a couple of places where conversions were made to have the code read more clearly/consistently. @prep2@ constant shift,mask; type T; expression a; @@ -(((T)(a) << shift) & mask) +FIELD_PREP(mask, a) @prep@ constant shift,mask; type T; expression a; @@ -((T)((a) << shift) & mask) +FIELD_PREP(mask, a) Cc: Julia Lawall <Julia.Lawall@inria.fr> CC: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-10-03ice: make use of DEFINE_FLEX() in ice_switch.cPrzemek Kitszel1-49/+14
Use DEFINE_FLEX() macro for 1-elem flex array members of ice_switch.c Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20230912115937.1645707-8-przemyslaw.kitszel@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-17ice: drop two params from ice_aq_alloc_free_res()Przemek Kitszel1-9/+7
Drop @num_entries and @cd params, latter of which was always NULL. Number of entities to alloc is passed in internal buffer, the outer layer (that @num_entries was assigned to) meaning is closer to "the number of requests", which was =1 in all cases. ice_free_hw_res() was always called with 1 as its @num arg. Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17ice: remove unused methodsJan Sokolowski1-48/+0
Following methods were found to no longer be in use: ice_is_pca9575_present ice_mac_fltr_exist ice_napi_del Remove them. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-07ice: Rename enum ice_pkt_flags valuesMarcin Szycik1-3/+3
enum ice_pkt_flags contains values such as ICE_PKT_FLAGS_VLAN and ICE_PKT_FLAGS_TUNNEL, but actually the flags words which they refer to contain a range of unrelated values - e.g. word 0 (ICE_PKT_FLAGS_VLAN) contains fields such as from_network and ucast, which have nothing to do with VLAN. Rename each enum value to ICE_PKT_FLAGS_MDID<number>, so it's clear in which flags word does some value reside. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-07ice: Add direction metadataMarcin Szycik1-2/+9
Currently it is possible to create a filter which breaks TX traffic, e.g.: tc filter add dev $PF1 ingress protocol ip prio 1 flower ip_proto udp dst_port $PORT action mirred egress redirect dev $VF1_PR This adds a rule which might match both TX and RX traffic, and in TX path the PF will actually receive the traffic, which breaks communication. To fix this, add a match on direction metadata flag when adding a tc rule. Because of the way metadata is currently handled, a duplicate lookup word would appear if VLAN metadata is also added. The lookup would still work correctly, but one word would be wasted. To prevent it, lookup 0 now always contains all metadata. When any metadata needs to be added, it is added to lookup 0 and lookup count is not incremented. This way, two flags residing in the same word will take up one word, instead of two. Note: the drop action is also affected, i.e. it will now only work in one direction. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-27ice: process events created by lag netdev event handlerDave Ertman1-1/+9
Add in the function framework for the processing of LAG events. Also add in helper function to perform common tasks. Add the basis of the process of linking a lower netdev to an upper netdev. Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-27ice: changes to the interface with the HW and FW for SRIOV_VF+LAGDave Ertman1-21/+57
Add defines needed for interaction with the FW admin queue interface in relation to supporting LAG and SRIOV VFs interacting. Add code, or make non-static previously static functions, to access the new and changed admin queue calls for LAG. Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-24ice: Add guard rule when creating FDB in switchdevMarcin Szycik1-35/+62
Introduce new "guard" rule upon FDB entry creation. It matches on src_mac, has valid bit unset, allow_pass_l2 set and has a nop action. Previously introduced "forward" rule matches on dst_mac, has valid bit set, need_pass_l2 set and has a forward action. With these rules, a packet will be offloaded only if FDB exists in both directions (RX and TX). Let's assume link partner sends a packet to VF1: src_mac = LP_MAC, dst_mac = is VF1_MAC. Bridge adds FDB, two rules are created: 1. Guard rule matching on src_mac == LP_MAC 2. Forward rule matching on dst_mac == LP_MAC Now VF1 responds with src_mac = VF1_MAC, dst_mac = LP_MAC. Before this change, only one rule with dst_mac == LP_MAC would have existed, and the packet would have been offloaded, meaning the bridge wouldn't add FDB in the opposite direction. Now, the forward rule matches (dst_mac == LP_MAC), but it has need_pass_l2 set an there is no guard rule with src_mac == VF1_MAC, so the packet goes through slow-path and the bridge adds FDB. Two rules are created: 1. Guard rule matching on src_mac == VF1_MAC 2. Forward rule matching on dst_mac == VF1_MAC Further packets in both directions will be offloaded. The same example is true in opposite direction (i.e. VF1 is the first to send a packet out). Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Marcin Szycik <marcin.szycik@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-24ice: Skip adv rules removal upon switchdev releaseWojciech Drewek1-53/+0
Advanced rules for ctrl VSI will be removed anyway when the VSI will cleaned up, no need to do it explicitly. Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22ice: remove null checks before devm_kfree() callsPrzemek Kitszel1-13/+6
We all know they are redundant. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Michal Wilczynski <michal.wilczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Arpana Arland <arpanax.arland@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: use src VSI instead of src MAC in slow-pathMichal Swiatkowski1-0/+6
The use of a source MAC to direct packets from the VF to the corresponding port representor is only ok if there is only one MAC on a VF. To support this functionality when the number of MACs on a VF is greater, it is necessary to match a source VSI instead of a source MAC. Let's use the new switch API that allows matching on metadata. If MAC isn't used in match criteria there is no need to handle adding rule after virtchnl command. Instead add new rule while port representor is being configured. Remove rule_added field, checking for sp_rule can be used instead. Remove also checking for switchdev running in deleting rule as it can be called from unroll context when running flag isn't set. Checking for sp_rule covers both context (with and without running flag). Rules are added in eswitch configuration flow, so there is no need to have replay function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: allow matching on meta dataMichal Swiatkowski1-100/+58
Add meta data matching criteria in the same place as protocol matching criteria. There is no need to add meta data as special words after parsing all lookups. Trade meta data in the same why as other lookups. The one difference between meta data lookups and protocol lookups is that meta data doesn't impact how the packets looks like. Because of that ignore it when filling testing packet. Match on tunnel type meta data always if tunnel type is different than TNL_LAST. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: specify field names in ice_prot_ext initMichal Swiatkowski1-23/+28
Anonymous initializers are now discouraged. Define ICE_PROTCOL_ENTRY macro to rewrite anonymous initializers to named one. No functional changes here. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: remove redundant Rx field from rule infoMichal Swiatkowski1-11/+11
Information about the direction is currently stored in sw_act.flag. There is no need to duplicate it in another field. Setting direction flag doesn't mean that there is a match criteria for direction in rule. It is only a information for HW from where switch id should be collected (VSI or port). In current implementation of advance rule handling, without matching for direction meta data, we can always set one the same flag and everything will work the same. Ability to match on direction meta data will be added in follow up patches. Recipe 0, 3 and 9 loaded from package has direction match criteria, but they are handled in other function. Move ice_adv_rule_info fields to avoid holes. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: define meta data to match in switchMichal Swiatkowski1-5/+6
Add description for each meta data. Redefine tunnel mask to match only tunneled MAC and tunneled VLAN. It shouldn't try to match other flags (previously it was 0xff, it is redundant). VLAN mask was 0xd000, change it to 0xf000. 4 last bits are flags depending on the same field in packets (VLAN tag). Because of that, It isn't harmful to match also on ITAG. Group all MDID and MDID offsets into enums to keep things organized. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-28ice: Fix ice_cfg_rdma_fltr() to only update relevant fieldsBrett Creeley1-4/+22
The current implementation causes ice_vsi_update() to update all VSI fields based on the cached VSI context. This also assumes that the ICE_AQ_VSI_PROP_Q_OPT_VALID bit is set. This can cause problems if the VSI context is not correctly synced by the driver. Fix this by only updating the fields that correspond to ICE_AQ_VSI_PROP_Q_OPT_VALID. Also, make sure to save the updated result in the cached VSI context on success. Fixes: 348048e724a0 ("ice: Implement iidc operations") Co-developed-by: Robert Malz <robertx.malz@intel.com> Signed-off-by: Robert Malz <robertx.malz@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: switch: fix potential memleak in ice_add_adv_recipe()Zhang Changzhong1-1/+1
When ice_add_special_words() fails, the 'rm' is not released, which will lead to a memory leak. Fix this up by going to 'err_unroll' label. Compile tested only. Fixes: 8b032a55c1bd ("ice: low level support for tunnels") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2022-09-20ice: Add L2TPv3 hardware offload supportMarcin Szycik1-1/+69
Add support for offloading packets based on L2TPv3 session id in switchdev mode. Example filter: tc filter add dev $PF1 ingress prio 1 protocol ip flower ip_proto l2tp \ l2tpv3_sid 1234 skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to be able to specify l2tpv3_sid. ICE COMMS DDP package is required to create a filter as it contains L2TPv3 profiles. Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-06ice: switch: Simplify memory allocationChristophe JAILLET1-4/+2
'rbuf' is locale to the ice_get_initial_sw_cfg() function. There is no point in using devm_kzalloc()/devm_kfree(). use kzalloc()/kfree() instead. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-22Merge branch '100GbE' of ↵Jakub Kicinski1-161/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-08-18 (ice) This series contains updates to ice driver only. Jesse and Anatolii add support for controlling FCS/CRC stripping via ethtool. Anirudh allows for 100M speeds on devices which support it. Sylwester removes ucast_shared field and the associated dead code related to it. Mikael removes non-inclusive language from the driver. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: remove non-inclusive language ice: Remove ucast_shared ice: Allow 100M speeds for some devices ice: Implement FCS/CRC and VLAN stripping co-existence policy ice: Implement control of FCS/CRC stripping ==================== Link: https://lore.kernel.org/r/20220818155207.996297-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-18ice: Remove ucast_sharedSylwester Dziedziuch1-161/+5
Remove ucast_shared as it was always true. Remove the code depending on ucast_shared from ice_add_mac and ice_remove_mac. Remove ice_find_ucast_rule_entry function as it was only used when ucast_shared was set to false. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Ignore EEXIST when setting promisc modeGrzegorz Siwik1-1/+1
Ignore EEXIST error when setting promiscuous mode. This fix is needed because the driver could set promiscuous mode when it still has not cleared properly. Promiscuous mode could be set only once, so setting it second time will be rejected. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix double VLAN error when entering promisc modeGrzegorz Siwik1-0/+7
Avoid enabling or disabling VLAN 0 when trying to set promiscuous VLAN mode if double VLAN mode is enabled. This fix is needed because the driver tries to add the VLAN 0 filter twice (once for inner and once for outer) when double VLAN mode is enabled. The filter program is rejected by the firmware when double VLAN is enabled, because the promiscuous filter only needs to be set once. This issue was missed in the initial implementation of double VLAN mode. Fixes: 5eda8afd6bcc ("ice: Add support for PF/VF promiscuous mode") Signed-off-by: Grzegorz Siwik <grzegorz.siwik@intel.com> Link: https://lore.kernel.org/all/CAK8fFZ7m-KR57M_rYX6xZN39K89O=LGooYkKsu6HKt0Bs+x6xQ@mail.gmail.com/ Tested-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Tested-by: Igor Raits <igor@gooddata.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-07Merge tag 'bitmap-6.0-rc1' of https://github.com/norov/linuxLinus Torvalds1-1/+1
Pull bitmap updates from Yury Norov: - fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo) - optimize out non-atomic bitops on compile-time constants (Alexander Lobakin) - cleanup bitmap-related headers (Yury Norov) - x86/olpc: fix 'logical not is only applied to the left hand side' (Alexander Lobakin) - lib/nodemask: inline wrappers around bitmap (Yury Norov) * tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits) lib/nodemask: inline next_node_in() and node_random() powerpc: drop dependency on <asm/machdep.h> in archrandom.h x86/olpc: fix 'logical not is only applied to the left hand side' lib/cpumask: move some one-line wrappers to header file headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h> headers/deps: mm: Optimize <linux/gfp.h> header dependencies lib/cpumask: move trivial wrappers around find_bit to the header lib/cpumask: change return types to unsigned where appropriate cpumask: change return types to bool where appropriate lib/bitmap: change type of bitmap_weight to unsigned long lib/bitmap: change return types to bool where appropriate arm: align find_bit declarations with generic kernel iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE) lib/test_bitmap: test the tail after bitmap_to_arr64() lib/bitmap: fix off-by-one in bitmap_to_arr64() lib: test_bitmap: add compile-time optimization/evaluations assertions bitmap: don't assume compiler evaluates small mem*() builtins calls net/ice: fix initializing the bitmap in the switch code bitops: let optimize out non-atomic bitops on compile-time constants ...
2022-07-28ice: Introduce enabling promiscuous mode on multiple VF'sMichal Wilczynski1-65/+71
In current implementation default VSI switch filter is only able to forward traffic to a single VSI. This limits promiscuous mode with private flag 'vf-true-promisc-support' to a single VF. Enabling it on the second VF won't work. Also allmulticast support doesn't seem to be properly implemented when vf-true-promisc-support is true. Use standard ice_add_rule_internal() function that already implements forwarding to multiple VSI's instead of constructing AQ call manually. Add switch filter for allmulticast mode when vf-true-promisc-support is enabled. The same filter is added regardless of the flag - it doesn't matter for this case. Remove unnecessary fields in switch structure. From now on book keeping will be done by ice_add_rule_internal(). Refactor unnecessarily passed function arguments. To test: 1) Create 2 VM's, and two VF's. Attach VF's to VM's. 2) Enable promiscuous mode on both of them and check if traffic is seen on both of them. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-26ice: Add support for PPPoE hardware offloadMarcin Szycik1-0/+165
Add support for creating PPPoE filters in switchdev mode. Add support for parsing PPPoE and PPP-specific tc options: pppoe_sid and ppp_proto. Example filter: tc filter add dev $PF1 ingress protocol ppp_ses prio 1 flower pppoe_sid \ 1234 ppp_proto ip skip_sw action mirred egress redirect dev $VF1_PR Changes in iproute2 are required to use the new fields. ICE COMMS DDP package is required to create a filter as it contains PPPoE profiles. Added a warning message when loaded DDP package does not contain required profiles. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-30net/ice: fix initializing the bitmap in the switch codeAlexander Lobakin1-1/+1
Kbuild spotted the following bug during the testing of one of the optimizations: In file included from include/linux/cpumask.h:12, [...] from drivers/net/ethernet/intel/ice/ice_switch.c:4: drivers/net/ethernet/intel/ice/ice_switch.c: In function 'ice_find_free_recp_res_idx.constprop': include/linux/bitmap.h:447:22: warning: 'possible_idx[0]' is used uninitialized [-Wuninitialized] 447 | *map |= GENMASK(start + nbits - 1, start); | ^~ In file included from drivers/net/ethernet/intel/ice/ice.h:7, from drivers/net/ethernet/intel/ice/ice_lib.h:7, from drivers/net/ethernet/intel/ice/ice_switch.c:4: drivers/net/ethernet/intel/ice/ice_switch.c:4929:24: note: 'possible_idx[0]' was declared here 4929 | DECLARE_BITMAP(possible_idx, ICE_MAX_FV_WORDS); | ^~~~~~~~~~~~ include/linux/types.h:11:23: note: in definition of macro 'DECLARE_BITMAP' 11 | unsigned long name[BITS_TO_LONGS(bits)] | ^~~~ %ICE_MAX_FV_WORDS is 48, so bitmap_set() here was initializing only 48 bits, leaving a junk in the rest 16. It was previously hidden due to that filling 48 bits makes bitmap_set() call external __bitmap_set(), but after making it use plain bit arithmetics on small bitmaps, compilers started seeing the issue. It was still working because those 16 weren't used anywhere anyhow. bitmap_{clear,set}() are not really intended to initialize bitmaps, rather to modify already initialized ones, as they don't do anything past the passed number of bits. The correct function to do this in that particular case is bitmap_fill(), so use it here. It will do `*possible_idx = ~0UL` instead of `*possible_idx |= GENMASK(47, 0)`, not leaving anything in an undefined state. Fixes: fd2a6b71e300 ("ice: create advanced switch recipe") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-30ice: switch: dynamically add VLAN headers to dummy packetsMartyna Szapar-Mudlaw1-397/+140
Enable the support of creating all kinds of declared dummy packets with the VLAN tags by inserting VLAN headers (single VLAN and QinQ cases) if needed. Decrease the number of declared dummy packets and increase in the possible packet's combinations for adding switch rules. This change enables support of creating filters that match both on VLAN + tunnels properties in switchdev. Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-30ice: Add support for VLAN TPID filters in switchdevMartyna Szapar-Mudlaw1-3/+56
Enable support for adding TC rules that filter on the VLAN tag type in switchdev mode. Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>