aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2019-05-05net: dsa: Add support for deferred xmitVladimir Oltean1-0/+12
Some hardware needs to take work to get convinced to receive frames on the CPU port (such as the sja1105 which takes temporary L2 forwarding rules over SPI that last for a single frame). Such work needs a sleepable context, and because the regular .ndo_start_xmit is atomic, this cannot be done in the tagger. So introduce a generic DSA mechanism that sets up a transmit skb queue and a workqueue for deferred transmission. The new driver callback (.port_deferred_xmit) is in dsa_switch and not in the tagger because the operations that require sleeping typically also involve interacting with the hardware, and not simply skb manipulations. Therefore having it there simplifies the structure a bit and makes it unnecessary to export functions from the driver to the tagger. The driver is responsible of calling dsa_enqueue_skb which transfers it to the master netdevice. This is so that it has a chance of performing some more work afterwards, such as cleanup or TX timestamping. To tell DSA that skb xmit deferral is required, I have thought about changing the return type of the tagger .xmit from struct sk_buff * into a enum dsa_tx_t that could potentially encode a DSA_XMIT_DEFER value. But the trailer tagger is reallocating every skb on xmit and therefore making a valid use of the pointer return value. So instead of reworking the API in complicated ways, right now a boolean property in the newly introduced DSA_SKB_CB is set. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net: dsa: Keep private info in the skb->cbVladimir Oltean1-0/+31
Map a DSA structure over the 48-byte control block that will hold skb info on transmit and receive. This is only for use within the DSA processing layer (e.g. communicating between DSA core and tagger) and not for passing info around with other layers such as the master net device. Also add a DSA_SKB_CB_PRIV() macro which retrieves a pointer to the space up to 48 bytes that the DSA structure does not use. This space can be used for drivers to add their own private info. One use is for the PTP timestamping code path. When cloning a skb, annotate the original with a pointer to the clone, which the driver can then find easily and place the timestamp to. This avoids the need of a separate queue to hold clones and a way to match an original to a cloned skb. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net: dsa: Allow drivers to filter packets they can decode source port fromVladimir Oltean1-0/+15
Frames get processed by DSA and redirected to switch port net devices based on the ETH_P_XDSA multiplexed packet_type handler found by the network stack when calling eth_type_trans(). The running assumption is that once the DSA .rcv function is called, DSA is always able to decode the switch tag in order to change the skb->dev from its master. However there are tagging protocols (such as the new DSA_TAG_PROTO_SJA1105, user of DSA_TAG_PROTO_8021Q) where this assumption is not completely true, since switch tagging piggybacks on the absence of a vlan_filtering bridge. Moreover, management traffic (BPDU, PTP) for this switch doesn't rely on switch tagging, but on a different mechanism. So it would make sense to at least be able to terminate that. Having DSA receive traffic it can't decode would put it in an impossible situation: the eth_type_trans() function would invoke the DSA .rcv(), which could not change skb->dev, then eth_type_trans() would be invoked again, which again would call the DSA .rcv, and the packet would never be able to exit the DSA filter and would spiral in a loop until the whole system dies. This happens because eth_type_trans() doesn't actually look at the skb (so as to identify a potential tag) when it deems it as being ETH_P_XDSA. It just checks whether skb->dev has a DSA private pointer installed (therefore it's a DSA master) and that there exists a .rcv callback (everybody except DSA_TAG_PROTO_NONE has that). This is understandable as there are many switch tags out there, and exhaustively checking for all of them is far from ideal. The solution lies in introducing a filtering function for each tagging protocol. In the absence of a filtering function, all traffic is passed to the .rcv DSA callback. The tagging protocol should see the filtering function as a pre-validation that it can decode the incoming skb. The traffic that doesn't match the filter will bypass the DSA .rcv callback and be left on the master netdevice, which wasn't previously possible. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net: dsa: Optional VLAN-based port separation for switches without taggingVladimir Oltean2-0/+78
This patch provides generic DSA code for using VLAN (802.1Q) tags for the same purpose as a dedicated switch tag for injection/extraction. It is based on the discussions and interest that has been so far expressed in https://www.spinics.net/lists/netdev/msg556125.html. Unlike all other DSA-supported tagging protocols, CONFIG_NET_DSA_TAG_8021Q does not offer a complete solution for drivers (nor can it). Instead, it provides generic code that driver can opt into calling: - dsa_8021q_xmit: Inserts a VLAN header with the specified contents. Can be called from another tagging protocol's xmit function. Currently the LAN9303 driver is inserting headers that are simply 802.1Q with custom fields, so this is an opportunity for code reuse. - dsa_8021q_rcv: Retrieves the TPID and TCI from a VLAN-tagged skb. Removing the VLAN header is left as a decision for the caller to make. - dsa_port_setup_8021q_tagging: For each user port, installs an Rx VID and a Tx VID, for proper untagged traffic identification on ingress and steering on egress. Also sets up the VLAN trunk on the upstream (CPU or DSA) port. Drivers are intentionally left to call this function explicitly, depending on the context and hardware support. The expected switch behavior and VLAN semantics should not be violated under any conditions. That is, after calling dsa_port_setup_8021q_tagging, the hardware should still pass all ingress traffic, be it tagged or untagged. For uniformity with the other tagging protocols, a module for the dsa_8021q_netdev_ops structure is registered, but the typical usage is to set up another tagging protocol which selects CONFIG_NET_DSA_TAG_8021Q, and calls the API from tag_8021q.h. Null function definitions are also provided so that a "depends on" is not forced in the Kconfig. This tagging protocol only works when switch ports are standalone, or when they are added to a VLAN-unaware bridge. It will probably remain this way for the reasons below. When added to a bridge that has vlan_filtering 1, the bridge core will install its own VLANs and reset the pvids through switchdev. For the bridge core, switchdev is a write-only pipe. All VLAN-related state is kept in the bridge core and nothing is read from DSA/switchdev or from the driver. So the bridge core will break this port separation because it will install the vlan_default_pvid into all switchdev ports. Even if we could teach the bridge driver about switchdev preference of a certain vlan_default_pvid (task difficult in itself since the current setting is per-bridge but we would need it per-port), there would still exist many other challenges. Firstly, in the DSA rcv callback, a driver would have to perform an iterative reverse lookup to find the correct switch port. That is because the port is a bridge slave, so its Rx VID (port PVID) is subject to user configuration. How would we ensure that the user doesn't reset the pvid to a different value (which would make an O(1) translation impossible), or to a non-unique value within this DSA switch tree (which would make any translation impossible)? Finally, not all switch ports are equal in DSA, and that makes it difficult for the bridge to be completely aware of this anyway. The CPU port needs to transmit tagged packets (VLAN trunk) in order for the DSA rcv code to be able to decode source information. But the bridge code has absolutely no idea which switch port is the CPU port, if nothing else then just because there is no netdevice registered by DSA for the CPU port. Also DSA does not currently allow the user to specify that they want the CPU port to do VLAN trunking anyway. VLANs are added to the CPU port using the same flags as they were added on the user port. So the VLANs installed by dsa_port_setup_8021q_tagging per driver request should remain private from the bridge's and user's perspective, and should not alter the VLAN semantics observed by the user. In the current implementation a VLAN range ending at 4095 (VLAN_N_VID) is reserved for this purpose. Each port receives a unique Rx VLAN and a unique Tx VLAN. Separate VLANs are needed for Rx and Tx because they serve different purposes: on Rx the switch must process traffic as untagged and process it with a port-based VLAN, but with care not to hinder bridging. On the other hand, the Tx VLAN is where the reachability restrictions are imposed, since by tagging frames in the xmit callback we are telling the switch onto which port to steer the frame. Some general guidance on how this support might be employed for real-life hardware (some comments made by Florian Fainelli): - If the hardware supports VLAN tag stacking, it should somehow back up its private VLAN settings when the bridge tries to override them. Then the driver could re-apply them as outer tags. Dedicating an outer tag per bridge device would allow identical inner tag VID numbers to co-exist, yet preserve broadcast domain isolation. - If the switch cannot handle VLAN tag stacking, it should disable this port separation when added as slave to a vlan_filtering bridge, in that case having reduced functionality. - Drivers for old switches that don't support the entire VLAN_N_VID range will need to rework the current range selection mechanism. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Reviewed-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: add block pointer to tc_cls_common_offload structurePieter Jansen van Vuuren1-0/+8
Some actions like the police action are stateful and could share state between devices. This is incompatible with offloading to multiple devices and drivers might want to test for shared blocks when offloading. Store a pointer to the tcf_block structure in the tc_cls_common_offload structure to allow drivers to determine when offloads apply to a shared block. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: extend matchall offload for hardware statisticsPieter Jansen van Vuuren1-0/+2
Introduce a new command for matchall classifiers that allows hardware to update statistics. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: add police action to the hardware intermediate representationPieter Jansen van Vuuren1-0/+5
Add police action to the hardware intermediate representation which would subsequently allow it to be used by drivers for offload. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: move police action structures to headerPieter Jansen van Vuuren1-0/+70
Move tcf_police_params, tcf_police and tc_police_compat structures to a header. Making them usable to other code for example drivers that would offload police actions to hardware. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: remove unused functions for matchall offloadPieter Jansen van Vuuren1-25/+0
Cleanup unused functions and variables after porting to the newer intermediate representation. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05mlxsw: use intermediate representation for matchall offloadPieter Jansen van Vuuren1-0/+11
Updates the Mellanox spectrum driver to use the newer intermediate representation for flow actions in matchall offloads. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: use the hardware intermediate representation for matchallPieter Jansen van Vuuren1-0/+1
Extends matchall offload to make use of the hardware intermediate representation. More specifically, this patch moves the native TC actions in cls_matchall offload to the newer flow_action representation. This ultimately allows us to avoid a direct dependency on native TC actions for matchall. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05net/sched: add sample action to the hardware intermediate representationPieter Jansen van Vuuren1-0/+7
Add sample action to the hardware intermediate representation model which would subsequently allow it to be used by drivers for offload. Signed-off-by: Pieter Jansen van Vuuren <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller5-21/+40
Pablo Neira Ayuso says: =================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next, they are: 1) Move nft_expr_clone() to nft_dynset, from Paul Gortmaker. 2) Do not include module.h from net/netfilter/nf_tables.h, also from Paul. 3) Restrict conntrack sysctl entries to boolean, from Tonghao Zhang. 4) Several patches to add infrastructure to autoload NAT helper modules from their respective conntrack helper, this also includes the first client of this code in OVS, patches from Flavio Leitner. 5) Add support to match for conntrack ID, from Brett Mastbergen. 6) Spelling fix in connlabel, from Colin Ian King. 7) Use struct_size() from hashlimit, from Gustavo A. R. Silva. 8) Add optimized version of nf_inet_addr_mask(), from Li RongQing. =================== Signed-off-by: David S. Miller <[email protected]>
2019-05-06netfilter: slightly optimize nf_inet_addr_maskLi RongQing1-0/+9
using 64bit computation to slightly optimize nf_inet_addr_mask Signed-off-by: Li RongQing <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-05-05Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "I'd like to apologize for this very late pull request: I was dithering through the week whether to send the fixes, and then yesterday Jiri's crash fix for a regression introduced in this cycle clearly marked perf/urgent as 'must merge now'. Most of the commits are tooling fixes, plus there's three kernel fixes via four commits: - race fix in the Intel PEBS code - fix an AUX bug and roll back a previous attempt - fix AMD family 17h generic HW cache-event perf counters The largest diffstat contribution comes from the AMD fix - a new event table is introduced, which is a fairly low risk change but has a large linecount" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix race in intel_pmu_disable_event() perf/x86/intel/pt: Remove software double buffering PMU capability perf/ring_buffer: Fix AUX software double buffering perf tools: Remove needless asm/unistd.h include fixing build in some places tools arch uapi: Copy missing unistd.h headers for arc, hexagon and riscv tools build: Add -ldl to the disassembler-four-args feature test perf cs-etm: Always allocate memory for cs_etm_queue::prev_packet perf cs-etm: Don't check cs_etm_queue::prev_packet validity perf report: Report OOM in status line in the GTK UI perf bench numa: Add define for RUSAGE_THREAD if not present tools lib traceevent: Change tag string for error perf annotate: Fix build on 32 bit for BPF annotation tools uapi x86: Sync vmx.h with the kernel perf bpf: Return value with unlocking in perf_env__find_btf() MAINTAINERS: Include vendor specific files under arch/*/events/* perf/x86/amd: Update generic hardware cache events for Family 17h
2019-05-05Merge branch 'for-upstream' of ↵David S. Miller1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2019-05-05 Here's one more bluetooth-next pull request for 5.2: - Fixed Command Complete event handling check for matching opcode - Added support for Qualcomm WCN3998 controller, along with DT bindings - Added default address for Broadcom BCM2076B1 controllers Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-05-05x86/mm: Initialize PGD cache during mm initializationNadav Amit1-0/+2
Poking-mm initialization might require to duplicate the PGD in early stage. Initialize the PGD cache earlier to prevent boot failures. Reported-by: kernel test robot <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rick Edgecombe <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Stephen Rothwell <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 4fc19708b165 ("x86/alternatives: Initialize temporary mm for patching") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-05-05ipv4: Define __ipv4_neigh_lookup_noref when CONFIG_INET is disabledDavid Ahern1-0/+8
Define __ipv4_neigh_lookup_noref to return NULL when CONFIG_INET is disabled. Fixes: 4b2a2bfeb3f0 ("neighbor: Call __ipv4_neigh_lookup_noref in neigh_xmit") Reported-by: kbuild test robot <[email protected]> Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05ip6: fix skb leak in ip6frag_expire_frag_queue()Eric Dumazet1-1/+0
Since ip6frag_expire_frag_queue() now pulls the head skb from frag queue, we should no longer use skb_get(), since this leads to an skb leak. Stefan Bader initially reported a problem in 4.4.stable [1] caused by the skb_get(), so this patch should also fix this issue. 296583.091021] kernel BUG at /build/linux-6VmqmP/linux-4.4.0/net/core/skbuff.c:1207! [296583.091734] Call Trace: [296583.091749] [<ffffffff81740e50>] __pskb_pull_tail+0x50/0x350 [296583.091764] [<ffffffff8183939a>] _decode_session6+0x26a/0x400 [296583.091779] [<ffffffff817ec719>] __xfrm_decode_session+0x39/0x50 [296583.091795] [<ffffffff818239d0>] icmpv6_route_lookup+0xf0/0x1c0 [296583.091809] [<ffffffff81824421>] icmp6_send+0x5e1/0x940 [296583.091823] [<ffffffff81753238>] ? __netif_receive_skb+0x18/0x60 [296583.091838] [<ffffffff817532b2>] ? netif_receive_skb_internal+0x32/0xa0 [296583.091858] [<ffffffffc0199f74>] ? ixgbe_clean_rx_irq+0x594/0xac0 [ixgbe] [296583.091876] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091893] [<ffffffff8183d431>] icmpv6_send+0x21/0x30 [296583.091906] [<ffffffff8182b500>] ip6_expire_frag_queue+0xe0/0x120 [296583.091921] [<ffffffffc04eb27f>] nf_ct_frag6_expire+0x1f/0x30 [nf_defrag_ipv6] [296583.091938] [<ffffffff810f3b57>] call_timer_fn+0x37/0x140 [296583.091951] [<ffffffffc04eb260>] ? nf_ct_net_exit+0x50/0x50 [nf_defrag_ipv6] [296583.091968] [<ffffffff810f5464>] run_timer_softirq+0x234/0x330 [296583.091982] [<ffffffff8108a339>] __do_softirq+0x109/0x2b0 Fixes: d4289fcc9b16 ("net: IP6 defrag: use rbtrees for IPv6 defrag") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Stefan Bader <[email protected]> Cc: Peter Oskolkov <[email protected]> Cc: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05Bluetooth: Ignore CC events not matching the last HCI commandJoão Paulo Rechi Vita1-0/+1
This commit makes the kernel not send the next queued HCI command until a command complete arrives for the last HCI command sent to the controller. This change avoids a problem with some buggy controllers (seen on two SKUs of QCA9377) that send an extra command complete event for the previous command after the kernel had already sent a new HCI command to the controller. The problem was reproduced when starting an active scanning procedure, where an extra command complete event arrives for the LE_SET_RANDOM_ADDR command. When this happends the kernel ends up not processing the command complete for the following commmand, LE_SET_SCAN_PARAM, and ultimately behaving as if a passive scanning procedure was being performed, when in fact controller is performing an active scanning procedure. This makes it impossible to discover BLE devices as no device found events are sent to userspace. This problem is reproducible on 100% of the attempts on the affected controllers. The extra command complete event can be seen at timestamp 27.420131 on the btmon logs bellow. Bluetooth monitor ver 5.50 = Note: Linux version 5.0.0+ (x86_64) 0.352340 = Note: Bluetooth subsystem version 2.22 0.352343 = New Index: 80:C5:F2:8F:87:84 (Primary,USB,hci0) [hci0] 0.352344 = Open Index: 80:C5:F2:8F:87:84 [hci0] 0.352345 = Index Info: 80:C5:F2:8F:87:84 (Qualcomm) [hci0] 0.352346 @ MGMT Open: bluetoothd (privileged) version 1.14 {0x0001} 0.352347 @ MGMT Open: btmon (privileged) version 1.14 {0x0002} 0.352366 @ MGMT Open: btmgmt (privileged) version 1.14 {0x0003} 27.302164 @ MGMT Command: Start Discovery (0x0023) plen 1 {0x0003} [hci0] 27.302310 Address type: 0x06 LE Public LE Random < HCI Command: LE Set Random Address (0x08|0x0005) plen 6 #1 [hci0] 27.302496 Address: 15:60:F2:91:B2:24 (Non-Resolvable) > HCI Event: Command Complete (0x0e) plen 4 #2 [hci0] 27.419117 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Parameters (0x08|0x000b) plen 7 #3 [hci0] 27.419244 Type: Active (0x01) Interval: 11.250 msec (0x0012) Window: 11.250 msec (0x0012) Own address type: Random (0x01) Filter policy: Accept all advertisement (0x00) > HCI Event: Command Complete (0x0e) plen 4 #4 [hci0] 27.420131 LE Set Random Address (0x08|0x0005) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Scan Enable (0x08|0x000c) plen 2 #5 [hci0] 27.420259 Scanning: Enabled (0x01) Filter duplicates: Enabled (0x01) > HCI Event: Command Complete (0x0e) plen 4 #6 [hci0] 27.420969 LE Set Scan Parameters (0x08|0x000b) ncmd 1 Status: Success (0x00) > HCI Event: Command Complete (0x0e) plen 4 #7 [hci0] 27.421983 LE Set Scan Enable (0x08|0x000c) ncmd 1 Status: Success (0x00) @ MGMT Event: Command Complete (0x0001) plen 4 {0x0003} [hci0] 27.422059 Start Discovery (0x0023) plen 1 Status: Success (0x00) Address type: 0x06 LE Public LE Random @ MGMT Event: Discovering (0x0013) plen 2 {0x0003} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0002} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) @ MGMT Event: Discovering (0x0013) plen 2 {0x0001} [hci0] 27.422067 Address type: 0x06 LE Public LE Random Discovery: Enabled (0x01) Signed-off-by: João Paulo Rechi Vita <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-05-05Merge tag 'nand/for-5.2' of ↵Richard Weinberger8-74/+146
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux into mtd/next NAND core changes: - Support having the bad block markers in either the first, second or last page of a block. The combination of all three location is now possible. - Constification of NAND_OP_PARSER(_PATTERN) elements. - Generic NAND DT bindings changed to yaml format (can be used to check the proposed bindings. First platform to be fully supported: sunxi. - Stopped using several legacy hooks. - Preparation to use the generic NAND layer with the addition of several helpers and the removal of the struct nand_chip from generic functions. - Kconfig cleanup to prepare the introduction of external ECC engines support. - Fallthrough comments. - Introduction of the SPI-mem dirmap API for SPI-NAND devices. Raw NAND controller drivers changes: - nandsim: * Switch to ->exec-op(). - meson: * Misc cleanups and fixes. * New OOB layout. - Sunxi: * A23/A33 NAND DMA support. - Ingenic: * Full reorganization and cleanup. * Clear separation between NAND controller and ECC engine. * Support JZ4740 an JZ4725B. - Denali: * Clear controller/chip separation. * ->exec_op() migration. * Various cleanups. - fsl_elbc: * Enable software ECC support. - Atmel: * Sam9x60 support. - GPMI: * Introduce the GPMI_IS_MXS() macro. - Various trivial/spelling/coding style fixes.
2019-05-05ipv4: Move exception bucket to nh_commonDavid Ahern1-1/+1
Similar to the cached routes, make IPv4 exceptions accessible when using an IPv6 nexthop struct with IPv4 routes. Simplify the exception functions by passing in fib_nh_common since that is all it needs, and then cleanup the call sites that have extraneous fib_nh conversions. As with the cached routes this is a change in location only, from fib_nh up to fib_nh_common; no functional change intended. Signed-off-by: David Ahern <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-05ipv4: Move cached routes to fib_nh_commonDavid Ahern1-2/+4
While the cached routes, nh_pcpu_rth_output and nh_rth_input, are IPv4 specific, a later patch wants to make them accessible for IPv6 nexthops with IPv4 routes using a fib6_nh. Move the cached routes from fib_nh to fib_nh_common and update references. Initialization of the cached entries is moved to fib_nh_common_init, and free is moved to fib_nh_common_release. Change in location only, from fib_nh up to fib_nh_common; no functional change intended. Signed-off-by: David Ahern <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04blk-mq: always free hctx after request queue is freedMing Lei2-0/+9
In normal queue cleanup path, hctx is released after request queue is freed, see blk_mq_release(). However, in __blk_mq_update_nr_hw_queues(), hctx may be freed because of hw queues shrinking. This way is easy to cause use-after-free, because: one implicit rule is that it is safe to call almost all block layer APIs if the request queue is alive; and one hctx may be retrieved by one API, then the hctx can be freed by blk_mq_update_nr_hw_queues(); finally use-after-free is triggered. Fixes this issue by always freeing hctx after releasing request queue. If some hctxs are removed in blk_mq_update_nr_hw_queues(), introduce a per-queue list to hold them, then try to resuse these hctxs if numa node is matched. Cc: Dongli Zhang <[email protected]> Cc: James Smart <[email protected]> Cc: Bart Van Assche <[email protected]> Cc: [email protected], Cc: Martin K . Petersen <[email protected]>, Cc: Christoph Hellwig <[email protected]>, Cc: James E . J . Bottomley <[email protected]>, Reviewed-by: Hannes Reinecke <[email protected]> Tested-by: James Smart <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2019-05-04netlink: add validation of NLA_F_NESTED flagMichal Kubecek1-1/+10
Add new validation flag NL_VALIDATE_NESTED which adds three consistency checks of NLA_F_NESTED_FLAG: - the flag is set on attributes with NLA_NESTED{,_ARRAY} policy - the flag is not set on attributes with other policies except NLA_UNSPEC - the flag is set on attribute passed to nla_parse_nested() Signed-off-by: Michal Kubecek <[email protected]> v2: change error messages to mention NLA_F_NESTED explicitly Reviewed-by: Johannes Berg <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04net: phy: improve resuming from hibernationHeiner Kallweit1-8/+1
I got an interesting report [0] that after resuming from hibernation the link has 100Mbps instead of 1Gbps. Reason is that another OS has been used whilst Linux was hibernated. And this OS speeds down the link due to WoL. Therefore, when resuming, we shouldn't expect that what the PHY advertises is what it did when hibernating. Easiest way to do this is removing state PHY_RESUMING. Instead always go via PHY_UP that configures PHY advertisement. [0] https://bugzilla.kernel.org/show_bug.cgi?id=202851 Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04net: phy: improve pause handlingHeiner Kallweit1-0/+1
When probing the phy device we set sym and asym pause in the "supported" bitmap (unless the PHY tells us otherwise). However we don't know yet whether the MAC supports pause. Simply copying phy->supported to phy->advertising will trigger advertising pause, and that's not what we want. Therefore add phy_advertise_supported() that copies all modes but doesn't touch the pause bits. In phy_support_(a)sym_pause we shouldn't set any bits in the supported bitmap because we may set a bit the PHY intentionally disabled. Effective pause support should be the AND-combined PHY and MAC pause capabilities. If the MAC supports everything, then it's only relevant what the PHY supports. If MAC supports sym pause only, then we have to clear the asym bit in phydev->supported. Copy the pause flags only and don't touch the modes, because a driver may have intentionally removed a mode from phydev->advertising. Signed-off-by: Heiner Kallweit <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04net: add a generic tracepoint for TX queue timeoutCong Wang1-0/+23
Although devlink health report does a nice job on reporting TX timeout and other NIC errors, unfortunately it requires drivers to support it but currently only mlx5 has implemented it. Before other drivers could catch up, it is useful to have a generic tracepoint to monitor this kind of TX timeout. We have been suffering TX timeout with different drivers, we plan to start to monitor it with rasdaemon which just needs a new tracepoint. Sample output: ksoftirqd/1-16 [001] ..s2 144.043173: net_dev_xmit_timeout: dev=ens3 driver=e1000 queue=0 Cc: Eran Ben Elisha <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jiri Pirko <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-04Merge tag 'mlx5-updates-2019-04-30' of ↵David S. Miller9-23/+142
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2019-04-30 mlx5 misc updates: 1) Bodong Wang and Parav Pandit (6): - Remove unused mlx5_query_nic_vport_vlans - vport macros refactoring - Fix vport access in E-Switch - Use atomic rep state to serialize state change 2) Eli Britstein (2): - prio tag mode support, added ACLs and replace TC vlan pop with vlan 0 rewrite when prio tag mode is enabled. 3) Erez Alfasi (2): - ethtool: Add SFF-8436 and SFF-8636 max EEPROM length definitions - mlx5e: ethtool, Add support for EEPROM high pages query 4) Masahiro Yamada (1): - remove meaningless CFLAGS_tracepoint.o 5) Maxim Mikityanskiy (1): - Put the common XDP code into a function 6) Tariq Toukan (2): - Turn on HW tunnel offload in all TIRs 7) Vlad Buslov (1): - Return error when trying to insert existing flower filter ==================== Signed-off-by: David S. Miller <[email protected]>
2019-05-03net: dsa: mv88e6xxx: Pass interrupt number in platform dataAndrew Lunn1-0/+1
Allow an interrupt number to be passed in the platform data. The driver will then use it if not zero, otherwise it will poll for interrupts. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03power: supply: core: Add POWER_SUPPLY_HEALTH_OVERCURRENT constantAndrey Smirnov1-0/+1
Add POWER_SUPPLY_HEALTH_OVERCURRENT constant in order to allow singalling overcurrent condition via power supply health information. Signed-off-by: Andrey Smirnov <[email protected]> Reviewed-by: Guenter Roeck <[email protected]> Cc: Enric Balletbo Serra <[email protected]> Cc: Chris Healy <[email protected]> Cc: Lucas Stach <[email protected]> Cc: Fabio Estevam <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Sebastian Reichel <[email protected]>
2019-05-03livepatch: Remove custom kobject state handlingPetr Mladek1-3/+0
kobject_init() always succeeds and sets the reference count to 1. It allows to always free the structures via kobject_put() and the related release callback. Note that the custom kobject state handling was used only because we did not know that kobject_put() can and actually should get called even when kobject_init_and_add() fails. The patch should not change the existing behavior. Suggested-by: "Tobin C. Harding" <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Acked-by: Joe Lawrence <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2019-05-03kernel/cpu: Allow non-zero CPU to be primary for suspend / kexec freezeNicholas Piggin1-1/+6
This patch provides an arch option, ARCH_SUSPEND_NONZERO_CPU, to opt-in to allowing suspend to occur on one of the housekeeping CPUs rather than hardcoded CPU0. This will allow CPU0 to be a nohz_full CPU with a later change. It may be possible for platforms with hardware/firmware restrictions on suspend/wake effectively support this by handing off the final stage to CPU0 when kernel housekeeping is no longer required. Another option is to make housekeeping / nohz_full mask dynamic at runtime, but the complexity could not be justified at this time. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J . Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-05-03power/suspend: Add function to disable secondaries for suspendNicholas Piggin1-0/+12
This adds a function to disable secondary CPUs for suspend that are not necessarily non-zero / non-boot CPUs. Platforms will be able to use this to suspend using non-zero CPUs. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J . Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-05-03clk: analogbits: add Wide-Range PLL libraryPaul Walmsley1-0/+79
Add common library code for the Analog Bits Wide-Range PLL (WRPLL) IP block, as implemented in TSMC CLN28HPC. There is no bus interface or register target associated with this PLL. This library is intended to be used by drivers for IP blocks that expose registers connected to the PLL configuration and status signals. Based on code originally written by Wesley Terpstra <[email protected]>: https://github.com/riscv/riscv-linux/commit/999529edf517ed75b56659d456d221b2ee56bb60 This version incorporates several changes requested by Stephen Boyd <[email protected]>. Signed-off-by: Paul Walmsley <[email protected]> Signed-off-by: Paul Walmsley <[email protected]> Cc: Wesley Terpstra <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Michael Turquette <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Megan Wachs <[email protected]> Cc: [email protected] Cc: [email protected] [[email protected]: Fix some const issues] Signed-off-by: Stephen Boyd <[email protected]>
2019-05-03Merge tag 'usb-for-v5.2' of ↵Greg Kroah-Hartman1-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next Felipe writes: USB: changes for v5.2 merge window With a total of 50 non-merge commits, this is not a large pull request. Most of the changes are, again, in dwc2 (37%) and dwc3 (32%) with the rest of it scattered among other UDCs, function drivers and device-tree bindings. No really big feature this time around apart from support to Amlogic being added to both dwc3 and dwc2 drivers. * tag 'usb-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: (50 commits) usb: dwc3: Rename DWC3_DCTL_LPM_ERRATA usb: dwc3: Fix default lpm_nyet_threshold value usb: dwc3: debug: Print GET_STATUS(device) tracepoint usb: dwc3: Do core validation early on probe usb: dwc3: gadget: Set lpm_capable usb: gadget: atmel: tie wake lock to running clock usb: gadget: atmel: support USB suspend usb: gadget: atmel_usba_udc: simplify setting of interrupt-enabled mask dwc2: gadget: Fix completed transfer size calculation in DDMA usb: dwc2: Set lpm mode parameters depend on HW configuration usb: dwc2: Fix channel disable flow usb: dwc2: Set actual frame number for completed ISOC transfer usb: gadget: do not use __constant_cpu_to_le16 usb: dwc2: gadget: Increase descriptors count for ISOC's usb: introduce usb_ep_type_string() function usb: dwc3: move synchronize_irq() out of the spinlock protected block usb: dwc3: Free resource immediately after use usb: dwc3: of-simple: Convert to bulk clk API usb: dwc2: Delayed status support usb: gadget: udc: lpc32xx: rework interrupt handling ...
2019-05-03Merge 5.1-rc7 into usb-nextGreg Kroah-Hartman52-147/+302
We need this to make the usb-gadget branch merge cleaner. And for testing to keep from hitting the same issues already fixed. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-05-03Merge tag 'usb-serial-5.2-rc1' of ↵Greg Kroah-Hartman1-7/+1
https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial into usb-next Johan writes: USB-serial updates for 5.2-rc1 Here are the USB-serial updates for 5.2-rc1, including: - flow-control related fixes for pl2303 - fix for an initial-termios issue - fix for a couple of unthrottle() races - fix for f81232 interrupt-handling issues - improved f81232 overrun handling - support for higher f81232 line speeds - support for f81232 break control Included are also various clean ups. All but the last four commits have been in linux-next and with no reported issues. Signed-off-by: Johan Hovold <[email protected]> * tag 'usb-serial-5.2-rc1' of https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial: (22 commits) USB: serial: f81232: implement break control USB: serial: f81232: add high baud rate support USB: serial: f81232: clear overrun flag USB: serial: f81232: fix interrupt worker not stop USB: serial: io_edgeport: fix up switch fall-through comments USB: serial: drop unused iflag macro USB: serial: drop unnecessary goto USB: serial: clean up throttle handling USB: serial: fix unthrottle races USB: serial: spcp8x5: simplify init_termios USB: serial: oti6858: simplify init_termios USB: serial: iuu_phoenix: simplify init_termios USB: serial: iuu_phoenix: drop bogus initial cflag USB: serial: cypress_m8: clean up initial-termios handling USB: serial: cypress_m8: drop unused termios USB: serial: cypress_m8: drop unused driver data flag USB: serial: ark3116: drop redundant init_termios USB: serial: fix initial-termios handling USB: serial: digi_acceleport: clean up set_termios USB: serial: digi_acceleport: clean up modem-control handling ...
2019-05-03Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds1-0/+16
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two fixes for the NKMP clks on Allwinner SoCs, a locking fix for clkdev where we forgot to hold a lock while iterating a list that can change, and finally a build fix that adds some stubs for clk APIs that are used by devfreq drivers on platforms without the clk APIs" * tag 'clk-fixes-for-linus' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: Add missing stubs for a few functions clkdev: Hold clocks_mutex while iterating clocks list clk: sunxi-ng: nkmp: Explain why zero width check is needed clk: sunxi-ng: nkmp: Avoid GENMASK(-1, 0)
2019-05-03net: dsa: sja1105: Add support for VLAN operationsVladimir Oltean1-0/+4
VLAN filtering cannot be properly disabled in SJA1105. So in order to emulate the "no VLAN awareness" behavior (not dropping traffic that is tagged with a VID that isn't configured on the port), we need to hack another switch feature: programmable TPID (which is 0x8100 for 802.1Q). We are reprogramming the TPID to a bogus value which leaves the switch thinking that all traffic is untagged, and therefore accepts it. Under a vlan_filtering bridge, the proper TPID of ETH_P_8021Q is installed again, and the switch starts identifying 802.1Q-tagged traffic. Signed-off-by: Vladimir Oltean <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03ether: Add dedicated Ethertype for pseudo-802.1Q DSA taggingVladimir Oltean1-0/+1
There are two possible utilizations so far: - Switch devices that don't support a native insertion/extraction header on the CPU port may still enjoy the benefits of port isolation with a custom VLAN tag. For this, they need to have a customizable TPID in hardware and a new Ethertype to distinguish between real 802.1Q traffic and the private tags used for port separation. - Switches that don't support the deactivation of VLAN awareness, but still want to have a mode in which they accept all traffic, including frames that are tagged with a VLAN not configured on their ports, may use this as a fake to trick the hardware into thinking that the TPID for VLAN is something other than 0x8100. What follows after the ETH_P_DSA_8021Q EtherType is a regular VLAN header (TCI), however there is no other EtherType that can be used for this purpose and doesn't already have a well-defined meaning. ETH_P_8021AD, ETH_P_QINQ1, ETH_P_QINQ2 and ETH_P_QINQ3 expect that another follow-up VLAN tag is present, which is not the case here. Signed-off-by: Vladimir Oltean <[email protected]> Suggested-by: Andrew Lunn <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03net: dsa: Introduce driver for NXP SJA1105 5-port L2 switchVladimir Oltean1-0/+19
At this moment the following is supported: * Link state management through phylib * Autonomous L2 forwarding managed through iproute2 bridge commands. IP termination must be done currently through the master netdevice, since the switch is unmanaged at this point and using DSA_TAG_PROTO_NONE. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: Georg Waibel <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03lib: Add support for generic packing operationsVladimir Oltean1-0/+49
This provides an unified API for accessing register bit fields regardless of memory layout. The basic unit of data for these API functions is the u64. The process of transforming an u64 from native CPU encoding into the peripheral's encoding is called 'pack', and transforming it from peripheral to native CPU encoding is 'unpack'. Signed-off-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03iommu/dma-iommu: Remove iommu_dma_map_msi_msg()Julien Grall1-5/+0
A recent change split iommu_dma_map_msi_msg() in two new functions. The function was still implemented to avoid modifying all the callers at once. Now that all the callers have been reworked, iommu_dma_map_msi_msg() can be removed. Signed-off-by: Julien Grall <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Eric Auger <[email protected]> Acked-by: Joerg Roedel <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-03net: macb: shrink macb_platform_data structureNicolas Ferre1-9/+0
This structure was used intensively for machine specific values when DT was not used. Since the removal of AVR32 from the kernel, this structure is only used for passing clocks from PCI macb wrapper, all other fields being 0. All other known platforms use DT. Remove the leftovers but make sure that PCI macb still works as expected by using default values: - phydev->irq is set to PHY_POLL by mdiobus_alloc() - mii_bus->phy_mask is cleared while allocating it - bp->phy_interface is set to PHY_INTERFACE_MODE_MII if mode not found in DT. This simplifies driver probe path and particularly phy handling. Signed-off-by: Nicolas Ferre <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-05-03iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two partsJulien Grall1-0/+25
On RT, iommu_dma_map_msi_msg() may be called from non-preemptible context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as the function is using spin_lock (they can sleep on RT). iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT and update the MSI message with the IOVA. Only the part to lookup for the MSI page requires to be called in preemptible context. As the MSI page cannot change over the lifecycle of the MSI interrupt, the lookup can be cached and re-used later on. iomma_dma_map_msi_msg() is now split in two functions: - iommu_dma_prepare_msi(): This function will prepare the mapping in the IOMMU and store the cookie in the structure msi_desc. This function should be called in preemptible context. - iommu_dma_compose_msi_msg(): This function will update the MSI message with the IOVA when the device is behind an IOMMU. Signed-off-by: Julien Grall <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Eric Auger <[email protected]> Acked-by: Joerg Roedel <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-03genirq/msi: Add a new field in msi_desc to store an IOMMU cookieJulien Grall1-0/+26
When an MSI doorbell is located downstream of an IOMMU, it is required to swizzle the physical address with an appropriately-mapped IOVA for any device attached to one of our DMA ops domain. At the moment, the allocation of the mapping may be done when composing the message. However, the composing may be done in non-preemtible context while the allocation requires to be called from preemptible context. A follow-up change will split the current logic in two functions requiring to keep an IOMMU cookie per MSI. A new field is introduced in msi_desc to store an IOMMU cookie. As the cookie may not be required in some configuration, the field is protected under a new config CONFIG_IRQ_MSI_IOMMU. A pair of helpers has also been introduced to access the field. Signed-off-by: Julien Grall <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Reviewed-by: Eric Auger <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-05-03RDMA/core: Allow detaching gid attribute netdevice for RoCEParav Pandit1-1/+1
When there is active traffic through a GID, a QP/AH holds reference to this GID entry. RoCE GID entry holds reference to its attached netdevice. Due to this when netdevice is deleted by admin user, its refcount is not dropped. Therefore, while deleting RoCE GID, wait for all GID attribute's netdev users to finish accessing netdev in rcu context. Once all users done accessing it, release the netdev refcount. Signed-off-by: Huy Nguyen <[email protected]> Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-05-03RDMA/cma: Use rdma_read_gid_attr_ndev_rcu to access netdevParav Pandit1-0/+1
To access the netdevice of the GID attribute, use an existing API rdma_read_gid_attr_ndev_rcu(). This further reduces dependency on open access to netdevice of GID attribute. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-05-03RDMA: Introduce and use GID attr helper to read RoCE L2 fieldsParav Pandit1-0/+3
Instead of RoCE drivers figuring out vlan, smac fields while working on QP/AH, provide a helper routine to read the L2 fields such as vlan_id and source mac address. This moves logic from mlx5 driver to core for wider usage for RoCE ports. This is a preparation patch to allow detaching netdev in subsequent patch. Signed-off-by: Parav Pandit <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>