aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-02Merge branch 'net-dsa-Improve-dsa_untag_bridge_pvid'David S. Miller5-23/+24
Florian Fainelli says: ==================== net: dsa: Improve dsa_untag_bridge_pvid() This patch series is based on the recent discussions with Vladimir: https://lore.kernel.org/netdev/[email protected]/ the simplest way forward was to call dsa_untag_bridge_pvid() after eth_type_trans() has been set which guarantees that skb->protocol is set to a correct value and this allows us to utilize __vlan_find_dev_deep_rcu() properly without playing or using the bridge master as a net_device reference. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-02net: dsa: Utilize __vlan_find_dev_deep_rcu()Florian Fainelli1-8/+3
Now that we are guaranteed that dsa_untag_bridge_pvid() is called after eth_type_trans() we can utilize __vlan_find_dev_deep_rcu() which will take care of finding an 802.1Q upper on top of a bridge master. A common use case, prior to 12a1526d067 ("net: dsa: untag the bridge pvid from rx skbs") was to configure a bridge 802.1Q upper like this: ip link add name br0 type bridge vlan_filtering 0 ip link add link br0 name br0.1 type vlan id 1 in order to pop the default_pvid VLAN tag. With this change we restore that behavior while still allowing the DSA receive path to automatically pop the VLAN tag. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02net: dsa: Obtain VLAN protocol from skb->protocolFlorian Fainelli1-2/+1
Now that dsa_untag_bridge_pvid() is called after eth_type_trans() we are guaranteed that skb->protocol will be set to a correct value, thus allowing us to avoid calling vlan_eth_hdr(). Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02net: dsa: b53: Set untag_bridge_pvidFlorian Fainelli2-13/+3
Indicate to the DSA receive path that we need to untage the bridge PVID, this allows us to remove the dsa_untag_bridge_pvid() calls from net/dsa/tag_brcm.c. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02net: dsa: Call dsa_untag_bridge_pvid() from dsa_switch_rcv()Florian Fainelli2-0/+17
When a DSA switch driver needs to call dsa_untag_bridge_pvid(), it can set dsa_switch::untag_brige_pvid to indicate this is necessary. This is a pre-requisite to making sure that we are always calling dsa_untag_bridge_pvid() after eth_type_trans() has been called. Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02Merge branch 'master' of ↵David S. Miller11-36/+3066
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2020-10-02 1) Add a full xfrm compatible layer for 32-bit applications on 64-bit kernels. From Dmitry Safonov. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-02netlink: fix policy dump leakJohannes Berg3-16/+20
[ Upstream commit a95bc734e60449e7b073ff7ff70c35083b290ae9 ] If userspace doesn't complete the policy dump, we leak the allocated state. Fix this. Fixes: d07dcf9aadd6 ("netlink: add infrastructure to expose policies to userspace") Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02netlink: fix policy dump leakJohannes Berg3-16/+20
If userspace doesn't complete the policy dump, we leak the allocated state. Fix this. Fixes: d07dcf9aadd6 ("netlink: add infrastructure to expose policies to userspace") Signed-off-by: Johannes Berg <[email protected]> Reviewed-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-02net/mlx5e: Fix race condition on nhe->n pointer in neigh updateVlad Buslov2-37/+50
Current neigh update event handler implementation takes reference to neighbour structure, assigns it to nhe->n, tries to schedule workqueue task and releases the reference if task was already enqueued. This results potentially overwriting existing nhe->n pointer with another neighbour instance, which causes double release of the instance (once in neigh update handler that failed to enqueue to workqueue and another one in neigh update workqueue task that processes updated nhe->n pointer instead of original one): [ 3376.512806] ------------[ cut here ]------------ [ 3376.513534] refcount_t: underflow; use-after-free. [ 3376.521213] Modules linked in: act_skbedit act_mirred act_tunnel_key vxlan ip6_udp_tunnel udp_tunnel nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 mlx5_ib mlx5_core mlxfw pci_hyperv_intf ptp pps_core nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp rpcrdma rdma_ucm ib_umad ib_ipoib ib_iser rdma_cm ib_cm iw_cm rfkill ib_uverbs ib_core sunrpc kvm_intel kvm iTCO_wdt iTCO_vendor_support virtio_net irqbypass net_failover crc32_pclmul lpc_ich i2c_i801 failover pcspkr i2c_smbus mfd_core ghash_clmulni_intel sch_fq_codel drm i2c _core ip_tables crc32c_intel serio_raw [last unloaded: mlxfw] [ 3376.529468] CPU: 8 PID: 22756 Comm: kworker/u20:5 Not tainted 5.9.0-rc5+ #6 [ 3376.530399] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 [ 3376.531975] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core] [ 3376.532820] RIP: 0010:refcount_warn_saturate+0xd8/0xe0 [ 3376.533589] Code: ff 48 c7 c7 e0 b8 27 82 c6 05 0b b6 09 01 01 e8 94 93 c1 ff 0f 0b c3 48 c7 c7 88 b8 27 82 c6 05 f7 b5 09 01 01 e8 7e 93 c1 ff <0f> 0b c3 0f 1f 44 00 00 8b 07 3d 00 00 00 c0 74 12 83 f8 01 74 13 [ 3376.536017] RSP: 0018:ffffc90002a97e30 EFLAGS: 00010286 [ 3376.536793] RAX: 0000000000000000 RBX: ffff8882de30d648 RCX: 0000000000000000 [ 3376.537718] RDX: ffff8882f5c28f20 RSI: ffff8882f5c18e40 RDI: ffff8882f5c18e40 [ 3376.538654] RBP: ffff8882cdf56c00 R08: 000000000000c580 R09: 0000000000001a4d [ 3376.539582] R10: 0000000000000731 R11: ffffc90002a97ccd R12: 0000000000000000 [ 3376.540519] R13: ffff8882de30d600 R14: ffff8882de30d640 R15: ffff88821e000900 [ 3376.541444] FS: 0000000000000000(0000) GS:ffff8882f5c00000(0000) knlGS:0000000000000000 [ 3376.542732] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3376.543545] CR2: 0000556e5504b248 CR3: 00000002c6f10005 CR4: 0000000000770ee0 [ 3376.544483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3376.545419] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3376.546344] PKRU: 55555554 [ 3376.546911] Call Trace: [ 3376.547479] mlx5e_rep_neigh_update.cold+0x33/0xe2 [mlx5_core] [ 3376.548299] process_one_work+0x1d8/0x390 [ 3376.548977] worker_thread+0x4d/0x3e0 [ 3376.549631] ? rescuer_thread+0x3e0/0x3e0 [ 3376.550295] kthread+0x118/0x130 [ 3376.550914] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3376.551675] ret_from_fork+0x1f/0x30 [ 3376.552312] ---[ end trace d84e8f46d2a77eec ]--- Fix the bug by moving work_struct to dedicated dynamically-allocated structure. This enabled every event handler to work on its own private neighbour pointer and removes the need for handling the case when task is already enqueued. Fixes: 232c001398ae ("net/mlx5e: Add support to neighbour update flow") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Fix VLAN create flowAya Levin1-2/+4
When interface is attached while in promiscuous mode and with VLAN filtering turned off, both configurations are not respected and VLAN filtering is performed. There are 2 flows which add the any-vid rules during interface attach: VLAN creation table and set rx mode. Each is relaying on the other to add any-vid rules, eventually non of them does. Fix this by adding any-vid rules on VLAN creation regardless of promiscuous mode. Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Fix VLAN cleanup flowAya Levin1-2/+6
Prior to this patch unloading an interface in promiscuous mode with RX VLAN filtering feature turned off - resulted in a warning. This is due to a wrong condition in the VLAN rules cleanup flow, which left the any-vid rules in the VLAN steering table. These rules prevented destroying the flow group and the flow table. The any-vid rules are removed in 2 flows, but none of them remove it in case both promiscuous is set and VLAN filtering is off. Fix the issue by changing the condition of the VLAN table cleanup flow to clean also in case of promiscuous mode. mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 20 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_group:2123:(pid 28729): Flow group 19 wasn't destroyed, refcount > 1 mlx5_core 0000:00:08.0: mlx5_destroy_flow_table:2112:(pid 28729): Flow table 262149 wasn't destroyed, refcount > 1 ... ... ------------[ cut here ]------------ FW pages counter is 11560 after reclaiming all pages WARNING: CPU: 1 PID: 28729 at drivers/net/ethernet/mellanox/mlx5/core/pagealloc.c:660 mlx5_reclaim_startup_pages+0x178/0x230 [mlx5_core] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: mlx5_function_teardown+0x2f/0x90 [mlx5_core] mlx5_unload_one+0x71/0x110 [mlx5_core] remove_one+0x44/0x80 [mlx5_core] pci_device_remove+0x3e/0xc0 device_release_driver_internal+0xfb/0x1c0 device_release_driver+0x12/0x20 pci_stop_bus_device+0x68/0x90 pci_stop_and_remove_bus_device+0x12/0x20 hv_eject_device_work+0x6f/0x170 [pci_hyperv] ? __schedule+0x349/0x790 process_one_work+0x206/0x400 worker_thread+0x34/0x3f0 ? process_one_work+0x400/0x400 kthread+0x126/0x140 ? kthread_park+0x90/0x90 ret_from_fork+0x22/0x30 ---[ end trace 6283bde8d26170dc ]--- Fixes: 9df30601c843 ("net/mlx5e: Restore vlan filter after seamless reset") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Fix return status when setting unsupported FEC modeAya Levin1-0/+3
Verify the configured FEC mode is supported by at least a single link mode before applying the command. Otherwise fail the command and return "Operation not supported". Prior to this patch, the command was successful, yet it falsely set all link modes to FEC auto mode - like configuring FEC mode to auto. Auto mode is the default configuration if a link mode doesn't support the configured FEC mode. Fixes: b5ede32d3329 ("net/mlx5e: Add support for FEC modes based on 50G per lane links") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Fix driver's declaration to support GRE offloadAya Levin1-1/+18
Declare GRE offload support with respect to the inner protocol. Add a list of supported inner protocols on which the driver can offload checksum and GSO. For other protocols, inform the stack to do the needed operations. There is no noticeable impact on GRE performance. Fixes: 2729984149e6 ("net/mlx5e: Support TSO and TX checksum offloads for GRE tunnels") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: CT, Fix coverity issueMaor Dickman1-1/+3
The cited commit introduced the following coverity issue at function mlx5_tc_ct_rule_to_tuple_nat: - Memory - corruptions (OVERRUN) Overrunning array "tuple->ip.src_v6.in6_u.u6_addr32" of 4 4-byte elements at element index 7 (byte offset 31) using index "ip6_offset" (which evaluates to 7). In case of IPv6 destination address rewrite, ip6_offset values are between 4 to 7, which will cause memory overrun of array "tuple->ip.src_v6.in6_u.u6_addr32" to array "tuple->ip.dst_v6.in6_u.u6_addr32". Fixed by writing the value directly to array "tuple->ip.dst_v6.in6_u.u6_addr32" in case ip6_offset values are between 4 to 7. Fixes: bc562be9674b ("net/mlx5e: CT: Save ct entries tuples in hashtables") Signed-off-by: Maor Dickman <[email protected]> Reviewed-by: Roi Dayan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Add resiliency in Striding RQ mode for packets larger than MTUAya Levin2-5/+58
Prior to this fix, in Striding RQ mode the driver was vulnerable when receiving packets in the range (stride size - headroom, stride size]. Where stride size is calculated by mtu+headroom+tailroom aligned to the closest power of 2. Usually, this filtering is performed by the HW, except for a few cases: - Between 2 VFs over the same PF with different MTUs - On bluefield, when the host physical function sets a larger MTU than the ARM has configured on its representor and uplink representor. When the HW filtering is not present, packets that are larger than MTU might be harmful for the RQ's integrity, in the following impacts: 1) Overflow from one WQE to the next, causing a memory corruption that in most cases is unharmful: as the write happens to the headroom of next packet, which will be overwritten by build_skb(). In very rare cases, high stress/load, this is harmful. When the next WQE is not yet reposted and points to existing SKB head. 2) Each oversize packet overflows to the headroom of the next WQE. On the last WQE of the WQ, where addresses wrap-around, the address of the remainder headroom does not belong to the next WQE, but it is out of the memory region range. This results in a HW CQE error that moves the RQ into an error state. Solution: Add a page buffer at the end of each WQE to absorb the leak. Actually the maximal overflow size is headroom but since all memory units must be of the same size, we use page size to comply with UMR WQEs. The increase in memory consumption is of a single page per RQ. Initialize the mkey with all MTTs pointing to a default page. When the channels are activated, UMR WQEs will redirect the RX WQEs to the actual memory from the RQ's pool, while the overflow MTTs remain mapped to the default page. Fixes: 73281b78a37a ("net/mlx5e: Derive Striding RQ size from MTU") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5e: Fix error path for RQ allocAya Levin1-15/+17
Increase granularity of the error path to avoid unneeded free/release. Fix the cleanup to be symmetric to the order of creation. Fixes: 0ddf543226ac ("xdp/mlx5: setup xdp_rxq_info") Fixes: 422d4c401edd ("net/mlx5e: RX, Split WQ objects for different RQ types") Signed-off-by: Aya Levin <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: Fix request_irqs error flowMaor Gottlieb1-1/+1
Fix error flow handling in request_irqs which try to free irq that we failed to request. It fixes the below trace. WARNING: CPU: 1 PID: 7587 at kernel/irq/manage.c:1684 free_irq+0x4d/0x60 CPU: 1 PID: 7587 Comm: bash Tainted: G W OE 4.15.15-1.el7MELLANOXsmp-x86_64 #1 Hardware name: Advantech SKY-6200/SKY-6200, BIOS F2.00 08/06/2020 RIP: 0010:free_irq+0x4d/0x60 RSP: 0018:ffffc9000ef47af0 EFLAGS: 00010282 RAX: ffff88001476ae00 RBX: 0000000000000655 RCX: 0000000000000000 RDX: ffff88001476ae00 RSI: ffffc9000ef47ab8 RDI: ffff8800398bb478 RBP: ffff88001476a838 R08: ffff88001476ae00 R09: 000000000000156d R10: 0000000000000000 R11: 0000000000000004 R12: ffff88001476a838 R13: 0000000000000006 R14: ffff88001476a888 R15: 00000000ffffffe4 FS: 00007efeadd32740(0000) GS:ffff88047fc40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fc9cc010008 CR3: 00000001a2380004 CR4: 00000000007606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: mlx5_irq_table_create+0x38d/0x400 [mlx5_core] ? atomic_notifier_chain_register+0x50/0x60 mlx5_load_one+0x7ee/0x1130 [mlx5_core] init_one+0x4c9/0x650 [mlx5_core] pci_device_probe+0xb8/0x120 driver_probe_device+0x2a1/0x470 ? driver_allows_async_probing+0x30/0x30 bus_for_each_drv+0x54/0x80 __device_attach+0xa3/0x100 pci_bus_add_device+0x4a/0x90 pci_iov_add_virtfn+0x2dc/0x2f0 pci_enable_sriov+0x32e/0x420 mlx5_core_sriov_configure+0x61/0x1b0 [mlx5_core] ? kstrtoll+0x22/0x70 num_vf_store+0x4b/0x70 [mlx5_core] kernfs_fop_write+0x102/0x180 __vfs_write+0x26/0x140 ? rcu_all_qs+0x5/0x80 ? _cond_resched+0x15/0x30 ? __sb_start_write+0x41/0x80 vfs_write+0xad/0x1a0 SyS_write+0x42/0x90 do_syscall_64+0x60/0x110 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 24163189da48 ("net/mlx5: Separate IRQ request/free from EQ life cycle") Signed-off-by: Maor Gottlieb <[email protected]> Reviewed-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: cmdif, Avoid skipping reclaim pages if FW is not accessibleSaeed Mahameed3-9/+11
In case of pci is offline reclaim_pages_cmd() will still try to call the FW to release FW pages, cmd_exec() in this case will return a silent success without actually calling the FW. This is wrong and will cause page leaks, what we should do is to detect pci offline or command interface un-available before tying to access the FW and manually release the FW pages in the driver. In this patch we share the code to check for FW command interface availability and we call it in sensitive places e.g. reclaim_pages_cmd(). Alternative fix: 1. Remove MLX5_CMD_OP_MANAGE_PAGES form mlx5_internal_err_ret_value, command success simulation list. 2. Always Release FW pages even if cmd_exec fails in reclaim_pages_cmd(). Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: Add retry mechanism to the command entry index allocationEran Ben Elisha1-1/+20
It is possible that new command entry index allocation will temporarily fail. The new command holds the semaphore, so it means that a free entry should be ready soon. Add one second retry mechanism before returning an error. Patch "net/mlx5: Avoid possible free of command entry while timeout comp handler" increase the possibility to bump into this temporarily failure as it delays the entry index release for non-callback commands. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: poll cmd EQ in case of command timeoutEran Ben Elisha3-9/+86
Once driver detects a command interface command timeout, it warns the user and returns timeout error to the caller. In such case, the entry of the command is not evacuated (because only real event interrupt is allowed to clear command interface entry). If the HW event interrupt of this entry will never arrive, this entry will be left unused forever. Command interface entries are limited and eventually we can end up without the ability to post a new command. In addition, if driver will not consume the EQE of the lost interrupt and rearm the EQ, no new interrupts will arrive for other commands. Add a resiliency mechanism for manually polling the command EQ in case of a command timeout. In case resiliency mechanism will find non-handled EQE, it will consume it, and the command interface will be fully functional again. Once the resiliency flow finished, wait another 5 seconds for the command interface to complete for this command entry. Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow. Add an async EQ spinlock to avoid races between resiliency flows and real interrupts that might run simultaneously. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: Avoid possible free of command entry while timeout comp handlerEran Ben Elisha2-38/+73
Upon command completion timeout, driver simulates a forced command completion. In a rare case where real interrupt for that command arrives simultaneously, it might release the command entry while the forced handler might still access it. Fix that by adding an entry refcount, to track current amount of allowed handlers. Command entry to be released only when this refcount is decremented to zero. Command refcount is always initialized to one. For callback commands, command completion handler is the symmetric flow to decrement it. For non-callback commands, it is wait_func(). Before ringing the doorbell, increment the refcount for the real completion handler. Once the real completion handler is called, it will decrement it. For callback commands, once the delayed work is scheduled, increment the refcount. Upon callback command completion handler, we will try to cancel the timeout callback. In case of success, we need to decrement the callback refcount as it will never run. In addition, gather the entry index free and the entry free into a one flow for all command types release. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02net/mlx5: Fix a race when moving command interface to polling modeEran Ben Elisha1-0/+2
As part of driver unload, it destroys the commands EQ (via FW command). As the commands EQ is destroyed, FW will not generate EQEs for any command that driver sends afterwards. Driver should poll for later commands status. Driver commands mode metadata is updated before the commands EQ is actually destroyed. This can lead for double completion handle by the driver (polling and interrupt), if a command is executed and completed by FW after the mode was changed, but before the EQ was destroyed. Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee that only DESTROY_EQ command can be executed during this time period. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Eran Ben Elisha <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2020-10-02Merge branch 'work.epoll' of ↵Linus Torvalds1-41/+31
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull epoll fixes from Al Viro: "Several race fixes in epoll" * 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: ep_create_wakeup_source(): dentry name can change under you... epoll: EPOLL_CTL_ADD: close the race in decision to take fast path epoll: replace ->visited/visited_list with generation count epoll: do not insert into poll queues until all sanity checks are done
2020-10-02Merge tag 'riscv-for-linus-5.9-rc8' of ↵Linus Torvalds3-4/+14
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: "Two fixes for this week: - The addition of a symbol export for clint_time_val, which has been inlined into some timex functions and can be used by drivers. - A fix to avoid calling get_cycles() before the timers have been probed. These both only effect !MMU systems" * tag 'riscv-for-linus-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: Check clint_time_val before use clocksource: clint: Export clint_time_val for modules
2020-10-02Merge tag 'for-5.9-rc7-tag' of ↵Linus Torvalds3-10/+52
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two more fixes. One is for a lockdep warning/lockup (also caught by syzbot), that one has been seen in practice. Regarding the other syzbot reports mentioned last time, they don't seem to be urgent and reliably reproducible so they'll be fixed later. The second fix is for a potential corruption when device replace finishes and the in-memory state of trim is not copied to the new device" * tag 'for-5.9-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix filesystem corruption after a device replace btrfs: move btrfs_rm_dev_replace_free_srcdev outside of all locks btrfs: move btrfs_scratch_superblocks into btrfs_dev_replace_finishing
2020-10-02Merge tag 'pm-5.9-rc8' of ↵Linus Torvalds3-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix one more issue related to the recent RCU-lockdep changes, a typo in documentation and add a missing return statement to intel_pstate. Specifics: - Fix up RCU usage for cpuidle on the ARM imx6q platform (Ulf Hansson) - Fix typo in the PM documentation (Yoann Congal) - Add return statement that is missing after recent changes in the intel_pstate driver (Zhang Rui)" * tag 'pm-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ARM: imx6q: Fixup RCU usage for cpuidle Documentation: PM: Fix a reStructuredText syntax error cpufreq: intel_pstate: Fix missing return statement
2020-10-02Merge tag 'staging-5.9-rc8' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull IIO fixes from Greg KH: "Here are two small IIO driver fixes for 5.9-rc8 that resolve some reported issues: - driver name fixed in one driver - device name typo fixed Both have been in linux-next for a while with no reported problems" * tag 'staging-5.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: adc: qcom-spmi-adc5: fix driver name iio: adc: ad7124: Fix typo in device name
2020-10-02Merge tag 'gpio-v5.9-2' of ↵Linus Torvalds11-60/+138
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some late GPIO fixes for the v5.9 series: - Fix compiler warnings on the OMAP when PM is disabled - Clear the interrupt when setting edge sensitivity on the Spreadtrum driver. - Fix up spurious interrupts on the TC35894. - Support threaded interrupts on the Siox controller. - Fix resource leaks on the mockup driver. - Fix line event handling in syscall compatible mode for the character device. - Fix an unitialized variable in the PCA953A driver. - Fix access to all GPIO IRQs on the Aspeed AST2600. - Fix line direction on the AMD FCH driver. - Use the bitmap API instead of compiler intrinsics for bit manipulation in the PCA953x driver" * tag 'gpio-v5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: Correctly initialize registers 6 and 7 for PCA957x gpio: pca953x: Use bitmap API over implicit GCC extension gpio: amd-fch: correct logic of GPIO_LINE_DIRECTION gpio: aspeed: fix ast2600 bank properties gpio/aspeed-sgpio: don't enable all interrupts by default gpio/aspeed-sgpio: enable access to all 80 input & output sgpios gpio: pca953x: Fix uninitialized pending variable gpiolib: Fix line event handling in syscall compatible mode gpio: mockup: fix resource leak in error path gpio: siox: explicitly support only threaded irqs gpio: tc35894: fix up tc35894 interrupt configuration gpio: sprd: Clear interrupt when setting the type as edge gpio: omap: Fix warnings if PM is disabled
2020-10-02Merge tag 'mmc-v5.9-rc4-3' of ↵Linus Torvalds3-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - Fix deadlock when removing MEMSTICK host - Workaround broken CMDQ on Intel GLK based IRBIS models * tag 'mmc-v5.9-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci: Workaround broken command queuing on Intel GLK based IRBIS models memstick: Skip allocating card when removing host
2020-10-02random32: Restore __latent_entropy attribute on net_rand_stateThibaut Sautereau1-1/+1
Commit f227e3ec3b5c ("random32: update the net random state on interrupt and activity") broke compilation and was temporarily fixed by Linus in 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") by entirely moving net_rand_state out of the things handled by the latent_entropy GCC plugin. From what I understand when reading the plugin code, using the __latent_entropy attribute on a declaration was the wrong part and simply keeping the __latent_entropy attribute on the variable definition was the correct fix. Fixes: 83bdc7275e62 ("random32: remove net_rand_state from the latent entropy gcc plugin") Acked-by: Willy Tarreau <[email protected]> Cc: Emese Revfy <[email protected]> Signed-off-by: Thibaut Sautereau <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-02Merge branch 'pm-cpufreq'Rafael J. Wysocki1-0/+1
* pm-cpufreq: cpufreq: intel_pstate: Fix missing return statement
2020-10-02mm: memcg/slab: fix slab statistics in !SMP configurationRoman Gushchin1-0/+5
Since commit ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items") the write side of slab counters accepts a value in bytes and converts it to pages. It happens in __mod_node_page_state(). However a non-SMP version of __mod_node_page_state() doesn't perform this conversion. It leads to incorrect (unrealistically high) slab counters values. Fix this by adding a similar conversion to the non-SMP version of __mod_node_page_state(). Signed-off-by: Roman Gushchin <[email protected]> Reported-and-tested-by: Bastian Bittorf <[email protected]> Fixes: ea426c2a7de8 ("mm: memcg: prepare for byte-sized vmstat items") Acked-by: Vlastimil Babka <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-02MAINTAINERS: Add Mark Gross and Hans de Goede as x86 platform drivers ↵Hans de Goede1-3/+3
maintainers Darren Hart and Andy Shevchenko lately have not had enough time to maintain the x86 platform drivers, dropping their status to: "Odd Fixes". Mark Gross and Hans de Goede will take over maintainership of the x86 platform drivers. Replace Darren and Andy's entries with theirs and change the status to "Maintained". Signed-off-by: Hans de Goede <[email protected]> Acked-by: Mark Gross <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02platform/x86: intel-vbtn: Switch to an allow-list for SW_TABLET_MODE reportingHans de Goede1-9/+43
2 recent commits: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type") 1fac39fd0316 ("platform/x86: intel-vbtn: Also handle tablet-mode switch on "Detachable" and "Portable" chassis-types") Enabled reporting of SW_TABLET_MODE on more devices since the vbtn ACPI interface is used by the firmware on some of those devices to report this. Testing has shown that unconditionally enabling SW_TABLET_MODE reporting on all devices with a chassis type of 8 ("Portable") or 10 ("Notebook") which support the VGBS method is a very bad idea. Many of these devices are normal laptops (non 2-in-1) models with a VGBS which always returns 0, which we translate to SW_TABLET_MODE=1. This in turn causes userspace (libinput) to suppress events from the builtin keyboard and touchpad, making the laptop essentially unusable. Since the problem of wrongly reporting SW_TABLET_MODE=1 in combination with libinput, leads to a non-usable system. Where as OTOH many people will not even notice when SW_TABLET_MODE is not being reported, this commit changes intel_vbtn_has_switches() to use a DMI based allow-list. The new DMI based allow-list matches on the 31 ("Convertible") and 32 ("Detachable") chassis-types, as these clearly are 2-in-1s and so far if they support the intel-vbtn ACPI interface they all have properly working SW_TABLET_MODE reporting. Besides these 2 generic matches, it also contains model specific matches for 2-in-1 models which use a different chassis-type and which are known to have properly working SW_TABLET_MODE reporting. This has been tested on the following 2-in-1 devices: Dell Venue 11 Pro 7130 vPro HP Pavilion X2 10-p002nd HP Stream x360 Convertible PC 11 Medion E1239T Fixes: cfae58ed681c ("platform/x86: intel-vbtn: Only blacklist SW_TABLET_MODE on the 9 / "Laptop" chasis-type") BugLink: https://forum.manjaro.org/t/keyboard-and-touchpad-only-work-on-kernel-5-6/22668 BugLink: https://bugzilla.opensuse.org/show_bug.cgi?id=1175599 Cc: Barnabás Pőcze <[email protected]> Cc: Takashi Iwai <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02platform/x86: intel-vbtn: Revert "Fix SW_TABLET_MODE always reporting 1 on ↵Andy Shevchenko1-8/+4
the HP Pavilion 11 x360" After discussion, see the Link tag, it appears that this is not good enough. So, revert it now and apply a better fix. This reverts commit d823346876a970522ff9e4d2b323c9b734dcc4de. Link: https://lore.kernel.org/platform-driver-x86/[email protected]/ Signed-off-by: Andy Shevchenko <[email protected]>
2020-10-02mac80211: avoid processing non-S1G elements on S1G bandThomas Pedersen1-6/+9
In ieee80211_determine_chantype(), the sband->ht_cap was being processed before S1G Operation element. Since the HT capability element should not be present on the S1G band, avoid processing potential garbage by moving the call to ieee80211_apply_htcap_overrides() to after the S1G block. Also, in case of a missing S1G Operation element, we would continue trying to process non-S1G elements (and return with a channel width of 20MHz). Instead, just assume primary channel is equal to operating and infer the operating width from the BSS channel, then return. Signed-off-by: Thomas Pedersen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2020-10-02nl80211: fix non-split wiphy informationJohannes Berg1-1/+4
When dumping wiphy information, we try to split the data into many submessages, but for old userspace we still support the old mode where this doesn't happen. However, in this case we were not resetting our state correctly and dumping multiple messages for each wiphy, which would have broken such older userspace. This was broken pretty much immediately afterwards because it only worked in the original commit where non-split dumps didn't have any more data than split dumps... Fixes: fe1abafd942f ("nl80211: re-add channel width and extended capa advertising") Signed-off-by: Johannes Berg <[email protected]> Link: https://lore.kernel.org/r/20200928130717.3e6d9c6bada2.Ie0f151a8d0d00a8e1e18f6a8c9244dd02496af67@changeid Signed-off-by: Johannes Berg <[email protected]>
2020-10-02nl80211: reduce non-split wiphy dump sizeJohannes Berg1-13/+24
When wiphy dumps cannot be split, such as in events or with older userspace that doesn't support it, the size can today be too big. Reduce it, by doing two things: 1) remove data that couldn't have been present before the split capability was introduced since it's new, such as HE capabilities 2) as suggested by Martin Willi, remove management frame subtypes from the split dumps, as just (1) isn't even enough due to other new code capabilities. This is fine as old consumers (really just wpa_supplicant) didn't check this data before they got support for split dumps. Reported-by: Martin Willi <[email protected]> Suggested-by: Martin Willi <[email protected]> Signed-off-by: Johannes Berg <[email protected]> Tested-by: Martin Willi <[email protected]> Link: https://lore.kernel.org/r/20200928130655.53bce7873164.I71f06c9a221cd0630429a1a56eeae68a13beca61@changeid Signed-off-by: Johannes Berg <[email protected]>
2020-10-01pipe: remove pipe_wait() and fix wakeup race with spliceLinus Torvalds3-27/+48
The pipe splice code still used the old model of waiting for pipe IO by using a non-specific "pipe_wait()" that waited for any pipe event to happen, which depended on all pipe IO being entirely serialized by the pipe lock. So by checking the state you were waiting for, and then adding yourself to the wait queue before dropping the lock, you were guaranteed to see all the wakeups. Strictly speaking, the actual wakeups were not done under the lock, but the pipe_wait() model still worked, because since the waiter held the lock when checking whether it should sleep, it would always see the current state, and the wakeup was always done after updating the state. However, commit 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") split the single wait-queue into two, and in the process also made the "wait for event" code wait for _two_ wait queues, and that then showed a race with the wakers that were not serialized by the pipe lock. It's only splice that used that "pipe_wait()" model, so the problem wasn't obvious, but Josef Bacik reports: "I hit a hang with fstest btrfs/187, which does a btrfs send into /dev/null. This works by creating a pipe, the write side is given to the kernel to write into, and the read side is handed to a thread that splices into a file, in this case /dev/null. The box that was hung had the write side stuck here [pipe_write] and the read side stuck here [splice_from_pipe_next -> pipe_wait]. [ more details about pipe_wait() scenario ] The problem is we're doing the prepare_to_wait, which sets our state each time, however we can be woken up either with reads or writes. In the case above we race with the WRITER waking us up, and re-set our state to INTERRUPTIBLE, and thus never break out of schedule" Josef had a patch that avoided the issue in pipe_wait() by just making it set the state only once, but the deeper problem is that pipe_wait() depends on a level of synchonization by the pipe mutex that it really shouldn't. And the whole "wait for any pipe state change" model really isn't very good to begin with. So rather than trying to work around things in pipe_wait(), remove that legacy model of "wait for arbitrary pipe event" entirely, and actually create functions that wait for the pipe actually being readable or writable, and can do so without depending on the pipe lock serializing everything. Fixes: 0ddad21d3e99 ("pipe: use exclusive waits when reading or writing") Link: https://lore.kernel.org/linux-fsdevel/bfa88b5ad6f069b2b679316b9e495a970130416c.1601567868.git.josef@toxicpanda.com/ Reported-by: Josef Bacik <[email protected]> Reviewed-and-tested-by: Josef Bacik <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-01lib8390: Use netif_msg_init to initialize msg_enable bitsArmin Wolf1-6/+8
Use netif_msg_init() to process param settings and use only the proper initialized value of ei_local->msg_level for later processing; Signed-off-by: Armin Wolf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01net: phy: realtek: Modify 2.5G PHY name to RTL8226Willy Liu1-19/+19
Realtek single-chip Ethernet PHY solutions can be separated as below: 10M/100Mbps: RTL8201X 1Gbps: RTL8211X 2.5Gbps: RTL8226/RTL8221X RTL8226 is the first version for realtek that compatible 2.5Gbps single PHY. Since RTL8226 is single port only, realtek changes its name to RTL8221B from the second version. PHY ID for RTL8226 is 0x001cc800 and RTL8226B/RTL8221B is 0x001cc840. RTL8125 is not a single PHY solution, it integrates PHY/MAC/PCIE bus controller and embedded memory. Signed-off-by: Willy Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01caif_virtio: Remove redundant initialization of variable errJing Xiangfeng1-1/+1
After commit a8c7687bf216 ("caif_virtio: Check that vringh_config is not null"), the variable err is being initialized with '-EINVAL' that is meaningless. So remove it. Signed-off-by: Jing Xiangfeng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01net-sysfs: Fix inconsistent of format with argument type in net-sysfs.cYe Bin1-2/+2
Fix follow warnings: [net/core/net-sysfs.c:1161]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'int'. [net/core/net-sysfs.c:1162]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'int'. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Ye Bin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01pktgen: Fix inconsistent of format with argument type in pktgen.cYe Bin1-5/+5
Fix follow warnings: [net/core/pktgen.c:925]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:942]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:962]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:984]: (warning) %u in format string (no. 1) requires 'unsigned int' but the argument type is 'signed int'. [net/core/pktgen.c:1149]: (warning) %d in format string (no. 1) requires 'int' but the argument type is 'unsigned int'. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Ye Bin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01drivers/net/wan/hdlc_fr: Correctly handle special skb->protocol valuesXie He1-47/+51
The fr_hard_header function is used to prepend the header to skbs before transmission. It is used in 3 situations: 1) When a control packet is generated internally in this driver; 2) When a user sends an skb on an Ethernet-emulating PVC device; 3) When a user sends an skb on a normal PVC device. These 3 situations need to be handled differently by fr_hard_header. Different headers should be prepended to the skb in different situations. Currently fr_hard_header distinguishes these 3 situations using skb->protocol. For situation 1 and 2, a special skb->protocol value will be assigned before calling fr_hard_header, so that it can recognize these 2 situations. All skb->protocol values other than these special ones are treated by fr_hard_header as situation 3. However, it is possible that in situation 3, the user sends an skb with one of the special skb->protocol values. In this case, fr_hard_header would incorrectly treat it as situation 1 or 2. This patch tries to solve this issue by using skb->dev instead of skb->protocol to distinguish between these 3 situations. For situation 1, skb->dev would be NULL; for situation 2, skb->dev->type would be ARPHRD_ETHER; and for situation 3, skb->dev->type would be ARPHRD_DLCI. This way fr_hard_header would be able to distinguish these 3 situations correctly regardless what skb->protocol value the user tries to use in situation 3. Cc: Krzysztof Halasa <[email protected]> Signed-off-by: Xie He <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller103-1760/+7528
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-10-01 The following pull-request contains BPF updates for your *net-next* tree. We've added 90 non-merge commits during the last 8 day(s) which contain a total of 103 files changed, 7662 insertions(+), 1894 deletions(-). Note that once bpf(/net) tree gets merged into net-next, there will be a small merge conflict in tools/lib/bpf/btf.c between commit 1245008122d7 ("libbpf: Fix native endian assumption when parsing BTF") from the bpf tree and the commit 3289959b97ca ("libbpf: Support BTF loading and raw data output in both endianness") from the bpf-next tree. Correct resolution would be to stick with bpf-next, it should look like: [...] /* check BTF magic */ if (fread(&magic, 1, sizeof(magic), f) < sizeof(magic)) { err = -EIO; goto err_out; } if (magic != BTF_MAGIC && magic != bswap_16(BTF_MAGIC)) { /* definitely not a raw BTF */ err = -EPROTO; goto err_out; } /* get file size */ [...] The main changes are: 1) Add bpf_snprintf_btf() and bpf_seq_printf_btf() helpers to support displaying BTF-based kernel data structures out of BPF programs, from Alan Maguire. 2) Speed up RCU tasks trace grace periods by a factor of 50 & fix a few race conditions exposed by it. It was discussed to take these via BPF and networking tree to get better testing exposure, from Paul E. McKenney. 3) Support multi-attach for freplace programs, needed for incremental attachment of multiple XDP progs using libxdp dispatcher model, from Toke Høiland-Jørgensen. 4) libbpf support for appending new BTF types at the end of BTF object, allowing intrusive changes of prog's BTF (useful for future linking), from Andrii Nakryiko. 5) Several BPF helper improvements e.g. avoid atomic op in cookie generator and add a redirect helper into neighboring subsys, from Daniel Borkmann. 6) Allow map updates on sockmaps from bpf_iter context in order to migrate sockmaps from one to another, from Lorenz Bauer. 7) Fix 32 bit to 64 bit assignment from latest alu32 bounds tracking which caused a verifier issue due to type downgrade to scalar, from John Fastabend. 8) Follow-up on tail-call support in BPF subprogs which optimizes x64 JIT prologue and epilogue sections, from Maciej Fijalkowski. 9) Add an option to perf RB map to improve sharing of event entries by avoiding remove- on-close behavior. Also, add BPF_PROG_TEST_RUN for raw_tracepoint, from Song Liu. 10) Fix a crash in AF_XDP's socket_release when memory allocation for UMEMs fails, from Magnus Karlsson. ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-01Merge tag 'iommu-fixes-v5.9-rc7' of ↵Linus Torvalds3-50/+18
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Fix a device reference counting bug in the Exynos IOMMU driver. - Lockdep fix for the Intel VT-d driver. - Fix a bug in the AMD IOMMU driver which caused corruption of the IVRS ACPI table and caused IOMMU driver initialization failures in kdump kernels. * tag 'iommu-fixes-v5.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix lockdep splat in iommu_flush_dev_iotlb() iommu/amd: Fix the overwritten field in IVMD header iommu/exynos: add missing put_device() call in exynos_iommu_of_xlate()
2020-10-01Merge branch 'net-ravb-Add-support-for-explicit-internal-clock-delay-cDavid S. Miller5-147/+320
onfiguration' Geert Uytterhoeven says: ==================== net/ravb: Add support for explicit internal clock delay configuration Some Renesas EtherAVB variants support internal clock delay configuration, which can add larger delays than the delays that are typically supported by the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps" properties). Historically, the EtherAVB driver configured these delays based on the "rgmii-*id" PHY mode. This caused issues with PHY drivers that implement PHY internal delays properly[1]. Hence a backwards-compatible workaround was added by masking the PHY mode[2]. This patch series implements the next step of the plan outlined in [3], and adds proper support for explicit configuration of the MAC internal clock delays using new "[rt]x-internal-delay-ps" properties. If none of these properties is present, the driver falls back to the old handling. This can be considered the MAC counterpart of commit 9150069bf5fc0e86 ("dt-bindings: net: Add tx and rx internal delays"), which applies to the PHY. Note that unlike commit 92252eec913b2dd5 ("net: phy: Add a helper to return the index for of the internal delay"), no helpers are provided to parse the DT properties, as so far there is a single user only, which supports only zero or a single fixed value. Of course such helpers can be added later, when the need arises, or when deemed useful otherwise. This series consists of 3 parts: 1. DT binding updates documenting the new properties, for both the generic ethernet-controller and the EtherAVB-specific bindings, 2. Conversion to json-schema of the Renesas EtherAVB DT bindings. Technically, the conversion is independent of all of the above. I included it in this series, as it shows how all sanity checks on "[rt]x-internal-delay-ps" values are implemented as DT binding checks, 3. EtherAVB driver update implementing support for the new properties. Given Rob has provided his acks for the DT binding updates, all of this can be merged through net-next. Changes compared to v3[4]: - Add Reviewed-by, - Drop the DT updates, as they will be merged through renesas-devel and arm-soc, and have a hard dependency on this series. Changes compared to v2[5]: - Update recently added board DTS files, - Add Reviewed-by. Changes compared to v1[6]: - Added "[PATCH 1/7] dt-bindings: net: ethernet-controller: Add internal delay properties", - Replace "renesas,[rt]xc-delay-ps" by "[rt]x-internal-delay-ps", - Incorporated EtherAVB DT binding conversion to json-schema, - Add Reviewed-by. Impacted, tested: - Salvator-X(S) with R-Car H3 ES1.0 and ES2.0, M3-W, and M3-N. Not impacted, tested: - Ebisu with R-Car E3. Impacted, not tested: - Salvator-X(S) with other SoC variants, - ULCB with R-Car H3/M3-W/M3-N variants, - V3MSK and Eagle with R-Car V3M, - Draak with R-Car V3H, - HiHope RZ/G2[MN] with RZ/G2M or RZ/G2N, - Beacon EmbeddedWorks RZ/G2M Development Kit. To ease testing, I have pushed this series and the DT updates to the topic/ravb-internal-clock-delays-v4 branch of my renesas-drivers repository at git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers.git. Thanks for applying! References: [1] Commit bcf3440c6dd78bfe ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY") [2] Commit 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting delays twice"). https://lore.kernel.org/r/[email protected]/ [3] https://lore.kernel.org/r/CAMuHMdU+MR-2tr3-pH55G0GqPG9HwH3XUd=8HZxprFDMGQeWUw@mail.gmail.com/ [4] https://lore.kernel.org/linux-devicetree/[email protected]/ [5] https://lore.kernel.org/linux-devicetree/[email protected]/ [6] https://lore.kernel.org/linux-devicetree/[email protected]/ ==================== Signed-off-by: David S. Miller <[email protected]>
2020-10-01ravb: Add support for explicit internal clock delay configurationGeert Uytterhoeven2-9/+28
Some EtherAVB variants support internal clock delay configuration, which can add larger delays than the delays that are typically supported by the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps" properties). Historically, the EtherAVB driver configured these delays based on the "rgmii-*id" PHY mode. This caused issues with PHY drivers that implement PHY internal delays properly[1]. Hence a backwards-compatible workaround was added by masking the PHY mode[2]. Add proper support for explicit configuration of the MAC internal clock delays using the new "[rt]x-internal-delay-ps" properties. Fall back to the old handling if none of these properties is present. [1] Commit bcf3440c6dd78bfe ("net: phy: micrel: add phy-mode support for the KSZ9031 PHY") [2] Commit 9b23203c32ee02cd ("ravb: Mask PHY mode to avoid inserting delays twice"). Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Sergei Shtylyov <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2020-10-01ravb: Split delay handling in parsing and applyingGeert Uytterhoeven2-6/+19
Currently, full delay handling is done in both the probe and resume paths. Split it in two parts, so the resume path doesn't have to redo the parsing part over and over again. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Sergei Shtylyov <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>