aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-03-18netfilter: nftables: skip hook overlap logic if flowtable is stalePablo Neira Ayuso1-0/+3
If the flowtable has been previously removed in this batch, skip the hook overlap checks. This fixes spurious EEXIST errors when removing and adding the flowtable in the same batch. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-18netfilter: flowtable: Make sure GC works periodically in idle systemYinjun Zhang1-1/+1
Currently flowtable's GC work is initialized as deferrable, which means GC cannot work on time when system is idle. So the hardware offloaded flow may be deleted for timeout, since its used time is not timely updated. Resolve it by initializing the GC work as delayed work instead of deferrable. Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support") Signed-off-by: Yinjun Zhang <[email protected]> Signed-off-by: Louis Peens <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-18netfilter: nftables: allow to update flowtable flagsPablo Neira Ayuso2-0/+18
Honor flowtable flags from the control update path. Disallow disabling to toggle hardware offload support though. Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-18netfilter: nftables: report EOPNOTSUPP on unsupported flowtable flagsPablo Neira Ayuso1-1/+3
Error was not set accordingly. Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-18netfilter: conntrack: Fix gre tunneling over ipv6Ludovic Senecaux1-3/+0
This fix permits gre connections to be tracked within ip6tables rules Signed-off-by: Ludovic Senecaux <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-15netfilter: ctnetlink: fix dump of the expect mask attributeFlorian Westphal1-0/+1
Before this change, the mask is never included in the netlink message, so "conntrack -E expect" always prints 0.0.0.0. In older kernels the l3num callback struct was passed as argument, based on tuple->src.l3num. After the l3num indirection got removed, the call chain is based on m.src.l3num, but this value is 0xffff. Init l3num to the correct value. Fixes: f957be9d349a3 ("netfilter: conntrack: remove ctnetlink callbacks from l3 protocol trackers") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-15netfilter: x_tables: Use correct memory barriers.Mark Tomlinson2-2/+2
When a new table value was assigned, it was followed by a write memory barrier. This ensured that all writes before this point would complete before any writes after this point. However, to determine whether the rules are unused, the sequence counter is read. To ensure that all writes have been done before these reads, a full memory barrier is needed, not just a write memory barrier. The same argument applies when incrementing the counter, before the rules are read. Changing to using smp_mb() instead of smp_wmb() fixes the kernel panic reported in cc00bcaa5899 (which is still present), while still maintaining the same speed of replacing tables. The smb_mb() barriers potentially slow the packet path, however testing has shown no measurable change in performance on a 4-core MIPS64 platform. Fixes: 7f5c6d4f665b ("netfilter: get rid of atomic ops in fast path") Signed-off-by: Mark Tomlinson <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-15Revert "netfilter: x_tables: Switch synchronization to RCU"Mark Tomlinson5-40/+56
This reverts commit cc00bcaa589914096edef7fb87ca5cee4a166b5c. This (and the preceding) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Prior to using RCU a script calling "iptables" approx. 200 times was taking 1.16s. With RCU this increased to 11.59s. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-15Revert "netfilter: x_tables: Update remaining dereference to RCU"Mark Tomlinson3-3/+3
This reverts commit 443d6e86f821a165fae3fc3fc13086d27ac140b1. This (and the following) patch basically re-implemented the RCU mechanisms of patch 784544739a25. That patch was replaced because of the performance problems that it created when replacing tables. Now, we have the same issue: the call to synchronize_rcu() makes replacing tables slower by as much as an order of magnitude. Revert these patches and fix the issue in a different way. Signed-off-by: Mark Tomlinson <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2021-03-14flow_dissector: fix byteorder of dissected ICMP IDAlexander Lobakin1-1/+1
flow_dissector_key_icmp::id is of type u16 (CPU byteorder), ICMP header has its ID field in network byteorder obviously. Sparse says: net/core/flow_dissector.c:178:43: warning: restricted __be16 degrades to integer Convert ID value to CPU byteorder when storing it into flow_dissector_key_icmp. Fixes: 5dec597e5cd0 ("flow_dissector: extract more ICMP information") Signed-off-by: Alexander Lobakin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-14net: qrtr: fix a kernel-infoleak in qrtr_recvmsg()Eric Dumazet1-0/+5
struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently does not clear it before copying kernel data to user space. It might be too late to name the hole since sockaddr_qrtr structure is uapi. BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249 CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x21c/0x280 lib/dump_stack.c:120 kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118 kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402 kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249 instrument_copy_to_user include/linux/instrumented.h:121 [inline] _copy_to_user+0x1ac/0x270 lib/usercopy.c:33 copy_to_user include/linux/uaccess.h:209 [inline] move_addr_to_user+0x3a2/0x640 net/socket.c:237 ____sys_recvmsg+0x696/0xd50 net/socket.c:2575 ___sys_recvmsg net/socket.c:2610 [inline] do_recvmmsg+0xa97/0x22d0 net/socket.c:2710 __sys_recvmmsg net/socket.c:2789 [inline] __do_sys_recvmmsg net/socket.c:2812 [inline] __se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805 __x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805 do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x465f69 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69 RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003 RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60 R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000 Local variable ----addr@____sys_recvmsg created at: ____sys_recvmsg+0x168/0xd50 net/socket.c:2550 ____sys_recvmsg+0x168/0xd50 net/socket.c:2550 Bytes 2-3 of 12 are uninitialized Memory access of size 12 starts at ffff88817c627b40 Data copied to user address 0000000020000140 Fixes: bdabad3e363d ("net: Add Qualcomm IPC router") Signed-off-by: Eric Dumazet <[email protected]> Cc: Courtney Cavin <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-14net: arcnet: com20020 fix error handlingTong Zhang1-15/+19
There are two issues when handling error case in com20020pci_probe() 1. priv might be not initialized yet when calling com20020pci_remove() from com20020pci_probe(), since the priv is set at the very last but it can jump to error handling in the middle and priv remains NULL. 2. memory leak - the net device is allocated in alloc_arcdev but not properly released if error happens in the middle of the big for loop [ 1.529110] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 1.531447] RIP: 0010:com20020pci_remove+0x15/0x60 [com20020_pci] [ 1.536805] Call Trace: [ 1.536939] com20020pci_probe+0x3f2/0x48c [com20020_pci] [ 1.537226] local_pci_probe+0x48/0x80 [ 1.539918] com20020pci_init+0x3f/0x1000 [com20020_pci] Signed-off-by: Tong Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-13devlink: fix typo in documentationEva Dengler2-3/+3
This commit fixes three spelling typos in devlink-dpipe.rst and devlink-port.rst. Signed-off-by: Eva Dengler <[email protected]> Acked-by: Randy Dunlap <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-13net: ipa: terminate message handler arraysAlex Elder1-0/+2
When a QMI handle is initialized, an array of message handler structures is provided, defining how any received message should be handled based on its type and message ID. The QMI core code traverses this array when a message arrives and calls the function associated with the (type, msg_id) found in the array. The array is supposed to be terminated with an empty (all zero) entry though. Without it, an unsupported message will cause the QMI core code to go past the end of the array. Fix this bug, by properly terminating the message handler arrays provided when QMI handles are set up by the IPA driver. Fixes: 530f9216a9537 ("soc: qcom: ipa: AP/modem communications") Reported-by: Sujit Kautkar <[email protected]> Signed-off-by: Alex Elder <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/gitDavid S. Miller7-44/+57
/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-12 This series contains updates to ice, i40e, ixgbe and igb drivers. Magnus adjusts the return value for xsk allocation for ice. This fixes reporting of napi work done and matches the behavior of other Intel NIC drivers for xsk allocations. Maciej moves storing of the rx_offset value to after the build_skb flag is set as this flag affects the offset value for ice, i40e, and ixgbe. Li RongQing resolves an issue where an Rx buffer can be reused prematurely with XDP redirect for igb. ====================
2021-03-12ibmvnic: update MAINTAINERSLijun Pan1-0/+1
Tom wrote most of the driver code and his experience is valuable to us. Add him as a Reviewer so that patches will be Cc'ed and reviewed by him. Signed-off-by: Lijun Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12selftests: mptcp: Restore packet capture option in join testsMat Martineau1-10/+20
The join self tests previously used the '-c' command line option to enable creation of pcap files for the tests that run, but the change to allow running a subset of the join tests made overlapping use of that option. Restore the capture functionality with '-c' and move the syncookie test option to '-k'. Fixes: 1002b89f23ea ("selftests: mptcp: add command line arguments for mptcp_join.sh") Acked-and-tested-by: Geliang Tang <[email protected]> Co-developed-by: Matthieu Baerts <[email protected]> Signed-off-by: Matthieu Baerts <[email protected]> Signed-off-by: Mat Martineau <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12net: dsa: mt7530: setup core clock even in TRGMII modeIlya Lipnitskiy1-27/+25
A recent change to MIPS ralink reset logic made it so mt7530 actually resets the switch on platforms such as mt7621 (where bit 2 is the reset line for the switch). That exposed an issue where the switch would not function properly in TRGMII mode after a reset. Reconfigure core clock in TRGMII mode to fix the issue. Tested on Ubiquiti ER-X (MT7621) with TRGMII mode enabled. Fixes: 3f9ef7785a9c ("MIPS: ralink: manage low reset lines") Signed-off-by: Ilya Lipnitskiy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12mptcp: fix bit MPTCP_PUSH_PENDING testsDan Carpenter1-2/+2
The MPTCP_PUSH_PENDING define is 6 and these tests should be testing if BIT(6) is set. Fixes: c2e6048fa1cf ("mptcp: fix race in release_cb") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Matthieu Baerts <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12net: phy: broadcom: Fix RGMII delays for BCM50160 and BCM50610MFlorian Fainelli1-0/+4
The PHY driver entry for BCM50160 and BCM50610M calls bcm54xx_config_init() but does not call bcm54xx_config_clock_delay() in order to configuration appropriate clock delays on the PHY, fix that. Fixes: 733336262b28 ("net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12ftgmac100: Restart MAC HW onceDylan Hung1-0/+1
The interrupt handler may set the flag to reset the mac in the future, but that flag is not cleared once the reset has occurred. Fixes: 10cbd6407609 ("ftgmac100: Rework NAPI & interrupts handling") Signed-off-by: Dylan Hung <[email protected]> Acked-by: Benjamin Herrenschmidt <[email protected]> Reviewed-by: Joel Stanley <[email protected]> Signed-off-by: Joel Stanley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12net: axienet: Fix probe error cleanupRobert Hancock1-12/+25
The driver did not always clean up all allocated resources when probe failed. Fix the probe cleanup path to clean up everything that was allocated. Fixes: 57baf8cc70ea ("net: axienet: Handle deferred probe on clock properly") Signed-off-by: Robert Hancock <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12net: correct sk_acceptq_is_full()liuyacan1-1/+1
The "backlog" argument in listen() specifies the maximom length of pending connections, so the accept queue should be considered full if there are exactly "backlog" elements. Signed-off-by: liuyacan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-12Revert "net: bonding: fix error return code of bond_neigh_init()"David S. Miller1-6/+2
This reverts commit 2055a99da8a253a357bdfd359b3338ef3375a26c. This change rejects legitimate configurations. A slave doesn't need to exist nor implement ndo_slave_setup. Signed-off-by: David S. Miller <[email protected]>
2021-03-12igb: avoid premature Rx buffer reuseLi RongQing1-7/+15
Igb needs a similar fix as commit 75aab4e10ae6a ("i40e: avoid premature Rx buffer reuse") The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Longer explanation: Intel NICs have a recycle mechanism. The main idea is that a page is split into two parts. One part is owned by the driver, one part might be owned by someone else, such as the stack. t0: Page is allocated, and put on the Rx ring +--------------- used by NIC ->| upper buffer (rx_buffer) +--------------- | lower buffer +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX t1: Buffer is received, and passed to the stack (e.g.) +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 1 t2: Buffer is received, and redirected +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- Now, prior calling xdp_do_redirect(): page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 2 This means that buffer *cannot* be flipped/reused, because the skb is still using it. The problem arises when xdp_do_redirect() actually frees the segment. Then we get: page count == USHRT_MAX - 1 rx_buffer->pagecnt_bias == USHRT_MAX - 2 From a recycle perspective, the buffer can be flipped and reused, which means that the skb data area is passed to the Rx HW ring! To work around this, the page count is stored prior calling xdp_do_redirect(). Fixes: 9cbc948b5a20 ("igb: add XDP support") Signed-off-by: Li RongQing <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Tested-by: Vishakha Jambekar <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-12ixgbe: move headroom initialization to ixgbe_configure_rx_ringMaciej Fijalkowski1-1/+2
ixgbe_rx_offset(), that is supposed to initialize the Rx buffer headroom, relies on __IXGBE_RX_BUILD_SKB_ENABLED flag. Currently, the callsite of mentioned function is placed incorrectly within ixgbe_setup_rx_resources() where Rx ring's build skb flag is not set yet. This causes the XDP_REDIRECT to be partially broken due to inability to create xdp_frame in the headroom space, as the headroom is 0. Fix this by moving ixgbe_rx_offset() to ixgbe_configure_rx_ring() after the flag setting, which happens to be set in ixgbe_set_rx_buffer_len. Fixes: c0d4e9d223c5 ("ixgbe: store the result of ixgbe_rx_offset() onto ixgbe_ring") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Vishakha Jambekar <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-12ice: move headroom initialization to ice_setup_rx_ctxMaciej Fijalkowski2-17/+18
ice_rx_offset(), that is supposed to initialize the Rx buffer headroom, relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence. Currently, the callsite of mentioned function is placed incorrectly within ice_setup_rx_ring() where Rx ring's build skb flag is not set yet. This causes the XDP_REDIRECT to be partially broken due to inability to create xdp_frame in the headroom space, as the headroom is 0. Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag setting. Fixes: f1b1f409bf79 ("ice: store the result of ice_rx_offset() onto ice_ring") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-12i40e: move headroom initialization to i40e_configure_rx_ringMaciej Fijalkowski2-12/+13
i40e_rx_offset(), that is supposed to initialize the Rx buffer headroom, relies on I40E_RXR_FLAGS_BUILD_SKB_ENABLED flag. Currently, the callsite of mentioned function is placed incorrectly within i40e_setup_rx_descriptors() where Rx ring's build skb flag is not set yet. This causes the XDP_REDIRECT to be partially broken due to inability to create xdp_frame in the headroom space, as the headroom is 0. For the record, below is the call graph: i40e_vsi_open i40e_vsi_setup_rx_resources i40e_setup_rx_descriptors i40e_rx_offset() <-- sets offset to 0 as build_skb flag is set below i40e_vsi_configure_rx i40e_configure_rx_ring set_ring_build_skb_enabled(ring) <-- set build_skb flag Fix this by moving i40e_rx_offset() to i40e_configure_rx_ring() after the flag setting. Fixes: f7bb0d71d658 ("i40e: store the result of i40e_rx_offset() onto i40e_ring") Reported-by: Jesper Dangaard Brouer <[email protected]> Co-developed-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Maciej Fijalkowski <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Tested-by: Jesper Dangaard Brouer <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-12ice: fix napi work done reporting in xsk pathMagnus Karlsson2-7/+9
Fix the wrong napi work done reporting in the xsk path of the ice driver. The code in the main Rx processing loop was written to assume that the buffer allocation code returns true if all allocations where successful and false if not. In contrast with all other Intel NIC xsk drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up the work done reporting in the napi loop. This can be fixed either by inverting the return value from ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect way, or by changing the return value of ice_alloc_rx_bufs_zc(). We chose the latter as it makes all the xsk allocation functions for Intel NICs behave in the same way. My guess is that it was this unexpected discrepancy that gave rise to this bug in the first place. Fixes: 5bb0c4b5eb61 ("ice, xsk: Move Rx allocation out of while-loop") Reported-by: Maciej Fijalkowski <[email protected]> Signed-off-by: Magnus Karlsson <[email protected]> Tested-by: Kiran Bhandare <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11Merge branch 'htb-fixes'David S. Miller1-6/+13
Maxim Mikityanskiy says: ==================== Bugfixes for HTB The HTB offload feature introduced a few bugs in HTB. One affects the non-offload mode, preventing attaching qdiscs to HTB classes, and the other affects the error flow, when the netdev doesn't support the offload, but it was requested. This short series fixes them. ==================== Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11sch_htb: Fix offload cleanup in htb_destroy on htb_init failureMaxim Mikityanskiy1-6/+9
htb_init may fail to do the offload if it's not supported or if a runtime error happens when allocating direct qdiscs. In those cases TC_HTB_CREATE command is not sent to the driver, however, htb_destroy gets called anyway and attempts to send TC_HTB_DESTROY. It shouldn't happen, because the driver didn't receive TC_HTB_CREATE, and also because the driver may not support ndo_setup_tc at all, while q->offload is true, and htb_destroy mistakenly thinks the offload is supported. Trying to call ndo_setup_tc in the latter case will lead to a NULL pointer dereference. This commit fixes the issues with htb_destroy by deferring assignment of q->offload until after the TC_HTB_CREATE command. The necessary cleanup of the offload entities is already done in htb_init. Reported-by: [email protected] Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11sch_htb: Fix select_queue for non-offload modeMaxim Mikityanskiy1-0/+4
htb_select_queue assumes it's always the offload mode, and it ends up in calling ndo_setup_tc without any checks. It may lead to a NULL pointer dereference if ndo_setup_tc is not implemented, or to an error returned from the driver, which will prevent attaching qdiscs to HTB classes in the non-offload mode. This commit fixes the bug by adding the missing check to htb_select_queue. In the non-offload mode it will return sch->dev_queue, mimicking tc_modify_qdisc's behavior for the case where select_queue is not implemented. Reported-by: [email protected] Fixes: d03b195b5aa0 ("sch_htb: Hierarchical QoS hardware offload") Signed-off-by: Maxim Mikityanskiy <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11net: phy: broadcom: Add power down exit reset state delayFlorian Fainelli1-0/+5
Per the datasheet, when we clear the power down bit, the PHY remains in an internal reset state for 40us and then resume normal operation. Account for that delay to avoid any issues in the future if genphy_resume() changes. Fixes: fe26821fa614 ("net: phy: broadcom: Wire suspend/resume for BCM54810") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11mISDN: fix crash in fritzpciTong Zhang1-1/+1
setup_fritz() in avmfritz.c might fail with -EIO and in this case the isac.type and isac.write_reg is not initialized and remains 0(NULL). A subsequent call to isac_release() will dereference isac->write_reg and crash. [ 1.737444] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 1.737809] #PF: supervisor instruction fetch in kernel mode [ 1.738106] #PF: error_code(0x0010) - not-present page [ 1.738378] PGD 0 P4D 0 [ 1.738515] Oops: 0010 [#1] SMP NOPTI [ 1.738711] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted 5.12.0-rc2+ #78 [ 1.739077] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-48-gd9c812dda519-p rebuilt.qemu.org 04/01/2014 [ 1.739664] RIP: 0010:0x0 [ 1.739807] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [ 1.740200] RSP: 0018:ffffc9000027ba10 EFLAGS: 00010202 [ 1.740478] RAX: 0000000000000000 RBX: ffff888102f41840 RCX: 0000000000000027 [ 1.740853] RDX: 00000000000000ff RSI: 0000000000000020 RDI: ffff888102f41800 [ 1.741226] RBP: ffffc9000027ba20 R08: ffff88817bc18440 R09: ffffc9000027b808 [ 1.741600] R10: 0000000000000001 R11: 0000000000000001 R12: ffff888102f41840 [ 1.741976] R13: 00000000fffffffb R14: ffff888102f41800 R15: ffff8881008b0000 [ 1.742351] FS: 00007fda3a38a8c0(0000) GS:ffff88817bc00000(0000) knlGS:0000000000000000 [ 1.742774] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.743076] CR2: ffffffffffffffd6 CR3: 00000001021ec000 CR4: 00000000000006f0 [ 1.743452] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.743828] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 1.744206] Call Trace: [ 1.744339] isac_release+0xcc/0xe0 [mISDNipac] [ 1.744582] fritzpci_probe.cold+0x282/0x739 [avmfritz] [ 1.744861] local_pci_probe+0x48/0x80 [ 1.745063] pci_device_probe+0x10f/0x1c0 [ 1.745278] really_probe+0xfb/0x420 [ 1.745471] driver_probe_device+0xe9/0x160 [ 1.745693] device_driver_attach+0x5d/0x70 [ 1.745917] __driver_attach+0x8f/0x150 [ 1.746123] ? device_driver_attach+0x70/0x70 [ 1.746354] bus_for_each_dev+0x7e/0xc0 [ 1.746560] driver_attach+0x1e/0x20 [ 1.746751] bus_add_driver+0x152/0x1f0 [ 1.746957] driver_register+0x74/0xd0 [ 1.747157] ? 0xffffffffc00d8000 [ 1.747334] __pci_register_driver+0x54/0x60 [ 1.747562] AVM_init+0x36/0x1000 [avmfritz] [ 1.747791] do_one_initcall+0x48/0x1d0 [ 1.747997] ? __cond_resched+0x19/0x30 [ 1.748206] ? kmem_cache_alloc_trace+0x390/0x440 [ 1.748458] ? do_init_module+0x28/0x250 [ 1.748669] do_init_module+0x62/0x250 [ 1.748870] load_module+0x23ee/0x26a0 [ 1.749073] __do_sys_finit_module+0xc2/0x120 [ 1.749307] ? __do_sys_finit_module+0xc2/0x120 [ 1.749549] __x64_sys_finit_module+0x1a/0x20 [ 1.749782] do_syscall_64+0x38/0x90 Signed-off-by: Tong Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_templateLv Yunlong1-0/+3
In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by vfree(). But unfortunately, it is used when extended is true. Fixes: 7061b2bdd620e ("qlogic: Deletion of unnecessary checks before two function calls") Signed-off-by: Lv Yunlong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11Merge branch '1GbE' of ↵David S. Miller6-37/+61
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-03-11 This series contains updates to igc and e1000e drivers. Sasha adds locking to reset task to prevent race condition for igc. Muhammad fixes reporting of supported pause frame as well as advertised pause frame for Tx/Rx off for igc. Andre fixes timestamp retrieval from the wrong timer for igc. Vitaly adds locking to reset task to prevent race condition for e1000e. Dinghao Liu adds a missed check to return on error in e1000_set_d0_lplu_state_82571. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-03-11net: sock: simplify tw proto registrationTonghao Zhang1-16/+28
Introduce the new function tw_prot_init (inspired by req_prot_init) to simplify "proto_register" function. tw_prot_cleanup will take care of a partially initialized timewait_sock_ops. Signed-off-by: Tonghao Zhang <[email protected]> Reviewed-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-11e1000e: Fix error handling in e1000_set_d0_lplu_state_82571Dinghao Liu1-0/+2
There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571 that we have caught its return value but lack further handling. Check and terminate the execution flow just like other e1e_wphy() in this function. Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Signed-off-by: Dinghao Liu <[email protected]> Acked-by: Sasha Neftin <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11e1000e: add rtnl_lock() to e1000_reset_taskVitaly Lifshits1-1/+5
A possible race condition was found in e1000_reset_task, after discovering a similar issue in igb driver via commit 024a8168b749 ("igb: reinit_locked() should be called with rtnl_lock"). Added rtnl_lock() and rtnl_unlock() to avoid this. Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)") Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Vitaly Lifshits <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11igc: Fix igc_ptp_rx_pktstamp()Andre Guedes2-33/+41
The comment describing the timestamps layout in the packet buffer is wrong and the code is actually retrieving the timestamp in Timer 1 reference instead of Timer 0. This hasn't been a big issue so far because hardware is configured to report both timestamps using Timer 0 (see IGC_SRRCTL register configuration in igc_ptp_enable_rx_timestamp() helper). This patch fixes the comment and the code so we retrieve the timestamp in Timer 0 reference as expected. This patch also takes the opportunity to get rid of the hw.mac.type check since it is not required. Fixes: 81b055205e8ba ("igc: Add support for RX timestamping") Signed-off-by: Andre Guedes <[email protected]> Signed-off-by: Vedang Patel <[email protected]> Signed-off-by: Jithu Joseph <[email protected]> Reviewed-by: Maciej Fijalkowski <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11igc: Fix Supported Pause Frame Link SettingMuhammad Husaini Zulkifli1-0/+3
The Supported Pause Frame always display "No" even though the Advertised pause frame showing the correct setting based on the pause parameters via ethtool. Set bit in link_ksettings to "Supported" for Pause Frame. Before output: Supported pause frame use: No Expected output: Supported pause frame use: Symmetric Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Muhammad Husaini Zulkifli <[email protected]> Reviewed-by: Malli C <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Acked-by: Sasha Neftin <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11igc: Fix Pause Frame AdvertisingMuhammad Husaini Zulkifli1-3/+1
Fix Pause Frame Advertising when getting the advertisement via ethtool. Remove setting the "advertising" bit in link_ksettings during default case when Tx and Rx are in off state with Auto Negotiate off. Below is the original output of advertisement link during Tx and Rx off: Advertised pause frame use: Symmetric Receive-only Expected output: Advertised pause frame use: No Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Muhammad Husaini Zulkifli <[email protected]> Reviewed-by: Malli C <[email protected]> Acked-by: Sasha Neftin <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-11igc: reinit_locked() should be called with rtnl_lockSasha Neftin1-0/+9
This commit applies to the igc_reset_task the same changes that were applied to the igb driver in commit 024a8168b749 ("igb: reinit_locked() should be called with rtnl_lock") and fix possible race in reset subtask. Fixes: 0507ef8a0372 ("igc: Add transmit and receive fastpath and interrupt handlers") Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Neftin <[email protected]> Tested-by: Dvora Fuxbrumer <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
2021-03-10net: dsa: bcm_sf2: Qualify phydev->dev_flags based on portFlorian Fainelli1-2/+4
Similar to commit 92696286f3bb37ba50e4bd8d1beb24afb759a799 ("net: bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to qualify the phydev->dev_flags based on whether the port is connected to an internal or external PHY otherwise we risk having a flags collision with a completely different interpretation depending on the driver. Fixes: aa9aef77c761 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-10net: dsa: b53: VLAN filtering is global to all usersFlorian Fainelli1-7/+7
The bcm_sf2 driver uses the b53 driver as a library but does not make usre of the b53_setup() function, this made it fail to inherit the vlan_filtering_is_global attribute. Fix this by moving the assignment to b53_switch_alloc() which is used by bcm_sf2. Fixes: 7228b23e68f7 ("net: dsa: b53: Let DSA handle mismatched VLAN filtering settings") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-10net: sched: validate stab valuesEric Dumazet5-8/+20
iproute2 package is well behaved, but malicious user space can provide illegal shift values and trigger UBSAN reports. Add stab parameter to red_check_params() to validate user input. syzbot reported: UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18 shift exponent 111 is too large for 64-bit type 'long unsigned int' CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:79 [inline] dump_stack+0x141/0x1d7 lib/dump_stack.c:120 ubsan_epilogue+0xb/0x5a lib/ubsan.c:148 __ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327 red_calc_qavg_from_idle_time include/net/red.h:312 [inline] red_calc_qavg include/net/red.h:353 [inline] choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221 __dev_xmit_skb net/core/dev.c:3837 [inline] __dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150 neigh_hh_output include/net/neighbour.h:499 [inline] neigh_output include/net/neighbour.h:508 [inline] ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117 __ip6_finish_output net/ipv6/ip6_output.c:182 [inline] __ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161 ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192 NF_HOOK_COND include/linux/netfilter.h:290 [inline] ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215 dst_output include/net/dst.h:448 [inline] NF_HOOK include/linux/netfilter.h:301 [inline] NF_HOOK include/linux/netfilter.h:295 [inline] ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320 inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135 dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138 dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535 dccp_finish_passive_close net/dccp/proto.c:123 [inline] dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118 dccp_terminate_connection net/dccp/proto.c:958 [inline] dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028 inet_release+0x12e/0x280 net/ipv4/af_inet.c:431 inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478 __sock_release+0xcd/0x280 net/socket.c:599 sock_close+0x18/0x20 net/socket.c:1258 __fput+0x288/0x920 fs/file_table.c:280 task_work_run+0xdd/0x1a0 kernel/task_work.c:140 tracehook_notify_resume include/linux/tracehook.h:189 [inline] Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-10Merge tag 'mlx5-fixes-2021-03-10' of ↵David S. Miller20-49/+106
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-fixes-2021-03-10 Signed-off-by: David S. Miller <[email protected]>
2021-03-10net: dsa: bcm_sf2: use 2 Gbps IMP port link on BCM4908Rafał Miłecki1-1/+4
BCM4908 uses 2 Gbps link between switch and the Ethernet interface. Without this BCM4908 devices were able to achieve only 2 x ~895 Mb/s. This allows handling e.g. NAT traffic with 940 Mb/s. Signed-off-by: Rafał Miłecki <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-10net: pxa168_eth: Fix a potential data race in pxa168_eth_removePavel Andrianov1-1/+1
pxa168_eth_remove() firstly calls unregister_netdev(), then cancels a timeout work. unregister_netdev() shuts down a device interface and removes it from the kernel tables. If the timeout occurs in parallel, the timeout work (pxa168_eth_tx_timeout_task) performs stop and open of the device. It may lead to an inconsistent state and memory leaks. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Pavel Andrianov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-03-10macvlan: macvlan_count_rx() needs to be aware of preemptionEric Dumazet1-1/+2
macvlan_count_rx() can be called from process context, it is thus necessary to disable preemption before calling u64_stats_update_begin() syzbot was able to spot this on 32bit arch: WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert include/linux/seqlock.h:271 [inline] WARNING: CPU: 1 PID: 4632 at include/linux/seqlock.h:271 __seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269 Modules linked in: Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4632 Comm: kworker/1:3 Not tainted 5.12.0-rc2-syzkaller #0 Hardware name: ARM-Versatile Express Workqueue: events macvlan_process_broadcast Backtrace: [<82740468>] (dump_backtrace) from [<827406dc>] (show_stack+0x18/0x1c arch/arm/kernel/traps.c:252) r7:00000080 r6:60000093 r5:00000000 r4:8422a3c4 [<827406c4>] (show_stack) from [<82751b58>] (__dump_stack lib/dump_stack.c:79 [inline]) [<827406c4>] (show_stack) from [<82751b58>] (dump_stack+0xb8/0xe8 lib/dump_stack.c:120) [<82751aa0>] (dump_stack) from [<82741270>] (panic+0x130/0x378 kernel/panic.c:231) r7:830209b4 r6:84069ea4 r5:00000000 r4:844350d0 [<82741140>] (panic) from [<80244924>] (__warn+0xb0/0x164 kernel/panic.c:605) r3:8404ec8c r2:00000000 r1:00000000 r0:830209b4 r7:0000010f [<80244874>] (__warn) from [<82741520>] (warn_slowpath_fmt+0x68/0xd4 kernel/panic.c:628) r7:81363f70 r6:0000010f r5:83018e50 r4:00000000 [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert include/linux/seqlock.h:271 [inline]) [<827414bc>] (warn_slowpath_fmt) from [<81363f70>] (__seqprop_assert.constprop.0+0xf0/0x11c include/linux/seqlock.h:269) r8:5a109000 r7:0000000f r6:a568dac0 r5:89802300 r4:00000001 [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (u64_stats_update_begin include/linux/u64_stats_sync.h:128 [inline]) [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_count_rx include/linux/if_macvlan.h:47 [inline]) [<81363e80>] (__seqprop_assert.constprop.0) from [<81364af0>] (macvlan_broadcast+0x154/0x26c drivers/net/macvlan.c:291) r5:89802300 r4:8a927740 [<8136499c>] (macvlan_broadcast) from [<81365020>] (macvlan_process_broadcast+0x258/0x2d0 drivers/net/macvlan.c:317) r10:81364f78 r9:8a86d000 r8:8a9c7e7c r7:8413aa5c r6:00000000 r5:00000000 r4:89802840 [<81364dc8>] (macvlan_process_broadcast) from [<802696a4>] (process_one_work+0x2d4/0x998 kernel/workqueue.c:2275) r10:00000008 r9:8404ec98 r8:84367a02 r7:ddfe6400 r6:ddfe2d40 r5:898dac80 r4:8a86d43c [<802693d0>] (process_one_work) from [<80269dcc>] (worker_thread+0x64/0x54c kernel/workqueue.c:2421) r10:00000008 r9:8a9c6000 r8:84006d00 r7:ddfe2d78 r6:898dac94 r5:ddfe2d40 r4:898dac80 [<80269d68>] (worker_thread) from [<80271f40>] (kthread+0x184/0x1a4 kernel/kthread.c:292) r10:85247e64 r9:898dac80 r8:80269d68 r7:00000000 r6:8a9c6000 r5:89a2ee40 r4:8a97bd00 [<80271dbc>] (kthread) from [<80200114>] (ret_from_fork+0x14/0x20 arch/arm/kernel/entry-common.S:158) Exception stack(0x8a9c7fb0 to 0x8a9c7ff8) Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue") Signed-off-by: Eric Dumazet <[email protected]> Cc: Herbert Xu <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>