aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-11-30KVM: x86/mmu: Remove spurious TLB flushes in TDP MMU zap collapsible pathSean Christopherson3-25/+11
Drop the "flush" param and return values to/from the TDP MMU's helper for zapping collapsible SPTEs. Because the helper runs with mmu_lock held for read, not write, it uses tdp_mmu_zap_spte_atomic(), and the atomic zap handles the necessary remote TLB flush. Similarly, because mmu_lock is dropped and re-acquired between zapping legacy MMUs and zapping TDP MMUs, kvm_mmu_zap_collapsible_sptes() must handle remote TLB flushes from the legacy MMU before calling into the TDP MMU. Fixes: e2209710ccc5d ("KVM: x86/mmu: Skip rmap operations if rmaps not allocated") Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-30KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmappingSean Christopherson1-1/+1
Use the yield-safe variant of the TDP MMU iterator when handling an unmapping event from the MMU notifier, as most occurences of the event allow yielding. Fixes: e1eed5847b09 ("KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible") Cc: [email protected] Signed-off-by: Sean Christopherson <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2021-11-29net: mscc: ocelot: fix missing unlock on error in ocelot_hwstamp_set()Wei Yongjun1-1/+3
Add the missing mutex_unlock before return from function ocelot_hwstamp_set() in the ocelot_setup_ptp_traps() error handling case. Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29Merge tag 'rxrpc-fixes-20211129' of ↵Jakub Kicinski2-10/+18
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Leak fixes Here are a couple of fixes for leaks in AF_RXRPC: (1) Fix a leak of rxrpc_peer structs in rxrpc_look_up_bundle(). (2) Fix a leak of rxrpc_local structs in rxrpc_lookup_peer(). * tag 'rxrpc-fixes-20211129' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer() rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle() ==================== Link: https://lore.kernel.org/r/163820097905.226370.17234085194655347888.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29Merge branch 'wireguard-siphash-patches-for-5-16-rc6'Jakub Kicinski16-71/+129
Jason A. Donenfeld says: ==================== wireguard/siphash patches for 5.16-rc Here's quite a largeish set of stable patches I've had queued up and testing for a number of months now: - Patch (1) squelches a sparse warning by fixing an annotation. - Patches (2), (3), and (5) are minor improvements and fixes to the test suite. - Patch (4) is part of a tree-wide cleanup to have module-specific init and exit functions. - Patch (6) fixes a an issue with dangling dst references, by having a function to release references immediately rather than deferring, and adds an associated test case to prevent this from regressing. - Patches (7) and (8) help mitigate somewhat a potential DoS on the ingress path due to the use of skb_list's locking hitting contention on multiple cores by switching to using a ring buffer and dropping packets on contention rather than locking up another core spinning. - Patch (9) switches kvzalloc to kvcalloc for better form. - Patch (10) fixes alignment traps in siphash with clang-13 (and maybe other compilers) on armv6, by switching to using the unaligned functions by default instead of the aligned functions by default. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29siphash: use _unaligned version by defaultArnd Bergmann2-16/+10
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS because the ordinary load/store instructions (ldr, ldrh, ldrb) can tolerate any misalignment of the memory address. However, load/store double and load/store multiple instructions (ldrd, ldm) may still only be used on memory addresses that are 32-bit aligned, and so we have to use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we may end up with a severe performance hit due to alignment traps that require fixups by the kernel. Testing shows that this currently happens with clang-13 but not gcc-11. In theory, any compiler version can produce this bug or other problems, as we are dealing with undefined behavior in C99 even on architectures that support this in hardware, see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363. Fortunately, the get_unaligned() accessors do the right thing: when building for ARMv6 or later, the compiler will emit unaligned accesses using the ordinary load/store instructions (but avoid the ones that require 32-bit alignment). When building for older ARM, those accessors will emit the appropriate sequence of ldrb/mov/orr instructions. And on architectures that can truly tolerate any kind of misalignment, the get_unaligned() accessors resolve to the leXX_to_cpup accessors that operate on aligned addresses. Since the compiler will in fact emit ldrd or ldm instructions when building this code for ARM v6 or later, the solution is to use the unaligned accessors unconditionally on architectures where this is known to be fast. The _aligned version of the hash function is however still needed to get the best performance on architectures that cannot do any unaligned access in hardware. This new version avoids the undefined behavior and should produce the fastest hash on all architectures we support. Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/ Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/ Reported-by: Ard Biesheuvel <[email protected]> Fixes: 2c956a60778c ("siphash: add cryptographically secure PRF") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Jason A. Donenfeld <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: ratelimiter: use kvcalloc() instead of kvzalloc()Gustavo A. R. Silva1-2/+2
Use 2-factor argument form kvcalloc() instead of kvzalloc(). Link: https://github.com/KSPP/linux/issues/162 Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Gustavo A. R. Silva <[email protected]> [Jason: Gustavo's link above is for KSPP, but this isn't actually a security fix, as table_size is bounded to 8192 anyway, and gcc realizes this, so the codegen comes out to be about the same.] Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: receive: drop handshakes if queue lock is contendedJason A. Donenfeld1-3/+13
If we're being delivered packets from multiple CPUs so quickly that the ring lock is contended for CPU tries, then it's safe to assume that the queue is near capacity anyway, so just drop the packet rather than spinning. This helps deal with multicore DoS that can interfere with data path performance. It _still_ does not completely fix the issue, but it again chips away at it. Reported-by: Streun Fabio <[email protected]> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: receive: use ring buffer for incoming handshakesJason A. Donenfeld5-43/+37
Apparently the spinlock on incoming_handshake's skb_queue is highly contended, and a torrent of handshake or cookie packets can bring the data plane to its knees, simply by virtue of enqueueing the handshake packets to be processed asynchronously. So, we try switching this to a ring buffer to hopefully have less lock contention. This alleviates the problem somewhat, though it still isn't perfect, so future patches will have to improve this further. However, it at least doesn't completely diminish the data plane. Reported-by: Streun Fabio <[email protected]> Reported-by: Joel Wanner <[email protected]> Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: device: reset peer src endpoint when netns exitsJason A. Donenfeld5-2/+57
Each peer's endpoint contains a dst_cache entry that takes a reference to another netdev. When the containing namespace exits, we take down the socket and prevent future sockets from being created (by setting creating_net to NULL), which removes that potential reference on the netns. However, it doesn't release references to the netns that a netdev cached in dst_cache might be taking, so the netns still might fail to exit. Since the socket is gimped anyway, we can simply clear all the dst_caches (by way of clearing the endpoint src), which will release all references. However, the current dst_cache_reset function only releases those references lazily. But it turns out that all of our usages of wg_socket_clear_peer_endpoint_src are called from contexts that are not exactly high-speed or bottle-necked. For example, when there's connection difficulty, or when userspace is reconfiguring the interface. And in particular for this patch, when the netns is exiting. So for those cases, it makes more sense to call dst_release immediately. For that, we add a small helper function to dst_cache. This patch also adds a test to netns.sh from Hangbin Liu to ensure this doesn't regress. Tested-by: Hangbin Liu <[email protected]> Reported-by: Xiumei Mu <[email protected]> Cc: Toke Høiland-Jørgensen <[email protected]> Cc: Paolo Abeni <[email protected]> Fixes: 900575aa33a3 ("wireguard: device: avoid circular netns references") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: selftests: rename DEBUG_PI_LIST to DEBUG_PLISTLi Zhijian1-1/+1
DEBUG_PI_LIST was renamed to DEBUG_PLIST since 8e18faeac3 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST"). Signed-off-by: Li Zhijian <[email protected]> Fixes: 8e18faeac3e4 ("lib/plist: rename DEBUG_PI_LIST to DEBUG_PLIST") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: main: rename 'mod_init' & 'mod_exit' functions to be module-specificRandy Dunlap1-4/+4
Rename module_init & module_exit functions that are named "mod_init" and "mod_exit" so that they are unique in both the System.map file and in initcall_debug output instead of showing up as almost anonymous "mod_init". This is helpful for debugging and in determining how long certain module_init calls take to execute. Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: selftests: actually test for routing loopsJason A. Donenfeld1-1/+5
We previously removed the restriction on looping to self, and then added a test to make sure the kernel didn't blow up during a routing loop. The kernel didn't blow up, thankfully, but on certain architectures where skb fragmentation is easier, such as ppc64, the skbs weren't actually being discarded after a few rounds through. But the test wasn't catching this. So actually test explicitly for massive increases in tx to see if we have a routing loop. Note that the actual loop problem will need to be addressed in a different commit. Fixes: b673e24aad36 ("wireguard: socket: remove errant restriction on looping to self") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: selftests: increase default dmesg log sizeJason A. Donenfeld1-0/+1
The selftests currently parse the kernel log at the end to track potential memory leaks. With these tests now reading off the end of the buffer, due to recent optimizations, some creation messages were lost, making the tests think that there was a free without an alloc. Fix this by increasing the kernel log size. Fixes: 24b70eeeb4f4 ("wireguard: use synchronize_net rather than synchronize_rcu") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29wireguard: allowedips: add missing __rcu annotation to satisfy sparseJason A. Donenfeld1-1/+1
A __rcu annotation got lost during refactoring, which caused sparse to become enraged. Fixes: bf7b042dc62a ("wireguard: allowedips: free empty intermediate nodes when removing single node") Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2021-11-29netfs: Adjust docs after foliationDavid Howells2-41/+58
Adjust the netfslib docs in light of the foliation changes. Also un-kdoc-mark netfs_skip_folio_read() since it's internal and isn't part of the API. Signed-off-by: David Howells <[email protected]> Reviewed-by: Jeff Layton <[email protected]> cc: Matthew Wilcox <[email protected]> cc: [email protected] cc: [email protected] Link: https://lore.kernel.org/r/163706992597.3179783.18360472879717076435.stgit@warthog.procyon.org.uk/ Signed-off-by: Linus Torvalds <[email protected]>
2021-11-29mt76: fix key pointer overwrite in mt7921s_write_txwi/mt7663_usb_sdio_write_txwiLorenzo Bianconi2-12/+10
Fix pointer overwrite in mt7921s_tx_prepare_skb and mt7663_usb_sdio_tx_prepare_skb routines since in commit '2a9e9857473b ("mt76: fix possible pktid leak") mt76_tx_status_skb_add() has been moved out of mt7921s_write_txwi()/mt7663_usb_sdio_write_txwi() overwriting hw key pointer in ieee80211_tx_info structure. Fix the issue saving key pointer before running mt76_tx_status_skb_add(). Fixes: 2a9e9857473b ("mt76: fix possible pktid leak") Tested-by: Deren Wu <[email protected]> Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/eba40c84b6d114f618e2ae486cc6d0f2e9272cf9.1638193069.git.lorenzo@kernel.org
2021-11-29rxrpc: Fix rxrpc_local leak in rxrpc_lookup_peer()Eiichi Tsukata1-5/+9
Need to call rxrpc_put_local() for peer candidate before kfree() as it holds a ref to rxrpc_local. [DH: v2: Changed to abstract the peer freeing code out into a function] Fixes: 9ebeddef58c4 ("rxrpc: rxrpc_peer needs to hold a ref on the rxrpc_local record") Signed-off-by: Eiichi Tsukata <[email protected]> Signed-off-by: David Howells <[email protected]> Reviewed-by: Marc Dionne <[email protected]> cc: [email protected] Link: https://lore.kernel.org/all/[email protected]/ # v1
2021-11-29rxrpc: Fix rxrpc_peer leak in rxrpc_look_up_bundle()Eiichi Tsukata1-5/+9
Need to call rxrpc_put_peer() for bundle candidate before kfree() as it holds a ref to rxrpc_peer. [DH: v2: Changed to abstract out the bundle freeing code into a function] Fixes: 245500d853e9 ("rxrpc: Rewrite the client connection manager") Signed-off-by: Eiichi Tsukata <[email protected]> Signed-off-by: David Howells <[email protected]> Reviewed-by: Marc Dionne <[email protected]> cc: [email protected] Link: https://lore.kernel.org/r/[email protected]/ # v1
2021-11-29ipv6: fix memory leak in fib6_rule_suppressmsizanoen14-4/+7
The kernel leaks memory when a `fib` rule is present in IPv6 nftables firewall rules and a suppress_prefix rule is present in the IPv6 routing rules (used by certain tools such as wg-quick). In such scenarios, every incoming packet will leak an allocation in `ip6_dst_cache` slab cache. After some hours of `bpftrace`-ing and source code reading, I tracked down the issue to ca7a03c41753 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule"). The problem with that change is that the generic `args->flags` always have `FIB_LOOKUP_NOREF` set[1][2] but the IPv6-specific flag `RT6_LOOKUP_F_DST_NOREF` might not be, leading to `fib6_rule_suppress` not decreasing the refcount when needed. How to reproduce: - Add the following nftables rule to a prerouting chain: meta nfproto ipv6 fib saddr . mark . iif oif missing drop This can be done with: sudo nft create table inet test sudo nft create chain inet test test_chain '{ type filter hook prerouting priority filter + 10; policy accept; }' sudo nft add rule inet test test_chain meta nfproto ipv6 fib saddr . mark . iif oif missing drop - Run: sudo ip -6 rule add table main suppress_prefixlength 0 - Watch `sudo slabtop -o | grep ip6_dst_cache` to see memory usage increase with every incoming ipv6 packet. This patch exposes the protocol-specific flags to the protocol specific `suppress` function, and check the protocol-specific `flags` argument for RT6_LOOKUP_F_DST_NOREF instead of the generic FIB_LOOKUP_NOREF when decreasing the refcount, like this. [1]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L71 [2]: https://github.com/torvalds/linux/blob/ca7a03c4175366a92cee0ccc4fec0038c3266e26/net/ipv6/fib6_rules.c#L99 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215105 Fixes: ca7a03c41753 ("ipv6: do not free rt if FIB_LOOKUP_NOREF is set on suppress rule") Cc: [email protected] Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29Merge branch 'atlantic-fixes'David S. Miller11-55/+184
Sudarsana Reddy Kalluru says: ==================== net: atlantic: 11-2021 fixes The patch series contains fixes for atlantic driver to improve support of latest AQC113 chipset. Please consider applying it to 'net' tree. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlantic: Remove warn trace message.Sameer Saurabh1-3/+0
Remove the warn trace message - it's not a correct check here, because the function can still be called on the device in DOWN state Fixes: 508f2e3dce454 ("net: atlantic: split rx and tx per-queue stats") Signed-off-by: Sameer Saurabh <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlantic: Fix statistics logic for production hardwareDmitry Bogdanov5-27/+139
B0 is the main and widespread device revision of atlantic2 HW. In the current state, driver will incorrectly fetch the statistics for this revision. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Dmitry Bogdanov <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29Remove Half duplex mode speed capabilities.Sameer Saurabh1-4/+1
Since Half Duplex mode has been deprecated by the firmware, driver should not advertise Half Duplex speed in ethtool support link speed values. Fixes: 071a02046c262 ("net: atlantic: A2: half duplex support") Signed-off-by: Sameer Saurabh <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlantic: Add missing DIDs and fix 115c.Nikita Danilov4-1/+27
At the late production stages new dev ids were introduced. These are now in production, so its important for the driver to recognize these. And also fix the board caps for AQC115C adapter. Fixes: b3f0c79cba206 ("net: atlantic: A2 hw_ops skeleton") Signed-off-by: Nikita Danilov <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlantic: Fix to display FW bundle version instead of FW mac version.Sameer Saurabh1-3/+3
The correct way to reflect firmware version is to use bundle version. Hence populating the same instead of MAC fw version. Fixes: c1be0bf092bd2 ("net: atlantic: common functions needed for basic A2 init/deinit hw_ops") Signed-off-by: Sameer Saurabh <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlatnic: enable Nbase-t speeds with base-tNikita Danilov3-19/+13
When 2.5G is advertised, N-Base should be advertised against the T-base caps. N5G is out of use in baseline code and driver should treat both 5G and N5G (and also 2.5G and N2.5G) equally from user perspective. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Nikita Danilov <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29atlantic: Increase delay for fw transactionsDmitry Bogdanov1-2/+5
The max waiting period (of 1 ms) while reading the data from FW shared buffer is too small for certain types of data (e.g., stats). There's a chance that FW could be updating buffer at the same time and driver would be unsuccessful in reading data. Firmware manual recommends to have 1 sec timeout to fix this issue. Fixes: 5cfd54d7dc186 ("net: atlantic: minimal A2 fw_ops") Signed-off-by: Dmitry Bogdanov <[email protected]> Signed-off-by: Sudarsana Reddy Kalluru <[email protected]> Signed-off-by: Igor Russkikh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net/mlx4_en: Update reported link modes for 1/10GErik Ekman1-3/+3
When link modes were initially added in commit 2c762679435dc ("net/mlx4_en: Use PTYS register to query ethtool settings") and later updated for the new ethtool API in commit 3d8f7cc78d0eb ("net: mlx4: use new ETHTOOL_G/SSETTINGS API") the only 1/10G non-baseT link modes configured were 1000baseKX, 10000baseKX4 and 10000baseKR. It looks like these got picked to represent other modes since nothing better was available. Switch to using more specific link modes added in commit 5711a98221443 ("net: ethtool: add support for 1000BaseX and missing 10G link modes"). Tested with MCX311A-XCAT connected via DAC. Before: % sudo ethtool enp3s0 Settings for enp3s0: Supported ports: [ FIBRE ] Supported link modes: 1000baseKX/Full 10000baseKR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 1000baseKX/Full 10000baseKR/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 10000Mb/s Duplex: Full Auto-negotiation: off Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Supports Wake-on: d Wake-on: d Current message level: 0x00000014 (20) link ifdown Link detected: yes With this change: % sudo ethtool enp3s0 Settings for enp3s0: Supported ports: [ FIBRE ] Supported link modes: 1000baseX/Full 10000baseCR/Full 10000baseSR/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 1000baseX/Full 10000baseCR/Full 10000baseSR/Full Advertised pause frame use: Symmetric Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 10000Mb/s Duplex: Full Auto-negotiation: off Port: Direct Attach Copper PHYAD: 0 Transceiver: internal Supports Wake-on: d Wake-on: d Current message level: 0x00000014 (20) link ifdown Link detected: yes Tested-by: Michael Stapelberg <[email protected]> Signed-off-by: Erik Ekman <[email protected]> Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29mctp: test: fix skb free in test device txJeremy Kerr1-1/+1
In our test device, we're currently freeing skbs in the transmit path with kfree(), rather than kfree_skb(). This change uses the correct kfree_skb() instead. Fixes: ded21b722995 ("mctp: Add test utils") Reported-by: kernel test robot <[email protected]> Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Jeremy Kerr <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net/tls: Fix authentication failure in CCM modeTianjia Zhang1-2/+2
When the TLS cipher suite uses CCM mode, including AES CCM and SM4 CCM, the first byte of the B0 block is flags, and the real IV starts from the second byte. The XOR operation of the IV and rec_seq should be skip this byte, that is, add the iv_offset. Fixes: f295b3ae9f59 ("net/tls: Add support of AES128-CCM based ciphers") Signed-off-by: Tianjia Zhang <[email protected]> Cc: Vakul Garg <[email protected]> Cc: [email protected] # v5.2+ Signed-off-by: David S. Miller <[email protected]>
2021-11-29Merge branch 'mpls-notifications'David S. Miller2-36/+63
Benjamin Poirier says: ==================== net: mpls: Netlink notification fixes fix missing or inaccurate route notifications when devices used in nexthops are deleted. ==================== Signed-off-by: David S. Miller <[email protected]>
2021-11-29net: mpls: Remove rcu protection from nh_devBenjamin Poirier2-25/+16
Following the previous commit, nh_dev can no longer be accessed and modified concurrently. Signed-off-by: Benjamin Poirier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net: mpls: Fix notifications when deleting a deviceBenjamin Poirier1-16/+52
There are various problems related to netlink notifications for mpls route changes in response to interfaces being deleted: * delete interface of only nexthop DELROUTE notification is missing RTA_OIF attribute * delete interface of non-last nexthop NEWROUTE notification is missing entirely * delete interface of last nexthop DELROUTE notification is missing nexthop All of these problems stem from the fact that existing routes are modified in-place before sending a notification. Restructure mpls_ifdown() to avoid changing the route in the DELROUTE cases and to create a copy in the NEWROUTE case. Fixes: f8efb73c97e2 ("mpls: multipath route support") Signed-off-by: Benjamin Poirier <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net: usb: lan78xx: lan78xx_phy_init(): use PHY_POLL instead of "0" if no IRQ ↵Sven Schuchmann1-1/+1
is available On most systems request for IRQ 0 will fail, phylib will print an error message and fall back to polling. To fix this set the phydev->irq to PHY_POLL if no IRQ is available. Fixes: cc89c323a30e ("lan78xx: Use irq_domain for phy interrupt from USB Int. EP") Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Sven Schuchmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29USB: NO_LPM quirk Lenovo Powered USB-C Travel HubOle Ernst1-0/+3
This is another branded 8153 device that doesn't work well with LPM: r8152 2-2.1:1.0 enp0s13f0u2u1: Stop submitting intr, status -71 Disable LPM to resolve the issue. Signed-off-by: Ole Ernst <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net: dsa: realtek-smi: fix indirect reg access for ports>3Luiz Angelo Daros de Luca1-1/+8
This switch family can have up to 8 UTP ports {0..7}. However, INDIRECT_ACCESS_ADDRESS_PHYNUM_MASK was using 2 bits instead of 3, dropping the most significant bit during indirect register reads and writes. Reading or writing ports 4, 5, 6, and 7 registers was actually manipulating, respectively, ports 0, 1, 2, and 3 registers. This is not sufficient but necessary to support any variant with more than 4 UTP ports, like RTL8367S. rtl8365mb_phy_{read,write} will now returns -EINVAL if phy is greater than 7. Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Signed-off-by: Luiz Angelo Daros de Luca <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29tcp: fix page frag corruption on page faultPaolo Abeni1-5/+8
Steffen reported a TCP stream corruption for HTTP requests served by the apache web-server using a cifs mount-point and memory mapping the relevant file. The root cause is quite similar to the one addressed by commit 20eb4f29b602 ("net: fix sk_page_frag() recursion from memory reclaim"). Here the nested access to the task page frag is caused by a page fault on the (mmapped) user-space memory buffer coming from the cifs file. The page fault handler performs an smb transaction on a different socket, inside the same process context. Since sk->sk_allaction for such socket does not prevent the usage for the task_frag, the nested allocation modify "under the hood" the page frag in use by the outer sendmsg call, corrupting the stream. The overall relevant stack trace looks like the following: httpd 78268 [001] 3461630.850950: probe:tcp_sendmsg_locked: ffffffff91461d91 tcp_sendmsg_locked+0x1 ffffffff91462b57 tcp_sendmsg+0x27 ffffffff9139814e sock_sendmsg+0x3e ffffffffc06dfe1d smb_send_kvec+0x28 [...] ffffffffc06cfaf8 cifs_readpages+0x213 ffffffff90e83c4b read_pages+0x6b ffffffff90e83f31 __do_page_cache_readahead+0x1c1 ffffffff90e79e98 filemap_fault+0x788 ffffffff90eb0458 __do_fault+0x38 ffffffff90eb5280 do_fault+0x1a0 ffffffff90eb7c84 __handle_mm_fault+0x4d4 ffffffff90eb8093 handle_mm_fault+0xc3 ffffffff90c74f6d __do_page_fault+0x1ed ffffffff90c75277 do_page_fault+0x37 ffffffff9160111e page_fault+0x1e ffffffff9109e7b5 copyin+0x25 ffffffff9109eb40 _copy_from_iter_full+0xe0 ffffffff91462370 tcp_sendmsg_locked+0x5e0 ffffffff91462370 tcp_sendmsg_locked+0x5e0 ffffffff91462b57 tcp_sendmsg+0x27 ffffffff9139815c sock_sendmsg+0x4c ffffffff913981f7 sock_write_iter+0x97 ffffffff90f2cc56 do_iter_readv_writev+0x156 ffffffff90f2dff0 do_iter_write+0x80 ffffffff90f2e1c3 vfs_writev+0xa3 ffffffff90f2e27c do_writev+0x5c ffffffff90c042bb do_syscall_64+0x5b ffffffff916000ad entry_SYSCALL_64_after_hwframe+0x65 The cifs filesystem rightfully sets sk_allocations to GFP_NOFS, we can avoid the nesting using the sk page frag for allocation lacking the __GFP_FS flag. Do not define an additional mm-helper for that, as this is strictly tied to the sk page frag usage. v1 -> v2: - use a stricted sk_page_frag() check instead of reordering the code (Eric) Reported-by: Steffen Froemer <[email protected]> Fixes: 5640f7685831 ("net: use a per task frag allocator") Signed-off-by: Paolo Abeni <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29net: stmmac: Avoid DMA_CHAN_CONTROL write if no Split Header supportVincent Whitchurch1-5/+6
The driver assumes that split headers can be enabled/disabled without stopping/starting the device, so it writes DMA_CHAN_CONTROL from stmmac_set_features(). However, on my system (IP v5.10a without Split Header support), simply writing DMA_CHAN_CONTROL when DMA is running (for example, with the commands below) leads to a TX watchdog timeout. host$ socat TCP-LISTEN:1024,fork,reuseaddr - & device$ ethtool -K eth0 tso off device$ ethtool -K eth0 tso on device$ dd if=/dev/zero bs=1M count=10 | socat - TCP4:host:1024 <tx watchdog timeout> Note that since my IP is configured without Split Header support, the driver always just reads and writes the same value to the DMA_CHAN_CONTROL register. I don't have access to any platforms with Split Header support so I don't know if these writes to the DMA_CHAN_CONTROL while DMA is running actually work properly on such systems. I could not find anything in the databook that says that DMA_CHAN_CONTROL should not be written when the DMA is running. But on systems without Split Header support, there is in any case no need to call enable_sph() in stmmac_set_features() at all since SPH can never be toggled, so we can avoid the watchdog timeout there by skipping this call. Fixes: 8c6fc097a2f4acf ("net: stmmac: gmac4+: Add Split Header support") Signed-off-by: Vincent Whitchurch <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2021-11-29rt2x00: do not mark device gone on EPROTO errors during startStanislaw Gruszka1-0/+3
As reported by Exuvo is possible that we have lot's of EPROTO errors during device start i.e. firmware load. But after that device works correctly. Hence marking device gone by few EPROTO errors done by commit e383c70474db ("rt2x00: check number of EPROTO errors") caused regression - Exuvo device stop working after kernel update. To fix disable the check during device start. Link: https://lore.kernel.org/linux-wireless/[email protected]/ Reported-and-tested-by: Exuvo <[email protected]> Fixes: e383c70474db ("rt2x00: check number of EPROTO errors") Cc: [email protected] Signed-off-by: Stanislaw Gruszka <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-29ALSA: hda/cs8409: Set PMSG_ON earlier inside cs8409 driverStefan Binding2-0/+14
For cs8409, it is required to run Jack Detect on resume. Jack Detect on cs8409+cs42l42 requires an interrupt from cs42l42 to be sent to cs8409 which is propogated to the driver via an unsolicited event. However, the hda_codec drops unsolicited events if the power_state is not set to PMSG_ON. Which is set at the end of the resume call. This means there is a race condition between setting power_state to PMSG_ON and receiving the interrupt. To solve this, we can add an API to set the power_state earlier and call that before we start Jack Detect. This does not cause issues, since we know inside our driver that we are already initialized, and ready to handle the unsolicited events. Signed-off-by: Stefan Binding <[email protected]> Signed-off-by: Vitaly Rodionov <[email protected]> Cc: <[email protected]> # v5.15+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2021-11-28Linux 5.16-rc3Linus Torvalds1-1/+1
2021-11-28Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds9-76/+9
Pull vhost,virtio,vdpa bugfixes from Michael Tsirkin: "Misc fixes all over the place. Revert of virtio used length validation series: the approach taken does not seem to work, breaking too many guests in the process. We'll need to do length validation using some other approach" [ This merge also ends up reverting commit f7a36b03a732 ("vsock/virtio: suppress used length validation"), which came in through the networking tree in the meantime, and was part of that whole used length validation series - Linus ] * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa_sim: avoid putting an uninitialized iova_domain vhost-vdpa: clean irqs before reseting vdpa device virtio-blk: modify the value type of num in virtio_queue_rq() vhost/vsock: cleanup removing `len` variable vhost/vsock: fix incorrect used length reported to the guest Revert "virtio_ring: validate used buffer length" Revert "virtio-net: don't let virtio core to validate used length" Revert "virtio-blk: don't let virtio core to validate used length" Revert "virtio-scsi: don't let virtio core to validate used buffer length"
2021-11-28Merge tag 'x86-urgent-2021-11-28' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 build fix from Thomas Gleixner: "A single fix for a missing __init annotation of prepare_command_line()" * tag 'x86-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Mark prepare_command_line() __init
2021-11-28Merge tag 'sched-urgent-2021-11-28' of ↵Linus Torvalds2-4/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "A single scheduler fix to ensure that there is no stale KASAN shadow state left on the idle task's stack when a CPU is brought up after it was brought down before" * tag 'sched-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/scs: Reset task stack state in bringup_cpu()
2021-11-28Merge tag 'perf-urgent-2021-11-28' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for perf to prevent it from sending SIGTRAP to another task from a trace point event as it's not possible to deliver a synchronous signal to a different task from there" * tag 'perf-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Ignore sigtrap for tracepoints destined for other tasks
2021-11-28Merge tag 'locking-urgent-2021-11-28' of ↵Linus Torvalds1-93/+89
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Two regression fixes for reader writer semaphores: - Plug a race in the lock handoff which is caused by inconsistency of the reader and writer path and can lead to corruption of the underlying counter. - down_read_trylock() is suboptimal when the lock is contended and multiple readers trylock concurrently. That's due to the initial value being read non-atomically which results in at least two compare exchange loops. Making the initial readout atomic reduces this significantly. Whith 40 readers by 11% in a benchmark which enforces contention on mmap_sem" * tag 'locking-urgent-2021-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/rwsem: Optimize down_read_trylock() under highly contended case locking/rwsem: Make handoff bit handling more consistent
2021-11-28Merge tag 'trace-v5.16-rc2-3' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull another tracing fix from Steven Rostedt: "Fix the fix of pid filtering The setting of the pid filtering flag tested the "trace only this pid" case twice, and ignored the "trace everything but this pid" case. The 5.15 kernel does things a little differently due to the new sparse pid mask introduced in 5.16, and as the bug was discovered running the 5.15 kernel, and the first fix was initially done for that kernel, that fix handled both cases (only pid and all but pid), but the forward port to 5.16 created this bug" * tag 'trace-v5.16-rc2-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Test the 'Do not trace this pid' case in create event
2021-11-28Merge tag 'iommu-fixes-v5.16-rc2' of ↵Linus Torvalds5-17/+10
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: - Intel VT-d fixes: - Remove unused PASID_DISABLED - Fix RCU locking - Fix for the unmap_pages call-back - Rockchip RK3568 address mask fix - AMD IOMMUv2 log message clarification * tag 'iommu-fixes-v5.16-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/vt-d: Fix unmap_pages support iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock() iommu/rockchip: Fix PAGE_DESC_HI_MASKs for RK3568 iommu/amd: Clarify AMD IOMMUv2 initialization messages iommu/vt-d: Remove unused PASID_DISABLED
2021-11-27Merge tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbdLinus Torvalds2-18/+22
Pull ksmbd fixes from Steve French: "Five ksmbd server fixes, four of them for stable: - memleak fix - fix for default data stream on filesystems that don't support xattr - error logging fix - session setup fix - minor doc cleanup" * tag '5.16-rc2-ksmbd-fixes' of git://git.samba.org/ksmbd: ksmbd: fix memleak in get_file_stream_info() ksmbd: contain default data stream even if xattr is empty ksmbd: downgrade addition info error msg to debug in smb2_get_info_sec() docs: filesystem: cifs: ksmbd: Fix small layout issues ksmbd: Fix an error handling path in 'smb2_sess_setup()'