aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-10-27crypto: shash - remove crypto_shash_ctx_aligned()Eric Biggers2-10/+0
crypto_shash_ctx_aligned() is no longer used, and it is useless now that shash algorithms don't support nonzero alignmasks, so remove it. Also remove crypto_tfm_ctx_aligned() which was only called by crypto_shash_ctx_aligned(). It's unlikely to be useful again, since it seems inappropriate to use cra_alignmask to represent alignment for the tfm context when it already means alignment for inputs/outputs. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27units: Add BYTES_PER_*BITDamian Muszynski1-0/+4
There is going to be a new user of the BYTES_PER_[K/M/G]BIT definition besides possibly existing ones. Add them to the header. Signed-off-by: Damian Muszynski <[email protected]> Reviewed-by: Giovanni Cabiddu <[email protected]> Reviewed-by: Tero Kristo <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27crypto: shash - remove crypto_shash_alignmaskEric Biggers1-6/+0
crypto_shash_alignmask() no longer has any callers, and it always returns 0 now that the shash algorithm type no longer supports nonzero alignmasks. Therefore, remove it. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27crypto: shash - eliminate indirect call for default import and exportEric Biggers1-13/+2
Most shash algorithms don't have custom ->import and ->export functions, resulting in the memcpy() based default being used. Yet, crypto_shash_import() and crypto_shash_export() still make an indirect call, which is expensive. Therefore, change how the default import and export are called to make it so that crypto_shash_import() and crypto_shash_export() don't do an indirect call in this case. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2023-10-27net: Add MDB get device operationIdo Schimmel1-0/+4
Add MDB net device operation that will be invoked by rtnetlink code in response to received RTM_GETMDB messages. Subsequent patches will implement the operation in the bridge and VXLAN drivers. Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27bridge: add MDB get uAPI attributesIdo Schimmel1-0/+18
Add MDB get attributes that correspond to the MDB set attributes used in RTM_NEWMDB messages. Specifically, add 'MDBA_GET_ENTRY' which will hold a 'struct br_mdb_entry' and 'MDBA_GET_ENTRY_ATTRS' which will hold 'MDBE_ATTR_*' attributes that are used as indexes (source IP and source VNI). An example request will look as follows: [ struct nlmsghdr ] [ struct br_port_msg ] [ MDBA_GET_ENTRY ] struct br_mdb_entry [ MDBA_GET_ENTRY_ATTRS ] [ MDBE_ATTR_SOURCE ] struct in_addr / struct in6_addr [ MDBE_ATTR_SRC_VNI ] u32 Signed-off-by: Ido Schimmel <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27Merge tag 'thunderbolt-for-v6.7-rc1' of ↵Greg Kroah-Hartman1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt into usb-next Mika writes: thunderbolt: Changes for v6.7 merge window This includes following USB4/Thunderbolt changes for the v6.7 merge window: - Configure asymmetric link if the DisplayPort bandwidth requires so - Enable path power management packet support for USB4 v2 routers - Make the bandwidth reservations to follow the USB4 v2 connection manager guide suggestions - DisplayPort tunneling improvements - Small cleanups and improvements around the driver. All these have been in linux-next with no reported issues. * tag 'thunderbolt-for-v6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/westeri/thunderbolt: (25 commits) thunderbolt: Fix one kernel-doc comment thunderbolt: Configure asymmetric link if needed and bandwidth allows thunderbolt: Add support for asymmetric link thunderbolt: Introduce tb_switch_depth() thunderbolt: Introduce tb_for_each_upstream_port_on_path() thunderbolt: Introduce tb_port_path_direction_downstream() thunderbolt: Set path power management packet support bit for USB4 v2 routers thunderbolt: Change bandwidth reservations to comply USB4 v2 thunderbolt: Make is_gen4_link() available to the rest of the driver thunderbolt: Use weight constants in tb_usb3_consumed_bandwidth() thunderbolt: Use constants for path weight and priority thunderbolt: Add DP IN added last in the head of the list of DP resources thunderbolt: Create multiple DisplayPort tunnels if there are more DP IN/OUT pairs thunderbolt: Log NVM version of routers and retimers thunderbolt: Use tb_tunnel_xxx() log macros in tb.c thunderbolt: Expose tb_tunnel_xxx() log macros to the rest of the driver thunderbolt: Use tb_tunnel_dbg() where possible to make logging more consistent thunderbolt: Fix typo of HPD bit for Hot Plug Detect thunderbolt: Fix typo in enum tb_link_width kernel-doc thunderbolt: Fix debug log when DisplayPort adapter not available for pairing ...
2023-10-27net/tcp: Add TCP_AO_REPAIRDmitry Safonov2-0/+22
Add TCP_AO_REPAIR setsockopt(), getsockopt(). They let a user to repair TCP-AO ISNs/SNEs. Also let the user hack around when (tp->repair) is on and add ao_info on a socket in any supported state. As SNEs now can be read/written at any moment, use WRITE_ONCE()/READ_ONCE() to set/read them. Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Wire up l3index to TCP-AODmitry Safonov2-14/+15
Similarly how TCP_MD5SIG_FLAG_IFINDEX works for TCP-MD5, TCP_AO_KEYF_IFINDEX is an AO-key flag that binds that MKT to a specified by L3 ifinndex. Similarly, without this flag the key will work in the default VRF l3index = 0 for connections. To prevent AO-keys from overlapping, it's restricted to add key B for a socket that has key A, which have the same sndid/rcvid and one of the following is true: - !(A.keyflags & TCP_AO_KEYF_IFINDEX) or !(B.keyflags & TCP_AO_KEYF_IFINDEX) so that any key is non-bound to a VRF - A.l3index == B.l3index both want to work for the same VRF Additionally, it's restricted to match TCP-MD5 keys for the same peer the following way: |--------------|--------------------|----------------|---------------| | | MD5 key without | MD5 key | MD5 key | | | l3index | l3index=0 | l3index=N | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | without | reject | reject | reject | | l3index | | | | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=0 | reject | reject | allow | |--------------|--------------------|----------------|---------------| | TCP-AO key | | | | | l3index=N | reject | allow | reject | |--------------|--------------------|----------------|---------------| This is done with the help of tcp_md5_do_lookup_any_l3index() to reject adding AO key without TCP_AO_KEYF_IFINDEX if there's TCP-MD5 in any VRF. This is important for case where sysctl_tcp_l3mdev_accept = 1 Similarly, for TCP-AO lookups tcp_ao_do_lookup() may be used with l3index < 0, so that __tcp_ao_key_cmp() will match TCP-AO key in any VRF. Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add static_key for TCP-AODmitry Safonov2-8/+18
Similarly to TCP-MD5, add a static key to TCP-AO that is patched out when there are no keys on a machine and dynamically enabled with the first setsockopt(TCP_AO) adds a key on any socket. The static key is as well dynamically disabled later when the socket is destructed. The lifetime of enabled static key here is the same as ao_info: it is enabled on allocation, passed over from full socket to twsk and destructed when ao_info is scheduled for destruction. Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Allow asynchronous delete for TCP-AO keys (MKTs)Dmitry Safonov1-1/+2
Delete becomes very, very fast - almost free, but after setsockopt() syscall returns, the key is still alive until next RCU grace period. Which is fine for listen sockets as userspace needs to be aware of setsockopt(TCP_AO) and accept() race and resolve it with verification by getsockopt() after TCP connection was accepted. The benchmark results (on non-loaded box, worse with more RCU work pending): > ok 33 Worst case delete 16384 keys: min=5ms max=10ms mean=6.93904ms stddev=0.263421 > ok 34 Add a new key 16384 keys: min=1ms max=4ms mean=2.17751ms stddev=0.147564 > ok 35 Remove random-search 16384 keys: min=5ms max=10ms mean=6.50243ms stddev=0.254999 > ok 36 Remove async 16384 keys: min=0ms max=0ms mean=0.0296107ms stddev=0.0172078 Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO getsockopt()sDmitry Safonov2-14/+61
Introduce getsockopt(TCP_AO_GET_KEYS) that lets a user get TCP-AO keys and their properties from a socket. The user can provide a filter to match the specific key to be dumped or ::get_all = 1 may be used to dump all keys in one syscall. Add another getsockopt(TCP_AO_INFO) for providing per-socket/per-ao_info stats: packet counters, Current_key/RNext_key and flags like ::ao_required and ::accept_icmps. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add option for TCP-AO to (not) hash headerDmitry Safonov1-0/+5
Provide setsockopt() key flag that makes TCP-AO exclude hashing TCP header for peers that match the key. This is needed for interraction with middleboxes that may change TCP options, see RFC5925 (9.2). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Ignore specific ICMPs for TCP-AO connectionsDmitry Safonov3-2/+14
Similarly to IPsec, RFC5925 prescribes: ">> A TCP-AO implementation MUST default to ignore incoming ICMPv4 messages of Type 3 (destination unreachable), Codes 2-4 (protocol unreachable, port unreachable, and fragmentation needed -- ’hard errors’), and ICMPv6 Type 1 (destination unreachable), Code 1 (administratively prohibited) and Code 4 (port unreachable) intended for connections in synchronized states (ESTABLISHED, FIN-WAIT-1, FIN- WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK, TIME-WAIT) that match MKTs." A selftest (later in patch series) verifies that this attack is not possible in this TCP-AO implementation. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add tcp_hash_fail() ratelimited logsDmitry Safonov2-2/+41
Add a helper for logging connection-detailed messages for failed TCP hash verification (both MD5 and AO). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO SNE supportDmitry Safonov1-1/+21
Add Sequence Number Extension (SNE) for TCP-AO. This is needed to protect long-living TCP-AO connections from replaying attacks after sequence number roll-over, see RFC5925 (6.2). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO segments countersDmitry Safonov5-9/+43
Introduce segment counters that are useful for troubleshooting/debugging as well as for writing tests. Now there are global snmp counters as well as per-socket and per-key. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Verify inbound TCP-AO signed segmentsDmitry Safonov3-2/+82
Now there is a common function to verify signature on TCP segments: tcp_inbound_hash(). It has checks for all possible cross-interactions with MD5 signs as well as with unsigned segments. The rules from RFC5925 are: (1) Any TCP segment can have at max only one signature. (2) TCP connections can't switch between using TCP-MD5 and TCP-AO. (3) TCP-AO connections can't stop using AO, as well as unsigned connections can't suddenly start using AO. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Sign SYN-ACK segments with TCP-AODmitry Safonov2-0/+9
Similarly to RST segments, wire SYN-ACKs to TCP-AO. tcp_rsk_used_ao() is handy here to check if the request socket used AO and needs a signature on the outgoing segments. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Wire TCP-AO to request socketsDmitry Safonov3-0/+48
Now when the new request socket is created from the listening socket, it's recorded what MKT was used by the peer. tcp_rsk_used_ao() is a new helper for checking if TCP-AO option was used to create the request socket. tcp_ao_copy_all_matching() will copy all keys that match the peer on the request socket, as well as preparing them for the usage (creating traffic keys). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO sign to twskDmitry Safonov2-2/+12
Add support for sockets in time-wait state. ao_info as well as all keys are inherited on transition to time-wait socket. The lifetime of ao_info is now protected by ref counter, so that tcp_ao_destroy_sock() will destruct it only when the last user is gone. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add AO sign to RST packetsDmitry Safonov2-1/+18
Wire up sending resets to TCP-AO hashing. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add tcp_parse_auth_options()Dmitry Safonov3-2/+45
Introduce a helper that: (1) shares the common code with TCP-MD5 header options parsing (2) looks for hash signature only once for both TCP-MD5 and TCP-AO (3) fails with -EEXIST if any TCP sign option is present twice, see RFC5925 (2.2): ">> A single TCP segment MUST NOT have more than one TCP-AO in its options sequence. When multiple TCP-AOs appear, TCP MUST discard the segment." Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO sign to outgoing packetsDmitry Safonov2-0/+87
Using precalculated traffic keys, sign TCP segments as prescribed by RFC5925. Per RFC, TCP header options are included in sign calculation: "The TCP header, by default including options, and where the TCP checksum and TCP-AO MAC fields are set to zero, all in network- byte order." (5.1.3) tcp_ao_hash_header() has exclude_options parameter to optionally exclude TCP header from hash calculation, as described in RFC5925 (9.1), this is needed for interaction with middleboxes that may change "some TCP options". This is wired up to AO key flags and setsockopt() later. Similarly to TCP-MD5 hash TCP segment fragments. From this moment a user can start sending TCP-AO signed segments with one of crypto ahash algorithms from supported by Linux kernel. It can have a user-specified MAC length, to either save TCP option header space or provide higher protection using a longer signature. The inbound segments are not yet verified, TCP-AO option is ignored and they are accepted. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Calculate TCP-AO traffic keysDmitry Safonov2-2/+52
Add traffic key calculation the way it's described in RFC5926. Wire it up to tcp_finish_connect() and cache the new keys straight away on already established TCP connections. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Prevent TCP-MD5 with TCP-AO being setDmitry Safonov2-2/+54
Be as conservative as possible: if there is TCP-MD5 key for a given peer regardless of L3 interface - don't allow setting TCP-AO key for the same peer. According to RFC5925, TCP-AO is supposed to replace TCP-MD5 and there can't be any switch between both on any connected tuple. Later it can be relaxed, if there's a use, but in the beginning restrict any intersection. Note: it's still should be possible to set both TCP-MD5 and TCP-AO keys on a listening socket for *different* peers. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Introduce TCP_AO setsockopt()sDmitry Safonov4-1/+88
Add 3 setsockopt()s: 1. TCP_AO_ADD_KEY to add a new Master Key Tuple (MKT) on a socket 2. TCP_AO_DEL_KEY to delete present MKT from a socket 3. TCP_AO_INFO to change flags, Current_key/RNext_key on a TCP-AO sk Userspace has to introduce keys on every socket it wants to use TCP-AO option on, similarly to TCP_MD5SIG/TCP_MD5SIG_EXT. RFC5925 prohibits definition of MKTs that would match the same peer, so do sanity checks on the data provided by userspace. Be as conservative as possible, including refusal of defining MKT on an established connection with no AO, removing the key in-use and etc. (1) and (2) are to be used by userspace key manager to add/remove keys. (3) main purpose is to set RNext_key, which (as prescribed by RFC5925) is the KeyID that will be requested in TCP-AO header from the peer to sign their segments with. At this moment the life of ao_info ends in tcp_v4_destroy_sock(). Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Add TCP-AO config and structuresDmitry Safonov4-8/+101
Introduce new kernel config option and common structures as well as helpers to be used by TCP-AO code. Co-developed-by: Francesco Ruggeri <[email protected]> Signed-off-by: Francesco Ruggeri <[email protected]> Co-developed-by: Salam Noureddine <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: Dmitry Safonov <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27net/tcp: Prepare tcp_md5sig_pool for TCP-AODmitry Safonov1-14/+36
TCP-AO, similarly to TCP-MD5, needs to allocate tfms on a slow-path, which is setsockopt() and use crypto ahash requests on fast paths, which are RX/TX softirqs. Also, it needs a temporary/scratch buffer for preparing the hash. Rework tcp_md5sig_pool in order to support other hashing algorithms than MD5. It will make it possible to share pre-allocated crypto_ahash descriptors and scratch area between all TCP hash users. Internally tcp_sigpool calls crypto_clone_ahash() API over pre-allocated crypto ahash tfm. Kudos to Herbert, who provided this new crypto API. I was a little concerned over GFP_ATOMIC allocations of ahash and crypto_request in RX/TX (see tcp_sigpool_start()), so I benchmarked both "backends" with different algorithms, using patched version of iperf3[2]. On my laptop with i7-7600U @ 2.80GHz: clone-tfm per-CPU-requests TCP-MD5 2.25 Gbits/sec 2.30 Gbits/sec TCP-AO(hmac(sha1)) 2.53 Gbits/sec 2.54 Gbits/sec TCP-AO(hmac(sha512)) 1.67 Gbits/sec 1.64 Gbits/sec TCP-AO(hmac(sha384)) 1.77 Gbits/sec 1.80 Gbits/sec TCP-AO(hmac(sha224)) 1.29 Gbits/sec 1.30 Gbits/sec TCP-AO(hmac(sha3-512)) 481 Mbits/sec 480 Mbits/sec TCP-AO(hmac(md5)) 2.07 Gbits/sec 2.12 Gbits/sec TCP-AO(hmac(rmd160)) 1.01 Gbits/sec 995 Mbits/sec TCP-AO(cmac(aes128)) [not supporetd yet] 2.11 Gbits/sec So, it seems that my concerns don't have strong grounds and per-CPU crypto_request allocation can be dropped/removed from tcp_sigpool once ciphers get crypto_clone_ahash() support. [1]: https://lore.kernel.org/all/[email protected]/T/#u [2]: https://github.com/0x7f454c46/iperf/tree/tcp-md5-ao Signed-off-by: Dmitry Safonov <[email protected]> Reviewed-by: Steen Hegelund <[email protected]> Acked-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-10-27Merge tag 'icc-6.7-rc1' of ↵Greg Kroah-Hartman1-0/+102
git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc into char-misc-next Georgi writes: interconnect changes for 6.7 This pull request contains the interconnect changes for the 6.7-rc1 merge window which contains just driver changes with the following highlights: Driver changes: - New interconnect driver for the SDX75 platform. - Support for coefficients to allow node-specific rate adjustments. - Update DT bindings according to the recent changes of how we represent the SMD and RPM bus clocks on Qualcomm platforms. - Misc fixes and cleanups. Signed-off-by: Georgi Djakov <[email protected]> * tag 'icc-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/djakov/icc: (36 commits) interconnect: qcom: Convert to platform remove callback returning void dt-bindings: interconnect: qcom,rpmh: do not require reg on SDX65 MC virt interconnect: imx: Replace inclusion of kernel.h in the header interconnect: fix error handling in qnoc_probe() interconnect: qcom: osm-l3: Replace custom implementation of COUNT_ARGS() interconnect: msm8974: Replace custom implementation of COUNT_ARGS() interconnect: imx: Replace custom implementation of COUNT_ARGS() interconnect: qcom: Add SDX75 interconnect provider driver dt-bindings: interconnect: Add compatibles for SDX75 interconnect: qcom: sm8350: Set ACV enable_mask interconnect: qcom: sm8250: Set ACV enable_mask interconnect: qcom: sm8150: Set ACV enable_mask interconnect: qcom: sm6350: Set ACV enable_mask interconnect: qcom: sdm845: Set ACV enable_mask interconnect: qcom: sdm670: Set ACV enable_mask interconnect: qcom: sc8280xp: Set ACV enable_mask interconnect: qcom: sc8180x: Set ACV enable_mask interconnect: qcom: sc7280: Set ACV enable_mask interconnect: qcom: sc7180: Set ACV enable_mask interconnect: qcom: qdu1000: Set ACV enable_mask ...
2023-10-27tty: n_gsm: add copyright Siemens Mobility GmbHDaniel Starke1-0/+1
More than 1/3 of the n_gsm code has been contributed by us in the last 1.5 years, completing conformance with the standard and stabilizing the driver: - added UI (unnumbered information) frame support - added PN (parameter negotiation) message handling and function support - added optional keep-alive control link supervision via test messages - added TIOCM_OUT1 and TIOCM_OUT2 to allow responder to operate as modem - added TIOCMIWAIT support on virtual ttys - added additional ioctls and parameters to configure the new functions - added overall locking mechanism to avoid data race conditions - added outgoing data flow to decouple physical from virtual tty handling for better performance and to avoid dead-locks - fixed advanced option mode implementation - fixed convergence layer type 2 implementation - fixed handling of CLD (multiplexer close down) messages - fixed broken muxer close down procedure - and many more bug fixes With this most of our initial RFC has been implemented. It gives the driver a quality boost unseen in the decade before. Add a copyright notice to the n_gsm files to highlight this contribution. Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Daniel Starke <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2023-10-27Merge branches 'iommu/fixes', 'arm/tegra', 'arm/smmu', 'virtio', 'x86/vt-d', ↵Joerg Roedel3-159/+25
'x86/amd', 'core' and 's390' into next
2023-10-26Merge tag 'wireless-next-2023-10-26' of ↵Jakub Kicinski4-54/+67
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next Kalle Valo says: ==================== wireless-next patches for v6.7 The third, and most likely the last, features pull request for v6.7. Fixes all over and only few small new features. Major changes: iwlwifi - more Multi-Link Operation (MLO) work ath12k - QCN9274: mesh support ath11k - firmware-2.bin container file format support * tag 'wireless-next-2023-10-26' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (155 commits) wifi: ray_cs: Remove unnecessary (void*) conversions Revert "wifi: ath11k: call ath11k_mac_fils_discovery() without condition" wifi: ath12k: Introduce and use ath12k_sta_to_arsta() wifi: ath12k: fix htt mlo-offset event locking wifi: ath12k: fix dfs-radar and temperature event locking wifi: ath11k: fix gtk offload status event locking wifi: ath11k: fix htt pktlog locking wifi: ath11k: fix dfs radar event locking wifi: ath11k: fix temperature event locking wifi: ath12k: rename the sc naming convention to ab wifi: ath12k: rename the wmi_sc naming convention to wmi_ab wifi: ath11k: add firmware-2.bin support wifi: ath11k: qmi: refactor ath11k_qmi_m3_load() wifi: rtw89: cleanup firmware elements parsing wifi: rt2x00: rework MT7620 PA/LNA RF calibration wifi: rt2x00: rework MT7620 channel config function wifi: rt2x00: improve MT7620 register initialization MAINTAINERS: wifi: rt2x00: drop Helmut Schaa wifi: wlcore: main: replace deprecated strncpy with strscpy wifi: wlcore: boot: replace deprecated strncpy with strscpy ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-26Merge tag 'for-netdev' of ↵Jakub Kicinski11-21/+132
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2023-10-26 We've added 51 non-merge commits during the last 10 day(s) which contain a total of 75 files changed, 5037 insertions(+), 200 deletions(-). The main changes are: 1) Add open-coded task, css_task and css iterator support. One of the use cases is customizable OOM victim selection via BPF, from Chuyi Zhou. 2) Fix BPF verifier's iterator convergence logic to use exact states comparison for convergence checks, from Eduard Zingerman, Andrii Nakryiko and Alexei Starovoitov. 3) Add BPF programmable net device where bpf_mprog defines the logic of its xmit routine. It can operate in L3 and L2 mode, from Daniel Borkmann and Nikolay Aleksandrov. 4) Batch of fixes for BPF per-CPU kptr and re-enable unit_size checking for global per-CPU allocator, from Hou Tao. 5) Fix libbpf which eagerly assumed that SHT_GNU_verdef ELF section was going to be present whenever a binary has SHT_GNU_versym section, from Andrii Nakryiko. 6) Fix BPF ringbuf correctness to fold smp_mb__before_atomic() into atomic_set_release(), from Paul E. McKenney. 7) Add a warning if NAPI callback missed xdp_do_flush() under CONFIG_DEBUG_NET which helps checking if drivers were missing the former, from Sebastian Andrzej Siewior. 8) Fix missed RCU read-lock in bpf_task_under_cgroup() which was throwing a warning under sleepable programs, from Yafang Shao. 9) Avoid unnecessary -EBUSY from htab_lock_bucket by disabling IRQ before checking map_locked, from Song Liu. 10) Make BPF CI linked_list failure test more robust, from Kumar Kartikeya Dwivedi. 11) Enable samples/bpf to be built as PIE in Fedora, from Viktor Malik. 12) Fix xsk starving when multiple xsk sockets were associated with a single xsk_buff_pool, from Albert Huang. 13) Clarify the signed modulo implementation for the BPF ISA standardization document that it uses truncated division, from Dave Thaler. 14) Improve BPF verifier's JEQ/JNE branch taken logic to also consider signed bounds knowledge, from Andrii Nakryiko. 15) Add an option to XDP selftests to use multi-buffer AF_XDP xdp_hw_metadata and mark used XDP programs as capable to use frags, from Larysa Zaremba. 16) Fix bpftool's BTF dumper wrt printing a pointer value and another one to fix struct_ops dump in an array, from Manu Bretelle. * tag 'for-netdev' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (51 commits) netkit: Remove explicit active/peer ptr initialization selftests/bpf: Fix selftests broken by mitigations=off samples/bpf: Allow building with custom bpftool samples/bpf: Fix passing LDFLAGS to libbpf samples/bpf: Allow building with custom CFLAGS/LDFLAGS bpf: Add more WARN_ON_ONCE checks for mismatched alloc and free selftests/bpf: Add selftests for netkit selftests/bpf: Add netlink helper library bpftool: Extend net dump with netkit progs bpftool: Implement link show support for netkit libbpf: Add link-based API for netkit tools: Sync if_link uapi header netkit, bpf: Add bpf programmable net device bpf: Improve JEQ/JNE branch taken logic bpf: Fold smp_mb__before_atomic() into atomic_set_release() bpf: Fix unnecessary -EBUSY from htab_lock_bucket xsk: Avoid starving the xsk further down the list bpf: print full verifier states on infinite loop detection selftests/bpf: test if state loops are detected in a tricky case bpf: correct loop detection for iterators convergence ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-27scsi: sd: Introduce manage_shutdown device flagDamien Le Moal1-2/+18
Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") change setting the manage_system_start_stop flag to false for libata managed disks to enable libata internal management of disk suspend/resume. However, a side effect of this change is that on system shutdown, disks are no longer being stopped (set to standby mode with the heads unloaded). While this is not a critical issue, this unclean shutdown is not recommended and shows up with increased smart counters (e.g. the unexpected power loss counter "Unexpect_Power_Loss_Ct"). Instead of defining a shutdown driver method for all ATA adapter drivers (not all of them define that operation), this patch resolves this issue by further refining the sd driver start/stop control of disks using the new flag manage_shutdown. If this new flag is set to true by a low level driver, the function sd_shutdown() will issue a START STOP UNIT command with the start argument set to 0 when a disk needs to be powered off (suspended) on system power off, that is, when system_state is equal to SYSTEM_POWER_OFF. Similarly to the other manage_xxx flags, the new manage_shutdown flag is exposed through sysfs as a read-write device attribute. To avoid any confusion between manage_shutdown and manage_system_start_stop, the comments describing these flags in include/scsi/scsi.h are also improved. Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop") Cc: [email protected] Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218038 Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Damien Le Moal <[email protected]> Reviewed-by: Niklas Cassel <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: James Bottomley <[email protected]> Acked-by: Martin K. Petersen <[email protected]>
2023-10-26netlink: make range pointers in policies constJakub Kicinski1-2/+2
struct nla_policy is usually constant itself, but unless we make the ranges inside constant we won't be able to make range structs const. The ranges are not modified by the core. Reviewed-by: Johannes Berg <[email protected]> Reviewed-by: David Ahern <[email protected]> Reviewed-by: Nikolay Aleksandrov <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski12-10/+84
Cross-merge networking fixes after downstream PR. Conflicts: net/mac80211/rx.c 91535613b609 ("wifi: mac80211: don't drop all unprotected public action frames") 6c02fab72429 ("wifi: mac80211: split ieee80211_drop_unencrypted_mgmt() return value") Adjacent changes: drivers/net/ethernet/apm/xgene/xgene_enet_main.c 61471264c018 ("net: ethernet: apm: Convert to platform remove callback returning void") d2ca43f30611 ("net: xgene: Fix unused xgene_enet_of_match warning for !CONFIG_OF") net/vmw_vsock/virtio_transport.c 64c99d2d6ada ("vsock/virtio: support to send non-linear skb") 53b08c498515 ("vsock/virtio: initialize the_virtio_vsock before using VQs") Signed-off-by: Jakub Kicinski <[email protected]>
2023-10-26exportfs: Change bcachefs fid_type enum to avoid conflictsKent Overstreet1-6/+6
Per Amir Goldstein, the fid types that bcachefs picked conflicted with xfs and fuse, which previously were in use but not deviced in the master enum. Since bcachefs is still out of tree, we can move. https://lore.kernel.org/linux-next/[email protected]/T/#ma59f65ba61f605b593e69f4690dbd317526d83ba Signed-off-by: Kent Overstreet <[email protected]>
2023-10-26landlock: Support network rules with TCP bind and connectKonstantin Meskhidze1-0/+55
Add network rules support in the ruleset management helpers and the landlock_create_ruleset() syscall. Extend user space API to support network actions: * Add new network access rights: LANDLOCK_ACCESS_NET_BIND_TCP and LANDLOCK_ACCESS_NET_CONNECT_TCP. * Add a new network rule type: LANDLOCK_RULE_NET_PORT tied to struct landlock_net_port_attr. The allowed_access field contains the network access rights, and the port field contains the port value according to the controlled protocol. This field can take up to a 64-bit value but the maximum value depends on the related protocol (e.g. 16-bit value for TCP). Network port is in host endianness [1]. * Add a new handled_access_net field to struct landlock_ruleset_attr that contains network access rights. * Increment the Landlock ABI version to 4. Implement socket_bind() and socket_connect() LSM hooks, which enable to control TCP socket binding and connection to specific ports. Expand access_masks_t from u16 to u32 to be able to store network access rights alongside filesystem access rights for rulesets' handled access rights. Access rights are not tied to socket file descriptors but checked at bind() or connect() call time against the caller's Landlock domain. For the filesystem, a file descriptor is a direct access to a file/data. However, for network sockets, we cannot identify for which data or peer a newly created socket will give access to. Indeed, we need to wait for a connect or bind request to identify the use case for this socket. Likewise a directory file descriptor may enable to open another file (i.e. a new data item), but this opening is also restricted by the caller's domain, not the file descriptor's access rights [2]. [1] https://lore.kernel.org/r/[email protected] [2] https://lore.kernel.org/r/[email protected] Signed-off-by: Konstantin Meskhidze <[email protected]> Link: https://lore.kernel.org/r/[email protected] [mic: Extend commit message, fix typo in comments, and specify endianness in the documentation] Co-developed-by: Mickaël Salaün <[email protected]> Signed-off-by: Mickaël Salaün <[email protected]>
2023-10-26Merge tag 'net-6.6-rc8' of ↵Linus Torvalds3-1/+31
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from WiFi and netfilter. Most regressions addressed here come from quite old versions, with the exceptions of the iavf one and the WiFi fixes. No known outstanding reports or investigation. Fixes to fixes: - eth: iavf: in iavf_down, disable queues when removing the driver Previous releases - regressions: - sched: act_ct: additional checks for outdated flows - tcp: do not leave an empty skb in write queue - tcp: fix wrong RTO timeout when received SACK reneging - wifi: cfg80211: pass correct pointer to rdev_inform_bss() - eth: i40e: sync next_to_clean and next_to_process for programming status desc - eth: iavf: initialize waitqueues before starting watchdog_task Previous releases - always broken: - eth: r8169: fix data-races - eth: igb: fix potential memory leak in igb_add_ethtool_nfc_entry - eth: r8152: avoid writing garbage to the adapter's registers - eth: gtp: fix fragmentation needed check with gso" * tag 'net-6.6-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits) iavf: in iavf_down, disable queues when removing the driver vsock/virtio: initialize the_virtio_vsock before using VQs net: ipv6: fix typo in comments net: ipv4: fix typo in comments net/sched: act_ct: additional checks for outdated flows netfilter: flowtable: GC pushes back packets to classic path i40e: Fix wrong check for I40E_TXR_FLAGS_WB_ON_ITR gtp: fix fragmentation needed check with gso gtp: uapi: fix GTPA_MAX Fix NULL pointer dereference in cn_filter() sfc: cleanup and reduce netlink error messages net/handshake: fix file ref count in handshake_nl_accept_doit() wifi: mac80211: don't drop all unprotected public action frames wifi: cfg80211: fix assoc response warning on failed links wifi: cfg80211: pass correct pointer to rdev_inform_bss() isdn: mISDN: hfcsusb: Spelling fix in comment tcp: fix wrong RTO timeout when received SACK reneging r8152: Block future register access if register access fails r8152: Rename RTL8152_UNPLUG to RTL8152_INACCESSIBLE r8152: Check for unplug in r8153b_ups_en() / r8153c_ups_en() ...
2023-10-26Merge branch 'for-next/cpus_have_const_cap' into for-next/coreCatalin Marinas1-0/+2
* for-next/cpus_have_const_cap: (38 commits) : cpus_have_const_cap() removal arm64: Remove cpus_have_const_cap() arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_REPEAT_TLBI arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_NVIDIA_CARMEL_CNP arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_CAVIUM_23154 arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_2645198 arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1742098 arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_1542419 arm64: Avoid cpus_have_const_cap() for ARM64_WORKAROUND_843419 arm64: Avoid cpus_have_const_cap() for ARM64_UNMAP_KERNEL_AT_EL0 arm64: Avoid cpus_have_const_cap() for ARM64_{SVE,SME,SME2,FA64} arm64: Avoid cpus_have_const_cap() for ARM64_SPECTRE_V2 arm64: Avoid cpus_have_const_cap() for ARM64_SSBS arm64: Avoid cpus_have_const_cap() for ARM64_MTE arm64: Avoid cpus_have_const_cap() for ARM64_HAS_TLB_RANGE arm64: Avoid cpus_have_const_cap() for ARM64_HAS_WFXT arm64: Avoid cpus_have_const_cap() for ARM64_HAS_RNG arm64: Avoid cpus_have_const_cap() for ARM64_HAS_EPAN arm64: Avoid cpus_have_const_cap() for ARM64_HAS_PAN arm64: Avoid cpus_have_const_cap() for ARM64_HAS_GIC_PRIO_MASKING arm64: Avoid cpus_have_const_cap() for ARM64_HAS_DIT ...
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov1-3/+6
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <[email protected]> Cc: Russell King <[email protected]> Cc: Qiang Yu <[email protected]> Cc: Rob Clark <[email protected]> Cc: Abhinav Kumar <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Danilo Krummrich <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Boris Brezillon <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Emma Anholt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Luben Tuikov <[email protected]> Acked-by: Christian König <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-10-26ASoC: cs35l41: Detect CSPL errors when sending CSPL commandsStefan Binding1-0/+2
The existing code checks for the correct state transition after sending a command. However, it is possible for the message box to return -1, which indicates an error, if an error has occurred in the firmware. We can detect if the error has occurred, and return a different error. In addition, there is no recovering from a CSPL error, so the retry mechanism is not needed in this case, and we can return immediately. Signed-off-by: Stefan Binding <[email protected]> Acked-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-10-26ALSA: hda: cs35l41: Force a software reset after hardware resetStefan Binding1-0/+1
To ensure the chip has correctly reset during probe and system suspend, we need to force a software reset, in case of systems where the hardware reset is not available. The software reset register was labelled as volatile but not readable, however, it is readable, (just returns 0x0). Adding it to readable registers means it will be correctly treated as volatile, and thus will not be cached. Signed-off-by: Stefan Binding <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Takashi Iwai <[email protected]>
2023-10-26Merge tag 'v6.6-rc7' into coreJoerg Roedel51-84/+190
Linux 6.6-rc7
2023-10-26iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domainJason Gunthorpe1-0/+3
Following the pattern of identity domains, just assign the BLOCKED domain global statics to a value in ops. Update the core code to use the global static directly. Update powerpc to use the new scheme and remove its empty domain_alloc callback. Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26pmdomain: Merge branch genpd_dt into nextUlf Hansson1-0/+2
Merge the immutable branch genpd_dt into next, to allow the DT bindings to be tested together with new pmdomain changes that are targeted for v6.7. Signed-off-by: Ulf Hansson <[email protected]>
2023-10-26dt-bindings: power: qcom,rpmhpd: Add GMXC PD indexSibi Sankar1-0/+1
Document GMXC (Graphics MXC) power domain index which will be used on SC8380XP SoCs. Signed-off-by: Sibi Sankar <[email protected]> Link: https://lore.kernel.org/r/[email protected] [Ulf: Re-based to step up the index number] Signed-off-by: Ulf Hansson <[email protected]>
2023-10-26dt-bindings: power: qcom,rpmpd: document the SM8650 RPMh Power DomainsNeil Armstrong1-0/+1
Document the RPMh Power Domains on the SM8650 Platform. Signed-off-by: Neil Armstrong <[email protected]> Link: https://lore.kernel.org/r/20231025-topic-sm8650-upstream-rpmpd-v1-1-f25d313104c6@linaro.org Signed-off-by: Ulf Hansson <[email protected]>
2023-10-26iommu/vt-d: Disallow read-only mappings to nest parent domainLu Baolu1-1/+11
When remapping hardware is configured by system software in scalable mode as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) in first-stage page-table entries even when second-stage mappings indicate that corresponding first-stage page-table is Read-Only. As the result, contents of pages designated by VMM as Read-Only can be modified by IOMMU via PML5E (PML4E for 4-level tables) access as part of address translation process due to DMAs issued by Guest. This disallows read-only mappings in the domain that is supposed to be used as nested parent. Reference from Sapphire Rapids Specification Update [1], errata details, SPR17. Userspace should know this limitation by checking the IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 flag reported in the IOMMU_GET_HW_INFO ioctl. [1] https://www.intel.com/content/www/us/en/content-details/772415/content-details.html Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>