aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2021-04-26Merge branch 'linus' of ↵Linus Torvalds16-91/+106
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - crypto_destroy_tfm now ignores errors as well as NULL pointers Algorithms: - Add explicit curve IDs in ECDH algorithm names - Add NIST P384 curve parameters - Add ECDSA Drivers: - Add support for Green Sardine in ccp - Add ecdh/curve25519 to hisilicon/hpre - Add support for AM64 in sa2ul" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (184 commits) fsverity: relax build time dependency on CRYPTO_SHA256 fscrypt: relax Kconfig dependencies for crypto API algorithms crypto: camellia - drop duplicate "depends on CRYPTO" crypto: s5p-sss - consistently use local 'dev' variable in probe() crypto: s5p-sss - remove unneeded local variable initialization crypto: s5p-sss - simplify getting of_device_id match data ccp: ccp - add support for Green Sardine crypto: ccp - Make ccp_dev_suspend and ccp_dev_resume void functions crypto: octeontx2 - add support for OcteonTX2 98xx CPT block. crypto: chelsio/chcr - Remove useless MODULE_VERSION crypto: ux500/cryp - Remove duplicate argument crypto: chelsio - remove unused function crypto: sa2ul - Add support for AM64 crypto: sa2ul - Support for per channel coherency dt-bindings: crypto: ti,sa2ul: Add new compatible for AM64 crypto: hisilicon - enable new error types for QM crypto: hisilicon - add new error type for SEC crypto: hisilicon - support new error types for ZIP crypto: hisilicon - dynamic configuration 'err_info' crypto: doc - fix kernel-doc notation in chacha.c and af_alg.c ...
2021-04-26Merge tag 'keys-cve-2020-26541-v3' of ↵Linus Torvalds1-0/+15
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull x509 dbx/mokx UEFI support from David Howells: "Here's a set of patches from Eric Snowberg[1] that add support for EFI_CERT_X509_GUID entries in the dbx and mokx UEFI tables (such entries cause matching certificates to be rejected). These are currently ignored and only the hash entries are made use of. Additionally Eric included his patches to allow such certificates to be preloaded. These patches deal with CVE-2020-26541. To quote Eric: 'This is the fifth patch series for adding support for EFI_CERT_X509_GUID entries [2]. It has been expanded to not only include dbx entries but also entries in the mokx. Additionally my series to preload these certificate [3] has also been included'" Link: https://lore.kernel.org/r/20210122181054.32635-1-eric.snowberg@oracle.com [1] Link: https://patchwork.kernel.org/project/linux-security-module/patch/20200916004927.64276-1-eric.snowberg@oracle.com/ [2] Link: https://lore.kernel.org/patchwork/cover/1315485/ [3] * tag 'keys-cve-2020-26541-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: integrity: Load mokx variables into the blacklist keyring certs: Add ability to preload revocation certs certs: Move load_system_certificate_list to a common function certs: Add EFI_CERT_X509_GUID support for dbx entries
2021-04-26Merge tag 'tpmdd-next-v5.13' of ↵Linus Torvalds6-21/+118
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "New features: - ARM TEE backend for kernel trusted keys to complete the existing TPM backend - ASN.1 format for TPM2 trusted keys to make them interact with the user space stack, such as OpenConnect VPN Other than that, a bunch of bug fixes" * tag 'tpmdd-next-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: Fix missing null return from kzalloc call char: tpm: fix error return code in tpm_cr50_i2c_tis_recv() MAINTAINERS: Add entry for TEE based Trusted Keys doc: trusted-encrypted: updates with TEE as a new trust source KEYS: trusted: Introduce TEE based Trusted Keys KEYS: trusted: Add generic trusted keys framework security: keys: trusted: Make sealed key properly interoperable security: keys: trusted: use ASN.1 TPM2 key format for the blobs security: keys: trusted: fix TPM2 authorizations oid_registry: Add TCG defined OIDS for TPM keys lib: Add ASN.1 encoder tpm: vtpm_proxy: Avoid reading host log when using a virtual device tpm: acpi: Check eventlog signature before using it tpm: efi: Use local variable for calculating final log size
2021-04-23Merge tag 'gpio-fixes-for-v5.12' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio fix from Bartosz Golaszewski: "Save and restore the sysconfig register in gpio-omap to fix a power-management issue" * tag 'gpio-fixes-for-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: omap: Save and restore sysconfig
2021-04-21gpio: omap: Save and restore sysconfigTony Lindgren1-0/+3
As we are using cpu_pm to save and restore context, we must also save and restore the GPIO sysconfig register. This is needed because we are not calling PM runtime functions at all with cpu_pm. We need to save the sysconfig on idle as it's value can get reconfigured by PM runtime and can be different from the init time value. Device specific flags like "ti,no-idle-on-init" can affect the init value. Fixes: b764a5863fd8 ("gpio: omap: Remove custom PM calls and use cpu_pm instead") Cc: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: Adam Ford <aford173@gmail.com> Cc: Andreas Kemnade <andreas@kemnade.info> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Peter Ujfalusi <peter.ujfalusi@gmail.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Acked-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
2021-04-20capabilities: require CAP_SETFCAP to map uid 0Serge E. Hallyn2-1/+5
cap_setfcap is required to create file capabilities. Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"), a process running as uid 0 but without cap_setfcap is able to work around this as follows: unshare a new user namespace which maps parent uid 0 into the child namespace. While this task will not have new capabilities against the parent namespace, there is a loophole due to the way namespaced file capabilities are represented as xattrs. File capabilities valid in userns 1 are distinguished from file capabilities valid in userns 2 by the kuid which underlies uid 0. Therefore the restricted root process can unshare a new self-mapping namespace, add a namespaced file capability onto a file, then use that file capability in the parent namespace. To prevent that, do not allow mapping parent uid 0 if the process which opened the uid_map file does not have CAP_SETFCAP, which is the capability for setting file capabilities. As a further wrinkle: a task can unshare its user namespace, then open its uid_map file itself, and map (only) its own uid. In this case we do not have the credential from before unshare, which was potentially more restricted. So, when creating a user namespace, we record whether the creator had CAP_SETFCAP. Then we can use that during map_write(). With this patch: 1. Unprivileged user can still unshare -Ur ubuntu@caps:~$ unshare -Ur root@caps:~# logout 2. Root user can still unshare -Ur ubuntu@caps:~$ sudo bash root@caps:/home/ubuntu# unshare -Ur root@caps:/home/ubuntu# logout 3. Root user without CAP_SETFCAP cannot unshare -Ur: root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap -- root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap unable to set CAP_SETFCAP effective capability: Operation not permitted root@caps:/home/ubuntu# unshare -Ur unshare: write failed /proc/self/uid_map: Operation not permitted Note: an alternative solution would be to allow uid 0 mappings by processes without CAP_SETFCAP, but to prevent such a namespace from writing any file capabilities. This approach can be seen at [1]. Background history: commit 95ebabde382 ("capabilities: Don't allow writing ambiguous v3 file capabilities") tried to fix the issue by preventing v3 fscaps to be written to disk when the root uid would map to the same uid in nested user namespaces. This led to regressions for various workloads. For example, see [2]. Ultimately this is a valid use-case we have to support meaning we had to revert this change in 3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing ambiguous v3 file capabilities")"). Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1] Link: https://github.com/containers/buildah/issues/3071 [2] Signed-off-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Tested-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christian Brauner <christian.brauner@ubuntu.com> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-17Merge tag 'net-5.12-rc8' of ↵Linus Torvalds3-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc8, including fixes from netfilter, and bpf. BPF verifier changes stand out, otherwise things have slowed down. Current release - regressions: - gro: ensure frag0 meets IP header alignment - Revert "net: stmmac: re-init rx buffers when mac resume back" - ethernet: macb: fix the restore of cmp registers Previous releases - regressions: - ixgbe: Fix NULL pointer dereference in ethtool loopback test - ixgbe: fix unbalanced device enable/disable in suspend/resume - phy: marvell: fix detection of PHY on Topaz switches - make tcp_allowed_congestion_control readonly in non-init netns - xen-netback: Check for hotplug-status existence before watching Previous releases - always broken: - bpf: mitigate a speculative oob read of up to map value size by tightening the masking window - sctp: fix race condition in sctp_destroy_sock - sit, ip6_tunnel: Unregister catch-all devices - netfilter: nftables: clone set element expression template - netfilter: flowtable: fix NAT IPv6 offload mangling - net: geneve: check skb is large enough for IPv4/IPv6 header - netlink: don't call ->netlink_bind with table lock held" * tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits) netlink: don't call ->netlink_bind with table lock held MAINTAINERS: update my email bpf: Update selftests to reflect new error states bpf: Tighten speculative pointer arithmetic mask bpf: Move sanitize_val_alu out of op switch bpf: Refactor and streamline bounds check into helper bpf: Improve verifier error messages for users bpf: Rework ptr_limit into alu_limit and add common error path bpf: Ensure off_reg has no mixed signed bounds for all types bpf: Move off_reg into sanitize_ptr_alu bpf: Use correct permission flag for mixed signed bounds arithmetic ch_ktls: do not send snd_una update to TCB in middle ch_ktls: tcb close causes tls connection failure ch_ktls: fix device connection close ch_ktls: Fix kernel panic i40e: fix the panic when running bpf in xdpdrv mode net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta net/mlx5e: Fix setting of RS FEC mode net/mlx5: Fix setting of devlink traps in switchdev mode Revert "net: stmmac: re-init rx buffers when mac resume back" ...
2021-04-17Merge tag 'libnvdimm-fixes-for-5.12-rc8' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "The largest change is for a regression that landed during -rc1 for block-device read-only handling. Vaibhav found a new use for the ability (originally introduced by virtio_pmem) to call back to the platform to flush data, but also found an original bug in that implementation. Lastly, Arnd cleans up some compile warnings in dax. This has all appeared in -next with no reported issues. Summary: - Fix a regression of read-only handling in the pmem driver - Fix a compile warning - Fix support for platform cache flush commands on powerpc/papr" * tag 'libnvdimm-fixes-for-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: libnvdimm/region: Fix nvdimm_has_flush() to handle ND_REGION_ASYNC libnvdimm: Notify disk drivers to revalidate region read-only dax: avoid -Wempty-body warnings
2021-04-16kasan: remove redundant config optionWalter Wu1-1/+1
CONFIG_KASAN_STACK and CONFIG_KASAN_STACK_ENABLE both enable KASAN stack instrumentation, but we should only need one config, so that we remove CONFIG_KASAN_STACK_ENABLE and make CONFIG_KASAN_STACK workable. see [1]. When enable KASAN stack instrumentation, then for gcc we could do no prompt and default value y, and for clang prompt and default value n. This patch fixes the following compilation warning: include/linux/kasan.h:333:30: warning: 'CONFIG_KASAN_STACK' is not defined, evaluates to 0 [-Wundef] [akpm@linux-foundation.org: fix merge snafu] Link: https://bugzilla.kernel.org/show_bug.cgi?id=210221 [1] Link: https://lkml.kernel.org/r/20210226012531.29231-1-walter-zh.wu@mediatek.com Fixes: d9b571c885a8 ("kasan: fix KASAN_STACK dependency for HW_TAGS") Signed-off-by: Walter Wu <walter-zh.wu@mediatek.com> Suggested-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andrey Konovalov <andreyknvl@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-14Merge tag 'dmaengine-fix-5.12' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: "A couple of dmaengine driver fixes for: - race and descriptor issue for xilinx driver - fix interrupt handling, wq state & cleanup, field sizes for completion, msix permissions for idxd driver - runtime pm fix for tegra driver - double free fix in dma_async_device_register" * tag 'dmaengine-fix-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: idxd: fix wq cleanup of WQCFG registers dmaengine: idxd: clear MSIX permission entry on shutdown dmaengine: plx_dma: add a missing put_device() on error path dmaengine: tegra20: Fix runtime PM imbalance on error dmaengine: Fix a double free in dma_async_device_register dmaengine: dw: Make it dependent to HAS_IOMEM dmaengine: idxd: fix wq size store permission state dmaengine: idxd: fix opcap sysfs attribute output dmaengine: idxd: fix delta_rec and crc size field for completion record dmaengine: idxd: Fix clobbering of SWERR overflow bit on writeback dmaengine: xilinx: dpdma: Fix race condition in done IRQ dmaengine: xilinx: dpdma: Fix descriptor issuing on video group
2021-04-14KEYS: trusted: Introduce TEE based Trusted KeysSumit Garg1-0/+16
Add support for TEE based trusted keys where TEE provides the functionality to seal and unseal trusted keys using hardware unique key. Refer to Documentation/staging/tee.rst for detailed information about TEE. Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Tested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14KEYS: trusted: Add generic trusted keys frameworkSumit Garg2-21/+61
Current trusted keys framework is tightly coupled to use TPM device as an underlying implementation which makes it difficult for implementations like Trusted Execution Environment (TEE) etc. to provide trusted keys support in case platform doesn't posses a TPM device. Add a generic trusted keys framework where underlying implementations can be easily plugged in. Create struct trusted_key_ops to achieve this, which contains necessary functions of a backend. Also, define a module parameter in order to select a particular trust source in case a platform support multiple trust sources. In case its not specified then implementation itetrates through trust sources list starting with TPM and assign the first trust source as a backend which has initiazed successfully during iteration. Note that current implementation only supports a single trust source at runtime which is either selectable at compile time or during boot via aforementioned module parameter. Suggested-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Sumit Garg <sumit.garg@linaro.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14security: keys: trusted: Make sealed key properly interoperableJames Bottomley1-0/+2
The current implementation appends a migratable flag to the end of a key, meaning the format isn't exactly interoperable because the using party needs to know to strip this extra byte. However, all other consumers of TPM sealed blobs expect the unseal to return exactly the key. Since TPM2 keys have a key property flag that corresponds to migratable, use that flag instead and make the actual key the only sealed quantity. This is secure because the key properties are bound to a hash in the private part, so if they're altered the key won't load. Backwards compatibility is implemented by detecting whether we're loading a new format key or not and correctly setting migratable from the last byte of old format keys. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14security: keys: trusted: use ASN.1 TPM2 key format for the blobsJames Bottomley1-0/+1
Modify the TPM2 key format blob output to export and import in the ASN.1 form for TPM2 sealed object keys. For compatibility with prior trusted keys, the importer will also accept two TPM2B quantities representing the public and private parts of the key. However, the export via keyctl pipe will only output the ASN.1 format. The benefit of the ASN.1 format is that it's a standard and thus the exported key can be used by userspace tools (openssl_tpm2_engine, openconnect and tpm2-tss-engine). The format includes policy specifications, thus it gets us out of having to construct policy handles in userspace and the format includes the parent meaning you don't have to keep passing it in each time. This patch only implements basic handling for the ASN.1 format, so keys with passwords but no policy. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14security: keys: trusted: fix TPM2 authorizationsJames Bottomley1-0/+1
In TPM 1.2 an authorization was a 20 byte number. The spec actually recommended you to hash variable length passwords and use the sha1 hash as the authorization. Because the spec doesn't require this hashing, the current authorization for trusted keys is a 40 digit hex number. For TPM 2.0 the spec allows the passing in of variable length passwords and passphrases directly, so we should allow that in trusted keys for ease of use. Update the 'blobauth' parameter to take this into account, so we can now use plain text passwords for the keys. so before keyctl add trusted kmk "new 32 blobauth=f572d396fae9206628714fb2ce00f72e94f2258fkeyhandle=81000001" @u after we will accept both the old hex sha1 form as well as a new directly supplied password: keyctl add trusted kmk "new 32 blobauth=hello keyhandle=81000001" @u Since a sha1 hex code must be exactly 40 bytes long and a direct password must be 20 or less, we use the length as the discriminator for which form is input. Note this is both and enhancement and a potential bug fix. The TPM 2.0 spec requires us to strip leading zeros, meaning empyty authorization is a zero length HMAC whereas we're currently passing in 20 bytes of zeros. A lot of TPMs simply accept this as OK, but the Microsoft TPM emulator rejects it with TPM_RC_BAD_AUTH, so this patch makes the Microsoft TPM emulator work with trusted keys. Fixes: 0fe5480303a1 ("keys, trusted: seal/unseal with TPM 2.0 chips") Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14oid_registry: Add TCG defined OIDS for TPM keysJames Bottomley1-0/+5
The TCG has defined an OID prefix "2.23.133.10.1" for the various TPM key uses. We've defined three of the available numbers: 2.23.133.10.1.3 TPM Loadable key. This is an asymmetric key (Usually RSA2048 or Elliptic Curve) which can be imported by a TPM2_Load() operation. 2.23.133.10.1.4 TPM Importable Key. This is an asymmetric key (Usually RSA2048 or Elliptic Curve) which can be imported by a TPM2_Import() operation. Both loadable and importable keys are specific to a given TPM, the difference is that a loadable key is wrapped with the symmetric secret, so must have been created by the TPM itself. An importable key is wrapped with a DH shared secret, and may be created without access to the TPM provided you know the public part of the parent key. 2.23.133.10.1.5 TPM Sealed Data. This is a set of data (up to 128 bytes) which is sealed by the TPM. It usually represents a symmetric key and must be unsealed before use. The ASN.1 binary key form starts of with this OID as the first element of a sequence, giving the binary form a unique recognizable identity marker regardless of encoding. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-14lib: Add ASN.1 encoderJames Bottomley1-0/+32
We have a need in the TPM2 trusted keys to return the ASN.1 form of the TPM key blob so it can be operated on by tools outside of the kernel. The specific tools are the openssl_tpm2_engine, openconnect and the Intel tpm2-tss-engine. To do that, we have to be able to read and write the same binary key format the tools use. The current ASN.1 decoder does fine for reading, but we need pieces of an ASN.1 encoder to write the key blob in binary compatible form. For backwards compatibility, the trusted key reader code will still accept the two TPM2B quantities that it uses today, but the writer will only output the ASN.1 form. The current implementation only encodes the ASN.1 bits we actually need. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Reviewed-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2021-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller2-4/+6
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Fix NAT IPv6 offload in the flowtable. 2) icmpv6 is printed as unknown in /proc/net/nf_conntrack. 3) Use div64_u64() in nft_limit, from Eric Dumazet. 4) Use pre_exit to unregister ebtables and arptables hooks, from Florian Westphal. 5) Fix out-of-bound memset in x_tables compat match/target, also from Florian. 6) Clone set elements expression to ensure proper initialization. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12net: phy: marvell: fix detection of PHY on Topaz switchesPali Rohár1-2/+3
Since commit fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading"), Linux reports the temperature of Topaz hwmon as constant -75°C. This is because switches from the Topaz family (88E6141 / 88E6341) have the address of the temperature sensor register different from Peridot. This address is instead compatible with 88E1510 PHYs, as was used for Topaz before the above mentioned commit. Create a new mapping table between switch family and PHY ID for families which don't have a model number. And define PHY IDs for Topaz and Peridot families. Create a new PHY ID and a new PHY driver for Topaz's internal PHY. The only difference from Peridot's PHY driver is the HWMON probing method. Prior this change Topaz's internal PHY is detected by kernel as: PHY [...] driver [Marvell 88E6390] (irq=63) And afterwards as: PHY [...] driver [Marvell 88E6341 Family] (irq=63) Signed-off-by: Pali Rohár <pali@kernel.org> BugLink: https://github.com/globalscaletechnologies/linux/issues/1 Fixes: fee2d546414d ("net: phy: marvell: mv88e6390 temperature sensor reading") Reviewed-by: Marek Behún <kabel@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-12dmaengine: idxd: fix delta_rec and crc size field for completion recordDave Jiang1-2/+2
The delta_rec_size and crc_val in the completion record should be 32bits and not 16bits. Fixes: bfe1d56091c1 ("dmaengine: idxd: Init and probe for Intel data accelerators") Reported-by: Nikhil Rao <nikhil.rao@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161645618572.2003490.14466173451736323035.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Vinod Koul <vkoul@kernel.org>
2021-04-10netfilter: arp_tables: add pre_exit hook for table unregisterFlorian Westphal1-2/+3
Same problem that also existed in iptables/ip(6)tables, when arptable_filter is removed there is no longer a wait period before the table/ruleset is free'd. Unregister the hook in pre_exit, then remove the table in the exit function. This used to work correctly because the old nf_hook_unregister API did unconditional synchronize_net. The per-net hook unregister function uses call_rcu instead. Fixes: b9e69e127397 ("netfilter: xtables: don't hook tables by default") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-10netfilter: bridge: add pre_exit hooks for ebtable unregistrationFlorian Westphal1-2/+3
Just like ip/ip6/arptables, the hooks have to be removed, then synchronize_rcu() has to be called to make sure no more packets are being processed before the ruleset data is released. Place the hook unregistration in the pre_exit hook, then call the new ebtables pre_exit function from there. Years ago, when first netns support got added for netfilter+ebtables, this used an older (now removed) netfilter hook unregister API, that did a unconditional synchronize_rcu(). Now that all is done with call_rcu, ebtable_{filter,nat,broute} pernet exit handlers may free the ebtable ruleset while packets are still in flight. This can only happens on module removal, not during netns exit. The new function expects the table name, not the table struct. This is because upcoming patch set (targeting -next) will remove all net->xt.{nat,filter,broute}_table instances, this makes it necessary to avoid external references to those member variables. The existing APIs will be converted, so follow the upcoming scheme of passing name + hook type instead. Fixes: aee12a0a3727e ("ebtables: remove nf_hook_register usage") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-09Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-3/+3
Merge misc fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: mm (kasan, gup, pagecache, and kfence), MAINTAINERS, mailmap, nds32, gcov, ocfs2, ia64, and lib" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib: fix kconfig dependency on ARCH_WANT_FRAME_POINTERS kfence, x86: fix preemptible warning on KPTI-enabled systems lib/test_kasan_module.c: suppress unused var warning kasan: fix conflict with page poisoning fs: direct-io: fix missing sdio->boundary ia64: fix user_stack_pointer() for ptrace() ocfs2: fix deadlock between setattr and dio_end_io_write gcov: re-fix clang-11+ support nds32: flush_dcache_page: use page_mapping_file to avoid races with swapoff mm/gup: check page posion status for coredump. .mailmap: fix old email addresses mailmap: update email address for Jordan Crouse treewide: change my e-mail address, fix my name MAINTAINERS: update CZ.NIC's Turris information
2021-04-09Merge tag 'net-5.12-rc7' of ↵Linus Torvalds15-68/+170
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Networking fixes for 5.12-rc7, including fixes from can, ipsec, mac80211, wireless, and bpf trees. No scary regressions here or in the works, but small fixes for 5.12 changes keep coming. Current release - regressions: - virtio: do not pull payload in skb->head - virtio: ensure mac header is set in virtio_net_hdr_to_skb() - Revert "net: correct sk_acceptq_is_full()" - mptcp: revert "mptcp: provide subflow aware release function" - ethernet: lan743x: fix ethernet frame cutoff issue - dsa: fix type was not set for devlink port - ethtool: remove link_mode param and derive link params from driver - sched: htb: fix null pointer dereference on a null new_q - wireless: iwlwifi: Fix softirq/hardirq disabling in iwl_pcie_enqueue_hcmd() - wireless: iwlwifi: fw: fix notification wait locking - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the rtnl dependency Current release - new code bugs: - napi: fix hangup on napi_disable for threaded napi - bpf: take module reference for trampoline in module - wireless: mt76: mt7921: fix airtime reporting and related tx hangs - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending config command Previous releases - regressions: - rfkill: revert back to old userspace API by default - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets - let skb_orphan_partial wake-up waiters - xfrm/compat: Cleanup WARN()s that can be user-triggered - vxlan, geneve: do not modify the shared tunnel info when PMTU triggers an ICMP reply - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE - can: uapi: mark union inside struct can_frame packed - sched: cls: fix action overwrite reference counting - sched: cls: fix err handler in tcf_action_init() - ethernet: mlxsw: fix ECN marking in tunnel decapsulation - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx - ethernet: i40e: fix receiving of single packets in xsk zero-copy mode - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic Previous releases - always broken: - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET - bpf: Refcount task stack in bpf_get_task_stack - bpf, x86: Validate computation of branch displacements - ieee802154: fix many similar syzbot-found bugs - fix NULL dereferences in netlink attribute handling - reject unsupported operations on monitor interfaces - fix error handling in llsec_key_alloc() - xfrm: make ipv4 pmtu check honor ip header df - xfrm: make hash generation lock per network namespace - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp offload - ethtool: fix incorrect datatype in set_eee ops - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory model - openvswitch: fix send of uninitialized stack memory in ct limit reply Misc: - udp: add get handling for UDP_GRO sockopt" * tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits) net: fix hangup on napi_disable for threaded napi net: hns3: Trivial spell fix in hns3 driver lan743x: fix ethernet frame cutoff issue net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits net: dsa: lantiq_gswip: Don't use PHY auto polling net: sched: sch_teql: fix null-pointer dereference ipv6: report errors for iftoken via netlink extack net: sched: fix err handler in tcf_action_init() net: sched: fix action overwrite reference counting Revert "net: sched: bump refcount for new action in ACT replace mode" ice: fix memory leak of aRFS after resuming from suspend i40e: Fix sparse warning: missing error code 'err' i40e: Fix sparse error: 'vsi->netdev' could be null i40e: Fix sparse error: uninitialized symbol 'ring' i40e: Fix sparse errors in i40e_txrx.c i40e: Fix parameters in aq_get_phy_register() nl80211: fix beacon head validation bpf, x86: Validate computation of branch displacements for x86-32 bpf, x86: Validate computation of branch displacements for x86-64 ...
2021-04-09treewide: change my e-mail address, fix my nameMarek Behún3-3/+3
Change my e-mail address to kabel@kernel.org, and fix my name in non-code parts (add diacritical mark). Link: https://lkml.kernel.org/r/20210325171123.28093-2-kabel@kernel.org Signed-off-by: Marek Behún <kabel@kernel.org> Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jassi Brar <jassisinghbrar@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-08libnvdimm: Notify disk drivers to revalidate region read-onlyDan Williams1-0/+1
Previous kernels allowed the BLKROSET to override the disk's read-only status. With that situation fixed the pmem driver needs to rely on notification events to reevaluate the disk read-only status after the host region has been marked read-write. Recall that when libnvdimm determines that the persistent memory has lost persistence (for example lack of energy to flush from DRAM to FLASH on an NVDIMM-N device) it marks the region read-only, but that state can be overridden by the user via: echo 0 > /sys/bus/nd/devices/regionX/read_only ...to date there is no notification that the region has restored persistence, so the user override is the only recovery. Fixes: 52f019d43c22 ("block: add a hard-readonly flag to struct gendisk") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Vishal Verma <vishal.l.verma@intel.com> Tested-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/161534060720.528671.2341213328968989192.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2021-04-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-1/+6
Daniel Borkmann says: ==================== pull-request: bpf 2021-04-08 The following pull-request contains BPF updates for your *net* tree. We've added 4 non-merge commits during the last 2 day(s) which contain a total of 4 files changed, 31 insertions(+), 10 deletions(-). The main changes are: 1) Validate and reject invalid JIT branch displacements, from Piotr Krysiuk. 2) Fix incorrect unhash restore as well as fwd_alloc memory accounting in sock map, from John Fastabend. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08Merge tag 'mac80211-for-net-2021-04-08.2' of ↵David S. Miller1-12/+68
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes berg says: ==================== Various small fixes: * S1G beacon validation * potential leak in nl80211 * fast-RX confusion with 4-addr mode * erroneous WARN_ON that userspace can trigger * wrong time units in virt_wifi * rfkill userspace API breakage * TXQ AC confusing that led to traffic stopped forever * connection monitoring time after/before confusion * netlink beacon head validation buffer overrun ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08ipv6: report errors for iftoken via netlink extackStephen Hemminger1-2/+2
Setting iftoken can fail for several different reasons but there and there was no report to user as to the cause. Add netlink extended errors to the processing of the request. This requires adding additional argument through rtnl_af_ops set_link_af callback. Reported-by: Hongren Zheng <li@zenithal.me> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08net: sched: fix err handler in tcf_action_init()Vlad Buslov1-6/+1
With recent changes that separated action module load from action initialization tcf_action_init() function error handling code was modified to manually release the loaded modules if loading/initialization of any further action in same batch failed. For the case when all modules successfully loaded and some of the actions were initialized before one of them failed in init handler. In this case for all previous actions the module will be released twice by the error handler: First time by the loop that manually calls module_put() for all ops, and second time by the action destroy code that puts the module after destroying the action. Reproduction: $ sudo tc actions add action simple sdata \"2\" index 2 $ sudo tc actions add action simple sdata \"1\" index 1 \ action simple sdata \"2\" index 2 RTNETLINK answers: File exists We have an error talking to the kernel $ sudo tc actions ls action simple total acts 1 action order 0: Simple <"2"> index 2 ref 1 bind 0 $ sudo tc actions flush action simple $ sudo tc actions ls action simple $ sudo tc actions add action simple sdata \"2\" index 2 Error: Failed to load TC action module. We have an error talking to the kernel $ lsmod | grep simple act_simple 20480 -1 Fix the issue by modifying module reference counting handling in action initialization code: - Get module reference in tcf_idr_create() and put it in tcf_idr_release() instead of taking over the reference held by the caller. - Modify users of tcf_action_init_1() to always release the module reference which they obtain before calling init function instead of assuming that created action takes over the reference. - Finally, modify tcf_action_init_1() to not release the module reference when overwriting existing action as this is no longer necessary since both upper and lower layers obtain and manage their own module references independently. Fixes: d349f9976868 ("net_sched: fix RTNL deadlock again caused by request_module()") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08net: sched: fix action overwrite reference countingVlad Buslov1-2/+3
Action init code increments reference counter when it changes an action. This is the desired behavior for cls API which needs to obtain action reference for every classifier that points to action. However, act API just needs to change the action and releases the reference before returning. This sequence breaks when the requested action doesn't exist, which causes act API init code to create new action with specified index, but action is still released before returning and is deleted (unless it was referenced concurrently by cls API). Reproduction: $ sudo tc actions ls action gact $ sudo tc actions change action gact drop index 1 $ sudo tc actions ls action gact Extend tcf_action_init() to accept 'init_res' array and initialize it with action->ops->init() result. In tcf_action_add() remove pointers to created actions from actions array before passing it to tcf_action_put_many(). Fixes: cae422f379f3 ("net: sched: use reference counting action init") Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-08rfkill: revert back to old userspace API by defaultJohannes Berg1-12/+68
Recompiling with the new extended version of struct rfkill_event broke systemd in *two* ways: - It used "sizeof(struct rfkill_event)" to read the event, but then complained if it actually got something != 8, this broke it on new kernels (that include the updated API); - It used sizeof(struct rfkill_event) to write a command, but didn't implement the intended expansion protocol where the kernel returns only how many bytes it accepted, and errored out due to the unexpected smaller size on kernels that didn't include the updated API. Even though systemd has now been fixed, that fix may not be always deployed, and other applications could potentially have similar issues. As such, in the interest of avoiding regressions, revert the default API "struct rfkill_event" back to the original size. Instead, add a new "struct rfkill_event_ext" that extends it by the new field, and even more clearly document that applications should be prepared for extensions in two ways: * write might only accept fewer bytes on older kernels, and will return how many to let userspace know which data may have been ignored; * read might return anything between 8 (the original size) and whatever size the application sized its buffer at, indicating how much event data was supported by the kernel. Perhaps that will help avoid such issues in the future and we won't have to come up with another version of the struct if we ever need to extend it again. Applications that want to take advantage of the new field will have to be modified to use struct rfkill_event_ext instead now, which comes with the danger of them having already been updated to use it from 'struct rfkill_event', but I found no evidence of that, and it's still relatively new. Cc: stable@vger.kernel.org # 5.11 Reported-by: Takashi Iwai <tiwai@suse.de> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v12.0.0-r4 (x86-64) Link: https://lore.kernel.org/r/20210319232510.f1a139cfdd9c.Ic5c7c9d1d28972059e132ea653a21a427c326678@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-04-07ethtool: Remove link_mode param and derive link params from driverDanielle Ratson1-1/+8
Some drivers clear the 'ethtool_link_ksettings' struct in their get_link_ksettings() callback, before populating it with actual values. Such drivers will set the new 'link_mode' field to zero, resulting in user space receiving wrong link mode information given that zero is a valid value for the field. Another problem is that some drivers (notably tun) can report random values in the 'link_mode' field. This can result in a general protection fault when the field is used as an index to the 'link_mode_params' array [1]. This happens because such drivers implement their set_link_ksettings() callback by simply overwriting their private copy of 'ethtool_link_ksettings' struct with the one they get from the stack, which is not always properly initialized. Fix these problems by removing 'link_mode' from 'ethtool_link_ksettings' and instead have drivers call ethtool_params_from_link_mode() with the current link mode. The function will derive the link parameters (e.g., speed) from the link mode and fill them in the 'ethtool_link_ksettings' struct. v3: * Remove link_mode parameter and derive the link parameters in the driver instead of passing link_mode parameter to ethtool and derive it there. v2: * Introduce 'cap_link_mode_supported' instead of adding a validity field to 'ethtool_link_ksettings' struct. [1] general protection fault, probably for non-canonical address 0xdffffc00f14cc32c: 0000 [#1] PREEMPT SMP KASAN KASAN: probably user-memory-access in range [0x000000078a661960-0x000000078a661967] CPU: 0 PID: 8452 Comm: syz-executor360 Not tainted 5.11.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__ethtool_get_link_ksettings+0x1a3/0x3a0 net/ethtool/ioctl.c:446 Code: b7 3e fa 83 fd ff 0f 84 30 01 00 00 e8 16 b0 3e fa 48 8d 3c ed 60 d5 69 8a 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 14 02 48 89 f8 83 e0 07 83 c0 03 +38 d0 7c 08 84 d2 0f 85 b9 RSP: 0018:ffffc900019df7a0 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff888026136008 RCX: 0000000000000000 RDX: 00000000f14cc32c RSI: ffffffff873439ca RDI: 000000078a661960 RBP: 00000000ffff8880 R08: 00000000ffffffff R09: ffff88802613606f R10: ffffffff873439bc R11: 0000000000000000 R12: 0000000000000000 R13: ffff88802613606c R14: ffff888011d0c210 R15: ffff888011d0c210 FS: 0000000000749300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000004b60f0 CR3: 00000000185c2000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: linkinfo_prepare_data+0xfd/0x280 net/ethtool/linkinfo.c:37 ethnl_default_notify+0x1dc/0x630 net/ethtool/netlink.c:586 ethtool_notify+0xbd/0x1f0 net/ethtool/netlink.c:656 ethtool_set_link_ksettings+0x277/0x330 net/ethtool/ioctl.c:620 dev_ethtool+0x2b35/0x45d0 net/ethtool/ioctl.c:2842 dev_ioctl+0x463/0xb70 net/core/dev_ioctl.c:440 sock_do_ioctl+0x148/0x2d0 net/socket.c:1060 sock_ioctl+0x477/0x6a0 net/socket.c:1177 vfs_ioctl fs/ioctl.c:48 [inline] __do_sys_ioctl fs/ioctl.c:753 [inline] __se_sys_ioctl fs/ioctl.c:739 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: c8907043c6ac9 ("ethtool: Get link mode in use instead of speed and duplex parameters") Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07Merge tag 'mlx5-fixes-2021-04-06' of ↵David S. Miller1-4/+6
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2021-04-06 This series provides some fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07ethtool: fix kdoc in headersJakub Kicinski2-2/+13
Fix remaining issues with kdoc in the ethtool headers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07ethtool: document reserved fields in the uAPIJakub Kicinski1-1/+21
Add a note on expected handling of reserved fields, and references to all kdocs. This fixes a bunch of kdoc warnings. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-07ethtool: un-kdocify extended link stateJakub Kicinski2-23/+7
Extended link state structures and enums use kdoc headers but then do not describe any of the members. Convert to normal comments. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-06net/mlx5: Fix PBMC register mappingAya Levin1-1/+1
Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields. Fixes: 50b4a3c23646 ("net/mlx5: PPTB and PBMC register firmware command support") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06net/mlx5: Fix PPLM register mappingAya Levin1-0/+2
Add reserved mapping to cover all the register in order to avoid setting arbitrary values to newer FW which implements the reserved fields. Fixes: a58837f52d43 ("net/mlx5e: Expose FEC feilds and related capability bit") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-06net/mlx5: Fix placement of log_max_flow_counterRaed Salem1-3/+3
The cited commit wrongly placed log_max_flow_counter field of mlx5_ifc_flow_table_prop_layout_bits, align it to the HW spec intended placement. Fixes: 16f1c5bb3ed7 ("net/mlx5: Check device capability for maximum flow counters") Signed-off-by: Raed Salem <raeds@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-04-07bpf, sockmap: Fix sk->prot unhash op resetJohn Fastabend1-1/+6
In '4da6a196f93b1' we fixed a potential unhash loop caused when a TLS socket in a sockmap was removed from the sockmap. This happened because the unhash operation on the TLS ctx continued to point at the sockmap implementation of unhash even though the psock has already been removed. The sockmap unhash handler when a psock is removed does the following, void sock_map_unhash(struct sock *sk) { void (*saved_unhash)(struct sock *sk); struct sk_psock *psock; rcu_read_lock(); psock = sk_psock(sk); if (unlikely(!psock)) { rcu_read_unlock(); if (sk->sk_prot->unhash) sk->sk_prot->unhash(sk); return; } [...] } The unlikely() case is there to handle the case where psock is detached but the proto ops have not been updated yet. But, in the above case with TLS and removed psock we never fixed sk_prot->unhash() and unhash() points back to sock_map_unhash resulting in a loop. To fix this we added this bit of code, static inline void sk_psock_restore_proto(struct sock *sk, struct sk_psock *psock) { sk->sk_prot->unhash = psock->saved_unhash; This will set the sk_prot->unhash back to its saved value. This is the correct callback for a TLS socket that has been removed from the sock_map. Unfortunately, this also overwrites the unhash pointer for all psocks. We effectively break sockmap unhash handling for any future socks. Omitting the unhash operation will leave stale entries in the map if a socket transition through unhash, but does not do close() op. To fix set unhash correctly before calling into tls_update. This way the TLS enabled socket will point to the saved unhash() handler. Fixes: 4da6a196f93b1 ("bpf: Sockmap/tls, during free we may call tcp_bpf_unhash() in loop") Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Reported-by: Lorenz Bauer <lmb@cloudflare.com> Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/161731441904.68884.15593917809745631972.stgit@john-XPS-13-9370
2021-04-06virtio_net: Do not pull payload in skb->headEric Dumazet1-5/+9
Xuan Zhuo reported that commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") brought a ~10% performance drop. The reason for the performance drop was that GRO was forced to chain sk_buff (using skb_shinfo(skb)->frag_list), which uses more memory but also cause packet consumers to go over a lot of overhead handling all the tiny skbs. It turns out that virtio_net page_to_skb() has a wrong strategy : It allocates skbs with GOOD_COPY_LEN (128) bytes in skb->head, then copies 128 bytes from the page, before feeding the packet to GRO stack. This was suboptimal before commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") because GRO was using 2 frags per MSS, meaning we were not packing MSS with 100% efficiency. Fix is to pull only the ethernet header in page_to_skb() Then, we change virtio_net_hdr_to_skb() to pull the missing headers, instead of assuming they were already pulled by callers. This fixes the performance regression, but could also allow virtio_net to accept packets with more than 128bytes of headers. Many thanks to Xuan Zhuo for his report, and his tests/help. Fixes: 3226b158e67c ("net: avoid 32 x truesize under-estimation for tiny skbs") Reported-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://www.spinics.net/lists/netdev/msg731397.html Co-Developed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: virtualization@lists.linux-foundation.org Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-03Merge tag 'char-misc-5.12-rc6' of ↵Linus Torvalds2-1/+24
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are a few small driver char/misc changes for 5.12-rc6. Nothing major here, a few fixes for reported issues: - interconnect fixes for problems found - fbcon syzbot-found fix - extcon fixes - firmware stratix10 bugfix - MAINTAINERS file update. All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: drivers: video: fbcon: fix NULL dereference in fbcon_cursor() mei: allow map and unmap of client dma buffer only for disconnected client MAINTAINERS: Add linux-phy list and patchwork interconnect: Fix kerneldoc warning firmware: stratix10-svc: reset COMMAND_RECONFIG_FLAG_PARTIAL to 0 extcon: Fix error handling in extcon_dev_register extcon: Add stubs for extcon_register_notifier_all() functions interconnect: core: fix error return code of icc_link_destroy() interconnect: qcom: msm8939: remove rpm-ids from non-RPM nodes
2021-04-03Merge tag 'tty-5.12-rc6' of ↵Linus Torvalds1-2/+0
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull serial driver fix from Greg KH: "Here is a single serial driver fix for 5.12-rc6. Is is a revert of a change that showed up in 5.9 that has been reported to cause problems. It has been in linux-next for a while with no reported issues" * tag 'tty-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: soc: qcom-geni-se: Cleanup the code to remove proxy votes
2021-04-03Merge tag 'scsi-fixes' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "A single fix to iscsi for a rare race condition which can cause a kernel panic" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: iscsi: Fix race condition between login and sync thread
2021-04-02Merge tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-blockLinus Torvalds2-28/+2
Pull block fixes from Jens Axboe: - Remove comment that never came to fruition in 22 years of development (Christoph) - Remove unused request flag (Christoph) - Fix for null_blk fake timeout handling (Damien) - Fix for IOCB_NOWAIT being ignored for O_DIRECT on raw bdevs (Pavel) - Error propagation fix for multiple split bios (Yufen) * tag 'block-5.12-2021-04-02' of git://git.kernel.dk/linux-block: block: remove the unused RQF_ALLOCED flag block: update a few comments in uapi/linux/blkpg.h block: don't ignore REQ_NOWAIT for direct IO null_blk: fix command timeout completion handling block: only update parent bi_status when bio fail
2021-04-02Merge tag 'acpi-5.12-rc6' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "These fix an ACPI tables management issue, an issue related to the ACPI enumeration of devices and CPU wakeup in the ACPI processor driver. Specifics: - Ensure that the memory occupied by ACPI tables on x86 will always be reserved to prevent it from being allocated for other purposes which was possible in some cases (Rafael Wysocki). - Fix the ACPI device enumeration code to prevent it from attempting to evaluate the _STA control method for devices with unmet dependencies which is likely to fail (Hans de Goede). - Fix the handling of CPU0 wakeup in the ACPI processor driver to prevent CPU0 online failures from occurring (Vitaly Kuznetsov)" * tag 'acpi-5.12-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: processor: Fix CPU0 wakeup in acpi_idle_play_dead() ACPI: scan: Fix _STA getting called on devices with unmet dependencies ACPI: tables: x86: Reserve memory occupied by ACPI tables
2021-04-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+2
Alexei Starovoitov says: ==================== pull-request: bpf 2021-04-01 The following pull-request contains BPF updates for your *net* tree. We've added 11 non-merge commits during the last 8 day(s) which contain a total of 10 files changed, 151 insertions(+), 26 deletions(-). The main changes are: 1) xsk creation fixes, from Ciara. 2) bpf_get_task_stack fix, from Dave. 3) trampoline in modules fix, from Jiri. 4) bpf_obj_get fix for links and progs, from Lorenz. 5) struct_ops progs must be gpl compatible fix, from Toke. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-02block: remove the unused RQF_ALLOCED flagChristoph Hellwig1-2/+0
Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-04-02block: update a few comments in uapi/linux/blkpg.hChristoph Hellwig1-26/+2
The big top of the file comment talk about grand plans that never happened, so remove them to not confuse the readers. Also mark the devname and volname fields as ignored as they were never used by the kernel. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>