aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2023-12-15dma-mapping: don't store redundant offsetsRobin Murphy1-7/+12
A bus_dma_region necessarily stores both CPU and DMA base addresses for a range, so there's no need to also store the difference between them. Signed-off-by: Robin Murphy <[email protected]> Acked-by: Rob Herring <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2023-12-15virtio/vsock: send credit update during setting SO_RCVLOWATArseniy Krasnov1-0/+1
Send credit update message when SO_RCVLOWAT is updated and it is bigger than number of bytes in rx queue. It is needed, because 'poll()' will wait until number of bytes in rx queue will be not smaller than O_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup for tx/rx is possible: sender waits for free space and receiver is waiting data in 'poll()'. Rename 'set_rcvlowat' callback to 'notify_set_rcvlowat' and set 'sk->sk_rcvlowat' only in one place (i.e. 'vsock_set_rcvlowat'), so the transport doesn't need to do it. Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages") Signed-off-by: Arseniy Krasnov <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-15Merge tag 'mlx5-updates-2023-12-13' of ↵David S. Miller2-6/+42
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2023-12-13 Preparation for mlx5e socket direct feature. Socket direct will allow multiple PF devices attached to different NUMA nodes but sharing the same physical port. The following series is a small refactoring series in preparation to support socket direct in the following submission. Highlights: - Define required device registers and bits related to socket direct - Flow steering re-arrangements - Generalize TX objects (TISs) and store them in a common object, will be useful in the next series for per function object management. - Decouple raw CQ objects from their parent netdev priv - Prepare devcom for Socket Direct device group discovery. Please see the individual patches for more information. ==================== Signed-off-by: David S. Miller <[email protected]>
2023-12-15bus: mhi: ep: Add support for async DMA write operationManivannan Sadhasivam1-0/+4
In order to optimize the data transfer, let's use the async DMA operation for writing (queuing) data to the host. In the async path, the completion event for the transfer ring will only be sent to the host when the controller driver notifies the MHI stack of the actual transfer completion using the callback (mhi_ep_skb_completion) supplied in "struct mhi_ep_buf_info". Also to accommodate the async operation, the transfer ring read offset (ring->rd_offset) is cached in the "struct mhi_ep_chan" and updated locally to let the stack queue further ring items to the controller driver. But the actual read offset of the transfer ring will only be updated in the completion callback. Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-14Merge tag 'wireless-2023-12-14' of ↵Jakub Kicinski1-1/+2
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== * add (and fix) certificate for regdb handover to Chen-Yu Tsai * fix rfkill GPIO handling * a few driver (iwlwifi, mt76) crash fixes * logic fixes in the stack * tag 'wireless-2023-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: cfg80211: fix certs build to not depend on file order wifi: mt76: fix crash with WED rx support enabled wifi: iwlwifi: pcie: avoid a NULL pointer dereference wifi: mac80211: mesh_plink: fix matches_local logic wifi: mac80211: mesh: check element parsing succeeded wifi: mac80211: check defragmentation succeeded wifi: mac80211: don't re-add debugfs during reconfig net: rfkill: gpio: set GPIO direction wifi: mac80211: check if the existing link config remains unchanged wifi: cfg80211: Add my certificate wifi: iwlwifi: pcie: add another missing bh-disable for rxq->lock wifi: ieee80211: don't require protected vendor action frames ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-14net: skbuff: fix spelling errorsRandy Dunlap1-3/+3
Correct spelling as reported by codespell. Signed-off-by: Randy Dunlap <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski6-6/+16
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/iavf/iavf_ethtool.c 3a0b5a2929fd ("iavf: Introduce new state machines for flow director") 95260816b489 ("iavf: use iavf_schedule_aq_request() helper") https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/broadcom/bnxt/bnxt.c c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic") c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.") a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7") 1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips") https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()") 84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters") https://lore.kernel.org/all/[email protected]/ drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c 3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled") cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present") https://lore.kernel.org/all/[email protected]/ No adjacent changes. Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-14Merge tag 'net-6.7-rc6' of ↵Linus Torvalds1-1/+8
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Current release - regressions: - tcp: fix tcp_disordered_ack() vs usec TS resolution Current release - new code bugs: - dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() - eth: octeon_ep: initialise control mbox tasks before using APIs Previous releases - regressions: - io_uring/af_unix: disable sending io_uring over sockets - eth: mlx5e: - TC, don't offload post action rule if not supported - fix possible deadlock on mlx5e_tx_timeout_work - eth: iavf: fix iavf_shutdown to call iavf_remove instead iavf_close - eth: bnxt_en: fix skb recycling logic in bnxt_deliver_skb() - eth: ena: fix DMA syncing in XDP path when SWIOTLB is on - eth: team: fix use-after-free when an option instance allocation fails Previous releases - always broken: - neighbour: don't let neigh_forced_gc() disable preemption for long - net: prevent mss overflow in skb_segment() - ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIX - tcp: remove acked SYN flag from packet in the transmit queue correctly - eth: octeontx2-af: - fix a use-after-free in rvu_nix_register_reporters - fix promisc mcam entry action - eth: dwmac-loongson: make sure MDIO is initialized before use - eth: atlantic: fix double free in ring reinit logic" * tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits) net: atlantic: fix double free in ring reinit logic appletalk: Fix Use-After-Free in atalk_ioctl net: stmmac: Handle disabled MDIO busses from devicetree net: stmmac: dwmac-qcom-ethqos: Fix drops in 10M SGMII RX dpaa2-switch: do not ask for MDB, VLAN and FDB replay dpaa2-switch: fix size of the dma_unmap net: prevent mss overflow in skb_segment() vsock/virtio: Fix unsigned integer wrap around in virtio_transport_has_space() Revert "tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set" MIPS: dts: loongson: drop incorrect dwmac fallback compatible stmmac: dwmac-loongson: drop useless check for compatible fallback stmmac: dwmac-loongson: Make sure MDIO is initialized before use tcp: disable tcp_autocorking for socket when TCP_NODELAY flag is set dpll: sanitize possible null pointer dereference in dpll_pin_parent_pin_set() net: ena: Fix XDP redirection error net: ena: Fix DMA syncing in XDP path when SWIOTLB is on net: ena: Fix xdp drops handling due to multibuf packets net: ena: Destroy correct number of xdp queues upon failure net: Remove acked SYN flag from packet in the transmit queue correctly qed: Fix a potential use-after-free in qed_cxt_tables_alloc ...
2023-12-14add listmount(2) syscallMiklos Szeredi1-0/+3
Add way to query the children of a particular mount. This is a more flexible way to iterate the mount tree than having to parse /proc/self/mountinfo. Lookup the mount by the new 64bit mount ID. If a mount needs to be queried based on path, then statx(2) can be used to first query the mount ID belonging to the path. Return an array of new (64bit) mount ID's. Without privileges only mounts are listed which are reachable from the task's root. Folded into this patch are several later improvements. Keeping them separate would make the history pointlessly confusing: * Recursive listing of mounts is the default now (cf. [1]). * Remove explicit LISTMOUNT_UNREACHABLE flag (cf. [1]) and fail if mount is unreachable from current root. This also makes permission checking consistent with statmount() (cf. [3]). * Start listing mounts in unique mount ID order (cf. [2]) to allow continuing listmount() from a midpoint. * Allow to continue listmount(). The @request_mask parameter is renamed and to @param to be usable by both statmount() and listmount(). If @param is set to a mount id then listmount() will continue listing mounts from that id on. This allows listing mounts in multiple listmount invocations without having to resize the buffer. If @param is zero then the listing starts from the beginning (cf. [4]). * Don't return EOVERFLOW, instead return the buffer size which allows to detect a full buffer as well (cf. [4]). Signed-off-by: Miklos Szeredi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Ian Kent <[email protected]> Link: https://lore.kernel.org/r/[email protected] [1] (folded) Link: https://lore.kernel.org/r/[email protected] [2] (folded) Link: https://lore.kernel.org/r/[email protected] [3] (folded) Link: https://lore.kernel.org/r/[email protected] [4] (folded) [Christian Brauner <[email protected]>: various smaller fixes] Signed-off-by: Christian Brauner <[email protected]>
2023-12-14Merge tag 'platform-drivers-x86-amd-wbrf-v6.8-1' into review-hansHans de Goede1-0/+91
Immutable branch between pdx86 amd wbrf branch and wifi / amdgpu due for the v6.8 merge window platform-drivers-x86-amd-wbrf-v6.8-1: v6.7-rc1 + AMD WBRF support for merging into the wifi subsys and amdgpu driver for 6.8.
2023-12-14bus: mhi: ep: Introduce async read/write callbacksManivannan Sadhasivam1-0/+9
These callbacks can be implemented by the controller drivers to perform async read/write operation that increases the throughput. For aiding the async operation, a completion callback is also introduced. Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-14bus: mhi: ep: Rename read_from_host() and write_to_host() APIsManivannan Sadhasivam1-4/+4
In the preparation for adding async API support, let's rename the existing APIs to read_sync() and write_sync() to make it explicit that these APIs are used for synchronous read/write. Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-14bus: mhi: ep: Pass mhi_ep_buf_info struct to read/write APIsManivannan Sadhasivam1-2/+14
In the preparation of DMA async support, let's pass the parameters to read_from_host() and write_to_host() APIs using mhi_ep_buf_info structure. No functional change. Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-14bus: mhi: ep: Use slab allocator where applicableManivannan Sadhasivam1-0/+3
Use slab allocator for allocating the memory for objects used frequently and are of fixed size. This reduces the overheard associated with kmalloc(). Suggested-by: Alex Elder <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-13iavf: enable symmetric-xor RSS for Toeplitz hash functionAhmed Zaki1-0/+19
Allow the user to set the symmetric Toeplitz hash function via: # ethtool -X eth0 hfunc toeplitz symmetric-xor The driver will reject any new RSS configuration if a field other than (IP src/dst and L4 src/dst ports) is requested for hashing. The symmetric RSS will not be supported on PFs not advertising the ADV RSS Offload flag (ADV_RSS_SUPPORT()), for example the E700 series (i40e). Reviewed-by: Madhu Chittim <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13ice: refactor RSS configurationQi Zhang1-8/+8
Refactor the driver to use a communication data structure for RSS config. To do so we introduce the new ice_rss_hash_cfg struct, and then pass it as an argument to several functions. Also introduce enum ice_rss_cfg_hdr_type to specify a more granular and flexible RSS configuration: ICE_RSS_OUTER_HEADERS - take outer layer as RSS input set ICE_RSS_INNER_HEADERS - take inner layer as RSS input set ICE_RSS_INNER_HEADERS_W_OUTER_IPV4 - take inner layer as RSS input set for packet with outer IPV4 ICE_RSS_INNER_HEADERS_W_OUTER_IPV6 - take inner layer as RSS input set for packet with outer IPV6 ICE_RSS_ANY_HEADERS - try with outer first then inner (same as the behaviour without this change) Finally, move the virtchnl_rss_algorithm enum to be with the other RSS related structures in the virtchnl.h file. There should be no functional change due to this patch. Reviewed-by: Wojciech Drewek <[email protected]> Signed-off-by: Qi Zhang <[email protected]> Co-developed-by: Jesse Brandeburg <[email protected]> Signed-off-by: Jesse Brandeburg <[email protected]> Co-developed-by: Ahmed Zaki <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13net: ethtool: add support for symmetric-xor RSS hashAhmed Zaki1-0/+6
Symmetric RSS hash functions are beneficial in applications that monitor both Tx and Rx packets of the same flow (IDS, software firewalls, ..etc). Getting all traffic of the same flow on the same RX queue results in higher CPU cache efficiency. A NIC that supports "symmetric-xor" can achieve this RSS hash symmetry by XORing the source and destination fields and pass the values to the RSS hash algorithm. The user may request RSS hash symmetry for a specific algorithm, via: # ethtool -X eth0 hfunc <hash_alg> symmetric-xor or turn symmetry off (asymmetric) by: # ethtool -X eth0 hfunc <hash_alg> The specific fields for each flow type should then be specified as usual via: # ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n Reviewed-by: Wojciech Drewek <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13net: ethtool: get rid of get/set_rxfh_context functionsAhmed Zaki1-15/+11
Add the RSS context parameters to struct ethtool_rxfh_param and use the get/set_rxfh to handle the RSS contexts as well. This is part 2/2 of the fix suggested in [1]: - Add a rss_context member to the argument struct and a capability like cap_link_lanes_supported to indicate whether driver supports rss contexts, then you can remove *et_rxfh_context functions, and instead call *et_rxfh() with a non-zero rss_context. Link: https://lore.kernel.org/netdev/[email protected]/ [1] CC: Jesse Brandeburg <[email protected]> CC: Tony Nguyen <[email protected]> CC: Marcin Wojtas <[email protected]> CC: Russell King <[email protected]> CC: Sunil Goutham <[email protected]> CC: Geetha sowjanya <[email protected]> CC: Subbaraya Sundeep <[email protected]> CC: hariprasad <[email protected]> CC: Saeed Mahameed <[email protected]> CC: Leon Romanovsky <[email protected]> CC: Edward Cree <[email protected]> CC: Martin Habets <[email protected]> Suggested-by: Jakub Kicinski <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool opsAhmed Zaki1-8/+30
The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added. This is part 1/2 of the fix, as suggested in [1]: - First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present. - Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack. Link: https://lore.kernel.org/netdev/[email protected]/ [1] Suggested-by: Jakub Kicinski <[email protected]> Suggested-by: Jacob Keller <[email protected]> Signed-off-by: Ahmed Zaki <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-14bus: mhi: host: Add a separate timeout parameter for waiting readyQiang Yu1-0/+4
Some devices(eg. SDX75) take longer than expected (default, 8 seconds) to set ready after reboot. Hence add optional ready timeout parameter and pass the appropriate timeout value to mhi_poll_reg_field() to wait enough for device ready as part of power up sequence. Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]>
2023-12-13Input: as5011 - convert to GPIO descriptorLinus Walleij1-1/+0
This driver does not have any in-tree users but is passing a legacy GPIO number through platform data. Convert it to use a GPIO descriptor, new users or outoftree users can easily be implemented using GPIO descriptor tables or software nodes. Signed-off-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2023-12-13Input: omap-keypad - drop optional GPIO supportLinus Walleij1-3/+0
The driver supports passing some GPIO lines for rows and columns through the driver data, but there is no in-kernel user of this. Further the use seems convoluted because the GPIO lines are unused in the driver, then explicitly free:ed when removing it without being requested when probing it, which is assymetric and just a recepie for disaster. Remove the support for these unused GPIOs, if need be support can be reestablished in an organized fashion using GPIO descriptors. Signed-off-by: Linus Walleij <[email protected]> Reviewed-by: Tony Lindgren <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2023-12-13Input: navpoint - convert to use GPIO descriptorLinus Walleij1-1/+0
The Navpoint driver uses a GPIO line, convert this to use a GPIO descriptor. There are no in-kernel users but out of tree users can easily be added or converted using a GPIO descriptor table as with numerous other drivers. Signed-off-by: Linus Walleij <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]>
2023-12-13page_pool: transition to reference count management after page drainingLiang Chen1-1/+1
To support multiple users referencing the same fragment, 'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages from fragment management to reference count management after draining based on the suggestion from [1]. The idea is that the concept of fragmenting exists before the page is drained, and all related functions retain their current names. However, once the page is drained, its management shifts to being governed by 'pp_ref_count'. Therefore, all functions associated with that lifecycle stage of a pp page are renamed. [1] http://lore.kernel.org/netdev/[email protected] Signed-off-by: Liang Chen <[email protected]> Reviewed-by: Yunsheng Lin <[email protected]> Reviewed-by: Ilias Apalodimas <[email protected]> Reviewed-by: Mina Almasry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13ethtool: add SET for TCP_DATA_SPLIT ringparamAlexander Lobakin1-0/+2
Follow up commit 9690ae604290 ("ethtool: add header/data split indication") and add the set part of Ethtool's header split, i.e. ability to enable/disable header split via the Ethtool Netlink interface. This might be helpful to optimize the setup for particular workloads, for example, to avoid XDP frags, and so on. A driver should advertise ``ETHTOOL_RING_USE_TCP_DATA_SPLIT`` in its ops->supported_ring_params to allow doing that. "Unknown" passed from the userspace when the header split is supported means the driver is free to choose the preferred state. Reviewed-by: Przemek Kitszel <[email protected]> Signed-off-by: Alexander Lobakin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13net/mlx5: Move TISes from priv to mdev HW resourcesTariq Toukan1-0/+2
The transport interface send (TIS) object is responsible for performing all transport related operations of the transmit side. Messages from Send Queues get segmented and transmitted by the TIS including all transport required implications, e.g. in the case of large send offload, the TIS is responsible for the segmentation. These are stateless objects and can be used by multiple netdevs (e.g. representors) who share the same core device. Providing the TISes as a service from the core layer to the netdev layer reduces the number of replecated TIS objects (in case of multiple netdevs), and will ease the transition to netdev with multiple mdevs. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-12-13net/mlx5: Expose Management PCIe Index Register (MPIR)Tariq Toukan2-0/+15
MPIR register allows to query the PCIe indexes and Socket-Direct related parameters. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-12-13net/mlx5: Add mlx5_ifc bits used for supporting single netdev Socket-DirectTariq Toukan1-6/+25
Multiple device caps and features are required to support single netdev Socket-Direct. Add them here in preparation for the feature implementation. Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Gal Pressman <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2023-12-13net: phy: c45: add genphy_c45_pma_read_ext_abilities() functionOleksij Rempel1-0/+1
Move part of the genphy_c45_pma_read_abilities() code to a separate function. Some PHYs do not implement PMA/PMD status 2 register (Register 1.8) but do implement PMA/PMD extended ability register (Register 1.11). To make use of it, we need to be able to access this part of code separately. Signed-off-by: Oleksij Rempel <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Russell King (Oracle) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-12-13mlx5: implement VLAN tag XDP hintLarysa Zaremba1-1/+1
Implement the newly added .xmo_rx_vlan_tag() hint function. Reviewed-by: Tariq Toukan <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Acked-by: Jesper Dangaard Brouer <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-13net: make vlan_get_tag() return -ENODATA instead of -EINVALLarysa Zaremba1-2/+2
__vlan_hwaccel_get_tag() is used in veth XDP hints implementation, its return value (-EINVAL if skb is not VLAN tagged) is passed to bpf code, but XDP hints specification requires drivers to return -ENODATA, if a hint cannot be provided for a particular packet. Solve this inconsistency by changing error return value of __vlan_hwaccel_get_tag() from -EINVAL to -ENODATA, do the same thing to __vlan_get_tag(), because this function is supposed to follow the same convention. This, in turn, makes -ENODATA the only non-zero value vlan_get_tag() can return. We can do this with no side effects, because none of the users of the 3 above-mentioned functions rely on the exact value. Suggested-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Signed-off-by: Larysa Zaremba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2023-12-13bpf: Support uid and gid when mounting bpffsJie Jiang1-0/+2
Parse uid and gid in bpf_parse_param() so that they can be passed in as the `data` parameter when mount() bpffs. This will be useful when we want to control which user/group has the control to the mounted bpffs, otherwise a separate chown() call will be needed. Signed-off-by: Jie Jiang <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Mike Frysinger <[email protected]> Acked-by: Christian Brauner <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-12-13Merge tag 'platform-drivers-x86-amd-wbrf-v6.8-1' into amd-drm-nextAlex Deucher1-0/+91
Immutable branch between pdx86 amd wbrf branch and wifi / amdgpu due for the v6.8 merge window platform-drivers-x86-amd-wbrf-v6.8-1: v6.7-rc1 + AMD WBRF support for merging into the wifi subsys and amdgpu driver for 6.8. Signed-off-by: Alex Deucher <[email protected]>
2023-12-14Merge branches 'doc.2023.12.13a', 'torture.2023.11.23a', ↵Neeraj Upadhyay (AMD)3-7/+10
'fixes.2023.12.13a', 'rcu-tasks.2023.12.12b' and 'srcu.2023.12.13a' into rcu-merge.2023.12.13a
2023-12-14rculist.h: docs: Fix wrong function summaryPhilipp Stanner1-1/+1
The brief summary in the docstring for function list_next_or_null_rcu() states that the function is supposed to provide the "first" member of a list, whereas in truth it returns the next member. Change the docstring so it describes what the function actually does. Signed-off-by: Philipp Stanner <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Neeraj Upadhyay (AMD) <[email protected]>
2023-12-13srcu: Use try-lock lockdep annotation for NMI-safe access.Sebastian Andrzej Siewior2-1/+7
It is claimed that srcu_read_lock_nmisafe() NMI-safe. However it triggers a lockdep if used from NMI because lockdep expects a deadlock since nothing disables NMIs while the lock is acquired. This is because commit f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies") annotates synchronize_srcu() as a write lock usage. This helps to detect a deadlocks such as srcu_read_lock(); synchronize_srcu(); srcu_read_unlock(); The side effect is that the lock srcu_struct now has a USED usage in normal contexts, so it conflicts with a USED_READ usage in NMI. But this shouldn't cause a real deadlock because the write lock usage from synchronize_srcu() is a fake one and only used for read/write deadlock detection. Use a try-lock annotation for srcu_read_lock_nmisafe() to avoid lockdep complains if used from NMI. Fixes: f0f44752f5f6 ("rcu: Annotate SRCU's update-side lockdep dependencies") Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Boqun Feng <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Neeraj Upadhyay (AMD) <[email protected]>
2023-12-13io_uring/poll: don't enable lazy wake for POLLEXCLUSIVEJens Axboe1-0/+3
There are a few quirks around using lazy wake for poll unconditionally, and one of them is related the EPOLLEXCLUSIVE. Those may trigger exclusive wakeups, which wake a limited number of entries in the wait queue. If that wake number is less than the number of entries someone is waiting for (and that someone is also using DEFER_TASKRUN), then we can get stuck waiting for more entries while we should be processing the ones we already got. If we're doing exclusive poll waits, flag the request as not being compatible with lazy wakeups. Reported-by: Pavel Begunkov <[email protected]> Fixes: 6ce4a93dbb5b ("io_uring/poll: use IOU_F_TWQ_LAZY_WAKE for wakeups") Signed-off-by: Jens Axboe <[email protected]>
2023-12-13leds: trigger: Remove unused function led_trigger_rename_static()Heiner Kallweit1-17/+0
This function was added with a8df7b1ab70b ("leds: add led_trigger_rename function") 11 yrs ago, but it has no users. So remove it. Signed-off-by: Heiner Kallweit <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-12-13PCI: Move pci_clear_and_set_dword() helper to PCI headerShuai Xue1-0/+2
The clear and set pattern is commonly used for accessing PCI config, move the helper pci_clear_and_set_dword() from aspm.c into PCI header. In addition, rename to pci_clear_and_set_config_dword() to retain the "config" information and match the other accessors. No functional change intended. Signed-off-by: Shuai Xue <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Tested-by: Ilkka Koskinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2023-12-13PCI: Add Alibaba Vendor ID to linux/pci_ids.hShuai Xue1-0/+2
The Alibaba Vendor ID (0x1ded) is now used by Alibaba elasticRDMA ("erdma") and will be shared with the upcoming PCIe PMU ("dwc_pcie_pmu"). Move the Vendor ID to linux/pci_ids.h so that it can shared by several drivers later. Signed-off-by: Shuai Xue <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> # pci_ids.h Tested-by: Ilkka Koskinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2023-12-13thermal: trip: Send trip change notifications on all trip updatesRafael J. Wysocki1-0/+2
The _store callbacks of the trip point temperature and hysteresis sysfs attributes invoke thermal_notify_tz_trip_change() to send a notification regarding the trip point change, but when trip points are updated by the platform firmware, trip point change notifications are not sent. To make the behavior after a trip point change more consistent, modify all of the 3 places where trip point temperature is updated to use a new function called thermal_zone_set_trip_temp() for this purpose and make that function call thermal_notify_tz_trip_change(). Note that trip point hysteresis can only be updated via sysfs and trip_point_hyst_store() calls thermal_notify_tz_trip_change() already, so this code path need not be changed. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Daniel Lezcano <[email protected]>
2023-12-13tty: add new helper function tty_get_tiocmFlorian Eckert1-0/+1
There is no in-kernel function to get the status register of a tty device like the TIOCMGET ioctl returns to userspace. Create a new function, tty_get_tiocm(), to obtain the status register that other portions of the kernel can call if they need this information, and move the existing internal tty_tiocmget() function to use this interface. Signed-off-by: Florian Eckert <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lee Jones <[email protected]>
2023-12-13leds: trigger: netdev: Extend speeds up to 10GDaniel Golle1-0/+3
Add 2.5G, 5G and 10G as available speeds to the netdev LED trigger. Signed-off-by: Daniel Golle <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Link: https://lore.kernel.org/r/99e7d3304c6bba7f4863a4a80764a869855f2085.1701143925.git.daniel@makrotopia.org Signed-off-by: Lee Jones <[email protected]>
2023-12-13dpll: remove leftover mode_supported() op and use mode_get() insteadJiri Pirko1-3/+0
Mode supported is currently reported to the user exactly the same, as the current mode. That's because mode changing is not implemented. Remove the leftover mode_supported() op and use mode_get() to fill up the supported mode exposed to user. One, if even, mode changing is going to be introduced, this could be very easily taken back. In the meantime, prevent drivers form implementing this in wrong way (as for example recent netdevsim implementation attempt intended to do). Signed-off-by: Jiri Pirko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-12-13PM: domains: Drop the unused pm_genpd_opp_to_performance_state()Ulf Hansson1-12/+0
Since commit 7c41cdcd3bbe ("OPP: Simplify the over-designed pstate <-> level dance"), there is no longer any users of the pm_genpd_opp_to_performance_state() API. Let's therefore drop it and its corresponding ->opp_to_performance_state() callback, which also no longer has any users. Signed-off-by: Ulf Hansson <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-12-12merge mm-hotfixes-stable into mm-nonmm-stable to pick up depended-upon changesAndrew Morton4-28/+39
2023-12-12mm/mglru: reclaim offlined memcgs harderYu Zhao1-4/+4
In the effort to reduce zombie memcgs [1], it was discovered that the memcg LRU doesn't apply enough pressure on offlined memcgs. Specifically, instead of rotating them to the tail of the current generation (MEMCG_LRU_TAIL) for a second attempt, it moves them to the next generation (MEMCG_LRU_YOUNG) after the first attempt. Not applying enough pressure on offlined memcgs can cause them to build up, and this can be particularly harmful to memory-constrained systems. On Pixel 8 Pro, launching apps for 50 cycles: Before After Change Zombie memcgs 45 35 -22% [1] https://lore.kernel.org/CABdmKX2M6koq4Q0Cmp_-=wbP0Qa190HdEGGaHfxNS05gAkUtPA@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Reported-by: T.J. Mercier <[email protected]> Tested-by: T.J. Mercier <[email protected]> Cc: Charan Teja Kalla <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jaroslav Pulchart <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-12mm/mglru: respect min_ttl_ms with memcgsYu Zhao1-13/+17
While investigating kswapd "consuming 100% CPU" [1] (also see "mm/mglru: try to stop at high watermarks"), it was discovered that the memcg LRU can breach the thrashing protection imposed by min_ttl_ms. Before the memcg LRU: kswapd() shrink_node_memcgs() mem_cgroup_iter() inc_max_seq() // always hit a different memcg lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation After the memcg LRU: kswapd() shrink_many() restart: iterate the memcg LRU: inc_max_seq() // occasionally hit the same memcg if raced with lru_gen_rotate_memcg(): goto restart lru_gen_age_node() mem_cgroup_iter() check the timestamp of the oldest generation Specifically, when the restart happens in shrink_many(), it needs to stick with the (memcg LRU) generation it began with. In other words, it should neither re-read memcg_lru->seq nor age an lruvec of a different generation. Otherwise it can hit the same memcg multiple times without giving lru_gen_age_node() a chance to check the timestamp of that memcg's oldest generation (against min_ttl_ms). [1] https://lore.kernel.org/CAK8fFZ4DY+GtBA40Pm7Nn5xCHy+51w3sfxPqkqpqakSXYyX+Wg@mail.gmail.com/ Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56cd208 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Tested-by: T.J. Mercier <[email protected]> Cc: Charan Teja Kalla <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jaroslav Pulchart <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-12mm/mglru: fix underprotected page cacheYu Zhao1-9/+14
Unmapped folios accessed through file descriptors can be underprotected. Those folios are added to the oldest generation based on: 1. The fact that they are less costly to reclaim (no need to walk the rmap and flush the TLB) and have less impact on performance (don't cause major PFs and can be non-blocking if needed again). 2. The observation that they are likely to be single-use. E.g., for client use cases like Android, its apps parse configuration files and store the data in heap (anon); for server use cases like MySQL, it reads from InnoDB files and holds the cached data for tables in buffer pools (anon). However, the oldest generation can be very short lived, and if so, it doesn't provide the PID controller with enough time to respond to a surge of refaults. (Note that the PID controller uses weighted refaults and those from evicted generations only take a half of the whole weight.) In other words, for a short lived generation, the moving average smooths out the spike quickly. To fix the problem: 1. For folios that are already on LRU, if they can be beyond the tracking range of tiers, i.e., five accesses through file descriptors, move them to the second oldest generation to give them more time to age. (Note that tiers are used by the PID controller to statistically determine whether folios accessed multiple times through file descriptors are worth protecting.) 2. When adding unmapped folios to LRU, adjust the placement of them so that they are not too close to the tail. The effect of this is similar to the above. On Android, launching 55 apps sequentially: Before After Change workingset_refault_anon 25641024 25598972 0% workingset_refault_file 115016834 106178438 -8% Link: https://lkml.kernel.org/r/[email protected] Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Signed-off-by: Yu Zhao <[email protected]> Reported-by: Charan Teja Kalla <[email protected]> Tested-by: Kalesh Singh <[email protected]> Cc: T.J. Mercier <[email protected]> Cc: Kairui Song <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Jaroslav Pulchart <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-12-12mm/damon/core: make damon_start() waits until kdamond_fn() startsSeongJae Park1-0/+2
The cleanup tasks of kdamond threads including reset of corresponding DAMON context's ->kdamond field and decrease of global nr_running_ctxs counter is supposed to be executed by kdamond_fn(). However, commit 0f91d13366a4 ("mm/damon: simplify stop mechanism") made neither damon_start() nor damon_stop() ensure the corresponding kdamond has started the execution of kdamond_fn(). As a result, the cleanup can be skipped if damon_stop() is called fast enough after the previous damon_start(). Especially the skipped reset of ->kdamond could cause a use-after-free. Fix it by waiting for start of kdamond_fn() execution from damon_start(). Link: https://lkml.kernel.org/r/[email protected] Fixes: 0f91d13366a4 ("mm/damon: simplify stop mechanism") Signed-off-by: SeongJae Park <[email protected]> Reported-by: Jakub Acs <[email protected]> Cc: Changbin Du <[email protected]> Cc: Jakub Acs <[email protected]> Cc: <[email protected]> # 5.15.x Signed-off-by: Andrew Morton <[email protected]>