aboutsummaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2023-07-14iommu: Optimise PCI SAC address trickRobin Murphy1-0/+2
Per the reasoning in commit 4bf7fda4dce2 ("iommu/dma: Add config for PCI SAC address trick") and its subsequent revert, this mechanism no longer serves its original purpose, but now only works around broken hardware/drivers in a way that is unfortunately too impactful to remove. This does not, however, prevent us from solving the performance impact which that workaround has on large-scale systems that don't need it. Once the 32-bit IOVA space fills up and a workload starts allocating and freeing on both sides of the boundary, the opportunistic SAC allocation can then end up spending significant time hunting down scattered fragments of free 32-bit space, or just reestablishing max32_alloc_size. This can easily be exacerbated by a change in allocation pattern, such as by changing the network MTU, which can increase pressure on the 32-bit space by leaving a large quantity of cached IOVAs which are now the wrong size to be recycled, but also won't be freed since the non-opportunistic allocations can still be satisfied from the whole 64-bit space without triggering the reclaim path. However, in the context of a workaround where smaller DMA addresses aren't simply a preference but a necessity, if we get to that point at all then in fact it's already the endgame. The nature of the allocator is currently such that the first IOVA we give to a device after the 32-bit space runs out will be the highest possible address for that device, ever. If that works, then great, we know we can optimise for speed by always allocating from the full range. And if it doesn't, then the worst has already happened and any brokenness is now showing, so there's little point in continuing to try to hide it. To that end, implement a flag to refine the SAC business into a per-device policy that can automatically get itself out of the way if and when it stops being useful. CC: Linus Torvalds <[email protected]> CC: Jakub Kicinski <[email protected]> Reviewed-by: John Garry <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Tested-by: Vasant Hegde <[email protected]> Tested-by: Jakub Kicinski <[email protected]> Link: https://lore.kernel.org/r/b8502b115b915d2a3fabde367e099e39106686c8.1681392791.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-07-14spi: Use struct_size() helperAndy Shevchenko1-6/+9
The Documentation/process/deprecated.rst suggests to use flexible array members to provide a way to declare having a dynamically sized set of trailing elements in a structure.This makes code robust agains bunch of the issues described in the documentation, main of which is about the correctness of the sizeof() calculation for this data structure. Due to above, prefer struct_size() over open-coded versions. Signed-off-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]>
2023-07-14platform/x86/intel/tpmi: Read feature control statusSrinivas Pandruvada1-0/+2
Some of the PM features can be locked or disabled. In that case, write interface can be locked. This status is read via a mailbox. There is one TPMI ID which provides base address for interface and data register for mail box operation. The mailbox operations is defined in the TPMI specification. Refer to https://github.com/intel/tpmi_power_management/ for TPMI specifications. An API is exposed to feature drivers to read feature control status. Signed-off-by: Srinivas Pandruvada <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-07-14Merge tag 'ib-pdx86-simatic-v6.6' into review-hansHans de Goede2-2/+6
Immutable branch between pdx86 simatic branch and LED due for the v6.6 merge window v6.5-rc1 + recent pdx86 simatic-ipc patches for merging into the LED subsystem for v6.6.
2023-07-14platform/x86: simatic-ipc: add another modelHenning Schild1-0/+1
This is the panel variant of a device we already did have. All the same, just no LEDs. Signed-off-by: Henning Schild <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-07-14platform/x86: simatic-ipc: add CMOS battery monitoringHenning Schild1-0/+1
Siemens Simatic Industrial PCs can monitor the voltage of the CMOS battery with two bits that indicate low or empty state. This can be GPIO or PortIO based. Here we model that as a hwmon voltage. The core driver does the PortIO and provides boilerplate for the GPIO versions. Which are split out to model runtime dependencies while allowing fine-grained kernel configuration. Signed-off-by: Henning Schild <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-07-14media: i2c: add DS90UB960 driverTomi Valkeinen1-0/+22
Add driver for TI DS90UB960 FPD-Link III Deserializer. Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2023-07-14media: i2c: add I2C Address Translator (ATR) supportLuca Ceresoli1-0/+116
An ATR is a device that looks similar to an i2c-mux: it has an I2C slave "upstream" port and N master "downstream" ports, and forwards transactions from upstream to the appropriate downstream port. But it is different in that the forwarded transaction has a different slave address. The address used on the upstream bus is called the "alias" and is (potentially) different from the physical slave address of the downstream chip. Add a helper file (just like i2c-mux.c for a mux or switch) to allow implementing ATR features in a device driver. The helper takes care of adapter creation/destruction and translates addresses at each transaction. Signed-off-by: Luca Ceresoli <[email protected]> Signed-off-by: Tomi Valkeinen <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Acked-by: Wolfram Sang <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2023-07-14platform/x86: simatic-ipc: add another model BX-21AHenning Schild2-2/+4
This adds support for the Siemens Simatic IPC model BX-21A. Actual drivers for that model will be sent in separate patches. Signed-off-by: Henning Schild <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Hans de Goede <[email protected]> Signed-off-by: Hans de Goede <[email protected]>
2023-07-14net: mdio: add unlocked mdiobus and mdiodev bus accessorsRussell King (Oracle)1-0/+26
Add the following unlocked accessors to complete the set: __mdiobus_modify() __mdiodev_read() __mdiodev_write() __mdiodev_modify() __mdiodev_modify_changed() which we will need for Marvell DSA PCS conversion. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14net: phylink: add support for PCS link change notificationsRussell King (Oracle)1-0/+7
Add a function, phylink_pcs_change() which can be used by PCs drivers to notify phylink about changes to the PCS link state. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14net: phylink: add pcs_pre_config()/pcs_post_config() methodsRussell King (Oracle)1-0/+6
Add hooks that are called before and after the mac_config() call, which will be needed to deal with errata workarounds for the Marvell 88e639x DSA switches. Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14net: phylink: add pcs_enable()/pcs_disable() methodsRussell King (Oracle)1-0/+16
Add phylink PCS enable/disable callbacks that will allow us to place IEEE 802.3 register compliant PCS in power-down mode while not being used. Signed-off-by: Russell King (Oracle) <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14ipv6: Constify the sk parameter of several helper functions.Guillaume Nault1-6/+4
icmpv6_flow_init(), ip6_datagram_flow_key_init() and ip6_mc_hdr() don't need to modify their sk argument. Make that explicit using const. Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14ipv4: Constify the sk parameter of ip_route_output_*().Guillaume Nault1-3/+3
These functions don't need to modify the socket, so let's allow the callers to pass a const struct sock *. Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Simon Horman <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-14security: Constify sk in the sk_getsecid hook.Guillaume Nault2-3/+4
The sk_getsecid hook shouldn't need to modify its socket argument. Make it const so that callers of security_sk_classify_flow() can use a const struct sock *. Signed-off-by: Guillaume Nault <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2023-07-13soc: qcom: smem: Add qcom_smem_is_available()Stephan Gerhold1-0/+1
Avoid having to look up a dummy item from SMEM to detect if it is already available or if we need to defer probing. Reviewed-by: Konrad Dybcio <[email protected]> Signed-off-by: Stephan Gerhold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Andersson <[email protected]>
2023-07-13net: stmmac: replace the en_tx_lpi_clockgating field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the rx_clk_runs_in_lpi field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the int_snapshot_en field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the ext_snapshot_en field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the multi_msi_en field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the vlan_fail_q_en field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the serdes_up_after_phy_linkup field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the tso_en field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the has_sun8i field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the use_phy_wol field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the sph_disable field with a flagBartosz Golaszewski1-1/+1
Drop the boolean field of the plat_stmmacenet_data structure in favor of a simple bitfield flag. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13net: stmmac: replace the has_integrated_pcs field with a flagBartosz Golaszewski1-1/+3
struct plat_stmmacenet_data contains several boolean fields that could be easily replaced with a common integer 'flags' bitfield and bit defines. Start the process with the has_integrated_pcs field. Signed-off-by: Bartosz Golaszewski <[email protected]> Reviewed-by: Andrew Halaney <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13Merge tag 'for-netdev' of ↵Jakub Kicinski7-3/+79
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2023-07-13 We've added 67 non-merge commits during the last 15 day(s) which contain a total of 106 files changed, 4444 insertions(+), 619 deletions(-). The main changes are: 1) Fix bpftool build in presence of stale vmlinux.h, from Alexander Lobakin. 2) Introduce bpf_me_mcache_free_rcu() and fix OOM under stress, from Alexei Starovoitov. 3) Teach verifier actual bounds of bpf_get_smp_processor_id() and fix perf+libbpf issue related to custom section handling, from Andrii Nakryiko. 4) Introduce bpf map element count, from Anton Protopopov. 5) Check skb ownership against full socket, from Kui-Feng Lee. 6) Support for up to 12 arguments in BPF trampoline, from Menglong Dong. 7) Export rcu_request_urgent_qs_task, from Paul E. McKenney. 8) Fix BTF walking of unions, from Yafang Shao. 9) Extend link_info for kprobe_multi and perf_event links, from Yafang Shao. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (67 commits) selftests/bpf: Add selftest for PTR_UNTRUSTED bpf: Fix an error in verifying a field in a union selftests/bpf: Add selftests for nested_trust bpf: Fix an error around PTR_UNTRUSTED selftests/bpf: add testcase for TRACING with 6+ arguments bpf, x86: allow function arguments up to 12 for TRACING bpf, x86: save/restore regs with BPF_DW size bpftool: Use "fallthrough;" keyword instead of comments bpf: Add object leak check. bpf: Convert bpf_cpumask to bpf_mem_cache_free_rcu. bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu(). selftests/bpf: Improve test coverage of bpf_mem_alloc. rcu: Export rcu_request_urgent_qs_task() bpf: Allow reuse from waiting_for_gp_ttrace list. bpf: Add a hint to allocated objects. bpf: Change bpf_mem_cache draining process. bpf: Further refactor alloc_bulk(). bpf: Factor out inc/dec of active flag into helpers. bpf: Refactor alloc_bulk(). bpf: Let free_all() return the number of freed elements. ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-13KVM: arm64: vgic-v4: Make the doorbell request robust w.r.t preemptionMarc Zyngier1-1/+1
Xiang reports that VMs occasionally fail to boot on GICv4.1 systems when running a preemptible kernel, as it is possible that a vCPU is blocked without requesting a doorbell interrupt. The issue is that any preemption that occurs between vgic_v4_put() and schedule() on the block path will mark the vPE as nonresident and *not* request a doorbell irq. This occurs because when the vcpu thread is resumed on its way to block, vcpu_load() will make the vPE resident again. Once the vcpu actually blocks, we don't request a doorbell anymore, and the vcpu won't be woken up on interrupt delivery. Fix it by tracking that we're entering WFI, and key the doorbell request on that flag. This allows us not to make the vPE resident when going through a preempt/schedule cycle, meaning we don't lose any state. Cc: [email protected] Fixes: 8e01d9a396e6 ("KVM: arm64: vgic-v4: Move the GICv4 residency flow to be driven by vcpu_load/put") Reported-by: Xiang Chen <[email protected]> Suggested-by: Zenghui Yu <[email protected]> Tested-by: Xiang Chen <[email protected]> Co-developed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Zenghui Yu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2023-07-13Merge tag 'net-6.5-rc2' of ↵Linus Torvalds5-15/+37
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from netfilter, wireless and ebpf. Current release - regressions: - netfilter: conntrack: gre: don't set assured flag for clash entries - wifi: iwlwifi: remove 'use_tfh' config to fix crash Previous releases - regressions: - ipv6: fix a potential refcount underflow for idev - icmp6: ifix null-ptr-deref of ip6_null_entry->rt6i_idev in icmp6_dev() - bpf: fix max stack depth check for async callbacks - eth: mlx5e: - check for NOT_READY flag state after locking - fix page_pool page fragment tracking for XDP - eth: igc: - fix tx hang issue when QBV gate is closed - fix corner cases for TSN offload - eth: octeontx2-af: Move validation of ptp pointer before its usage - eth: ena: fix shift-out-of-bounds in exponential backoff Previous releases - always broken: - core: prevent skb corruption on frag list segmentation - sched: - cls_fw: fix improper refcount update leads to use-after-free - sch_qfq: account for stab overhead in qfq_enqueue - netfilter: - report use refcount overflow - prevent OOB access in nft_byteorder_eval - wifi: mt7921e: fix init command fail with enabled device - eth: ocelot: fix oversize frame dropping for preemptible TCs - eth: fec: recycle pages for transmitted XDP frames" * tag 'net-6.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (79 commits) selftests: tc-testing: add test for qfq with stab overhead net/sched: sch_qfq: account for stab overhead in qfq_enqueue selftests: tc-testing: add tests for qfq mtu sanity check net/sched: sch_qfq: reintroduce lmax bound check for MTU wifi: cfg80211: fix receiving mesh packets without RFC1042 header wifi: rtw89: debug: fix error code in rtw89_debug_priv_send_h2c_set() net: txgbe: fix eeprom calculation error net/sched: make psched_mtu() RTNL-less safe net: ena: fix shift-out-of-bounds in exponential backoff netdevsim: fix uninitialized data in nsim_dev_trap_fa_cookie_write() net/sched: flower: Ensure both minimum and maximum ports are specified MAINTAINERS: Add another mailing list for QUALCOMM ETHQOS ETHERNET DRIVER docs: netdev: update the URL of the status page wifi: iwlwifi: remove 'use_tfh' config to fix crash xdp: use trusted arguments in XDP hints kfuncs bpf: cpumap: Fix memory leak in cpu_map_update_elem wifi: airo: avoid uninitialized warning in airo_get_rate() octeontx2-pf: Add additional check for MCAM rules net: dsa: Removed unneeded of_node_put in felix_parse_ports_node net: fec: use netdev_err_once() instead of netdev_err() ...
2023-07-13Merge tag 'trace-v6.5-rc1-3' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing fixes from Steven Rostedt: - Fix some missing-prototype warnings - Fix user events struct args (did not include size of struct) When creating a user event, the "struct" keyword is to denote that the size of the field will be passed in. But the parsing failed to handle this case. - Add selftest to struct sizes for user events - Fix sample code for direct trampolines. The sample code for direct trampolines attached to handle_mm_fault(). But the prototype changed and the direct trampoline sample code was not updated. Direct trampolines needs to have the arguments correct otherwise it can fail or crash the system. - Remove unused ftrace_regs_caller_ret() prototype. - Quiet false positive of FORTIFY_SOURCE Due to backward compatibility, the structure used to save stack traces in the kernel had a fixed size of 8. This structure is exported to user space via the tracing format file. A change was made to allow more than 8 functions to be recorded, and user space now uses the size field to know how many functions are actually in the stack. But the structure still has size of 8 (even though it points into the ring buffer that has the required amount allocated to hold a full stack. This was fine until the fortifier noticed that the memcpy(&entry->caller, stack, size) was greater than the 8 functions and would complain at runtime about it. Hide this by using a pointer to the stack location on the ring buffer instead of using the address of the entry structure caller field. - Fix a deadloop in reading trace_pipe that was caused by a mismatch between ring_buffer_empty() returning false which then asked to read the data, but the read code uses rb_num_of_entries() that returned zero, and causing a infinite "retry". - Fix a warning caused by not using all pages allocated to store ftrace functions, where this can happen if the linker inserts a bunch of "NULL" entries, causing the accounting of how many pages needed to be off. - Fix histogram synthetic event crashing when the start event is removed and the end event is still using a variable from it - Fix memory leak in freeing iter->temp in tracing_release_pipe() * tag 'trace-v6.5-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix memory leak of iter->temp when reading trace_pipe tracing/histograms: Add histograms to hist_vars if they have referenced variables tracing: Stop FORTIFY_SOURCE complaining about stack trace caller ftrace: Fix possible warning on checking all pages used in ftrace_process_locs() ring-buffer: Fix deadloop issue on reading trace_pipe tracing: arm64: Avoid missing-prototype warnings selftests/user_events: Test struct size match cases tracing/user_events: Fix struct arg size match check x86/ftrace: Remove unsued extern declaration ftrace_regs_caller_ret() arm64: ftrace: Add direct call trampoline samples support samples: ftrace: Save required argument registers in sample trampolines
2023-07-13Merge tag 'nvme-6.5-2023-07-13' of git://git.infradead.org/nvme into block-6.5Jens Axboe1-1/+1
Pull NVMe fixes from Keith: "nvme fixes for Linux 6.5 - Don't require quirk to use duplicate namespace identifiers (Christoph, Sagi) - One more BOGUS_NID quirk (Pankaj) - IO timeout and error hanlding fixes for PCI (Keith) - Enhanced metadata format mask fix (Ankit) - Association race condition fix for fibre channel (Michael) - Correct debugfs error checks (Minjie) - Use PAGE_SECTORS_SHIFT where needed (Damien) - Reduce kernel logs for legacy nguid attribute (Keith) - Use correct dma direction when unmapping metadata (Ming)" * tag 'nvme-6.5-2023-07-13' of git://git.infradead.org/nvme: nvme-pci: fix DMA direction of unmapping integrity data nvme: don't reject probe due to duplicate IDs for single-ported PCIe devices nvme: ensure disabling pairs with unquiesce nvme-fc: fix race between error recovery and creating association nvme-fc: return non-zero status code when fails to create association nvme: fix parameter check in nvme_fault_inject_init() nvme: warn only once for legacy uuid attribute nvme: fix the NVME_ID_NS_NVM_STS_MASK definition nvmet: use PAGE_SECTORS_SHIFT nvme: add BOGUS_NID quirk for Samsung SM953
2023-07-13PCI: Change pdev->rom_attr_enabled to single bitChristophe JAILLET1-1/+1
Make 'rom_attr_enabled' a single bit in a bitfield and move it close to an existing bitfield so that they can be merged. This field is only used in 'drivers/pci/pci-sysfs.c' to store 0 or 1. On x86_64, this shrinks the size of 'struct pci_dev' by 8 bytes from 3560 to 3552. Link: https://lore.kernel.org/r/d7a34ad369336db73145c3efbade895e792a0ad3.1687105455.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-07-13PCI: Reorder pci_dev fields to reduce holesChristophe JAILLET1-2/+2
Group some bitfield variables to reduce holes. On x86_64, this shrinks the size of 'struct pci_dev' by 16 bytes (from 3576 to 3560) when compiled with 'allmodconfig'. Link: https://lore.kernel.org/r/407b17c3e56764ef2c558898d4ff4c6c04b2d757.1687105455.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-07-13PCI/AER: Unexport pci_enable_pcie_error_reporting()Bjorn Helgaas1-6/+0
pci_enable_pcie_error_reporting() is used only inside aer.c. Stop exposing it outside the file. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-07-13PCI/AER: Drop unused pci_disable_pcie_error_reporting()Bjorn Helgaas1-5/+0
pci_disable_pcie_error_reporting() has no callers. Remove it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-07-13sched/core: introduce sched_core_idle_cpu()Cruz Zhao1-0/+2
As core scheduling introduced, a new state of idle is defined as force idle, running idle task but nr_running greater than zero. If a cpu is in force idle state, idle_cpu() will return zero. This result makes sense in some scenarios, e.g., load balance, showacpu when dumping, and judge the RCU boost kthread is starving. But this will cause error in other scenarios, e.g., tick_irq_exit(): When force idle, rq->curr == rq->idle but rq->nr_running > 0, results that idle_cpu() returns 0. In function tick_irq_exit(), if idle_cpu() is 0, tick_nohz_irq_exit() will not be called, and ts->idle_active will not become 1, which became 0 in tick_nohz_irq_enter(). ts->idle_sleeptime won't update in function update_ts_time_stats(), if ts->idle_active is 0, which should be 1. And this bug will result that ts->idle_sleeptime is less than the actual value, and finally will result that the idle time in /proc/stat is less than the actual value. To solve this problem, we introduce sched_core_idle_cpu(), which returns 1 when force idle. We audit all users of idle_cpu(), and change idle_cpu() into sched_core_idle_cpu() in function tick_irq_exit(). v2-->v3: Only replace idle_cpu() with sched_core_idle_cpu() in function tick_irq_exit(). And modify the corresponding commit log. Signed-off-by: Cruz Zhao <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Peter Zijlstra <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Reviewed-by: Joel Fernandes <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-13sched: add throttled time stat for throttled childrenJosh Don1-0/+2
We currently export the total throttled time for cgroups that are given a bandwidth limit. This patch extends this accounting to also account the total time that each children cgroup has been throttled. This is useful to understand the degree to which children have been affected by the throttling control. Children which are not runnable during the entire throttled period, for example, will not show any self-throttling time during this period. Expose this in a new interface, 'cpu.stat.local', which is similar to how non-hierarchical events are accounted in 'memory.events.local'. Signed-off-by: Josh Don <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Tejun Heo <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-13sched: avoid false lockdep splat in put_task_struct()Wander Lairson Costa1-4/+14
In put_task_struct(), a spin_lock is indirectly acquired under the kernel stock. When running the kernel in real-time (RT) configuration, the operation is dispatched to a preemptible context call to ensure guaranteed preemption. However, if PROVE_RAW_LOCK_NESTING is enabled and __put_task_struct() is called while holding a raw_spinlock, lockdep incorrectly reports an "Invalid lock context" in the stock kernel. This false splat occurs because lockdep is unaware of the different route taken under RT. To address this issue, override the inner wait type to prevent the false lockdep splat. Suggested-by: Oleg Nesterov <[email protected]> Suggested-by: Sebastian Andrzej Siewior <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Wander Lairson Costa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-13kernel/fork: beware of __put_task_struct() calling contextWander Lairson Costa1-1/+27
Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping locks. Therefore, it can't be called from an non-preemptible context. One practical example is splat inside inactive_task_timer(), which is called in a interrupt context: CPU: 1 PID: 2848 Comm: life Kdump: loaded Tainted: G W --------- Hardware name: HP ProLiant DL388p Gen8, BIOS P70 07/15/2012 Call Trace: dump_stack_lvl+0x57/0x7d mark_lock_irq.cold+0x33/0xba mark_lock+0x1e7/0x400 mark_usage+0x11d/0x140 __lock_acquire+0x30d/0x930 lock_acquire.part.0+0x9c/0x210 rt_spin_lock+0x27/0xe0 refill_obj_stock+0x3d/0x3a0 kmem_cache_free+0x357/0x560 inactive_task_timer+0x1ad/0x340 __run_hrtimer+0x8a/0x1a0 __hrtimer_run_queues+0x91/0x130 hrtimer_interrupt+0x10f/0x220 __sysvec_apic_timer_interrupt+0x7b/0xd0 sysvec_apic_timer_interrupt+0x4f/0xd0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7fff196bf6f5 Instead of calling __put_task_struct() directly, we defer it using call_rcu(). A more natural approach would use a workqueue, but since in PREEMPT_RT, we can't allocate dynamic memory from atomic context, the code would become more complex because we would need to put the work_struct instance in the task_struct and initialize it when we allocate a new task_struct. The issue is reproducible with stress-ng: while true; do stress-ng --sched deadline --sched-period 1000000000 \ --sched-runtime 800000000 --sched-deadline \ 1000000000 --mmapfork 23 -t 20 done Reported-by: Hu Chunyu <[email protected]> Suggested-by: Oleg Nesterov <[email protected]> Suggested-by: Valentin Schneider <[email protected]> Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Wander Lairson Costa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-07-12net/sched: make psched_mtu() RTNL-less safePedro Tammela1-1/+1
Eric Dumazet says[1]: ------- Speaking of psched_mtu(), I see that net/sched/sch_pie.c is using it without holding RTNL, so dev->mtu can be changed underneath. KCSAN could issue a warning. ------- Annotate dev->mtu with READ_ONCE() so KCSAN don't issue a warning. [1] https://lore.kernel.org/all/CANn89iJoJO5VtaJ-2=_d2aOQhb0Xw8iBT_Cxqp2HyuS-zj6azw@mail.gmail.com/ v1 -> v2: Fix commit message Fixes: d4b36210c2e6 ("net: pkt_sched: PIE AQM scheme") Suggested-by: Eric Dumazet <[email protected]> Signed-off-by: Pedro Tammela <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2023-07-12bpf: Introduce bpf_mem_free_rcu() similar to kfree_rcu().Alexei Starovoitov1-0/+2
Introduce bpf_mem_[cache_]free_rcu() similar to kfree_rcu(). Unlike bpf_mem_[cache_]free() that links objects for immediate reuse into per-cpu free list the _rcu() flavor waits for RCU grace period and then moves objects into free_by_rcu_ttrace list where they are waiting for RCU task trace grace period to be freed into slab. The life cycle of objects: alloc: dequeue free_llist free: enqeueu free_llist free_rcu: enqueue free_by_rcu -> waiting_for_gp free_llist above high watermark -> free_by_rcu_ttrace after RCU GP waiting_for_gp -> free_by_rcu_ttrace free_by_rcu_ttrace -> waiting_for_gp_ttrace -> slab Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Hou Tao <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-07-12rcu: Export rcu_request_urgent_qs_task()Paul E. McKenney2-0/+3
If a CPU is executing a long series of non-sleeping system calls, RCU grace periods can be delayed for on the order of a couple hundred milliseconds. This is normally not a problem, but if each system call does a call_rcu(), those callbacks can stack up. RCU will eventually notice this callback storm, but use of rcu_request_urgent_qs_task() allows the code invoking call_rcu() to give RCU a heads up. This function is not for general use, not yet, anyway. Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2023-07-12Merge tag 'probes-fixes-v6.5-rc1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull probes fixes from Masami Hiramatsu: - Fix fprobe's rethook release issues: - Release rethook after ftrace_ops is unregistered so that the rethook is not accessed after free. - Stop rethook before ftrace_ops is unregistered so that the rethook is NOT used after exiting unregister_fprobe() - Fix eprobe cleanup logic. If it attaches to multiple events and failes to enable one of them, rollback all enabled events correctly. - Fix fprobe to unlock ftrace recursion lock correctly when it missed by another running kprobe. - Cleanup kprobe to remove unnecessary NULL. - Cleanup kprobe to remove unnecessary 0 initializations. * tag 'probes-fixes-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: fprobe: Ensure running fprobe_exit_handler() finished before calling rethook_free() kernel: kprobes: Remove unnecessary ‘0’ values kprobes: Remove unnecessary ‘NULL’ values from correct_ret_addr fprobe: add unlock to match a succeeded ftrace_test_recursion_trylock kernel/trace: Fix cleanup logic of enable_trace_eprobe fprobe: Release rethook after the ftrace_ops is unregistered
2023-07-12tracing: arm64: Avoid missing-prototype warningsArnd Bergmann1-0/+9
These are all tracing W=1 warnings in arm64 allmodconfig about missing prototypes: kernel/trace/trace_kprobe_selftest.c:7:5: error: no previous prototype for 'kprobe_trace_selftest_target' [-Werror=missing-pro totypes] kernel/trace/ftrace.c:329:5: error: no previous prototype for '__register_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:372:5: error: no previous prototype for '__unregister_ftrace_function' [-Werror=missing-prototypes] kernel/trace/ftrace.c:4130:15: error: no previous prototype for 'arch_ftrace_match_adjust' [-Werror=missing-prototypes] kernel/trace/fgraph.c:243:15: error: no previous prototype for 'ftrace_return_to_handler' [-Werror=missing-prototypes] kernel/trace/fgraph.c:358:6: error: no previous prototype for 'ftrace_graph_sleep_time_control' [-Werror=missing-prototypes] arch/arm64/kernel/ftrace.c:460:6: error: no previous prototype for 'prepare_ftrace_return' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2172:5: error: no previous prototype for 'syscall_trace_enter' [-Werror=missing-prototypes] arch/arm64/kernel/ptrace.c:2195:6: error: no previous prototype for 'syscall_trace_exit' [-Werror=missing-prototypes] Move the declarations to an appropriate header where they can be seen by the caller and callee, and make sure the headers are included where needed. Link: https://lore.kernel.org/linux-trace-kernel/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Cc: Kees Cook <[email protected]> Cc: Florent Revest <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Acked-by: Catalin Marinas <[email protected]> [ Fixed ftrace_return_to_handler() to handle CONFIG_HAVE_FUNCTION_GRAPH_RETVAL case ] Signed-off-by: Steven Rostedt (Google) <[email protected]>
2023-07-12platform/x86: asus-wmi: expose dGPU and CPU tunables for ROGLuke D. Jones1-0/+9
Expose various CPU and dGPU tunables that are available on many ASUS ROG laptops. The tunables shown in sysfs will vary depending on the CPU and dGPU vendor. All of these variables are write only and there is no easy way to find what the defaults are. In general they seem to default to the max value the vendor sets for the CPU and dGPU package - this is not the same as the min/max writable value. Values written to these variables that are beyond the capabilities of the CPU are ignored by the laptop. Signed-off-by: Luke D. Jones <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-07-12platform/x86: asus-wmi: support setting mini-LED modeLuke D. Jones1-0/+1
Support changing the mini-LED mode on some of the newer ASUS laptops. Signed-off-by: Luke D. Jones <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>
2023-07-12platform/x86: asus-wmi: add WMI method to show if egpu connectedLuke D. Jones1-1/+3
Exposes the WMI method which tells if the eGPU is properly connected on the devices that support it. Signed-off-by: Luke D. Jones <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hans de Goede <[email protected]>