Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes updates from Masami Hiramatsu:
"Uprobes:
- x86/shstk: Make return uprobe work with shadow stack
- Add uretprobe syscall which speeds up the uretprobe 10-30% faster.
This syscall is automatically used from user-space trampolines
which are generated by the uretprobe. If this syscall is used by
normal user program, it will cause SIGILL. Note that this is
currently only implemented on x86_64.
(This also has two fixes for adjusting the syscall number to avoid
conflict with new *attrat syscalls.)
- uprobes/perf: fix user stack traces in the presence of pending
uretprobe. This corrects the uretprobe's trampoline address in the
stacktrace with correct return address
- selftests/x86: Add a return uprobe with shadow stack test
- selftests/bpf: Add uretprobe syscall related tests.
- test case for register integrity check
- test case with register changing case
- test case for uretprobe syscall without uprobes (expected to fail)
- test case for uretprobe with shadow stack
- selftests/bpf: add test validating uprobe/uretprobe stack traces
- MAINTAINERS: Add uprobes entry. This does not specify the tree but
to clarify who maintains and reviews the uprobes
Kprobes:
- tracing/kprobes: Test case cleanups.
Replace redundant WARN_ON_ONCE() + pr_warn() with WARN_ONCE() and
remove unnecessary code from selftest
- tracing/kprobes: Add symbol counting check when module loads.
This checks the uniqueness of the probed symbol on modules. The
same check has already done for kernel symbols
(This also has a fix for build error with CONFIG_MODULES=n)
Cleanup:
- Add MODULE_DESCRIPTION() macros for fprobe and kprobe examples"
* tag 'probes-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
MAINTAINERS: Add uprobes entry
selftests/bpf: Change uretprobe syscall number in uprobe_syscall test
uprobe: Change uretprobe syscall scope and number
tracing/kprobes: Fix build error when find_module() is not available
tracing/kprobes: Add symbol counting check when module loads
selftests/bpf: add test validating uprobe/uretprobe stack traces
perf,uprobes: fix user stack traces in the presence of pending uretprobes
tracing/kprobe: Remove cleanup code unrelated to selftest
tracing/kprobe: Integrate test warnings into WARN_ONCE
selftests/bpf: Add uretprobe shadow stack test
selftests/bpf: Add uretprobe syscall call from user space test
selftests/bpf: Add uretprobe syscall test for regs changes
selftests/bpf: Add uretprobe syscall test for regs integrity
selftests/x86: Add return uprobe shadow stack test
uprobe: Add uretprobe syscall to speed up return probe
uprobe: Wire up uretprobe system call
x86/shstk: Make return uprobe work with shadow stack
samples: kprobes: add missing MODULE_DESCRIPTION() macros
fprobe: add missing MODULE_DESCRIPTION() macro
|
|
Pull drm updates from Dave Airlie:
"There's a lot of stuff in here, amd, i915 and xe have new platform
work, lots of core rework around EDID handling, some new COMPILE_TEST
options, maintainer changes and a lots of other stuff. Summary:
core:
- deprecate DRM data and return 0 date
- connector: Create a set of helpers to help with HDMI support
- Remove driver owner assignments
- Allow more drivers to compile with COMPILE_TEST
- Conversions to drm_edid
- Sprinkle MODULE_DESCRIPTIONS everywhere they are missing
- Remove drm_mm_replace_node
- print: Add a drm prefix to warn level messages too, remove
___drm_dbg, consolidate prefix handling
- New monochrome TV mode variant
ttm:
- improve number of page faults on some platforms
- fix test builds under PREEMPT_RT
- more test coverage
ci:
- Require a more recent version of mesa
- improve farm setup and test generation
dma-buf:
- warn if reserving 0 fence slots
- internal API heap enhancements
fbdev:
- Create memory manager optimized fbdev emulation
panic:
- Allow to select fonts
- improve drm_fb_dma_get_scanout_buffer
- Allow to dump kmsg to the screen
bridge:
- Remove redundant checks on bridge->encoder
- Remove drm_bridge_chain_mode_fixup
- bridge-connector: Plumb in the new HDMI helper
- analogix_dp: Various improvements, handle AUX transfers timeout
- samsung-dsim: Fix timings calculation
- tc358767: Plenty of small fixes, fix no connector attach, fix
clocks
- sii902x: state validation improvements
panels:
- Switch panels from register table initialization to proper code
- Now that the panel code tracks the panel state, remove every ad-hoc
implementation in the panel drivers
- More cleanup of prepare / enable state tracking in drivers
- edp: Drop legacy panel compatibles
- simple-bridge: Switch to devm_drm_bridge_add
- New panels: Lincoln Tech Sol LCD185-101CT, Microtips Technology
13-101HIEBCAF0-C, Microtips Technology MF-103HIEB0GA0,
BOE nv110wum-l60, IVO t109nw41, WL-355608-A8, PrimeView
PM070WL4, Lincoln Technologies LCD197, Ortustech
COM35H3P70ULC, AUO G104STN01, K&d kd101ne3-40ti
amdgpu:
- DCN 4.0.x support
- GC 12.0 support
- GMC 12.0 support
- SDMA 7.0 support
- MES12 support
- MMHUB 4.1 support
- GFX12 modifier and DCC support
- lots of IP fixes/updates
amdkfd:
- Contiguous VRAM allocations
- GC 12.0 support
- SDMA 7.0 support
- SR-IOV fixes
- KFD GFX ALU exceptions
i915:
- Battlemage Xe2 HPD display enablement
- Panel Replay enabling
- DP AUX-less ALPM/LOBF
- Enable link training failure fallback for DP MST links
- CMRR (Content Match Refresh Rate) enabling
- Increase ADL-S/ADL-P/DG2+ max TMDS bitrate to 6 Gbps
- Enable eDP AUX based HDR backlight
- Support replaying GPU hangs with captured context image
- Automate CCS Mode setting during engine resets
- lots of refactoring
- Support replaying GPU hangs with captured context image
- Increase FLR timeout from 3s to 9s
- Enable w/a 16021333562 for DG2, MTL and ARL [guc]
xe:
- update MAINATINERS
- New uapi adding OA functionality to Xe
- expose l3 bank mask
- fix display detect on ADL-N
- runtime PM Fixes
- Fix silent backmerge issues
- More prep for SR-IOV
- HWmon additions
- per client usage info
- Rework GPU page fault handling
- Drop EXEC_QUEUE_FLAG_BANNED
- Add BMG PCI IDs
- Scheduler fixes and improvements
- Rename xe_exec_queue::compute to xe_exec_queue::lr
- Use ttm_uncached for BO with NEEDS_UC flag
- Rename xe perf layer as xe observation layer
- lots of refactoring
radeon:
- Backlight workaround for iMac
- Silence UBSAN flex array warnings
msm:
- Validate registers XML description against schema in CI
- core/dpu: SM7150 support
- mdp5: Add support for MSM8937
- gpu: Add param for userspace to know if raytracing is supported
- gpu: X185 support (aka gpu in X1 laptop chips)
- gpu: a505 support
ivpu:
- hardware scheduler support
- profiling support
- improvements to the platform support layer
- firmware handling improvements
- clocks/power mgmt improvements
- scheduler/logging improvements
habanalabs:
- Gradual sleep in polling memory macro
- Reduce Gaudi2 MSI-X interrupt count to 128
- Add Gaudi2-D revision support
- Add timestamp to CPLD info
- Gaudi2: Assume hard-reset by firmware upon MC SEI severe error
- Align Gaudi2 interrupt names
- Check for errors after preboot is ready
- Change habanalabs maintainer and git repo path
mgag200:
- refactoring and improvements
- Add BMC output
- enable polling
nouveau:
- add registry command line
v3d:
- perf counters improvements
zynqmp:
- irq and debugfs improvements
atmel-hlcdc:
- Support XLCDC in sam9x7
mipi-dbi:
- Remove mipi_dbi_machine_little_endian
- make SPI bits per word configurable
- support RGB888
- allow pixel formats to be specified in the DT
sun4i:
- Rework the blender setup for DE2
panfrost:
- Enable MT8188 support
vc4:
- Monochrome TV support
exynos:
- fix fallback mode regression
- fix memory leak
- Use drm_edid_duplicate() instead of kmemdup()
etnaviv:
- fix i.MX8MP NPU clock gating
- workaround FE register cdc issues on some cores
- fix DMA sync handling for cached buffers
- fix job timeout handling
- keep TS enabled on MMUv2 cores for improved performance
mediatek:
- Convert to platform remove callback returning void-
- Drop chain_mode_fixup call in mode_valid()
- Fixes the errors of MediaTek display driver found by IGT
- Add display support for the MT8365-EVK board
- Fix bit depth overwritten for mtk_ovl_set bit_depth()
- Fix possible_crtcs calculation
- Fix spurious kfree()
ast:
- refactor mode setting code
stm:
- Add LVDS support
- DSI PHY updates"
* tag 'drm-next-2024-07-18' of https://gitlab.freedesktop.org/drm/kernel: (2501 commits)
drm/amdgpu/mes12: add missing opcode string
drm/amdgpu/mes11: update opcode strings
Revert "drm/amd/display: Reset freesync config before update new state"
drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
drm/xe: Drop trace_xe_hw_fence_free
drm/xe/uapi: Rename xe perf layer as xe observation layer
drm/amdgpu: remove exp hw support check for gfx12
drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
drm/amdgpu: flush all cached ras bad pages to eeprom
drm/amdgpu: select compute ME engines dynamically
drm/amd/display: Allow display DCC for DCN401
drm/amdgpu: select compute ME engines dynamically
drm/amdgpu/job: Replace DRM_INFO/ERROR logging
drm/amdgpu: select compute ME engines dynamically
drm/amd/pm: Ignore initial value in smu response register
drm/amdgpu: Initialize VF partition mode
drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping
MAINTAINERS: fix Xinhui's name
MAINTAINERS: update powerplay and swsmu
drm/qxl: Pin buffer objects for internal mappings
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver updates from Ilpo Järvinen:
- amd/pmf: Report system state changes using existing input events
- asus-wmi: Zenbook 2023 camera LED disable support and fix TUF laptop
keyboard RGB LED sysfs interface
- dell-pc: Fan modes / platform profile support
- hp-wmi: Fix platform profile switching on Omen/Victus laptops
- intel/ISST: Use only TPMI interface when TPMI and legacy interfaces
are available
- intel/pmc: LTR restore support to pair with LTR ignore
- intel/tpmi: Performance Limit Reasons (PLR) and APIC <-> Punit CPU
numbering mapping support
- WMI: driver override support and docs improvements
- lenovo-yoga-c630: Support for EC (platform/arm64)
- platform/arm64: Fix build with COMPILE_TEST (broke after addition of
C630)
- tools: Intel Speed Select Turbo Ratio Limit fix
- Miscellaneous cleanups / refactoring / improvements
* tag 'platform-drivers-x86-v6.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (65 commits)
platform/x86: asus-wmi: fix TUF laptop RGB variant
platform/x86/intel/tpmi/plr: Fix output in plr_print_bits()
Docs/admin-guide: Remove pmf leftover reference from the index
platform/x86: ideapad-laptop: use cleanup.h
platform/x86: hp-wmi: Fix implementation of the platform_profile_omen_get function
platform: arm64: EC_LENOVO_YOGA_C630 should depend on ARCH_QCOM
platform: arm64: EC_ACER_ASPIRE1 should depend on ARCH_QCOM
platform/x86/amd/pmf: Remove update system state document
platform/x86/amd/pmf: Use existing input event codes to update system states
platform/x86: hp-wmi: Fix platform profile option switch bug on Omen and Victus laptops
platform/x86:intel/pmc: Add support to undo ltr_ignore
platform/x86:intel/pmc: Use the Elvis operator
platform/x86:intel/pmc: Use DEFINE_SHOW_STORE_ATTRIBUTE macro
platform/x86:intel/pmc: Remove unneeded min_t check
platform/x86:intel/pmc: Add support to show ltr_ignore value
platform/x86:intel/pmc: Move pmc assignment closer to first usage
platform/x86:intel/pmc: Convert index variables to be unsigned
platform/x86:intel/pmc: Simplify mutex usage with cleanup helpers
platform/x86:intel/pmc: Use the return value of pmc_core_send_msg
tools/power/x86/intel-speed-select: v1.20 release
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Not much excitement - a handful of large patchsets (devmem among them)
did not make it in time.
Core & protocols:
- Use local_lock in addition to local_bh_disable() to protect per-CPU
resources in networking, a step closer for local_bh_disable() not
to act as a big lock on PREEMPT_RT
- Use flex array for netdevice priv area, ensure its cache alignment
- Add a sysctl knob to allow user to specify a default rto_min at
socket init time. Bit of a big hammer but multiple companies were
independently carrying such patch downstream so clearly it's useful
- Support scheduling transmission of packets based on CLOCK_TAI
- Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
off using cpusets
- Support multiple L2TPv3 UDP tunnels using the same 5-tuple address
- Allow configuration of multipath hash seed, to both allow
synchronizing hashing of two routers, and preventing partial
accidental sync
- Improve TCP compliance with RFC 9293 for simultaneous connect()
- Support sending NAT keepalives in IPsec ESP in UDP states.
Userspace IKE daemon had to do this before, but the kernel can
better keep track of it
- Support sending supervision HSR frames with MAC addresses stored in
ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled
- Introduce IPPROTO_SMC for selecting SMC when socket is created
- Allow UDP GSO transmit from devices with no checksum offload
- openvswitch: add packet sampling via psample, separating the
sampled traffic from "upcall" packets sent to user space for
forwarding
- nf_tables: shrink memory consumption for transaction objects
Things we sprinkled into general kernel code:
- Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
QCA6390) [ Already merged separately - Linus ]
- Add IRQ information in sysfs for auxiliary bus
- Introduce guard definition for local_lock
- Add aligned flavor of __cacheline_group_{begin, end}() markings for
grouping fields in structures
BPF:
- Notify user space (via epoll) when a struct_ops object is getting
detached/unregistered
- Add new kfuncs for a generic, open-coded bits iterator
- Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
bpf_list_head
- Support resilient split BTF which cuts down on duplication and
makes BTF as compact as possible WRT BTF from modules
- Add support for dumping kfunc prototypes from BTF which enables
both detecting as well as dumping compilable prototypes for kfuncs
- riscv64 BPF JIT improvements in particular to add 12-argument
support for BPF trampolines and to utilize bpf_prog_pack for the
latter
- Add the capability to offload the netfilter flowtable in XDP layer
through kfuncs
Driver API:
- Allow users to configure IRQ tresholds between which automatic IRQ
moderation can choose
- Expand Power Sourcing (PoE) status with power, class and failure
reason. Support setting power limits
- Track additional RSS contexts in the core, make sure configuration
changes don't break them
- Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
ESP data paths
- Support updating firmware on SFP modules
Tests and tooling:
- mptcp: use net/lib.sh to manage netns
- TCP-AO and TCP-MD5: replace debug prints used by tests with
tracepoints
- openvswitch: make test self-contained (don't depend on OvS CLI
tools)
Drivers:
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- increase the max total outstanding PTP TX packets to 4
- add timestamping statistics support
- implement netdev_queue_mgmt_ops
- support new RSS context API
- Intel (100G, ice, idpf):
- implement FEC statistics and dumping signal quality indicators
- support E825C products (with 56Gbps PHYs)
- nVidia/Mellanox:
- support HW-GRO
- mlx4/mlx5: support per-queue statistics via netlink
- obey the max number of EQs setting in sub-functions
- AMD/Solarflare:
- support new RSS context API
- AMD/Pensando:
- ionic: rework fix for doorbell miss to lower overhead and
skip it on new HW
- Wangxun:
- txgbe: support Flow Director perfect filters
- Ethernet NICs consumer, embedded and virtual:
- Add driver for Tehuti Networks TN40xx chips
- Add driver for Meta's internal NIC chips
- Add driver for Ethernet MAC on Airoha EN7581 SoCs
- Add driver for Renesas Ethernet-TSN devices
- Google cloud vNIC:
- flow steering support
- Microsoft vNIC:
- support page sizes other than 4KB on ARM64
- vmware vNIC:
- support latency measurement (update to version 9)
- VirtIO net:
- support for Byte Queue Limits
- support configuring thresholds for automatic IRQ moderation
- support for AF_XDP Rx zero-copy
- Synopsys (stmmac):
- support for STM32MP13 SoC
- let platforms select the right PCS implementation
- TI:
- icssg-prueth: add multicast filtering support
- icssg-prueth: enable PTP timestamping and PPS
- Renesas:
- ravb: improve Rx performance 30-400% by using page pool,
theaded NAPI and timer-based IRQ coalescing
- ravb: add MII support for R-Car V4M
- Cadence (macb):
- macb: add ARP support to Wake-On-LAN
- Cortina:
- use phylib for RX and TX pause configuration
- Ethernet switches:
- nVidia/Mellanox:
- support configuration of multipath hash seed
- report more accurate max MTU
- use page_pool to improve Rx performance
- MediaTek:
- mt7530: add support for bridge port isolation
- Qualcomm:
- qca8k: add support for bridge port isolation
- Microchip:
- lan9371/2: add 100BaseTX PHY support
- NXP:
- vsc73xx: implement VLAN operations
- Ethernet PHYs:
- aquantia: enable support for aqr115c
- aquantia: add support for PHY LEDs
- realtek: add support for rtl8224 2.5Gbps PHY
- xpcs: add memory-mapped device support
- add BroadR-Reach link mode and support in Broadcom's PHY driver
- CAN:
- add document for ISO 15765-2 protocol support
- mcp251xfd: workaround for erratum DS80000789E, use timestamps to
catch when device returns incorrect FIFO status
- WiFi:
- mac80211/cfg80211:
- parse Transmit Power Envelope (TPE) data in mac80211 instead
of in drivers
- improvements for 6 GHz regulatory flexibility
- multi-link improvements
- support multiple radios per wiphy
- remove DEAUTH_NEED_MGD_TX_PREP flag
- Intel (iwlwifi):
- bump FW API to 91 for BZ/SC devices
- report 64-bit radiotap timestamp
- enable P2P low latency by default
- handle Transmit Power Envelope (TPE) advertised by AP
- remove support for older FW for new devices
- fast resume (keeping the device configured)
- mvm: re-enable Multi-Link Operation (MLO)
- aggregation (A-MSDU) optimizations
- MediaTek (mt76):
- mt7925 Multi-Link Operation (MLO) support
- Qualcomm (ath10k):
- LED support for various chipsets
- Qualcomm (ath12k):
- remove unsupported Tx monitor handling
- support channel 2 in 6 GHz band
- support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
- supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
Advertisements (EMA)
- support dynamic VLAN
- add panic handler for resetting the firmware state
- DebugFS support for datapath statistics
- WCN7850: support for Wake on WLAN
- Microchip (wilc1000):
- read MAC address during probe to make it visible to user space
- suspend/resume improvements
- TI (wl18xx):
- support newer firmware versions
- RealTek (rtw89):
- preparation for RTL8852BE-VT support
- Wake on WLAN support for WiFi 6 chips
- 36-bit PCI DMA support
- RealTek (rtlwifi):
- RTL8192DU support
- Broadcom (brcmfmac):
- Management Frame Protection support (to enable WPA3)
- Bluetooth:
- qualcomm: use the power sequencer for QCA6390
- btusb: mediatek: add ISO data transmission functions
- hci_bcm4377: add BCM4388 support
- btintel: add support for BlazarU core
- btintel: add support for Whale Peak2
- btnxpuart: add support for AW693 A1 chipset
- btnxpuart: add support for IW615 chipset
- btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"
* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
tcp: Replace strncpy() with strscpy()
wifi: ath12k: fix build vs old compiler
tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
eth: fbnic: Add L2 address programming
eth: fbnic: Add basic Rx handling
eth: fbnic: Add basic Tx handling
eth: fbnic: Add link detection
eth: fbnic: Add initial messaging to notify FW of our presence
eth: fbnic: Implement Rx queue alloc/start/stop/free
eth: fbnic: Implement Tx queue alloc/start/stop/free
eth: fbnic: Allocate a netdevice and napi vectors with queues
eth: fbnic: Add FW communication mechanism
eth: fbnic: Add message parsing for FW messages
eth: fbnic: Add register init to set PCIe/Ethernet device config
eth: fbnic: Allocate core device specific structures and devlink interface
eth: fbnic: Add scaffolding for Meta's NIC driver
PCI: Add Meta Platforms vendor ID
net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance events updates from Ingo Molnar:
- Intel PT support enhancements & fixes
- Fix leaked SIGTRAP events
- Improve and fix the Intel uncore driver
- Add support for Intel HBM and CXL uncore counters
- Add Intel Lake and Arrow Lake support
- AMD uncore driver fixes
- Make SIGTRAP and __perf_pending_irq() work on RT
- Micro-optimizations
- Misc cleanups and fixes
* tag 'perf-core-2024-07-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (44 commits)
perf/x86/intel: Add a distinct name for Granite Rapids
perf/x86/intel/ds: Fix non 0 retire latency on Raptorlake
perf/x86/intel: Hide Topdown metrics events if the feature is not enumerated
perf/x86/intel/uncore: Fix the bits of the CHA extended umask for SPR
perf: Split __perf_pending_irq() out of perf_pending_irq()
perf: Don't disable preemption in perf_pending_task().
perf: Move swevent_htable::recursion into task_struct.
perf: Shrink the size of the recursion counter.
perf: Enqueue SIGTRAP always via task_work.
task_work: Add TWA_NMI_CURRENT as an additional notify mode.
perf: Move irq_work_queue() where the event is prepared.
perf: Fix event leak upon exec and file release
perf: Fix event leak upon exit
task_work: Introduce task_work_cancel() again
task_work: s/task_work_cancel()/task_work_cancel_func()/
perf/x86/amd/uncore: Fix DF and UMC domain identification
perf/x86/amd/uncore: Avoid PMU registration if counters are unavailable
perf/x86/intel: Support Perfmon MSRs aliasing
perf/x86/intel: Support PERFEVTSEL extension
perf/x86: Add config_mask to represent EVENTSEL bitmask
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
- Jump label fixes, including a perf events fix that originally
manifested as jump label failures, but was a serialization bug at the
usage site
- Mark down_write*() helpers as __always_inline, to improve WCHAN
debuggability
- Misc cleanups and fixes
* tag 'locking-core-2024-07-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/rwsem: Add __always_inline annotation to __down_write_common() and inlined callers
jump_label: Simplify and clarify static_key_fast_inc_cpus_locked()
jump_label: Clarify condition in static_key_fast_inc_not_disabled()
jump_label: Fix concurrency issues in static_key_slow_dec()
perf/x86: Serialize set_attr_rdpmc()
cleanup: Standardize the header guard define's name
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add a new cpufreq driver for Loongson-3, add support for new
features in the intel_pstate (Lunar Lake and Arrow Lake platforms, OOB
mode for Emerald Rapids, highest performance change interrupt),
amd-pstate (fast CPPC) and sun50i (Allwinner H700 speed bin) cpufreq
drivers, simplify the cpufreq driver interface, simplify the teo
cpuidle governor, adjust the pm-graph utility for a new version of
Python, address issues and clean up code.
Specifics:
- Add Loongson-3 CPUFreq driver support (Huacai Chen)
- Add support for the Arrow Lake and Lunar Lake platforms and the
out-of-band (OOB) mode on Emerald Rapids to the intel_pstate
cpufreq driver, make it support the highest performance change
interrupt and clean it up (Srinivas Pandruvada)
- Switch cpufreq to new Intel CPU model defines (Tony Luck)
- Simplify the cpufreq driver interface by switching the .exit()
driver callback to the void return data type (Lizhe, Viresh Kumar)
- Make cpufreq_boost_enabled() return bool (Dhruva Gole)
- Add fast CPPC support to the amd-pstate cpufreq driver, address
multiple assorted issues in it and clean it up (Perry Yuan, Mario
Limonciello, Dhananjay Ugwekar, Meng Li, Xiaojian Du)
- Add Allwinner H700 speed bin to the sun50i cpufreq driver (Ryan
Walklin)
- Fix memory leaks and of_node_put() usage in the sun50i and
qcom-nvmem cpufreq drivers (Javier Carrasco)
- Clean up the sti and dt-platdev cpufreq drivers (Jeff Johnson,
Raphael Gallais-Pou)
- Fix deferred probe handling in the TI cpufreq driver and wrong
return values of ti_opp_supply_probe(), and add OPP tables for the
AM62Ax and AM62Px SoCs to it (Bryan Brattlof, Primoz Fiser)
- Avoid overflow of target_freq in .fast_switch() in the SCMI cpufreq
driver (Jagadeesh Kona)
- Use dev_err_probe() in every error path in probe in the Mediatek
cpufreq driver (Nícolas Prado)
- Fix kernel-doc param for longhaul_setstate in the longhaul cpufreq
driver (Yang Li)
- Fix system resume handling in the CPPC cpufreq driver (Riwen Lu)
- Improve the teo cpuidle governor and clean up leftover comments
from the menu cpuidle governor (Christian Loehle)
- Clean up a comment typo in the teo cpuidle governor (Atul Kumar
Pant)
- Add missing MODULE_DESCRIPTION() macro to cpuidle haltpoll (Jeff
Johnson)
- Switch the intel_idle driver to new Intel CPU model defines (Tony
Luck)
- Switch the Intel RAPL driver new Intel CPU model defines (Tony
Luck)
- Simplify if condition in the idle_inject driver (Thorsten Blum)
- Fix missing cleanup on error in _opp_attach_genpd() (Viresh Kumar)
- Introduce an OF helper function to inform if required-opps is used
and drop a redundant in-parameter to _set_opp_level() (Ulf Hansson)
- Update pm-graph to v5.12 which includes fixes and major code revamp
for python3.12 (Todd Brandt)
- Address several assorted issues in the cpupower utility (Roman
Storozhenko)"
* tag 'pm-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (77 commits)
cpufreq: sti: fix build warning
cpufreq: mediatek: Use dev_err_probe in every error path in probe
cpufreq: Add Loongson-3 CPUFreq driver support
cpufreq: Make cpufreq_driver->exit() return void
cpufreq/amd-pstate: Fix the scaling_max_freq setting on shared memory CPPC systems
cpufreq/amd-pstate-ut: Convert nominal_freq to khz during comparisons
cpufreq: pcc: Remove empty exit() callback
cpufreq: loongson2: Remove empty exit() callback
cpufreq: nforce2: Remove empty exit() callback
cpupower: fix lib default installation path
cpufreq: docs: Add missing scaling_available_frequencies description
cpuidle: teo: Don't count non-existent intercepts
cpupower: Disable direct build of the 'bench' subproject
cpuidle: teo: Remove recent intercepts metric
Revert: "cpuidle: teo: Introduce util-awareness"
cpufreq: make cpufreq_boost_enabled() return bool
cpufreq: intel_pstate: Support highest performance change interrupt
x86/cpufeatures: Add HWP highest perf change feature flag
Documentation: cpufreq: amd-pstate: update doc for Per CPU boost control method
cpufreq: amd-pstate: Cap the CPPC.max_perf to nominal_perf if CPB is off
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull hardening updates from Kees Cook:
- lkdtm/bugs: add test for hung smp_call_function_single() (Mark
Rutland)
- gcc-plugins: Remove duplicate included header file stringpool.h
(Thorsten Blum)
- ARM: Remove address checking for MMUless devices (Yanjun Yang)
- randomize_kstack: Clean up per-arch entropy and codegen
- KCFI: Make FineIBT mode Kconfig selectable
- fortify: Do not special-case 0-sized destinations
* tag 'hardening-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
randomize_kstack: Improve stack alignment codegen
ARM: Remove address checking for MMUless devices
gcc-plugins: Remove duplicate included header file stringpool.h
randomize_kstack: Remove non-functional per-arch entropy filtering
fortify: Do not special-case 0-sized destinations
x86/alternatives: Make FineIBT mode Kconfig selectable
lkdtm/bugs: add test for hung smp_call_function_single()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen updates from Juergen Gross:
- some trivial cleanups
- a fix for the Xen timer
- add boot time selectable debug capability to the Xen multicall
handling
- two fixes for the recently added Xen irqfd handling
* tag 'for-linus-6.11-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: remove deprecated xen_nopvspin boot parameter
x86/xen: eliminate some private header files
x86/xen: make some functions static
xen: make multicall debug boot time selectable
xen/arm: Convert comma to semicolon
xen: privcmd: Fix possible access to a freed kirqfd instance
xen: privcmd: Switch from mutex to spinlock for irqfds
xen: add missing MODULE_DESCRIPTION() macros
x86/xen: Convert comma to semicolon
x86/xen/time: Reduce Xen timer tick
xen/manage: Constify struct shutdown_handler
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"Note the removal of the EFI fake memory map support - this is believed
to be unused and no longer worth supporting. However, we could easily
bring it back if needed.
With recent developments regarding confidential VMs and unaccepted
memory, combined with kexec, creating a known inaccurate view of the
firmware's memory map and handing it to the OS is a feature we can
live without, hence the removal. Alternatively, I could imagine making
this feature mutually exclusive with those confidential VM related
features, but let's try simply removing it first.
Summary:
- Drop support for the 'fake' EFI memory map on x86
- Add an SMBIOS based tweak to the EFI stub instructing the firmware
on x86 Macbook Pros to keep both GPUs enabled
- Replace 0-sized array with flexible array in EFI memory attributes
table handling
- Drop redundant BSS clearing when booting via the native PE
entrypoint on x86
- Avoid returning EFI_SUCCESS when aborting on an out-of-memory
condition
- Cosmetic tweak for arm64 KASLR loading logic"
* tag 'efi-next-for-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: Replace efi_memory_attributes_table_t 0-sized array with flexible array
efi: Rename efi_early_memdesc_ptr() to efi_memdesc_ptr()
arm64/efistub: Clean up KASLR logic
x86/efistub: Drop redundant clearing of BSS
x86/efistub: Avoid returning EFI_SUCCESS on error
x86/efistub: Call Apple set_os protocol on dual GPU Intel Macs
x86/efistub: Enable SMBIOS protocol handling for x86
efistub/smbios: Simplify SMBIOS enumeration API
x86/efi: Drop support for fake EFI memory maps
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"Most of this is part of my ongoing work to clean up the system call
tables. In this bit, all of the newer architectures are converted to
use the machine readable syscall.tbl format instead in place of
complex macros in include/uapi/asm-generic/unistd.h.
This follows an earlier series that fixed various API mismatches and
in turn is used as the base for planned simplifications.
The other two patches are dead code removal and a warning fix"
* tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
vmlinux.lds.h: catch .bss..L* sections into BSS")
fixmap: Remove unused set_fixmap_offset_io()
riscv: convert to generic syscall table
openrisc: convert to generic syscall table
nios2: convert to generic syscall table
loongarch: convert to generic syscall table
hexagon: use new system call table
csky: convert to generic syscall table
arm64: rework compat syscall macros
arm64: generate 64-bit syscall.tbl
arm64: convert unistd_32.h to syscall.tbl format
arc: convert to generic syscall table
clone3: drop __ARCH_WANT_SYS_CLONE3 macro
kbuild: add syscall table generation to scripts/Makefile.asm-headers
kbuild: verify asm-generic header list
loongarch: avoid generating extra header files
um: don't generate asm/bpf_perf_event.h
csky: drop asm/gpio.h wrapper
syscalls: add generic scripts/syscall.tbl
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Add support for running the kernel in a SEV-SNP guest, over a Secure
VM Service Module (SVSM).
When running over a SVSM, different services can run at different
protection levels, apart from the guest OS but still within the
secure SNP environment. They can provide services to the guest, like
a vTPM, for example.
This series adds the required facilities to interface with such a
SVSM module.
- The usual fixlets, refactoring and cleanups
[ And as always: "SEV" is AMD's "Secure Encrypted Virtualization".
I can't be the only one who gets all the newer x86 TLA's confused,
can I?
- Linus ]
* tag 'x86_sev_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Documentation/ABI/configfs-tsm: Fix an unexpected indentation silly
x86/sev: Do RMP memory coverage check after max_pfn has been set
x86/sev: Move SEV compilation units
virt: sev-guest: Mark driver struct with __refdata to prevent section mismatch
x86/sev: Allow non-VMPL0 execution when an SVSM is present
x86/sev: Extend the config-fs attestation support for an SVSM
x86/sev: Take advantage of configfs visibility support in TSM
fs/configfs: Add a callback to determine attribute visibility
sev-guest: configfs-tsm: Allow the privlevel_floor attribute to be updated
virt: sev-guest: Choose the VMPCK key based on executing VMPL
x86/sev: Provide guest VMPL level to userspace
x86/sev: Provide SVSM discovery support
x86/sev: Use the SVSM to create a vCPU when not in VMPL0
x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0
x86/sev: Use kernel provided SVSM Calling Areas
x86/sev: Check for the presence of an SVSM in the SNP secrets page
x86/irqflags: Provide native versions of the local_irq_save()/restore()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 resource control updates from Borislav Petkov:
- Enable Sub-NUMA clustering to work with resource control on Intel by
teaching resctrl to handle scopes due to the clustering which
partitions the L3 cache into sets. Modify and extend the subsystem to
handle such scopes properly
* tag 'x86_cache_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/resctrl: Update documentation with Sub-NUMA cluster changes
x86/resctrl: Detect Sub-NUMA Cluster (SNC) mode
x86/resctrl: Enable shared RMID mode on Sub-NUMA Cluster (SNC) systems
x86/resctrl: Make __mon_event_count() handle sum domains
x86/resctrl: Fill out rmid_read structure for smp_call*() to read a counter
x86/resctrl: Handle removing directories in Sub-NUMA Cluster (SNC) mode
x86/resctrl: Create Sub-NUMA Cluster (SNC) monitor files
x86/resctrl: Allocate a new field in union mon_data_bits
x86/resctrl: Refactor mkdir_mondata_subdir() with a helper function
x86/resctrl: Initialize on-stack struct rmid_read instances
x86/resctrl: Add a new field to struct rmid_read for summation of domains
x86/resctrl: Prepare for new Sub-NUMA Cluster (SNC) monitor files
x86/resctrl: Block use of mba_MBps mount option on Sub-NUMA Cluster (SNC) systems
x86/resctrl: Introduce snc_nodes_per_l3_cache
x86/resctrl: Add node-scope to the options for feature scope
x86/resctrl: Split the rdt_domain and rdt_hw_domain structures
x86/resctrl: Prepare for different scope for control/monitor operations
x86/resctrl: Prepare to split rdt_domain structure
x86/resctrl: Prepare for new domain scope
|
|
Similar to kvm_x86_call(), kvm_pmu_call() is added to streamline the usage
of static calls of kvm_pmu_ops, which improves code readability.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-4-wei.w.wang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Introduces kvm_x86_call(), to streamline the usage of static calls of
kvm_x86_ops. The current implementation of these calls is verbose and
could lead to alignment challenges. This makes the code susceptible to
exceeding the "80 columns per single line of code" limit as defined in
the coding-style document. Another issue with the existing implementation
is that the addition of kvm_x86_ prefix to hooks at the static_call sites
hinders code readability and navigation. kvm_x86_call() is added to
improve code readability and maintainability, while adhering to the coding
style guidelines.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-3-wei.w.wang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The use of static_call_cond() is essentially the same as static_call() on
x86 (e.g. static_call() now handles a NULL pointer as a NOP), so replace
it with static_call() to simplify the code.
Link: https://lore.kernel.org/all/3916caa1dcd114301a49beafa5030eca396745c1.1679456900.git.jpoimboe@kernel.org/
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240507133103.15052-2-wei.w.wang@intel.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
The GHCB 2.0 specification defines 2 GHCB request types to allow SNP guests
to send encrypted messages/requests to firmware: SNP Guest Requests and SNP
Extended Guest Requests. These encrypted messages are used for things like
servicing attestation requests issued by the guest. Implementing support for
these is required to be fully GHCB-compliant.
For the most part, KVM only needs to handle forwarding these requests to
firmware (to be issued via the SNP_GUEST_REQUEST firmware command defined
in the SEV-SNP Firmware ABI), and then forwarding the encrypted response to
the guest.
However, in the case of SNP Extended Guest Requests, the host is also
able to provide the certificate data corresponding to the endorsement key
used by firmware to sign attestation report requests. This certificate data
is provided by userspace because:
1) It allows for different keys/key types to be used for each particular
guest with requiring any sort of KVM API to configure the certificate
table in advance on a per-guest basis.
2) It provides additional flexibility with how attestation requests might
be handled during live migration where the certificate data for
source/dest might be different.
3) It allows all synchronization between certificates and firmware/signing
key updates to be handled purely by userspace rather than requiring
some in-kernel mechanism to facilitate it. [1]
To support fetching certificate data from userspace, a new KVM exit type will
be needed to handle fetching the certificate from userspace. An attempt to
define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle this
was introduced in v1 of this patchset, but is still being discussed by
community, so for now this patchset only implements a stub version of SNP
Extended Guest Requests that does not provide certificate data, but is still
enough to provide compliance with the GHCB 2.0 spec.
|
|
Version 2 of GHCB specification added support for the SNP Extended Guest
Request Message NAE event. This event serves a nearly identical purpose
to the previously-added SNP_GUEST_REQUEST event, but for certain message
types it allows the guest to supply a buffer to be used for additional
information in some cases.
Currently the GHCB spec only defines extended handling of this sort in
the case of attestation requests, where the additional buffer is used to
supply a table of certificate data corresponding to the attestion
report's signing key. Support for this extended handling will require
additional KVM APIs to handle coordinating with userspace.
Whether or not the hypervisor opts to provide this certificate data is
optional. However, support for processing SNP_EXTENDED_GUEST_REQUEST
GHCB requests is required by the GHCB 2.0 specification for SNP guests,
so for now implement a stub implementation that provides an empty
certificate table to the guest if it supplies an additional buffer, but
otherwise behaves identically to SNP_GUEST_REQUEST.
Reviewed-by: Carlos Bilbao <carlos.bilbao.osdev@gmail.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
sev_guest.h currently contains various definitions relating to the
format of SNP_GUEST_REQUEST commands to SNP firmware. Currently only the
sev-guest driver makes use of them, but when the KVM side of this is
implemented there's a need to parse the SNP_GUEST_REQUEST header to
determine whether additional information needs to be provided to the
guest. Prepare for this by moving those definitions to a common header
that's shared by host/guest code so that KVM can also make use of them.
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-3-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Version 2 of GHCB specification added support for the SNP Guest Request
Message NAE event. The event allows for an SEV-SNP guest to make
requests to the SEV-SNP firmware through the hypervisor using the
SNP_GUEST_REQUEST API defined in the SEV-SNP firmware specification.
This is used by guests primarily to request attestation reports from
firmware. There are other request types are available as well, but the
specifics of what guest requests are being made generally does not
affect how they are handled by the hypervisor, which only serves as a
proxy for the guest requests and firmware responses.
Implement handling for these events.
When an SNP Guest Request is issued, the guest will provide its own
request/response pages, which could in theory be passed along directly
to firmware. However, these pages would need special care:
- Both pages are from shared guest memory, so they need to be
protected from migration/etc. occurring while firmware reads/writes
to them. At a minimum, this requires elevating the ref counts and
potentially needing an explicit pinning of the memory. This places
additional restrictions on what type of memory backends userspace
can use for shared guest memory since there would be some reliance
on using refcounted pages.
- The response page needs to be switched to Firmware-owned state
before the firmware can write to it, which can lead to potential
host RMP #PFs if the guest is misbehaved and hands the host a
guest page that KVM is writing to for other reasons (e.g. virtio
buffers).
Both of these issues can be avoided completely by using
separately-allocated bounce pages for both the request/response pages
and passing those to firmware instead. So that's the approach taken
here.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Ashish Kalra <ashish.kalra@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
[mdr: ensure FW command failures are indicated to guest, drop extended
request handling to be re-written as separate patch, massage commit]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240701223148.3798365-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Explicitly suppress userspace emulated MMIO exits that are triggered when
emulating a task switch as KVM doesn't support userspace MMIO during
complex (multi-step) emulation. Silently ignoring the exit request can
result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to
userspace for some other reason prior to purging mmio_needed.
See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if
emulator detects exception") for more details on KVM's limitations with
respect to emulated MMIO during complex emulator flows.
Reported-by: syzbot+2fb9f8ed752c01bc9a3f@syzkaller.appspotmail.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712144841.1230591-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Tweak the definition of make_huge_page_split_spte() to eliminate an
unnecessarily long line, and opportunistically initialize child_spte to
make it more obvious that the child is directly derived from the huge
parent.
No functional change intended.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712151335.1242633-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Bug the VM instead of simply warning if KVM tries to split a SPTE that is
non-present or not-huge. KVM is guaranteed to end up in a broken state as
the callers fully expect a valid SPTE, e.g. the shadow MMU will add an
rmap entry, and all MMUs will account the expected small page. Returning
'0' is also technically wrong now that SHADOW_NONPRESENT_VALUE exists,
i.e. would cause KVM to create a potential #VE SPTE.
While it would be possible to have the callers gracefully handle failure,
doing so would provide no practical value as the scenario really should be
impossible, while the error handling would add a non-trivial amount of
noise.
Fixes: a3fe5dbda0a4 ("KVM: x86/mmu: Split huge pages mapped by the TDP MMU when dirty logging is enabled")
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-ID: <20240712151335.1242633-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
KVM VMX changes for 6.11
- Remove an unnecessary EPT TLB flush when enabling hardware.
- Fix a series of bugs that cause KVM to fail to detect nested pending posted
interrupts as valid wake eents for a vCPU executing HLT in L2 (with
HLT-exiting disable by L1).
- Misc cleanups
|
|
KVM SVM changes for 6.11
- Make per-CPU save_area allocations NUMA-aware.
- Force sev_es_host_save_area() to be inlined to avoid calling into an
instrumentable function from noinstr code.
|
|
KVM x86/pmu changes for 6.11
- Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads
'0' and writes from userspace are ignored.
- Update to the newfangled Intel CPU FMS infrastructure.
- Use macros instead of open-coded literals to clean up KVM's manipulation of
FIXED_CTR_CTRL MSRs.
|
|
KVM x86 MTRR virtualization removal
Remove support for virtualizing MTRRs on Intel CPUs, along with a nasty CR0.CD
hack, and instead always honor guest PAT on CPUs that support self-snoop.
|
|
KVM x86 MMU changes for 6.11
- Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't
hold leafs SPTEs.
- Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager
page splitting to avoid stalling vCPUs when splitting huge pages.
- Misc cleanups
|
|
KVM x86 misc changes for 6.11
- Add a global struct to consolidate tracking of host values, e.g. EFER, and
move "shadow_phys_bits" into the structure as "maxphyaddr".
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
bus frequency, because TDX.
- Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
- Clean up KVM's handling of vendor specific emulation to consistently act on
"compatible with Intel/AMD", versus checking for a specific vendor.
- Misc cleanups
|
|
KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
- Setup empty IRQ routing when creating a VM to avoid having to synchronize
SRCU when creating a split IRQCHIP on x86.
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
that arch code can use for hooking both sched_in() and sched_out().
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace detect bugs.
- Mark a vCPU as preempted if and only if it's scheduled out while in the
KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
memory when retrieving guest state during live migration blackout.
- A few minor cleanups
|
|
KVM Xen:
Fix a bug where KVM fails to check the validity of an incoming userspace
virtual address and tries to activate a gfn_to_pfn_cache with a kernel address.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu model updates from Borislav Petkov:
- Flip the logic to add feature names to /proc/cpuinfo to having to
explicitly specify the flag if there's a valid reason to show it in
/proc/cpuinfo
- Switch a bunch of Intel x86 model checking code to the new CPU model
defines
- Fixes and cleanups
* tag 'x86_cpu_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu/intel: Drop stray FAM6 check with new Intel CPU model defines
x86/cpufeatures: Flip the /proc/cpuinfo appearance logic
x86/CPU/AMD: Always inline amd_clear_divider()
x86/mce/inject: Add missing MODULE_DESCRIPTION() line
perf/x86/rapl: Switch to new Intel CPU model defines
x86/boot: Switch to new Intel CPU model defines
x86/cpu: Switch to new Intel CPU model defines
perf/x86/intel: Switch to new Intel CPU model defines
x86/virt/tdx: Switch to new Intel CPU model defines
x86/PCI: Switch to new Intel CPU model defines
x86/cpu/intel: Switch to new Intel CPU model defines
x86/platform/intel-mid: Switch to new Intel CPU model defines
x86/pconfig: Remove unused MKTME pconfig code
x86/cpu: Remove useless work in detect_tme_early()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu mitigation updates from Borislav Petkov:
- Add a spectre_bhi=vmexit mitigation option aimed at cloud
environments
- Remove duplicated Spectre cmdline option documentation
- Add separate macro definitions for syscall handlers which do not
return in order to address objtool warnings
* tag 'x86_bugs_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/bugs: Add 'spectre_bhi=vmexit' cmdline option
x86/bugs: Remove duplicate Spectre cmdline option descriptions
x86/syscall: Mark exit[_group] syscall handlers __noreturn
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vmware updates from Borislav Petkov:
- Add a unified VMware hypercall API layer which should be used by all
callers instead of them doing homegrown solutions. This will provide
for adding API support for confidential computing solutions like TDX
* tag 'x86_vmware_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/vmware: Add TDX hypercall support
x86/vmware: Remove legacy VMWARE_HYPERCALL* macros
x86/vmware: Correct macro names
x86/vmware: Use VMware hypercall API
drm/vmwgfx: Use VMware hypercall API
input/vmmouse: Use VMware hypercall API
ptp/vmware: Use VMware hypercall API
x86/vmware: Introduce VMware hypercall API
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Make error checking of AMD SMN accesses more robust in the callers as
they're the only ones who can interpret the results properly
- The usual cleanups and fixes, left and right
* tag 'x86_misc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kmsan: Fix hook for unaligned accesses
x86/platform/iosf_mbi: Convert PCIBIOS_* return codes to errnos
x86/pci/xen: Fix PCIBIOS_* return code handling
x86/pci/intel_mid_pci: Fix PCIBIOS_* return code handling
x86/of: Return consistent error type from x86_of_pci_irq_enable()
hwmon: (k10temp) Rename _data variable
hwmon: (k10temp) Remove unused HAVE_TDIE() macro
hwmon: (k10temp) Reduce k10temp_get_ccd_support() parameters
hwmon: (k10temp) Define a helper function to read CCD temperature
x86/amd_nb: Enhance SMN access error checking
hwmon: (k10temp) Check return value of amd_smn_read()
EDAC/amd64: Check return value of amd_smn_read()
EDAC/amd64: Remove unused register accesses
tools/x86/kcpuid: Add missing dir via Makefile
x86, arm: Add missing license tag to syscall tables files
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build update from Borislav Petkov:
- Make sure insn support detection uses the proper compiler flag in
bi-arch builds
* tag 'x86_build_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/kconfig: Add as-instr64 macro to properly evaluate AS_WRUSS
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 uaccess update from Borislav Petkov:
- Cleanup the 8-byte getuser() asm case
* tag 'x86_core_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/uaccess: Improve the 8-byte getuser() case
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 confidential computing updates from Borislav Petkov:
"Unrelated x86/cc changes queued here to avoid ugly cross-merges and
conflicts:
- Carve out CPU hotplug function declarations into a separate header
with the goal to be able to use the lockdep assertions in a more
flexible manner
- As a result, refactor cacheinfo code after carving out a function
to return the cache ID associated with a given cache level
- Cleanups
Add support to be able to kexec TDX guests:
- Expand ACPI MADT CPU offlining support
- Add machinery to prepare CoCo guests memory before kexec-ing into a
new kernel
- Cleanup, readjust and massage related code"
* tag 'x86_cc_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
ACPI: tables: Print MULTIPROC_WAKEUP when MADT is parsed
x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method
x86/mm: Introduce kernel_ident_mapping_free()
x86/smp: Add smp_ops.stop_this_cpu() callback
x86/acpi: Do not attempt to bring up secondary CPUs in the kexec case
x86/acpi: Rename fields in the acpi_madt_multiproc_wakeup structure
x86/mm: Do not zap page table entries mapping unaccepted memory table during kdump
x86/mm: Make e820__end_ram_pfn() cover E820_TYPE_ACPI ranges
x86/tdx: Convert shared memory back to private on kexec
x86/mm: Add callbacks to prepare encrypted memory for kexec
x86/tdx: Account shared memory
x86/mm: Return correct level from lookup_address() if pte is none
x86/mm: Make x86_platform.guest.enc_status_change_*() return an error
x86/kexec: Keep CR4.MCE set during kexec for TDX guest
x86/relocate_kernel: Use named labels for less confusion
cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup
cpu/hotplug: Add support for declaring CPU offlining not supported
x86/apic: Mark acpi_mp_wake_* variables as __ro_after_init
x86/acpi: Extract ACPI MADT wakeup code into a separate file
x86/kexec: Remove spurious unconditional JMP from from identity_mapped()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
- Remove an unused function and the documentation of an already removed
cmdline parameter
* tag 'x86_cleanups_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Remove unused function __fortify_panic()
Documentation: Remove "mfgpt_irq=" from the kernel-parameters.txt file
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Borislav Petkov:
- Add a check to warn when cmdline parsing happens before the final
cmdline string has been built and thus arguments can get lost
- Code cleanups and simplifications
* tag 'x86_boot_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/setup: Warn when option parsing is done too early
x86/boot: Clean up the arch/x86/boot/main.c code a bit
x86/boot: Use current_stack_pointer to avoid asm() in init_heap()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 alternatives updates from Borislav Petkov:
"This is basically PeterZ's idea to nest the alternative macros to
avoid the need to "spell out" the number of alternates in an
ALTERNATIVE_n() macro and thus have an ever-increasing complexity in
those definitions.
For ease of bisection, the old macros are converted to the new, nested
variants in a step-by-step manner so that in case an issue is
encountered during testing, one can pinpoint the place where it fails
easier.
Because debugging alternatives is a serious pain"
* tag 'x86_alternatives_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/alternatives, kvm: Fix a couple of CALLs without a frame pointer
x86/alternative: Replace the old macros
x86/alternative: Convert the asm ALTERNATIVE_3() macro
x86/alternative: Convert the asm ALTERNATIVE_2() macro
x86/alternative: Convert the asm ALTERNATIVE() macro
x86/alternative: Convert ALTERNATIVE_3()
x86/alternative: Convert ALTERNATIVE_TERNARY()
x86/alternative: Convert alternative_call_2()
x86/alternative: Convert alternative_call()
x86/alternative: Convert alternative_io()
x86/alternative: Convert alternative_input()
x86/alternative: Convert alternative_2()
x86/alternative: Convert alternative()
x86/alternatives: Add nested alternatives macros
x86/alternative: Zap alternative_ternary()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Borislav Petkov:
- A cleanup and a correction to the error injection driver to inject a
MCA_MISC value only when one has actually been supplied by the user
* tag 'ras_core_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Remove unused variable and return value in machine_check_poll()
x86/mce/inject: Only write MCA_MISC when a value has been supplied
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
"Updates for timers, timekeeping and related functionality:
Core:
- Make the takeover of a hrtimer based broadcast timer reliable
during CPU hot-unplug. The current implementation suffers from a
race which can lead to broadcast timer starvation in the worst
case.
- VDSO related cleanups and simplifications
- Small cleanups and enhancements all over the place
PTP:
- Replace the architecture specific base clock to clocksource, e.g.
ART to TSC, conversion function with generic functionality to avoid
exposing such internals to drivers and convert all existing drivers
over. This also allows to provide functionality which converts the
other way round in the core code based on the same parameter set.
- Provide a function to convert CLOCK_REALTIME to the base clock to
support the upcoming PPS output driver on Intel platforms.
Drivers:
- A set of Device Tree bindings for new hardware
- Cleanups and enhancements all over the place"
* tag 'timers-core-2024-07-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
clocksource/drivers/realtek: Add timer driver for rtl-otto platforms
dt-bindings: timer: Add schema for realtek,otto-timer
dt-bindings: timer: Add SOPHGO SG2002 clint
dt-bindings: timer: renesas,tmu: Add R-Car Gen2 support
dt-bindings: timer: renesas,tmu: Add RZ/G1 support
dt-bindings: timer: renesas,tmu: Add R-Mobile APE6 support
clocksource/drivers/mips-gic-timer: Correct sched_clock width
clocksource/drivers/mips-gic-timer: Refine rating computation
clocksource/drivers/sh_cmt: Address race condition for clock events
clocksource/driver/arm_global_timer: Remove unnecessary ‘0’ values from err
clocksource/drivers/arm_arch_timer: Remove unnecessary ‘0’ values from irq
tick/broadcast: Make takeover of broadcast hrtimer reliable
tick/sched: Combine WARN_ON_ONCE and print_once
x86/vdso: Remove unused include
x86/vgtod: Remove unused typedef gtod_long_t
x86/vdso: Fix function reference in comment
vdso: Add comment about reason for vdso struct ordering
vdso/gettimeofday: Clarify comment about open coded function
timekeeping: Add missing kernel-doc function comments
tick: Remove unnused tick_nohz_get_idle_calls()
...
|
|
Merge minor word-at-a-time instruction choice improvements for x86 and
arm64.
This is the second of four branches that came out of me looking at the
code generation for path lookup on arm64.
The word-at-a-time infrastructure is used to do string operations in
chunks of one word both when copying the pathname from user space (in
strncpy_from_user()), and when parsing and hashing the individual path
components (in link_path_walk()).
In particular, the "find the first zero byte" uses various bit tricks to
figure out the end of the string or path component, and get the length
without having to do things one byte at a time. Both x86-64 and arm64
had less than optimal code choices for that.
The commit message for the arm64 change in particular tries to explain
the exact code flow for the zero byte finding for people who care. It's
made a bit more complicated by the fact that we support big-endian
hardware too, and so we have some extra abstraction layers to allow
different models for finding the zero byte, quite apart from the issue
of picking specialized instructions.
* word-at-a-time:
arm64: word-at-a-time: improve byte count calculations for LE
x86-64: word-at-a-time: improve byte count calculations
|
|
Merge runtime constants infrastructure with implementations for x86 and
arm64.
This is one of four branches that came out of me looking at profiles of
my kernel build filesystem load on my 128-core Altra arm64 system, where
pathname walking and the user copies (particularly strncpy_from_user()
for fetching the pathname from user space) is very hot.
This is a very specialized "instruction alternatives" model where the
dentry hash pointer and hash count will be constants for the lifetime of
the kernel, but the allocation are not static but done early during the
kernel boot. In order to avoid the pointer load and dynamic shift, we
just rewrite the constants in the instructions in place.
We can't use the "generic" alternative instructions infrastructure,
because different architectures do it very differently, and it's
actually simpler to just have very specific helpers, with a fallback to
the generic ("old") model of just using variables for architectures that
do not implement the runtime constant patching infrastructure.
Link: https://lore.kernel.org/all/CAHk-=widPe38fUNjUOmX11ByDckaeEo9tN4Eiyke9u1SAtu9sA@mail.gmail.com/
* runtime-constants:
arm64: add 'runtime constant' support
runtime constants: add x86 architecture support
runtime constants: add default dummy infrastructure
vfs: dcache: move hashlen_hash() from callers into d_hash()
|
|
After discussing with Arnd [1] it's preferable to change uretprobe
syscall number to 467 to omit the merge conflict with xattrat syscalls.
Also changing the ABI to 'common' which will ease up the global
scripts/syscall.tbl management. One consequence is we generate uretprobe
syscall numbers for ABIs that do not support uretprobe syscall, but the
syscall still returns -ENOSYS when called in that ABI.
[1] https://lore.kernel.org/lkml/784a34e5-4654-44c9-9c07-f9f4ffd952a0@app.fastmail.com/
Link: https://lore.kernel.org/all/20240712135228.1619332-2-jolsa@kernel.org/
Fixes: 190fec72df4a ("uprobe: Wire up uretprobe system call")
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
https://git.linaro.org/people/daniel.lezcano/linux into timers/core
Pull clocksource/event driver updates from Daniel Lezcano:
- Remove unnecessary local variables initialization as they will be
initialized in the code path anyway right after on the ARM arch
timer and the ARM global timer (Li kunyu)
- Fix a race condition in the interrupt leading to a deadlock on the
SH CMT driver. Note that this fix was not tested on the platform
using this timer but the fix seems reasonable enough to be picked
confidently (Niklas Söderlund)
- Increase the rating of the gic-timer and use the configured width
clocksource register on the MIPS architecture (Jiaxun Yang)
- Add the DT bindings for the TMU on the Renesas platforms (Geert
Uytterhoeven)
- Add the DT bindings for the SOPHGO SG2002 clint on RiscV (Thomas
Bonnefille)
- Add the rtl-otto timer driver along with the DT bindings for the
Realtek platform (Chris Packham)
Link: https://lore.kernel.org/all/91cd05de-4c5d-4242-a381-3b8a4fe6a2a2@linaro.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD
LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
|
|
Pre-population has been requested several times to mitigate KVM page faults
during guest boot or after live migration. It is also required by TDX
before filling in the initial guest memory with measured contents.
Introduce it as a generic API.
|
|
Wire KVM_PRE_FAULT_MEMORY ioctl to kvm_mmu_do_page_fault() to populate guest
memory. It can be called right after KVM_CREATE_VCPU creates a vCPU,
since at that point kvm_mmu_create() and kvm_init_mmu() are called and
the vCPU is ready to invoke the KVM page fault handler.
The helper function kvm_tdp_map_page() takes care of the logic to
process RET_PF_* return values and convert them to success or errno.
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9b866a0ae7147f96571c439e75429a03dcb659b6.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|