aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-11-03Merge tag 'perf-tools-for-v6.7-1-2023-11-01' of ↵Linus Torvalds251-1821/+31705
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Build: - Compile BPF programs by default if clang (>= 12.0.1) is available to enable more features like kernel lock contention, off-cpu profiling, kwork, sample filtering and so on. This can be disabled by passing BUILD_BPF_SKEL=0 to make. - Produce better error messages for bison on debug build (make DEBUG=1) by defining YYDEBUG symbol internally. perf record: - Track sideband events (like FORK/MMAP) from all CPUs even if perf record targets a subset of CPUs only (using -C option). Otherwise it may lose some information happened on a CPU out of the target list. - Fix checking raw sched_switch tracepoint argument using system BTF. This affects off-cpu profiling which attaches a BPF program to the raw tracepoint. perf lock contention: - Add --lock-cgroup option to see contention by cgroups. This should be used with BPF only (using -b option). $ sudo perf lock con -ab --lock-cgroup -- sleep 1 contended total wait max wait avg wait cgroup 835 14.06 ms 41.19 us 16.83 us /system.slice/led.service 25 122.38 us 13.77 us 4.89 us / 44 23.73 us 3.87 us 539 ns /user.slice/user-657345.slice/session-c4.scope 1 491 ns 491 ns 491 ns /system.slice/connectd.service - Add -G/--cgroup-filter option to see contention only for given cgroups. This can be useful when you identified a cgroup in the above command and want to investigate more on it. It also works with other output options like -t/--threads and -l/--lock-addr. $ sudo perf lock con -ab -G /user.slice/user-657345.slice/session-c4.scope -- sleep 1 contended total wait max wait avg wait type caller 8 77.11 us 17.98 us 9.64 us spinlock futex_wake+0xc8 2 24.56 us 14.66 us 12.28 us spinlock tick_do_update_jiffies64+0x25 1 4.97 us 4.97 us 4.97 us spinlock futex_q_lock+0x2a - Use per-cpu array for better spinlock tracking. This is to improve performance of the BPF program and to avoid nested contention on a lock in the BPF hash map. - Update callstack check for PowerPC. To find a representative caller of a lock, it needs to look up the call stacks. It ends the lookup when it sees 0 in the call stack buffer. However, PowerPC call stacks can have 0 values in the beginning so skip them when it expects valid call stacks after. perf kwork: - Support 'sched' class (for -k option) so that it can see task scheduling event (using sched_switch tracepoint) as well as irq and workqueue items. - Add perf kwork top subcommand to show more accurate cpu utilization with sched class above. It works both with a recorded data (using perf kwork record command) and BPF (using -b option). Unlike perf top command, it does not support interactive mode (yet). $ sudo perf kwork top -b -k sched Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [|||||||||||||||||| 61.66%] %Cpu1 [|||||||||||||||||| 61.27%] %Cpu2 [||||||||||||||||||| 66.40%] %Cpu3 [|||||||||||||||||| 61.28%] %Cpu4 [|||||||||||||||||| 61.82%] %Cpu5 [||||||||||||||||||||||| 77.41%] %Cpu6 [|||||||||||||||||| 61.73%] %Cpu7 [|||||||||||||||||| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> - Add hard/soft-irq statistics to perf kwork top. This will show the total CPU utilization with IRQ stats like below: $ sudo perf kwork top -b -k sched,irq,softirq Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 12554.889 ms, 8 cpus %Cpu(s): 96.23% id, 0.10% hi, 0.19% si <---- here %Cpu0 [| 4.60%] %Cpu1 [| 4.59%] %Cpu2 [ 2.73%] %Cpu3 [| 3.81%] <SNIP> perf bench: - Add -G/--cgroups option to perf bench sched pipe. The pipe bench is good to measure context switch overhead. With this option, it puts the reader and writer tasks in separate cgroups to enforce context switch between two different cgroups. Also it needs to set CPU affinity of the tasks in a CPU to accurately measure the impact of cgroup context switches. $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.307 [sec] 3.078180 usecs/op 324867 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000': 200,026 context-switches 63 cgroup-switches 0.321637922 seconds time elapsed You can see small number of cgroup-switches because both write and read tasks are in the same cgroup. $ sudo mkdir /sys/fs/cgroup/{AAA,BBB} $ sudo perf stat -e context-switches,cgroup-switches -- \ > taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB # Running 'sched/pipe' benchmark: # Executed 100000 pipe operations between two processes Total time: 0.351 [sec] 3.512990 usecs/op 284657 ops/sec Performance counter stats for 'taskset -c 0 perf bench sched pipe -l 100000 -G AAA,BBB': 200,020 context-switches 200,019 cgroup-switches 0.365034567 seconds time elapsed Now context-switches and cgroup-switches are almost same. And you can see the pipe operation took little more. - Kill child processes when perf bench sched messaging exited abnormally. Otherwise it'd leave the child doing unnecessary work. perf test: - Fix various shellcheck issues on the tests written in shell script. - Skip tests when condition is not satisfied: - object code reading test for non-text section addresses. - CoreSight test if cs_etm// event is not available. - lock contention test if not enough CPUs. Event parsing: - Make PMU alias name loading lazy to reduce the startup time in the event parsing code for perf record, stat and others in the general case. - Lazily compute PMU default config. In the same sense, delay PMU initialization until it's really needed to reduce the startup cost. - Fix event term values that are raw events. The event specification can have several terms including event name. But sometimes it clashes with raw event encoding which starts with 'r' and has hex-digits. For example, an event named 'read' should be processed as a normal event but it was mis-treated as a raw encoding and caused a failure. $ perf stat -e 'uncore_imc_free_running/event=read/' -a sleep 1 event syntax error: '..nning/event=read/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Event metrics: - Add "Compat" regex to match event with multiple identifiers. - Usual updates for Intel, Power10, Arm telemetry/CMN and AmpereOne. Misc: - Assorted memory leak fixes and footprint reduction. - Add "bpf_skeletons" to perf version --build-options so that users can check whether their perf tools have BPF support easily. - Fix unaligned access in Intel-PT packet decoder found by undefined-behavior sanitizer. - Avoid frequency mode for the dummy event. Surprisingly it'd impact kernel timer tick handler performance by force iterating all PMU events. - Update bash shell completion for events and metrics" * tag 'perf-tools-for-v6.7-1-2023-11-01' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (187 commits) perf vendor events intel: Update tsx_cycles_per_elision metrics perf vendor events intel: Update bonnell version number to v5 perf vendor events intel: Update westmereex events to v4 perf vendor events intel: Update meteorlake events to v1.06 perf vendor events intel: Update knightslanding events to v16 perf vendor events intel: Add typo fix for ivybridge FP perf vendor events intel: Update a spelling in haswell/haswellx perf vendor events intel: Update emeraldrapids to v1.01 perf vendor events intel: Update alderlake/alderlake events to v1.23 perf build: Disable BPF skeletons if clang version is < 12.0.1 perf callchain: Fix spelling mistake "statisitcs" -> "statistics" perf report: Fix spelling mistake "heirachy" -> "hierarchy" perf python: Fix binding linkage due to rename and move of evsel__increase_rlimit() perf tests: test_arm_coresight: Simplify source iteration perf vendor events intel: Add tigerlake two metrics perf vendor events intel: Add broadwellde two metrics perf vendor events intel: Fix broadwellde tma_info_system_dram_bw_use metric perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit perf callchain: Minor layout changes to callchain_list perf callchain: Make brtype_stat in callchain_list optional ...
2023-11-03Merge tag 'trace-v6.7' of ↵Linus Torvalds24-692/+1278
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: - Remove eventfs_file descriptor This is the biggest change, and the second part of making eventfs create its files dynamically. In 6.6 the first part was added, and that maintained a one to one mapping between eventfs meta descriptors and the directories and file inodes and dentries that were dynamically created. The directories were represented by a eventfs_inode and the files were represented by a eventfs_file. In v6.7 the eventfs_file is removed. As all events have the same directory make up (sched_switch has an "enable", "id", "format", etc files), the handing of what files are underneath each leaf eventfs directory is moved back to the tracing subsystem via a callback. When an event is added to the eventfs, it registers an array of evenfs_entry's. These hold the names of the files and the callbacks to call when the file is referenced. The callback gets the name so that the same callback may be used by multiple files. The callback then supplies the filesystem_operations structure needed to create this file. This has brought the memory footprint of creating multiple eventfs instances down by 2 megs each! - User events now has persistent events that are not associated to a single processes. These are privileged events that hang around even if no process is attached to them - Clean up of seq_buf There's talk about using seq_buf more to replace strscpy() and friends. But this also requires some minor modifications of seq_buf to be able to do this - Expand instance ring buffers individually Currently if boot up creates an instance, and a trace event is enabled on that instance, the ring buffer for that instance and the top level ring buffer are expanded (1.4 MB per CPU). This wastes memory as this happens when nothing is using the top level instance - Other minor clean ups and fixes * tag 'trace-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits) seq_buf: Export seq_buf_puts() seq_buf: Export seq_buf_putc() eventfs: Use simple_recursive_removal() to clean up dentries eventfs: Remove special processing of dput() of events directory eventfs: Delete eventfs_inode when the last dentry is freed eventfs: Hold eventfs_mutex when calling callback functions eventfs: Save ownership and mode eventfs: Test for ei->is_freed when accessing ei->dentry eventfs: Have a free_ei() that just frees the eventfs_inode eventfs: Remove "is_freed" union with rcu head eventfs: Fix kerneldoc of eventfs_remove_rec() tracing: Have the user copy of synthetic event address use correct context eventfs: Remove extra dget() in eventfs_create_events_dir() tracing: Have trace_event_file have ref counters seq_buf: Introduce DECLARE_SEQ_BUF and seq_buf_str() eventfs: Fix typo in eventfs_inode union comment eventfs: Fix WARN_ON() in create_file_dentry() powerpc: Remove initialisation of readpos tracing/histograms: Simplify last_cmd_set() seq_buf: fix a misleading comment ...
2023-11-03Merge tag 'trace-tools-v6.7' of ↵Linus Torvalds2-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing tools updates from Steven Rostedt: "RTLA: - In rtla/utils.c, initialize the 'found' variable to avoid garbage when a mount point is not found. Verification: - Remove duplicated imports on dot2k python script" * tag 'trace-tools-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla: Fix uninitialized variable found verification/dot2k: Delete duplicate imports
2023-11-03Merge tag 'printk-for-6.7' of ↵Linus Torvalds6-78/+1288
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Another preparation step for introducing printk kthreads. The main piece is a per-console lock with several features: - Support three priorities: normal, emergency, and panic. They will be defined by a context where the lock is taken. A context with a higher priority is allowed to take over the lock from a context with a lower one. The plan is to use the emergency context for Oops and WARN() messages, and also by watchdogs. The panic() context will be used on panic CPU. - The owner might enter/exit regions where it is not safe to take over the lock. It allows the take over the lock a safe way in the middle of a message. For example, serial drivers emit characters one by one. And the serial port is in a safe state in between. Only the final console_flush_in_panic() will be allowed to take over the lock even in the unsafe state (last chance, pray, and hope). - A higher priority context might busy wait with a timeout. The current owner is informed about the waiter and releases the lock on exit from the unsafe state. - The new lock is safe even in atomic contexts, including NMI. Another change is a safe manipulation of per-console sequence number counter under the new lock. - simple_strntoull() micro-optimization - Reduce pr_flush() pooling time. - Calm down false warning about possible buffer invalid access to console buffers when CONFIG_PRINTK is disabled. [ .. and Thomas Gleixner wants to point out that while several of the commits are attributed to him, he only authored the early versions of said commits, and that John Ogness and Petr Mladek have been the ones who sorted out the details and really should be those who get the credit - Linus ] * tag 'printk-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: vsprintf: uninline simple_strntoull(), reorder arguments printk: printk: Remove unnecessary statements'len = 0;' printk: Reduce pr_flush() pooling time printk: fix illegal pbufs access for !CONFIG_PRINTK printk: nbcon: Allow drivers to mark unsafe regions and check state printk: nbcon: Add emit function and callback function for atomic printing printk: nbcon: Add sequence handling printk: nbcon: Add ownership state functions printk: nbcon: Add buffer management printk: Make static printk buffers available to nbcon printk: nbcon: Add acquire/release logic printk: Add non-BKL (nbcon) console basic infrastructure
2023-11-03Merge tag 'livepatching-for-6.7' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching update from Petr Mladek: - Add missing newline character to avoid waiting for a continuous message * tag 'livepatching-for-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: livepatch: Fix missing newline character in klp_resolve_symbols()
2023-11-03Merge tag 'bitmap-for-6.7' of https://github.com/norov/linuxLinus Torvalds10-754/+700
Pull bitmap updates from Yury Norov: "This includes the 'bitmap: cleanup bitmap_*_region() implementation' series, and scattered cleanup patches" * tag 'bitmap-for-6.7' of https://github.com/norov/linux: buildid: reduce header file dependencies for module bitmap: move bitmap_*_region() functions to bitmap.h bitmap: drop _reg_op() function bitmap: replace _reg_op(REG_OP_ISFREE) with find_next_bit() bitmap: replace _reg_op(REG_OP_RELEASE) with bitmap_clear() bitmap: replace _reg_op(REG_OP_ALLOC) with bitmap_set() bitmap: fix opencoded bitmap_allocate_region() bitmap: add test for bitmap_*_region() functions bitmap: align __reg_op() wrappers with modern coding style lib/bitmap: split-out string-related operations to a separate files bitmap: Remove dead code, i.e. bitmap_copy_le() bitmap: Fix a typo ("identify map") cpumask: kernel-doc cleanups and additions
2023-11-03drm/amd/display: Enable fast update on blendTF changeIlya Bakoulin1-1/+0
[Why] Full update is not required on surface blend TF change. [How] Update full_update_required condition. Reviewed-by: Aric Cyr <[email protected]> Acked-by: Hersen Wu <[email protected]> Signed-off-by: Ilya Bakoulin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/display: Fix blend LUT programmingIlya Bakoulin2-0/+6
[Why] LUT write index does not get reset to zero when writing the LUT values for each separate RGB component, which results in wrong data for 2 of the 3 components. [How] Reset LUT write index to zero before writing each component's data. Reviewed-by: Krunoslav Kovac <[email protected]> Acked-by: Hersen Wu <[email protected]> Signed-off-by: Ilya Bakoulin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/display: Program plane color setting correctlySung Joon Kim4-2/+156
[why] There are some registers for plane color that are skipped programming on resume. Need to add those as part of the sequence. [how] Add new function hook for programming plane color control. Reviewed-by: Duncan Ma <[email protected]> Acked-by: Hersen Wu <[email protected]> Signed-off-by: Sung Joon Kim <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: Query and report boot statusHawking Zhang1-0/+2
Query boot status and report boot errors. A follow up change is needed to stop GPU initialization if boot fails. v2: only invoke the call for dGPU (Le/Lijo) Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Yang Wang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: Add psp v13 function to query boot statusHawking Zhang3-0/+96
Add psp v13 function to query boot status. v2: limit the use case to dGPU only (Lijo) Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Yang Wang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/swsmu: remove fw version check in sw_init.Li Ma1-13/+4
dorp fw version check and using max table size to init table. Signed-off-by: Li Ma <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Reviewed-by: Yang Wang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/swsmu: update smu v14_0_0 driver if and metrics tableLi Ma5-318/+97
Update driver if headers and metrics table in smu v14_0_0 after smu fw promotion. Drop the legacy metrics table and add warning of checking pmfw version. Signed-off-by: Li Ma <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: Add C2PMSG_109/126 reg field shift/masksHawking Zhang1-0/+28
Add MP0_C2PMSG_109/126 register field shift/masks that are used to identify boot status by driver. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Yang Wang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: Optimize the asic type fix codeMa Jun2-9/+31
Use a new struct array to define the asic information which asic type needs to be fixed. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: fix GRBM read timeout when do mes_self_testTim Huang1-0/+16
Use a proper MEID to make sure the CP_HQD_* and CP_GFX_HQD_* registers can be touched when initialize the compute and gfx mqd in mes_self_test. Otherwise, we expect no response from CP and an GRBM eventual timeout. Signed-off-by: Tim Huang <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/amdgpu: check recovery status of xgmi hive in ras_reset_error_countTao Zhou1-1/+10
Handle xgmi hive case. Suggested-by: Hawking Zhang <[email protected]> Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Stanley.Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/pm: only check sriov vf flag once when creating hwmon sysfsMa Jun1-13/+14
The current code checks sriov vf flag multiple times when creating hwmon sysfs. So fix it. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: Attach eviction fence on allocFelix Kuehling1-31/+48
Instead of attaching the eviction fence when a KFD BO is first mapped, attach it when it is allocated or imported. This in preparation to allow KFD BOs to be mapped using the render node API. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdkfd: Improve amdgpu_vm_handle_movedFelix Kuehling4-8/+23
Let amdgpu_vm_handle_moved update all BO VA mappings of BOs reserved by the caller. This will be useful for handling extra BO VA mappings in KFD VMs that are managed through the render node API. v2: rebase against drm_exec changes (Alex) Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/display: Increase frame warning limit with KASAN or KCSAN in dml2Nathan Chancellor1-0/+4
When building ARCH=x86_64 allmodconfig with clang, which will typically have sanitizers enabled, there is a warning about a large stack frame. drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6265:13: error: stack frame size (2520) exceeds limit (2048) in 'dml_prefetch_check' [-Werror,-Wframe-larger-than] 6265 | static void dml_prefetch_check(struct display_mode_lib_st *mode_lib) | ^ 1 error generated. Notably, GCC 13.2.0 does not do too much of a better job, as it is right at the current limit of 2048 (and others have reported being over with older GCC versions): drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c: In function 'dml_prefetch_check': drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/display_mode_core.c:6705:1: error: the frame size of 2048 bytes is larger than 1800 bytes [-Werror=frame-larger-than=] 6705 | } | ^ In the past, these warnings have been avoided by reducing the number of parameters to various functions so that not as many arguments need to be passed on the stack. However, these patches take a good amount of effort to write despite being mechanical due to code structure and complexity and they are never carried forward to new generations of the code so that effort has to be expended every new hardware generation, which becomes harder to justify as time goes on. To avoid having a noticeable or lengthy breakage in all{mod,yes}config, which are easy testing targets that have -Werror enabled, increase the limit for configurations that have KASAN or KCSAN enabled by 50% so that cases of extremely poor code generation can still be caught while not breaking the majority of builds. CONFIG_KMSAN also causes high stack usage but the frame limit is already set to zero when it is enabled, which is accounted for by the check for CONFIG_FRAME_WARN=0 in the dml2 Makefile. Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/display: Avoid NULL dereference of timing generatorWayne Lin1-2/+2
[Why & How] Check whether assigned timing generator is NULL or not before accessing its funcs to prevent NULL dereference. Reviewed-by: Jun Lei <[email protected]> Acked-by: Hersen Wu <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdkfd: Update cache info for GFX 9.4.3Mukul Joshi1-2/+16
Update cache info reporting based on compute and memory partitioning modes. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdkfd: Populate cache info for GFX 9.4.3Mukul Joshi1-1/+65
GFX 9.4.3 uses a new version of the GC info table which contains the cache info. This patch adds a new function to populate the cache info from IP discovery for GFX 9.4.3. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: don't put MQDs in VRAM on ARM | ARM64Alex Deucher1-0/+2
Issues were reported with commit 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM") on an ADLINK Ampere Altra Developer Platform (AVA developer platform). Various ARM systems seem to have problems related to PCIe and MMIO access. In this case, I'm not sure if this is specific to the ADLINK platform or ARM in general. Seems to be some coherency issue with VRAM. For now, just don't put MQDs in VRAM on ARM. Link: https://lists.freedesktop.org/archives/amd-gfx/2023-October/100453.html Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM") Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/amdgpu/smu13: drop compute workload workaroundAlex Deucher1-30/+2
This was fixed in PMFW before launch and is no longer required. Reviewed-by: Yang Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 6.1.x
2023-11-03drm/amdgpu: add a retry for IP discovery initAlex Deucher1-2/+21
AMD dGPUs have integrated FW that runs as soon as the device gets power and initializes the board (determines the amount of memory, provides configuration details to the driver, etc.). For direct PCIe attached cards this happens as soon as power is applied and normally completes well before the OS has even started loading. However, with hotpluggable ports like USB4, the driver needs to wait for this to complete before initializing the device. This normally takes 60-100ms, but could take longer on some older boards periodically due to memory training. Retry for up to a second. In the non-hotplug case, there should be no change in behavior and this should complete on the first try. v2: adjust test criteria v3: adjust checks for the masks, only enable on removable devices v4: skip bif_fb_en check Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/amdgpu: ungate power gating when system suspendPerry Yuan1-0/+9
[Why] During suspend, if GFX DPM is enabled and GFXOFF feature is enabled the system may get hung. So, it is suggested to disable GFXOFF feature during suspend and enable it after resume. [How] Update the code to disable GFXOFF feature during suspend and enable it after resume. [ 311.396526] amdgpu 0000:03:00.0: amdgpu: SMU: I'm not done with your previous command: SMN_C2PMSG_66:0x0000001E SMN_C2PMSG_82:0x00000000 [ 311.396530] amdgpu 0000:03:00.0: amdgpu: Fail to disable dpm features! [ 311.396531] [drm:amdgpu_device_ip_suspend_phase2 [amdgpu]] *ERROR* suspend of IP block <smu> failed -62 Acked-by: Yang Wang <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Perry Yuan <[email protected]> Signed-off-by: Kun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/radeon: replace 1-element arrays with flexible-array membersJosé Pekkarinen1-21/+21
Reported by coccinelle, the following patch will move the following 1 element arrays to flexible arrays. drivers/gpu/drm/radeon/atombios.h:5523:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5545:32-48: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5461:34-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4447:30-40: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4236:30-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7095:28-45: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:3896:27-37: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5443:16-25: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:5454:34-43: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4603:21-32: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4628:32-46: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:6285:29-39: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4296:30-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4756:28-36: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:4064:22-35: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7327:9-24: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7332:32-53: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7362:26-41: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7369:29-44: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7349:24-32: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) drivers/gpu/drm/radeon/atombios.h:7355:27-35: WARNING use flexible-array member instead (https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays) Signed-off-by: José Pekkarinen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headersAlex Deucher2-14/+14
For pptable structs that use flexible array sizes, use flexible arrays. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926 Reviewed-by: Mario Limonciello <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: don't use pci_is_thunderbolt_attached()Alex Deucher2-6/+7
It's only valid on Intel systems with the Intel VSEC. Use dev_is_removable() instead. This should do the right thing regardless of the platform. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/amdgpu: don't use ATRM for external devicesAlex Deucher1-0/+5
The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM. v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3: dev_is_removable() seems to be what we want Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2023-11-03drm/amdgpu/gfx10,11: use memcpy_to/fromio for MQDsAlex Deucher2-12/+12
Since they were moved to VRAM, we need to use the IO variants of memcpy. Fixes: 1cfb4d612127 ("drm/amdgpu: put MQDs in VRAM") Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu: use mode-2 reset for RAS poison consumptionTao Zhou1-1/+5
Switch from mode-1 reset to mode-2 for poison consumption. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Stanley.Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amdgpu doorbell range should be set when gpu recoveryLin.Cao1-0/+7
GFX doorbell range should be set after flr otherwise the gfx doorbell range will be overlap with MEC. v2: remove "amdgpu_sriov_vf" and "amdgpu_in_reset" check, and add grbm select for the case of 2 gfx rings. Signed-off-by: Lin.Cao <[email protected]> Acked-by: ZhenGuo Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03drm/amd/pm: Return 0 as default min power limit for legacy asicsMa Jun1-0/+3
Return 0 as the default min power limit for the asics use powerplay. Signed-off-by: Ma Jun <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-03Merge tag 'cpufreq-arm-updates-6.7-part2' of ↵Rafael J. Wysocki16-1/+1633
git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm Merge ARM cpufreq updates for 6.7 (part 2) from Viresh kumar: "- Add support for several Qualcomm SoC versions (Robert Marko and Varadarajan Narayanan)." * tag 'cpufreq-arm-updates-6.7-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm: cpufreq: qcom-nvmem: Introduce cpufreq for ipq95xx cpufreq: qcom-nvmem: Enable cpufreq for ipq53xx cpufreq: qcom-nvmem: add support for IPQ8074 soc: qcom: socinfo: Add IDs for IPQ8174 family dt-bindings: arm: qcom,ids: Add IDs for IPQ8174 family dt-bindings: qcom: geni-se: Allow dma-coherent soc: qcom: socinfo: Add SoC ID for QCM6490 dt-bindings: arm: qcom,ids: Add SoC ID for QCM6490 soc: qcom: socinfo: Add SM8550-adjacent PMICs soc: qcom: wcnss_ctrl: Remove redundant initialization owner in wcnss_ctrl_driver soc: qcom: socinfo: Add Soc ID for SM7150P dt-bindings: arm: qcom,ids: Add Soc ID for SM7150P firmware: Add support for Qualcomm UEFI Secure Application firmware: qcom_scm: Add support for Qualcomm Secure Execution Environment SCM interface lib/ucs2_string: Add UCS-2 strscpy function
2023-11-03Merge tag 'linux-cpupower-6.7-rc1' of ↵Rafael J. Wysocki1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Merge cpupower utility update for 6.7-rc1 from Shuah Khan: "This cpupower update for Linux 6.7-rc1 consists of a single fix to documentation to fix reference to a removed document." * tag 'linux-cpupower-6.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: cpupower: fix reference to nonexistent document
2023-11-03exfat: fix ctime is not updatedYuezhang Mo1-0/+1
Commit 4c72a36edd54 ("exfat: convert to new timestamp accessors") removed attr_copy() from exfat_set_attr(). It causes xfstests generic/221 to fail. In xfstests generic/221, it tests ctime should be updated even if futimens() update atime only. But in this case, ctime will not be updated if attr_copy() is removed. attr_copy() may also update other attributes, and removing it may cause other bugs, so this commit restores to call attr_copy() in exfat_set_attr(). Fixes: 4c72a36edd54 ("exfat: convert to new timestamp accessors") Signed-off-by: Yuezhang Mo <[email protected]> Reviewed-by: Andy Wu <[email protected]> Reviewed-by: Aoyama Wataru <[email protected]> Reviewed-by: Sungjong Seo <[email protected]> Signed-off-by: Namjae Jeon <[email protected]>
2023-11-03exfat: fix setting uninitialized time to ctime/atimeYuezhang Mo1-2/+2
An uninitialized time is set to ctime/atime in __exfat_write_inode(). It causes xfstests generic/003 and generic/192 to fail. And since there will be a time gap between setting ctime/atime to the inode and writing back the inode, so ctime/atime should not be set again when writing back the inode. Fixes: 4c72a36edd54 ("exfat: convert to new timestamp accessors") Signed-off-by: Yuezhang Mo <[email protected]> Reviewed-by: Andy Wu <[email protected]> Reviewed-by: Aoyama Wataru <[email protected]> Reviewed-by: Sungjong Seo <[email protected]> Signed-off-by: Namjae Jeon <[email protected]>
2023-11-02Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of ↵Linus Torvalds106-1000/+894
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "As usual, lots of singleton and doubleton patches all over the tree and there's little I can say which isn't in the individual changelogs. The lengthier patch series are - 'kdump: use generic functions to simplify crashkernel reservation in arch', from Baoquan He. This is mainly cleanups and consolidation of the 'crashkernel=' kernel parameter handling - After much discussion, David Laight's 'minmax: Relax type checks in min() and max()' is here. Hopefully reduces some typecasting and the use of min_t() and max_t() - A group of patches from Oleg Nesterov which clean up and slightly fix our handling of reads from /proc/PID/task/... and which remove task_struct.thread_group" * tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits) scripts/gdb/vmalloc: disable on no-MMU scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n .mailmap: add address mapping for Tomeu Vizoso mailmap: update email address for Claudiu Beznea tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions .mailmap: map Benjamin Poirier's address scripts/gdb: add lx_current support for riscv ocfs2: fix a spelling typo in comment proc: test ProtectionKey in proc-empty-vm test proc: fix proc-empty-vm test with vsyscall fs/proc/base.c: remove unneeded semicolon do_io_accounting: use sig->stats_lock do_io_accounting: use __for_each_thread() ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error() ocfs2: fix a typo in a comment scripts/show_delta: add __main__ judgement before main code treewide: mark stuff as __ro_after_init fs: ocfs2: check status values proc: test /proc/${pid}/statm compiler.h: move __is_constexpr() to compiler.h ...
2023-11-02Merge tag 'mm-stable-2023-11-01-14-33' of ↵Linus Torvalds281-5277/+11683
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/[email protected]/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...
2023-11-03nouveau/gsp: add some basic registry entries.Dave Airlie1-10/+35
The nvidia driver sets these two basic registry entries always, so copy it. Reviewed-by: Danilo Krummrich <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2023-11-03nouveau/gsp: fix message signature.Dave Airlie1-1/+1
This original one was backwards, compared to traces from nvidia driver. Reviewed-by: Danilo Krummrich <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2023-11-03nouveau/gsp: move to 535.113.01Dave Airlie73-171/+217
This moves the initial effort to the latest 535 firmware. The gsp msg structs have changed, and the message passing also. The wpr also seems to have some struct changes. This version of the firmware will be what we are stuck on for a while, until we can refactor the driver and work out a better path forward. Reviewed-by: Danilo Krummrich <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2023-11-02Merge tag 'v6.7-p1' of ↵Linus Torvalds275-3331/+10670
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Add virtual-address based lskcipher interface - Optimise ahash/shash performance in light of costly indirect calls - Remove ahash alignmask attribute Algorithms: - Improve AES/XTS performance of 6-way unrolling for ppc - Remove some uses of obsolete algorithms (md4, md5, sha1) - Add FIPS 202 SHA-3 support in pkcs1pad - Add fast path for single-page messages in adiantum - Remove zlib-deflate Drivers: - Add support for S4 in meson RNG driver - Add STM32MP13x support in stm32 - Add hwrng interface support in qcom-rng - Add support for deflate algorithm in hisilicon/zip" * tag 'v6.7-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (283 commits) crypto: adiantum - flush destination page before unmapping crypto: testmgr - move pkcs1pad(rsa,sha3-*) to correct place Documentation/module-signing.txt: bring up to date module: enable automatic module signing with FIPS 202 SHA-3 crypto: asymmetric_keys - allow FIPS 202 SHA-3 signatures crypto: rsa-pkcs1pad - Add FIPS 202 SHA-3 support crypto: FIPS 202 SHA-3 register in hash info for IMA x509: Add OIDs for FIPS 202 SHA-3 hash and signatures crypto: ahash - optimize performance when wrapping shash crypto: ahash - check for shash type instead of not ahash type crypto: hash - move "ahash wrapping shash" functions to ahash.c crypto: talitos - stop using crypto_ahash::init crypto: chelsio - stop using crypto_ahash::init crypto: ahash - improve file comment crypto: ahash - remove struct ahash_request_priv crypto: ahash - remove crypto_ahash_alignmask crypto: gcm - stop using alignmask of ahash crypto: chacha20poly1305 - stop using alignmask of ahash crypto: ccm - stop using alignmask of ahash net: ipv6: stop checking crypto_ahash_alignmask ...
2023-11-02Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds127-1503/+8885
Pull kvm updates from Paolo Bonzini: "ARM: - Generalized infrastructure for 'writable' ID registers, effectively allowing userspace to opt-out of certain vCPU features for its guest - Optimization for vSGI injection, opportunistically compressing MPIDR to vCPU mapping into a table - Improvements to KVM's PMU emulation, allowing userspace to select the number of PMCs available to a VM - Guest support for memory operation instructions (FEAT_MOPS) - Cleanups to handling feature flags in KVM_ARM_VCPU_INIT, squashing bugs and getting rid of useless code - Changes to the way the SMCCC filter is constructed, avoiding wasted memory allocations when not in use - Load the stage-2 MMU context at vcpu_load() for VHE systems, reducing the overhead of errata mitigations - Miscellaneous kernel and selftest fixes LoongArch: - New architecture for kvm. The hardware uses the same model as x86, s390 and RISC-V, where guest/host mode is orthogonal to supervisor/user mode. The virtualization extensions are very similar to MIPS, therefore the code also has some similarities but it's been cleaned up to avoid some of the historical bogosities that are found in arch/mips. The kernel emulates MMU, timer and CSR accesses, while interrupt controllers are only emulated in userspace, at least for now. RISC-V: - Support for the Smstateen and Zicond extensions - Support for virtualizing senvcfg - Support for virtualized SBI debug console (DBCN) S390: - Nested page table management can be monitored through tracepoints and statistics x86: - Fix incorrect handling of VMX posted interrupt descriptor in KVM_SET_LAPIC, which could result in a dropped timer IRQ - Avoid WARN on systems with Intel IPI virtualization - Add CONFIG_KVM_MAX_NR_VCPUS, to allow supporting up to 4096 vCPUs without forcing more common use cases to eat the extra memory overhead. - Add virtualization support for AMD SRSO mitigation (IBPB_BRTYPE and SBPB, aka Selective Branch Predictor Barrier). - Fix a bug where restoring a vCPU snapshot that was taken within 1 second of creating the original vCPU would cause KVM to try to synchronize the vCPU's TSC and thus clobber the correct TSC being set by userspace. - Compute guest wall clock using a single TSC read to avoid generating an inaccurate time, e.g. if the vCPU is preempted between multiple TSC reads. - "Virtualize" HWCR.TscFreqSel to make Linux guests happy, which complain about a "Firmware Bug" if the bit isn't set for select F/M/S combos. Likewise "virtualize" (ignore) MSR_AMD64_TW_CFG to appease Windows Server 2022. - Don't apply side effects to Hyper-V's synthetic timer on writes from userspace to fix an issue where the auto-enable behavior can trigger spurious interrupts, i.e. do auto-enabling only for guest writes. - Remove an unnecessary kick of all vCPUs when synchronizing the dirty log without PML enabled. - Advertise "support" for non-serializing FS/GS base MSR writes as appropriate. - Harden the fast page fault path to guard against encountering an invalid root when walking SPTEs. - Omit "struct kvm_vcpu_xen" entirely when CONFIG_KVM_XEN=n. - Use the fast path directly from the timer callback when delivering Xen timer events, instead of waiting for the next iteration of the run loop. This was not done so far because previously proposed code had races, but now care is taken to stop the hrtimer at critical points such as restarting the timer or saving the timer information for userspace. - Follow the lead of upstream Xen and ignore the VCPU_SSHOTTMR_future flag. - Optimize injection of PMU interrupts that are simultaneous with NMIs. - Usual handful of fixes for typos and other warts. x86 - MTRR/PAT fixes and optimizations: - Clean up code that deals with honoring guest MTRRs when the VM has non-coherent DMA and host MTRRs are ignored, i.e. EPT is enabled. - Zap EPT entries when non-coherent DMA assignment stops/start to prevent using stale entries with the wrong memtype. - Don't ignore guest PAT for CR0.CD=1 && KVM_X86_QUIRK_CD_NW_CLEARED=y This was done as a workaround for virtual machine BIOSes that did not bother to clear CR0.CD (because ancient KVM/QEMU did not bother to set it, in turn), and there's zero reason to extend the quirk to also ignore guest PAT. x86 - SEV fixes: - Report KVM_EXIT_SHUTDOWN instead of EINVAL if KVM intercepts SHUTDOWN while running an SEV-ES guest. - Clean up the recognition of emulation failures on SEV guests, when KVM would like to "skip" the instruction but it had already been partially emulated. This makes it possible to drop a hack that second guessed the (insufficient) information provided by the emulator, and just do the right thing. Documentation: - Various updates and fixes, mostly for x86 - MTRR and PAT fixes and optimizations" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (164 commits) KVM: selftests: Avoid using forced target for generating arm64 headers tools headers arm64: Fix references to top srcdir in Makefile KVM: arm64: Add tracepoint for MMIO accesses where ISV==0 KVM: arm64: selftest: Perform ISB before reading PAR_EL1 KVM: arm64: selftest: Add the missing .guest_prepare() KVM: arm64: Always invalidate TLB for stage-2 permission faults KVM: x86: Service NMI requests after PMI requests in VM-Enter path KVM: arm64: Handle AArch32 SPSR_{irq,abt,und,fiq} as RAZ/WI KVM: arm64: Do not let a L1 hypervisor access the *32_EL2 sysregs KVM: arm64: Refine _EL2 system register list that require trap reinjection arm64: Add missing _EL2 encodings arm64: Add missing _EL12 encodings KVM: selftests: aarch64: vPMU test for validating user accesses KVM: selftests: aarch64: vPMU register test for unimplemented counters KVM: selftests: aarch64: vPMU register test for implemented counters KVM: selftests: aarch64: Introduce vpmu_counter_access test tools: Import arm_pmuv3.h KVM: arm64: PMU: Allow userspace to limit PMCR_EL0.N for the guest KVM: arm64: Sanitize PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} before first run KVM: arm64: Add {get,set}_user for PM{C,I}NTEN{SET,CLR}, PMOVS{SET,CLR} ...
2023-11-02Merge tag 'sh-for-v6.7-tag1' of ↵Linus Torvalds29-1704/+27
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: "While the previously announced patch series for converting arch/sh to device trees is not yet ready for inclusion to mainline and therefore didn't make it for this pull request, there are still a small number changes for v6.7 which include one platform (board plus CPU and driver code) removal plus two fixes. The removal sent in by Arnd Bergmann concerns the microdev board which was an early SuperH prototype board that was never used in production. With the board removed, we were able to drop the now unused code for the SH4-202 CPU and well as the driver code for the superhyway bus and a custom implementation for ioport_map() and ioport_unmap() which will allow us to simplify ioport handling in the future. Another patch set by Geert Uytterhoeven revives SuperH BIOS earlyprintk support which got accidentally disabled in e76fe57447e88916 ("sh: Remove old early serial console code V2"), the second patch in the series updates the documentation. Finally, a patch by Masami Hiramatsu fixes a regression reported by the kernel test robot which uncovered that arch/sh is not implementing arch_cmpxchg_local() and therefore needs use __generic_cmpxchg_local() instead" * tag 'sh-for-v6.7-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: locking/atomic: sh: Use generic_cmpxchg_local for arch_cmpxchg_local() Documentation: kernel-parameters: Add earlyprintk=bios on SH sh: bios: Revive earlyprintk support sh: machvec: Remove custom ioport_{un,}map() sh: Remove superhyway bus support sh: Remove unused SH4-202 support sh: Remove stale microdev board
2023-11-02Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds9-33/+25
Pull ARM updates from Russell King: - fix some kernel-doc warnings - fix stack depot IRQ stack filter - cast memset() byte to unsigned char - explicitly include correct DI includes - fix ARCH_LOW_ADDRESS_LIMIT with CONFIG_ZONE_DMA - fix get_user() problems when linker uses a veneer - make including linux/uaccess.h self-contained on ARM * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: ARM: 9326/1: make <linux/uaccess.h> self-contained for ARM ARM: 9324/1: fix get_user() broken with veneer ARM: 9323/1: mm: Fix ARCH_LOW_ADDRESS_LIMIT when CONFIG_ZONE_DMA ARM: 9322/1: Explicitly include correct DT includes ARM: 9321/1: memset: cast the constant byte to unsigned char ARM: 9320/1: fix stack depot IRQ stack filter ARM: 9319/1: sa1111: fix sa1111_probe kernel-doc warnings
2023-11-02Merge tag 'm68knommu-for-v6.7' of ↵Linus Torvalds11-19/+37
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu updates from Greg Ungerer: "A few changes, most of them related to fixing warnings when compiling with "W=1". These follow up Geert's recent changes for M68K for this too. These ones complete the fixes for the nommu and ColdFire specific code. Also a couple of other fixes to improve ROM default addressing and compiling for the Cleopatra boards. Summary: - improve default Kconfig ROM section settings - fix compilation for some Cleopatra boards - fixes and cleanups for warnings compiling with 'W=1'" * tag 'm68knommu-for-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: 68000: fix warning in timer code m68k: 68000: fix warnings in 68000 interrupt handling m68k: coldfire: remove unused variable in MMU code m68k: coldfire: fix warnings in uboot argument processing m68k: coldfire: make mcf_maskimr() static m68k: coldfire: ensure gpio prototypes visible m68k: coldfire: add and use "vectors.h" m68knommu: fix compilation for ColdFire/Cleopatra boards m68knommu: improve config ROM setting defaults