aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2020-11-06rcutorture: Make stutter_wait() caller restore priorityPaul E. McKenney2-11/+22
Currently, stutter_wait() will happily spin waiting for the stutter interval to end even if the caller is running at a real-time priority level. This could starve normal-priority tasks for no good reason. This commit therefore drops the calling task's priority to SCHED_OTHER MAX_NICE if stutter_wait() needs to wait. But when it waits, stutter_wait() returns true, which allows the caller to restore the priority if needed. Callers that were already running at SCHED_OTHER MAX_NICE obviously do not need any changes, but this commit also restores priority for higher-priority callers. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06rcutorture: Prevent hangs for invalid argumentsPaul E. McKenney1-1/+4
If an rcutorture torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a rcu_torture_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06locktorture: Prevent hangs for invalid argumentsPaul E. McKenney1-0/+5
If an locktorture torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a lock_torture_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06locktorture: Ignore nreaders_stress if no readlock supportHou Tao1-1/+2
Exclusive locks do not have readlock support, which means that a locktorture run with the following module parameters will do nothing: torture_type=mutex_lock nwriters_stress=0 nreaders_stress=1 This commit therefore rejects this combination for exclusive locks by returning -EINVAL during module init. Signed-off-by: Hou Tao <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06refscale: Prevent hangs for invalid argumentsPaul E. McKenney1-1/+4
If an refscale torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a ref_scale_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06rcuscale: Prevent hangs for invalid argumentsPaul E. McKenney1-1/+4
If an rcuscale torture-test run is given a bad kvm.sh argument, the test will complain to the console, which is good. What is bad is that from the user's perspective, it will just hang for the time specified by the --duration argument. This commit therefore forces an immediate kernel shutdown if a rcu_scale_init()-time error occurs, thus avoiding the appearance of a hang. It also forces a console splat in this case to clearly indicate the presence of an error. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06rcuscale: Add RCU Tasks TracePaul E. McKenney1-1/+31
This commit adds the ability to test performance and scalability of RCU Tasks Trace updaters. Reported-by: Alexei Starovoitov <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06scftorture: Add an alternative IPI vectorPaul E. McKenney1-9/+32
The scftorture tests currently use only smp_call_function() and friends, which means that these tests cannot locate bugs caused by interactions between different IPI vectors. This commit therefore adds the rescheduling IPI to the mix. Note that this commit permits resched_cpus() only when scftorture is built in. This is a workaround. Longer term, this will use real wakeups rather than resched_cpu(). Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06torture: Make torture_stutter() use hrtimerPaul E. McKenney1-5/+12
The torture_stutter() function uses schedule_timeout_interruptible() to time the stutter duration, but this can miss race conditions due to its being time-synchronized with everything else that is based on the timer wheels. This commit therefore converts torture_stutter() to use the high-resolution timers via schedule_hrtimeout(), and also to fuzz the stutter interval. While in the area, this commit also limits the spin-loop portion of the stutter_wait() function's wait loop to two jiffies, down from about one second. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06torture: Periodically pause in stutter_wait()Paul E. McKenney1-2/+14
Running locktorture scenario LOCK05 results in hangs: tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --torture lock --duration 3 --configs LOCK05 The lock_torture_writer() kthreads set themselves to MAX_NICE while running SCHED_OTHER. Other locktorture kthreads run at default niceness, also SCHED_OTHER. This results in these other locktorture kthreads indefinitely preempting the lock_torture_writer() kthreads. Note that the cond_resched() in the stutter_wait() function's loop is ineffective because this scenario is built with CONFIG_PREEMPT=y. It is not clear that such indefinite preemption is supposed to happen, but in the meantime this commit prevents kthreads running in stutter_wait() from being completely CPU-bound, thus allowing the other threads to get some CPU in a timely fashion. This commit also uses hrtimers to provide very short sleeps to avoid degrading the sudden-on testing that stutter is supposed to provide. Reviewed-by: Davidlohr Bueso <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06locktorture: Track time of last ->writeunlock()Paul E. McKenney1-0/+2
This commit adds a last_lock_release variable that tracks the time of the last ->writeunlock() call, which allows easier diagnosing of lock hangs when using a kernel debugger. Acked-by: Davidlohr Bueso <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-06x86/cpu: Avoid cpuinfo-induced IPIing of idle CPUsPaul E. McKenney1-0/+8
Currently, accessing /proc/cpuinfo sends IPIs to idle CPUs in order to learn their clock frequency. Which is a bit strange, given that waking them from idle likely significantly changes their clock frequency. This commit therefore avoids sending /proc/cpuinfo-induced IPIs to idle CPUs. [ paulmck: Also check for idle in arch_freq_prepare_all(). ] Signed-off-by: Paul E. McKenney <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]>
2020-11-06bpf: Update verification logic for LSM programsKP Singh1-3/+7
The current logic checks if the name of the BTF type passed in attach_btf_id starts with "bpf_lsm_", this is not sufficient as it also allows attachment to non-LSM hooks like the very function that performs this check, i.e. bpf_lsm_verify_prog. In order to ensure that this verification logic allows attachment to only LSM hooks, the LSM_HOOK definitions in lsm_hook_defs.h are used to generate a BTF_ID set. Upon verification, the attach_btf_id of the program being attached is checked for presence in this set. Fixes: 9e4e01dfd325 ("bpf: lsm: Implement attach, detach and execution") Signed-off-by: KP Singh <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-06bpf: Implement get_current_task_btf and RET_PTR_TO_BTF_IDKP Singh2-2/+21
The currently available bpf_get_current_task returns an unsigned integer which can be used along with BPF_CORE_READ to read data from the task_struct but still cannot be used as an input argument to a helper that accepts an ARG_PTR_TO_BTF_ID of type task_struct. In order to implement this helper a new return type, RET_PTR_TO_BTF_ID, is added. This is similar to RET_PTR_TO_BTF_ID_OR_NULL but does not require checking the nullness of returned pointer. Signed-off-by: KP Singh <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-06bpf: Implement task local storageKP Singh5-1/+332
Similar to bpf_local_storage for sockets and inodes add local storage for task_struct. The life-cycle of storage is managed with the life-cycle of the task_struct. i.e. the storage is destroyed along with the owning task with a callback to the bpf_task_storage_free from the task_free LSM hook. The BPF LSM allocates an __rcu pointer to the bpf_local_storage in the security blob which are now stackable and can co-exist with other LSMs. The userspace map operations can be done by using a pid fd as a key passed to the lookup, update and delete operations. Signed-off-by: KP Singh <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-06bpf: Allow LSM programs to use bpf spin locksKP Singh2-5/+19
Usage of spin locks was not allowed for tracing programs due to insufficient preemption checks. The verifier does not currently prevent LSM programs from using spin locks, but the helpers are not exposed via bpf_lsm_func_proto. Based on the discussion in [1], non-sleepable LSM programs should be able to use bpf_spin_{lock, unlock}. Sleepable LSM programs can be preempted which means that allowng spin locks will need more work (disabling preemption and the verifier ensuring that no sleepable helpers are called when a spin lock is held). [1]: https://lore.kernel.org/bpf/[email protected]/T/#md601a053229287659071600d3483523f752cd2fb Signed-off-by: KP Singh <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-06ftrace: Add recording of functions that caused recursionSteven Rostedt (VMware)9-8/+271
This adds CONFIG_FTRACE_RECORD_RECURSION that will record to a file "recursed_functions" all the functions that caused recursion while a callback to the function tracer was running. Link: https://lkml.kernel.org/r/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Guo Ren <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Helge Deller <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: [email protected] Cc: "H. Peter Anvin" <[email protected]> Cc: Kees Cook <[email protected]> Cc: Anton Vorontsov <[email protected]> Cc: Colin Cross <[email protected]> Cc: Tony Luck <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Miroslav Benes <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Joe Lawrence <[email protected]> Cc: Kamalesh Babulal <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06ftrace: Reverse what the RECURSION flag means in the ftrace_opsSteven Rostedt (VMware)6-21/+13
Now that all callbacks are recursion safe, reverse the meaning of the RECURSION flag and rename it from RECURSION_SAFE to simply RECURSION. Now only callbacks that request to have recursion protecting it will have the added trampoline to do so. Also remove the outdated comment about "PER_CPU" when determining to use the ftrace_ops_assist_func. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Miroslav Benes <[email protected]> Cc: Kamalesh Babulal <[email protected]> Cc: Petr Mladek <[email protected]> Cc: [email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06perf/ftrace: Check for rcu_is_watching() in callback functionSteven Rostedt (VMware)1-1/+3
If a ftrace callback requires "rcu_is_watching", then it adds the FTRACE_OPS_FL_RCU flag and it will not be called if RCU is not "watching". But this means that it will use a trampoline when called, and this slows down the function tracing a tad. By checking rcu_is_watching() from within the callback, it no longer needs the RCU flag set in the ftrace_ops and it can be safely called directly. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Miroslav Benes <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06perf/ftrace: Add recursion protection to the ftrace callbackSteven Rostedt (VMware)1-1/+8
If a ftrace callback does not supply its own recursion protection and does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will make a helper trampoline to do so before calling the callback instead of just calling the callback directly. The default for ftrace_ops is going to change. It will expect that handlers provide their own recursion protection, unless its ftrace_ops states otherwise. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Miroslav Benes <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06livepatch: Trigger WARNING if livepatch function fails due to recursionSteven Rostedt (VMware)1-1/+1
If for some reason a function is called that triggers the recursion detection of live patching, trigger a warning. By not executing the live patch code, it is possible that the old unpatched function will be called placing the system into an unknown state. Link: https://lore.kernel.org/r/20201029145709.GD16774@alley Link: https://lkml.kernel.org/r/[email protected] Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Joe Lawrence <[email protected]> Cc: [email protected] Suggested-by: Miroslav Benes <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Acked-by: Miroslav Benes <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06livepatch/ftrace: Add recursion protection to the ftrace callbackSteven Rostedt (VMware)1-0/+5
If a ftrace callback does not supply its own recursion protection and does not set the RECURSION_SAFE flag in its ftrace_ops, then ftrace will make a helper trampoline to do so before calling the callback instead of just calling the callback directly. The default for ftrace_ops is going to change. It will expect that handlers provide their own recursion protection, unless its ftrace_ops states otherwise. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Jiri Kosina <[email protected]> Cc: Joe Lawrence <[email protected]> Cc: [email protected] Reviewed-by: Petr Mladek <[email protected]> Acked-by: Miroslav Benes <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06ftrace: Add ftrace_test_recursion_trylock() helper functionSteven Rostedt (VMware)1-7/+5
To make it easier for ftrace callbacks to have recursion protection, provide a ftrace_test_recursion_trylock() and ftrace_test_recursion_unlock() helper that tests for recursion. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06ftrace: Move the recursion testing into global headersSteven Rostedt (VMware)1-177/+0
Currently, if a callback is registered to a ftrace function and its ftrace_ops does not have the RECURSION flag set, it is encapsulated in a helper function that does the recursion for it. Really, all the callbacks should have their own recursion protection for performance reasons. But they should not all implement their own. Move the recursion helpers to global headers, so that all callbacks can use them. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-06printk: remove unneeded dead-store assignmentLukas Bulwahn1-2/+0
make clang-analyzer on x86_64 defconfig caught my attention with: kernel/printk/printk_ringbuffer.c:885:3: warning: Value stored to 'desc' is never read [clang-analyzer-deadcode.DeadStores] desc = to_desc(desc_ring, head_id); ^ Commit b6cf8b3f3312 ("printk: add lockless ringbuffer") introduced desc_reserve() with this unneeded dead-store assignment. As discussed with John Ogness privately, this is probably just some minor left-over from previous iterations of the ringbuffer implementation. So, simply remove this unneeded dead assignment to make clang-analyzer happy. As compilers will detect this unneeded assignment and optimize this anyway, the resulting object code is identical before and after this change. No functional change. No change to object code. Signed-off-by: Lukas Bulwahn <[email protected]> Reviewed-by: Sergey Senozhatsky <[email protected]> Reviewed-by: John Ogness <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Petr Mladek <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-05bpf: Lift hashtab key_size limitFlorian Lehner1-11/+5
Currently key_size of hashtab is limited to MAX_BPF_STACK. As the key of hashtab can also be a value from a per cpu map it can be larger than MAX_BPF_STACK. The use-case for this patch originates to implement allow/disallow lists for files and file paths. The maximum length of file paths is defined by PATH_MAX with 4096 chars including nul. This limit exceeds MAX_BPF_STACK. Changelog: v5: - Fix cast overflow v4: - Utilize BPF skeleton in tests - Rebase v3: - Rebase v2: - Add a test for bpf side Signed-off-by: Florian Lehner <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: John Fastabend <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-05bpf: Zero-fill re-used per-cpu map elementDavid Verbeiren1-2/+28
Zero-fill element values for all other cpus than current, just as when not using prealloc. This is the only way the bpf program can ensure known initial values for all cpus ('onallcpus' cannot be set when coming from the bpf program). The scenario is: bpf program inserts some elements in a per-cpu map, then deletes some (or userspace does). When later adding new elements using bpf_map_update_elem(), the bpf program can only set the value of the new elements for the current cpu. When prealloc is enabled, previously deleted elements are re-used. Without the fix, values for other cpus remain whatever they were when the re-used entry was previously freed. A selftest is added to validate correct operation in above scenario as well as in case of LRU per-cpu map element re-use. Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements") Signed-off-by: David Verbeiren <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Matthieu Baerts <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-05bpf: BPF_PRELOAD depends on BPF_SYSCALLRandy Dunlap1-0/+1
Fix build error when BPF_SYSCALL is not set/enabled but BPF_PRELOAD is by making BPF_PRELOAD depend on BPF_SYSCALL. ERROR: modpost: "bpf_preload_ops" [kernel/bpf/preload/bpf_preload.ko] undefined! Reported-by: kernel test robot <[email protected]> Reported-by: Randy Dunlap <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2020-11-05Merge tag 'trace-v5.10-rc2' of ↵Linus Torvalds6-34/+107
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: - Fix off-by-one error in retrieving the context buffer for trace_printk() - Fix off-by-one error in stack nesting limit - Fix recursion to not make all NMI code false positive as recursing - Stop losing events in function tracing when transitioning between irq context - Stop losing events in ring buffer when transitioning between irq context - Fix return code of error pointer in parse_synth_field() to prevent NULL pointer dereference. - Fix false positive of NMI recursion in kprobe event handling * tag 'trace-v5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: kprobes: Tell lockdep about kprobe nesting tracing: Make -ENOMEM the default error for parse_synth_field() ring-buffer: Fix recursion protection transitions between interrupt context tracing: Fix the checking of stackidx in __ftrace_trace_stack ftrace: Handle tracing when switching between context ftrace: Fix recursion check for NMI test tracing: Fix out of bounds write in get_trace_buf
2020-11-05Merge tag 'pm-5.10-rc3' of ↵Linus Torvalds1-12/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix the device links support in runtime PM, correct mistakes in the cpuidle documentation, fix the handling of policy limits changes in the schedutil cpufreq governor, fix assorted issues in the OPP (operating performance points) framework and make one janitorial change. Specifics: - Unify the handling of managed and stateless device links in the runtime PM framework and prevent runtime PM references to devices from being leaked after device link removal (Rafael Wysocki). - Fix two mistakes in the cpuidle documentation (Julia Lawall). - Prevent the schedutil cpufreq governor from missing policy limits updates in some cases (Viresh Kumar). - Prevent static OPPs from being dropped by mistake (Viresh Kumar). - Prevent helper function in the OPP framework from returning prematurely (Viresh Kumar). - Prevent opp_table_lock from being held too long during removal of OPP tables with no more active references (Viresh Kumar). - Drop redundant semicolon from the Intel RAPL power capping driver (Tom Rix)" * tag 'pm-5.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: runtime: Resume the device earlier in __device_release_driver() PM: runtime: Drop pm_runtime_clean_up_links() PM: runtime: Drop runtime PM references to supplier on link removal powercap/intel_rapl: remove unneeded semicolon Documentation: PM: cpuidle: correct path name Documentation: PM: cpuidle: correct typo cpufreq: schedutil: Don't skip freq update if need_freq_update is set opp: Reduce the size of critical section in _opp_table_kref_release() opp: Fix early exit from dev_pm_opp_register_set_opp_helper() opp: Don't always remove static OPPs in _of_add_opp_table_v1()
2020-11-04x86/entry: Move nmi entry/exit into common codeThomas Gleixner1-0/+36
Lockdep state handling on NMI enter and exit is nothing specific to X86. It's not any different on other architectures. Also the extra state type is not necessary, irqentry_state_t can carry the necessary information as well. Move it to common code and extend irqentry_state_t to carry lockdep state. [ Ira: Make exit_rcu and lockdep a union as they are mutually exclusive between the IRQ and NMI exceptions, and add kernel documentation for struct irqentry_state_t ] Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Ira Weiny <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-11-04Merge branch 'core/urgent' into core/entryThomas Gleixner228-3612/+14204
Pick up the entry fix before further modifications.
2020-11-04entry: Fix the incorrect ordering of lockdep and RCU checkThomas Gleixner1-2/+2
When an exception/interrupt hits kernel space and the kernel is not currently in the idle task then RCU must be watching. irqentry_enter() validates this via rcu_irq_enter_check_tick(), which in turn invokes lockdep when taking a lock. But at that point lockdep does not yet know about the fact that interrupts have been disabled by the CPU, which triggers a lockdep splat complaining about inconsistent state. Invoking trace_hardirqs_off() before rcu_irq_enter_check_tick() defeats the point of rcu_irq_enter_check_tick() because trace_hardirqs_off() uses RCU. So use the same sequence as for the idle case and tell lockdep about the irq state change first, invoke the RCU check and then do the lockdep and tracer update. Fixes: a5497bab5f72 ("entry: Provide generic interrupt entry/exit code") Reported-by: Mark Rutland <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Mark Rutland <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected]
2020-11-04kprobes: Tell lockdep about kprobe nestingSteven Rostedt (VMware)1-4/+21
Since the kprobe handlers have protection that prohibits other handlers from executing in other contexts (like if an NMI comes in while processing a kprobe, and executes the same kprobe, it will get fail with a "busy" return). Lockdep is unaware of this protection. Use lockdep's nesting api to differentiate between locks taken in INT3 context and other context to suppress the false warnings. Link: https://lore.kernel.org/r/[email protected] Cc: Peter Zijlstra <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-04module: only handle errors with the *switch* statement in module_sig_check()Sergey Shtylyov1-12/+14
Let's handle the successful call of mod_verify_sig() right after that call, making the *switch* statement only handle the real errors, and then move the comment from the first *case* before *switch* itself and the comment before *default* after it. Fix the comment style, add article/comma/dash, spell out "nomem" as "lack of memory" in these comments, while at it... Suggested-by: Joe Perches <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Jessica Yu <[email protected]>
2020-11-04module: avoid *goto*s in module_sig_check()Sergey Shtylyov1-10/+10
Let's move the common handling of the non-fatal errors after the *switch* statement -- this avoids *goto*s inside that *switch*... Suggested-by: Joe Perches <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Jessica Yu <[email protected]>
2020-11-04module: merge repetitive strings in module_sig_check()Sergey Shtylyov1-4/+5
The 'reason' variable in module_sig_check() points to 3 strings across the *switch* statement, all needlessly starting with the same text. Let's put the starting text into the pr_notice() call -- it saves 21 bytes of the object code (x86 gcc 10.2.1). Suggested-by: Joe Perches <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Signed-off-by: Sergey Shtylyov <[email protected]> Signed-off-by: Jessica Yu <[email protected]>
2020-11-02kernel: make kcov_common_handle consider the current contextAleksandr Nogikh1-0/+2
kcov_common_handle is a method that is used to obtain a "default" KCOV remote handle of the current process. The handle can later be passed to kcov_remote_start in order to collect coverage for the processing that is initiated by one process, but done in another. For details see Documentation/dev-tools/kcov.rst and comments in kernel/kcov.c. Presently, if kcov_common_handle is called in an IRQ context, it will return a handle for the interrupted process. This may lead to unreliable and incorrect coverage collection. Adjust the behavior of kcov_common_handle in the following way. If it is called in a task context, return the common handle for the currently running task. Otherwise, return 0. Signed-off-by: Aleksandr Nogikh <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]>
2020-11-02refscale: Bounds-check module parametersPaul E. McKenney1-0/+6
The default value for refscale.nreaders is -1, which results in the code setting the value to three-quarters of the number of CPUs. On single-CPU systems, this results in three-quarters of the value one, which the C language's integer arithmetic rounds to zero. This in turn results in a divide-by-zero error. This commit therefore adds bounds checking to the refscale module parameters, so that if they are less than one, they are set to the value one. Reported-by: kernel test robot <[email protected]> Tested-by "Chen, Rong A" <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02rcutorture: Make grace-period kthread report match RCU flavor being testedPaul E. McKenney3-18/+39
At the end of the test and after rcu_torture_writer() stalls, rcutorture invokes show_rcu_gp_kthreads() in order to dump out information on the RCU grace-period kthread. This makes a lot of sense when testing vanilla RCU, but not so much for the other flavors. This commit therefore allows per-flavor kthread-dump functions to be specified. [ paulmck: Apply feedback from kernel test robot <[email protected]>. ] Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02rcu-tasks: Convert rcu_tasks_wait_gp() for-loop to while-loopPaul E. McKenney1-4/+1
The infinite for-loop in rcu_tasks_wait_gp() has its only exit at the top of the loop, so this commit does the straightforward conversion to a while-loop, thus saving a few lines. Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02srcu: Use a more appropriate lockdep helperJakub Kicinski1-1/+1
The lockdep_is_held() macro is defined as: #define lockdep_is_held(lock) lock_is_held(&(lock)->dep_map) This hides away the dereference, so that builds with !LOCKDEP don't break. This works in current kernels because the RCU_LOCKDEP_WARN() eliminates its condition at preprocessor time in !LOCKDEP kernels. However, later patches in this series will cause the compiler to see this condition even in !LOCKDEP kernels. This commit prepares for this upcoming change by switching from lock_is_held() to lockdep_is_held(). Signed-off-by: Jakub Kicinski <[email protected]> -- CC: [email protected] CC: [email protected] CC: [email protected] CC: [email protected] CC: [email protected] Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02kcsan: Never set up watchpoints on NULL pointersMarco Elver1-1/+5
Avoid setting up watchpoints on NULL pointers, as otherwise we would crash inside the KCSAN runtime (when checking for value changes) instead of the instrumented code. Because that may be confusing, skip any address less than PAGE_SIZE. Reviewed-by: Dmitry Vyukov <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02kcsan: selftest: Ensure that address is at least PAGE_SIZEMarco Elver1-0/+3
In preparation of supporting only addresses not within the NULL page, change the selftest to never use addresses that are less than PAGE_SIZE. Reviewed-by: Dmitry Vyukov <[email protected]> Signed-off-by: Marco Elver <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]>
2020-11-02tracing: Make -ENOMEM the default error for parse_synth_field()Steven Rostedt (VMware)1-10/+7
parse_synth_field() returns a pointer and requires that errors get surrounded by ERR_PTR(). The ret variable is initialized to zero, but should never be used as zero, and if it is, it could cause a false return code and produce a NULL pointer dereference. It makes no sense to set ret to zero. Set ret to -ENOMEM (the most common error case), and have any other errors set it to something else. This removes the need to initialize ret on *every* error branch. Fixes: 761a8c58db6b ("tracing, synthetic events: Replace buggy strcat() with seq_buf operations") Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-02ring-buffer: Fix recursion protection transitions between interrupt contextSteven Rostedt (VMware)1-12/+46
The recursion protection of the ring buffer depends on preempt_count() to be correct. But it is possible that the ring buffer gets called after an interrupt comes in but before it updates the preempt_count(). This will trigger a false positive in the recursion code. Use the same trick from the ftrace function callback recursion code which uses a "transition" bit that gets set, to allow for a single recursion for to handle transitions between contexts. Cc: [email protected] Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking") Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-11-02kernel/hung_task.c: make type annotations consistentLukas Bulwahn1-2/+1
Commit 32927393dc1c ("sysctl: pass kernel pointers to ->proc_handler") removed various __user annotations from function signatures as part of its refactoring. It also removed the __user annotation for proc_dohung_task_timeout_secs() at its declaration in sched/sysctl.h, but not at its definition in kernel/hung_task.c. Hence, sparse complains: kernel/hung_task.c:271:5: error: symbol 'proc_dohung_task_timeout_secs' redeclared with different type (incompatible argument 3 (different address spaces)) Adjust the annotation at the definition fitting to that refactoring to make sparse happy again, which also resolves this warning from sparse: kernel/hung_task.c:277:52: warning: incorrect type in argument 3 (different address spaces) kernel/hung_task.c:277:52: expected void * kernel/hung_task.c:277:52: got void [noderef] __user *buffer No functional change. No change in object code. Signed-off-by: Lukas Bulwahn <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Tetsuo Handa <[email protected]> Cc: Al Viro <[email protected]> Cc: Andrey Ignatov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-11-02kthread_worker: prevent queuing delayed work from timer_fn when it is being ↵Zqiang1-1/+2
canceled There is a small race window when a delayed work is being canceled and the work still might be queued from the timer_fn: CPU0 CPU1 kthread_cancel_delayed_work_sync() __kthread_cancel_work_sync() __kthread_cancel_work() work->canceling++; kthread_delayed_work_timer_fn() kthread_insert_work(); BUG: kthread_insert_work() should not get called when work->canceling is set. Signed-off-by: Zqiang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Petr Mladek <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-11-02ptrace: fix task_join_group_stop() for the case when current is tracedOleg Nesterov1-9/+10
This testcase #include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <pthread.h> #include <assert.h> void *tf(void *arg) { return NULL; } int main(void) { int pid = fork(); if (!pid) { kill(getpid(), SIGSTOP); pthread_t th; pthread_create(&th, NULL, tf, NULL); return 0; } waitpid(pid, NULL, WSTOPPED); ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE); waitpid(pid, NULL, 0); ptrace(PTRACE_CONT, pid, 0,0); waitpid(pid, NULL, 0); int status; int thread = waitpid(-1, &status, 0); assert(thread > 0 && thread != pid); assert(status == 0x80137f); return 0; } fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap(). This is because task_join_group_stop() has 2 problems when current is traced: 1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee can be woken up by debugger and it can clone another thread which should join the group-stop. We need to check group_stop_count || SIGNAL_STOP_STOPPED. 2. If SIGNAL_STOP_STOPPED is already set, we should not increment sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread should stop without another do_notify_parent_cldstop() report. To clarify, the problem is very old and we should blame ptrace_init_task(). But now that we have task_join_group_stop() it makes more sense to fix this helper to avoid the code duplication. Reported-by: [email protected] Signed-off-by: Oleg Nesterov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jens Axboe <[email protected]> Cc: Christian Brauner <[email protected]> Cc: "Eric W . Biederman" <[email protected]> Cc: Zhiqiang Liu <[email protected]> Cc: Tejun Heo <[email protected]> Cc: <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-11-02cpufreq: schedutil: Don't skip freq update if need_freq_update is setViresh Kumar1-12/+10
The cpufreq policy's frequency limits (min/max) can get changed at any point of time, while schedutil is trying to update the next frequency. Though the schedutil governor has necessary locking and support in place to make sure we don't miss any of those updates, there is a corner case where the governor will find that the CPU is already running at the desired frequency and so may skip an update. For example, consider that the CPU can run at 1 GHz, 1.2 GHz and 1.4 GHz and is running at 1 GHz currently. Schedutil tries to update the frequency to 1.2 GHz, during this time the policy limits get changed as policy->min = 1.4 GHz. As schedutil (and cpufreq core) does clamp the frequency at various instances, we will eventually set the frequency to 1.4 GHz, while we will save 1.2 GHz in sg_policy->next_freq. Now lets say the policy limits get changed back at this time with policy->min as 1 GHz. The next time schedutil is invoked by the scheduler, we will reevaluate the next frequency (because need_freq_update will get set due to limits change event) and lets say we want to set the frequency to 1.2 GHz again. At this point sugov_update_next_freq() will find the next_freq == current_freq and will abort the update, while the CPU actually runs at 1.4 GHz. Until now need_freq_update was used as a flag to indicate that the policy's frequency limits have changed, and that we should consider the new limits while reevaluating the next frequency. This patch fixes the above mentioned issue by extending the purpose of the need_freq_update flag. If this flag is set now, the schedutil governor will not try to abort a frequency change even if next_freq == current_freq. As similar behavior is required in the case of CPUFREQ_NEED_UPDATE_LIMITS flag as well, need_freq_update will never be set to false if that flag is set for the driver. We also don't need to consider the need_freq_update flag in sugov_update_single() anymore to handle the special case of busy CPU, as we won't abort a frequency update anymore. Reported-by: zhuguangqing <[email protected]> Suggested-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> [ rjw: Rearrange code to avoid a branch ] Signed-off-by: Rafael J. Wysocki <[email protected]>