aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2018-04-10tick-sched: avoid a maybe-uninitialized warningArnd Bergmann1-3/+6
The use of bitfields seems to confuse gcc, leading to a false-positive warning in all compiler versions: kernel/time/tick-sched.c: In function 'tick_nohz_idle_exit': kernel/time/tick-sched.c:538:2: error: 'now' may be used uninitialized in this function [-Werror=maybe-uninitialized] This introduces a temporary variable to track the flags so gcc doesn't have to evaluate twice, eliminating the code path that leads to the warning. Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85301 Fixes: 1cae544d42d2 ("nohz: Gather tick_sched booleans under a common flag field") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-09time: hrtimer: Use timerqueue_iterate_next() to get to the next timerRafael J. Wysocki1-3/+1
Use timerqueue_iterate_next() to get to the next timer in __hrtimer_next_event_base() without browsing the timerqueue details diredctly. No intentional changes in functionality. Suggested-by: Frederic Weisbecker <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-09nohz: Avoid duplication of code related to got_idle_tickRafael J. Wysocki1-10/+6
Move the code setting ts->got_idle_tick into tick_sched_do_timer() to avoid code duplication. No intentional changes in functionality. Suggested-by: Frederic Weisbecker <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]>
2018-04-09nohz: Gather tick_sched booleans under a common flag fieldFrederic Weisbecker2-9/+15
Optimize the space and leave plenty of room for further flags. Signed-off-by: Frederic Weisbecker <[email protected]> [ rjw: Do not use __this_cpu_read() to access tick_stopped and add got_idle_tick to avoid overloading inidle ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-09cpuidle: menu: Refine idle state selection for running tickRafael J. Wysocki1-6/+6
If the tick isn't stopped, the target residency of the state selected by the menu governor may be greater than the actual time to the next tick and that means lost energy. To avoid that, make tick_nohz_get_sleep_length() return the current time to the next event (before stopping the tick) in addition to the estimated one via an extra pointer argument and make menu_select() use that value to refine the state selection when necessary. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2018-04-09sched: idle: Select idle state before stopping the tickRafael J. Wysocki3-17/+57
In order to address the issue with short idle duration predictions by the idle governor after the scheduler tick has been stopped, reorder the code in cpuidle_idle_call() so that the governor idle state selection runs before tick_nohz_idle_go_idle() and use the "nohz" hint returned by cpuidle_select() to decide whether or not to stop the tick. This isn't straightforward, because menu_select() invokes tick_nohz_get_sleep_length() to get the time to the next timer event and the number returned by the latter comes from __tick_nohz_idle_stop_tick(). Fortunately, however, it is possible to compute that number without actually stopping the tick and with the help of the existing code. Namely, tick_nohz_get_sleep_length() can be made call tick_nohz_next_event(), introduced earlier, to get the time to the next non-highres timer event. If that happens, tick_nohz_next_event() need not be called by __tick_nohz_idle_stop_tick() again. If it turns out that the scheduler tick cannot be stopped going forward or the next timer event is too close for the tick to be stopped, tick_nohz_get_sleep_length() can simply return the time to the next event currently programmed into the corresponding clock event device. In addition to knowing the return value of tick_nohz_next_event(), however, tick_nohz_get_sleep_length() needs to know the time to the next highres timer event, but with the scheduler tick timer excluded, which can be computed with the help of hrtimer_get_next_event(). That minimum of that number and the tick_nohz_next_event() return value is the total time to the next timer event with the assumption that the tick will be stopped. It can be returned to the idle governor which can use it for predicting idle duration (under the assumption that the tick will be stopped) and deciding whether or not it makes sense to stop the tick before putting the CPU into the selected idle state. With the above, the sleep_length field in struct tick_sched is not necessary any more, so drop it. Link: https://bugzilla.kernel.org/show_bug.cgi?id=199227 Reported-by: Doug Smythies <[email protected]> Reported-by: Thomas Ilsche <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]>
2018-04-07time: hrtimer: Introduce hrtimer_next_event_without()Rafael J. Wysocki1-2/+53
The next set of changes will need to compute the time to the next hrtimer event over all hrtimers except for the scheduler tick one. To that end introduce a new helper function, hrtimer_next_event_without(), for computing the time until the next hrtimer event over all timers except for one and modify the underlying code in __hrtimer_next_event_base() to prepare it for being called by that new function. No intentional changes in functionality. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]>
2018-04-07time: tick-sched: Split tick_nohz_stop_sched_tick()Rafael J. Wysocki2-46/+82
In order to address the issue with short idle duration predictions by the idle governor after the scheduler tick has been stopped, split tick_nohz_stop_sched_tick() into two separate routines, one computing the time to the next timer event and the other simply stopping the tick when the time to the next timer event is known. Prepare these two routines to be called separately, as one of them will be called by the idle governor in the cpuidle_select() code path after subsequent changes. Update the former callers of tick_nohz_stop_sched_tick() to use the new routines, tick_nohz_next_event() and tick_nohz_stop_tick(), instead of it and move the updates of the sleep_length field in struct tick_sched into __tick_nohz_idle_stop_tick() as it doesn't need to be updated anywhere else. There should be no intentional visible changes in functionality resulting from this change. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]>
2018-04-06cpuidle: Return nohz hint from cpuidle_select()Rafael J. Wysocki2-1/+23
Add a new pointer argument to cpuidle_select() and to the ->select cpuidle governor callback to allow a boolean value indicating whether or not the tick should be stopped before entering the selected state to be returned from there. Make the ladder governor ignore that pointer (to preserve its current behavior) and make the menu governor return 'false" through it if: (1) the idle exit latency is constrained at 0, or (2) the selected state is a polling one, or (3) the expected idle period duration is within the tick period range. In addition to that, the correction factor computations in the menu governor need to take the possibility that the tick may not be stopped into account to avoid artificially small correction factor values. To that end, add a mechanism to record tick wakeups, as suggested by Peter Zijlstra, and use it to modify the menu_update() behavior when tick wakeup occurs. Namely, if the CPU is woken up by the tick and the return value of tick_nohz_get_sleep_length() is not within the tick boundary, the predicted idle duration is likely too short, so make menu_update() try to compensate for that by updating the governor statistics as though the CPU was idle for a long time. Since the value returned through the new argument pointer of cpuidle_select() is not used by its caller yet, this change by itself is not expected to alter the functionality of the code. Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2018-04-06jiffies: Introduce USER_TICK_USEC and redefine TICK_USECRafael J. Wysocki1-1/+1
Since the subsequent changes will need a TICK_USEC definition analogous to TICK_NSEC, rename the existing TICK_USEC as USER_TICK_USEC, update its users and redefine TICK_USEC accordingly. Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]>
2018-04-05sched: idle: Do not stop the tick before cpuidle_idle_call()Rafael J. Wysocki1-4/+15
Make cpuidle_idle_call() decide whether or not to stop the tick. First, the cpuidle_enter_s2idle() path deals with the tick (and with the entire timekeeping for that matter) by itself and it doesn't need the tick to be stopped beforehand. Second, to address the issue with short idle duration predictions by the idle governor after the tick has been stopped, it will be necessary to change the ordering of cpuidle_select() with respect to tick_nohz_idle_stop_tick(). To prepare for that, put a tick_nohz_idle_stop_tick() call in the same branch in which cpuidle_select() is called. Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2018-04-05sched: idle: Do not stop the tick upfront in the idle loopRafael J. Wysocki2-11/+24
Push the decision whether or not to stop the tick somewhat deeper into the idle loop. Stopping the tick upfront leads to unpleasant outcomes in case the idle governor doesn't agree with the nohz code on the duration of the upcoming idle period. Specifically, if the tick has been stopped and the idle governor predicts short idle, the situation is bad regardless of whether or not the prediction is accurate. If it is accurate, the tick has been stopped unnecessarily which means excessive overhead. If it is not accurate, the CPU is likely to spend too much time in the (shallow, because short idle has been predicted) idle state selected by the governor [1]. As the first step towards addressing this problem, change the code to make the tick stopping decision inside of the loop in do_idle(). In particular, do not stop the tick in the cpu_idle_poll() code path. Also don't do that in tick_nohz_irq_exit() which doesn't really have enough information on whether or not to stop the tick. Link: https://marc.info/?l=linux-pm&m=150116085925208&w=2 # [1] Link: https://tu-dresden.de/zih/forschung/ressourcen/dateien/projekte/haec/powernightmares.pdf Suggested-by: Frederic Weisbecker <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2018-04-05time: tick-sched: Reorganize idle tick management codeRafael J. Wysocki2-21/+26
Prepare the scheduler tick code for reworking the idle loop to avoid stopping the tick in some cases. The idea is to split the nohz idle entry call to decouple the idle time stats accounting and preparatory work from the actual tick stop code, in order to later be able to delay the tick stop once we reach more power-knowledgeable callers. Move away the tick_nohz_start_idle() invocation from __tick_nohz_idle_enter(), rename the latter to __tick_nohz_idle_stop_tick() and define tick_nohz_idle_stop_tick() as a wrapper around it for calling it from the outside. Make tick_nohz_idle_enter() only call tick_nohz_start_idle() instead of calling the entire __tick_nohz_idle_enter(), add another wrapper disabling and enabling interrupts around tick_nohz_idle_stop_tick() and make the current callers of tick_nohz_idle_enter() call it too to retain their current functionality. Signed-off-by: Rafael J. Wysocki <[email protected]> Reviewed-by: Frederic Weisbecker <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]>
2018-04-03Merge tag 'pm-4.17-rc1' of ↵Linus Torvalds1-1/+25
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "These update the cpuidle poll state definition to reduce excessive energy usage related to it, add new CPU ID to the RAPL power capping driver, update the ACPI system suspend code to handle some special cases better, extend the PM core's device links code slightly, add new sysfs attribute for better suspend-to-idle diagnostics and easier hibernation handling, update power management tools and clean up cpufreq quite a bit. Specifics: - Modify the cpuidle poll state implementation to prevent CPUs from staying in the loop in there for excessive times (Rafael Wysocki). - Add Intel Cannon Lake chips support to the RAPL power capping driver (Joe Konno). - Add reference counting to the device links handling code in the PM core (Lukas Wunner). - Avoid reconfiguring GPEs on suspend-to-idle in the ACPI system suspend code (Rafael Wysocki). - Allow devices to be put into deeper low-power states via ACPI if both _SxD and _SxW are missing (Daniel Drake). - Reorganize the core ACPI suspend-to-idle wakeup code to avoid a keyboard wakeup issue on Asus UX331UA (Chris Chiu). - Prevent the PCMCIA library code from aborting suspend-to-idle due to noirq suspend failures resulting from incorrect assumptions (Rafael Wysocki). - Add coupled cpuidle supprt to the Exynos3250 platform (Marek Szyprowski). - Add new sysfs file to make it easier to specify the image storage location during hibernation (Mario Limonciello). - Add sysfs files for collecting suspend-to-idle usage and time statistics for CPU idle states (Rafael Wysocki). - Update the pm-graph utilities (Todd Brandt). - Reduce the kernel log noise related to reporting Low-power Idle constraings by the ACPI system suspend code (Rafael Wysocki). - Make it easier to distinguish dedicated wakeup IRQs in the /proc/interrupts output (Tony Lindgren). - Add the frequency table validation in cpufreq to the core and drop it from a number of cpufreq drivers (Viresh Kumar). - Drop "cooling-{min|max}-level" for CPU nodes from a couple of DT bindings (Viresh Kumar). - Clean up the CPU online error code path in the cpufreq core (Viresh Kumar). - Fix assorted issues in the SCPI, CPPC, mediatek and tegra186 cpufreq drivers (Arnd Bergmann, Chunyu Hu, George Cherian, Viresh Kumar). - Drop memory allocation error messages from a few places in cpufreq and cpuildle drivers (Markus Elfring)" * tag 'pm-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (56 commits) ACPI / PM: Fix keyboard wakeup from suspend-to-idle on ASUS UX331UA cpufreq: CPPC: Use transition_delay_us depending transition_latency PM / hibernate: Change message when writing to /sys/power/resume PM / hibernate: Make passing hibernate offsets more friendly cpuidle: poll_state: Avoid invoking local_clock() too often PM: cpuidle/suspend: Add s2idle usage and time state attributes cpuidle: Enable coupled cpuidle support on Exynos3250 platform cpuidle: poll_state: Add time limit to poll_idle() cpufreq: tegra186: Don't validate the frequency table twice cpufreq: speedstep: Don't validate the frequency table twice cpufreq: sparc: Don't validate the frequency table twice cpufreq: sh: Don't validate the frequency table twice cpufreq: sfi: Don't validate the frequency table twice cpufreq: scpi: Don't validate the frequency table twice cpufreq: sc520: Don't validate the frequency table twice cpufreq: s3c24xx: Don't validate the frequency table twice cpufreq: qoirq: Don't validate the frequency table twice cpufreq: pxa: Don't validate the frequency table twice cpufreq: ppc_cbe: Don't validate the frequency table twice cpufreq: powernow: Don't validate the frequency table twice ...
2018-04-02Merge branch 'syscalls-next' of ↵Linus Torvalds15-340/+563
git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux Pull removal of in-kernel calls to syscalls from Dominik Brodowski: "System calls are interaction points between userspace and the kernel. Therefore, system call functions such as sys_xyzzy() or compat_sys_xyzzy() should only be called from userspace via the syscall table, but not from elsewhere in the kernel. At least on 64-bit x86, it will likely be a hard requirement from v4.17 onwards to not call system call functions in the kernel: It is better to use use a different calling convention for system calls there, where struct pt_regs is decoded on-the-fly in a syscall wrapper which then hands processing over to the actual syscall function. This means that only those parameters which are actually needed for a specific syscall are passed on during syscall entry, instead of filling in six CPU registers with random user space content all the time (which may cause serious trouble down the call chain). Those x86-specific patches will be pushed through the x86 tree in the near future. Moreover, rules on how data may be accessed may differ between kernel data and user data. This is another reason why calling sys_xyzzy() is generally a bad idea, and -- at most -- acceptable in arch-specific code. This patchset removes all in-kernel calls to syscall functions in the kernel with the exception of arch/. On top of this, it cleans up the three places where many syscalls are referenced or prototyped, namely kernel/sys_ni.c, include/linux/syscalls.h and include/linux/compat.h" * 'syscalls-next' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/linux: (109 commits) bpf: whitelist all syscalls for error injection kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitions kernel/sys_ni: sort cond_syscall() entries syscalls/x86: auto-create compat_sys_*() prototypes syscalls: sort syscall prototypes in include/linux/compat.h net: remove compat_sys_*() prototypes from net/compat.h syscalls: sort syscall prototypes in include/linux/syscalls.h kexec: move sys_kexec_load() prototype to syscalls.h x86/sigreturn: use SYSCALL_DEFINE0 x86: fix sys_sigreturn() return type to be long, not unsigned long x86/ioport: add ksys_ioperm() helper; remove in-kernel calls to sys_ioperm() mm: add ksys_readahead() helper; remove in-kernel calls to sys_readahead() mm: add ksys_mmap_pgoff() helper; remove in-kernel calls to sys_mmap_pgoff() mm: add ksys_fadvise64_64() helper; remove in-kernel call to sys_fadvise64_64() fs: add ksys_fallocate() wrapper; remove in-kernel calls to sys_fallocate() fs: add ksys_p{read,write}64() helpers; remove in-kernel calls to syscalls fs: add ksys_truncate() wrapper; remove in-kernel calls to sys_truncate() fs: add ksys_sync_file_range helper(); remove in-kernel calls to syscall kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid() kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare() ...
2018-04-02Merge tag 'arch-removal' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pul removal of obsolete architecture ports from Arnd Bergmann: "This removes the entire architecture code for blackfin, cris, frv, m32r, metag, mn10300, score, and tile, including the associated device drivers. I have been working with the (former) maintainers for each one to ensure that my interpretation was right and the code is definitely unused in mainline kernels. Many had fond memories of working on the respective ports to start with and getting them included in upstream, but also saw no point in keeping the port alive without any users. In the end, it seems that while the eight architectures are extremely different, they all suffered the same fate: There was one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem, which was more costly than licensing newer off-the-shelf CPU cores from a third party (typically ARM, MIPS, or RISC-V). It seems that all the SoC product lines are still around, but have not used the custom CPU architectures for several years at this point. In contrast, CPU instruction sets that remain popular and have actively maintained kernel ports tend to all be used across multiple licensees. [ See the new nds32 port merged in the previous commit for the next generation of "one company in charge of an SoC line, a CPU microarchitecture and a software ecosystem" - Linus ] The removal came out of a discussion that is now documented at https://lwn.net/Articles/748074/. Unlike the original plans, I'm not marking any ports as deprecated but remove them all at once after I made sure that they are all unused. Some architectures (notably tile, mn10300, and blackfin) are still being shipped in products with old kernels, but those products will never be updated to newer kernel releases. After this series, we still have a few architectures without mainline gcc support: - unicore32 and hexagon both have very outdated gcc releases, but the maintainers promised to work on providing something newer. At least in case of hexagon, this will only be llvm, not gcc. - openrisc, risc-v and nds32 are still in the process of finishing their support or getting it added to mainline gcc in the first place. They all have patched gcc-7.3 ports that work to some degree, but complete upstream support won't happen before gcc-8.1. Csky posted their first kernel patch set last week, their situation will be similar [ Palmer Dabbelt points out that RISC-V support is in mainline gcc since gcc-7, although gcc-7.3.0 is the recommended minimum - Linus ]" This really says it all: 2498 files changed, 95 insertions(+), 467668 deletions(-) * tag 'arch-removal' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (74 commits) MAINTAINERS: UNICORE32: Change email account staging: iio: remove iio-trig-bfin-timer driver tty: hvc: remove tile driver tty: remove bfin_jtag_comm and hvc_bfin_jtag drivers serial: remove tile uart driver serial: remove m32r_sio driver serial: remove blackfin drivers serial: remove cris/etrax uart drivers usb: Remove Blackfin references in USB support usb: isp1362: remove blackfin arch glue usb: musb: remove blackfin port usb: host: remove tilegx platform glue pwm: remove pwm-bfin driver i2c: remove bfin-twi driver spi: remove blackfin related host drivers watchdog: remove bfin_wdt driver can: remove bfin_can driver mmc: remove bfin_sdh driver input: misc: remove blackfin rotary driver input: keyboard: remove bf54x driver ...
2018-04-02Merge branch 'sched-wait-for-linus' of ↵Linus Torvalds1-84/+26
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull wait_var_event updates from Ingo Molnar: "This introduces the new wait_var_event() API, which is a more flexible waiting primitive than wait_on_atomic_t(). All wait_on_atomic_t() users are migrated over to the new API and wait_on_atomic_t() is removed. The migration fixes one bug and should result in no functional changes for the other usecases" * 'sched-wait-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/wait: Improve __var_waitqueue() code generation sched/wait: Remove the wait_on_atomic_t() API sched/wait, arch/mips: Fix and convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/ocfs2: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/nfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/fscache: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/btrfs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, fs/afs: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/media: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait, drivers/drm: Convert wait_on_atomic_t() usage to the new wait_var_event() API sched/wait: Introduce wait_var_event()
2018-04-02Merge branch 'smp-hotplug-for-linus' of ↵Linus Torvalds1-36/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull SMP hotplug updates from Ingo Molnar: "Simplify the CPU hot-plug state machine" * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Fix unused function warning cpu/hotplug: Merge cpuhp_bp_states and cpuhp_ap_states
2018-04-02Merge branch 'sched-core-for-linus' of ↵Linus Torvalds34-1512/+2036
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "The main scheduler changes in this cycle were: - NUMA balancing improvements (Mel Gorman) - Further load tracking improvements (Patrick Bellasi) - Various NOHZ balancing cleanups and optimizations (Peter Zijlstra) - Improve blocked load handling, in particular we can now reduce and eventually stop periodic load updates on 'very idle' CPUs. (Vincent Guittot) - On isolated CPUs offload the final 1Hz scheduler tick as well, plus related cleanups and reorganization. (Frederic Weisbecker) - Core scheduler code cleanups (Ingo Molnar)" * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) sched/core: Update preempt_notifier_key to modern API sched/cpufreq: Rate limits for SCHED_DEADLINE sched/fair: Update util_est only on util_avg updates sched/cpufreq/schedutil: Use util_est for OPP selection sched/fair: Use util_est in LB and WU paths sched/fair: Add util_est on top of PELT sched/core: Remove TASK_ALL sched/completions: Use bool in try_wait_for_completion() sched/fair: Update blocked load when newly idle sched/fair: Move idle_balance() sched/nohz: Merge CONFIG_NO_HZ_COMMON blocks sched/fair: Move rebalance_domains() sched/nohz: Optimize nohz_idle_balance() sched/fair: Reduce the periodic update duration sched/nohz: Stop NOHZ stats when decayed sched/cpufreq: Provide migration hint sched/nohz: Clean up nohz enter/exit sched/fair: Update blocked load from NEWIDLE sched/fair: Add NOHZ stats balancing sched/fair: Restructure nohz_balance_kick() ...
2018-04-02kernel/sys_ni: remove {sys_,sys_compat} from cond_syscall definitionsDominik Brodowski1-215/+218
This keeps it in line with the SYSCALL_DEFINEx() / COMPAT_SYSCALL_DEFINEx() calling convention. Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel/sys_ni: sort cond_syscall() entriesDominik Brodowski1-174/+332
Shuffle the cond_syscall() entries in kernel/sys_ni.c around so that they are kept in the same order as in include/uapi/asm-generic/unistd.h. For better structuring, add the same comments as in that file, but keep a few additional comments and extend the commentary where it seems useful. Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()Dominik Brodowski1-1/+6
Using this helper allows us to avoid the in-kernel call to the sys_setsid() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_setsid(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: add ksys_unshare() helper; remove in-kernel calls to sys_unshare()Dominik Brodowski1-1/+6
Using this helper allows us to avoid the in-kernel calls to the sys_unshare() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_unshare(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add ksys_sync() helper; remove in-kernel calls to sys_sync()Dominik Brodowski3-3/+3
Using this helper allows us to avoid the in-kernel calls to the sys_sync() syscall. The ksys_ prefix denotes that this function is meant as a drop-in replacement for the syscall. In particular, it uses the same calling convention as sys_sync(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Alexander Viro <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs: add do_fchownat(), ksys_fchown() helpers and ksys_{,l}chown() wrappersDominik Brodowski1-3/+3
Using the fs-interal do_fchownat() wrapper allows us to get rid of fs-internal calls to the sys_fchownat() syscall. Introducing the ksys_fchown() helper and the ksys_{,}chown() wrappers allows us to avoid the in-kernel calls to the sys_{,l,f}chown() syscalls. The ksys_ prefix denotes that these functions are meant as a drop-in replacement for the syscalls. In particular, they use the same calling convention as sys_{,l,f}chown(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02fs/quota: use COMPAT_SYSCALL_DEFINE for sys32_quotactl()Dominik Brodowski1-1/+1
While sys32_quotactl() is only needed on x86, it can use the recommended COMPAT_SYSCALL_DEFINEx() machinery for its setup. Acked-by: Jan Kara <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: add kernel_move_pages() helper, move compat syscall to mm/migrate.cDominik Brodowski1-22/+0
Move compat_sys_move_pages() to mm/migrate.c and make it call a newly introduced helper -- kernel_move_pages() -- instead of the syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: [email protected] Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: add kernel_migrate_pages() helper, move compat syscall to mm/mempolicy.cDominik Brodowski1-33/+0
Move compat_sys_migrate_pages() to mm/mempolicy.c and make it call a newly introduced helper -- kernel_migrate_pages() -- instead of the syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: [email protected] Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02sched: add do_sched_yield() helper; remove in-kernel call to sched_yield()Dominik Brodowski1-2/+6
Using the sched-internal do_sched_yield() helper allows us to get rid of the sched-internal call to the sys_sched_yield() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Ingo Molnar <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.cDominik Brodowski3-17/+74
Using these helpers allows us to avoid the in-kernel calls to these syscalls: sys_setregid(), sys_setgid(), sys_setreuid(), sys_setuid(), sys_setresuid(), sys_setresgid(), sys_setfsuid(), and sys_setfsgid(). The ksys_ prefix denotes that these function are meant as a drop-in replacement for the syscall. In particular, they use the same calling convention. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: add do_compat_sigaltstack() helper; remove in-kernel call to compat ↵Dominik Brodowski1-4/+10
syscall Using this helper allows us to avoid the in-kernel call to the compat_sys_sigaltstack() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: "Eric W. Biederman" <[email protected]> Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: add do_getpgid() helper; remove internal call to sys_getpgid()Dominik Brodowski1-2/+7
Using the do_getpgid() helper removes an in-kernel call to the sys_getpgid() syscall. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02mm: use do_futex() instead of sys_futex() in mm_release()Dominik Brodowski1-2/+2
sys_futex() is a wrapper to do_futex() which does not modify any values here: - uaddr, val and val3 are kept the same - op is masked with FUTEX_CMD_MASK, but is always set to FUTEX_WAKE. Therefore, val2 is always 0. - as utime is set to NULL, *timeout is NULL This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Darren Hart <[email protected]> Cc: Andrew Morton <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kexec: call do_kexec_load() in compat syscall directlyDominik Brodowski1-13/+39
do_kexec_load() can be called directly by compat_sys_kexec() as long as the same parameters checks are completed which are currently handled (also) by sys_kexec(). Therefore, move those to kexec_load_check(), call that newly introduced helper function from both sys_kexec() and compat_sys_kexec(), and duplicate the remaining code from sys_kexec() in compat_sys_kexec(). This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Eric Biederman <[email protected]> Cc: [email protected] Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: open-code sys_rt_sigpending() in sys_sigpending()Dominik Brodowski1-3/+12
A similar but not fully equivalent code path is already open-coded three times (in sys_rt_sigpending and in the two compat stubs), so do it a fourth time here. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02kernel: use kernel_wait4() instead of sys_wait4()Dominik Brodowski3-6/+6
All call sites of sys_wait4() set *rusage to NULL. Therefore, there is no need for the copy_to_user() handling of *rusage, and we can use kernel_wait4() directly. This patch is part of a series which removes in-kernel calls to syscalls. On this basis, the syscall entry path can be streamlined. For details, see http://lkml.kernel.org/r/[email protected] Acked-by: Luis R. Rodriguez <[email protected]> Cc: Al Viro <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Dominik Brodowski <[email protected]>
2018-04-02Merge branch 'perf-core-for-linus' of ↵Linus Torvalds6-194/+975
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main kernel side changes were: - Modernize the kprobe and uprobe creation/destruction tooling ABIs: The existing text based APIs (kprobe_events and uprobe_events in tracefs), are naive, limited ABIs in that they require user-space to clean up after themselves, which is both difficult and fragile if the tool is buggy or exits unexpectedly. In other words they are not really suited for modern, robust tooling. So introduce a modern, file descriptor based ABI that does not have these limitations: introduce the 'perf_kprobe' and 'perf_uprobe' PMUs and extend the perf_event_open() syscall to create events with a kprobe/uprobe attached to them. These [k,u]probe are associated with this file descriptor, so they are not available in tracefs. (Song Liu) - Intel Cannon Lake CPU support (Harry Pan) - Intel PT cleanups (Alexander Shishkin) - Improve the performance of pinned/flexible event groups by using RB trees (Alexey Budankov) - Add PERF_EVENT_IOC_MODIFY_ATTRIBUTES which allows the modification of hardware breakpoints, which new ABI variant massively speeds up existing tooling that uses hardware breakpoints to instrument (and debug) memory usage. (Milind Chabbi, Jiri Olsa) - Various Intel PEBS handling fixes and improvements, and other Intel PMU improvements (Kan Liang) - Various perf core improvements and optimizations (Peter Zijlstra) - ... misc cleanups, fixes and updates. There's over 200 tooling commits, here's an (imperfect) list of highlights: - 'perf annotate' improvements: * Recognize and handle jumps to other functions as calls, which improves the navigation along jumps and back. (Arnaldo Carvalho de Melo) * Add the 'P' hotkey in TUI annotation to dump annotation output into a file, to ease e-mail reporting of annotation details. (Arnaldo Carvalho de Melo) * Add an IPC/cycles column to the TUI (Jin Yao) * Improve s390 assembly annotation (Thomas Richter) * Refactor the output formatting logic to better separate it into interactive and non-interactive features and add the --stdio2 output variant to demonstrate this. (Arnaldo Carvalho de Melo) - 'perf script' improvements: * Add Python 3 support (Jaroslav Škarvada) * Add --show-round-event (Jiri Olsa) - 'perf c2c' improvements: * Add NUMA analysis support (Jiri Olsa) - 'perf trace' improvements: * Improve PowerPC support (Ravi Bangoria) - 'perf inject' improvements: * Integrate ARM CoreSight traces (Robert Walker) - 'perf stat' improvements: * Add the --interval-count option (yuzhoujian) * Add the --timeout option (yuzhoujian) - 'perf sched' improvements (Changbin Du) - Vendor events improvements : * Add IBM s390 vendor events (Thomas Richter) * Add and improve arm64 vendor events (John Garry, Ganapatrao Kulkarni) * Update POWER9 vendor events (Sukadev Bhattiprolu) - Intel PT tooling improvements (Adrian Hunter) - PMU handling improvements (Agustin Vega-Frias) - Record machine topology in perf.data (Jiri Olsa) - Various overwrite related cleanups (Kan Liang) - Add arm64 dwarf post unwind support (Kim Phillips, Jean Pihet) - ... and lots of other changes, cleanups and fixes, see the shortlog and Git history for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits) perf/x86/intel: Enable C-state residency events for Cannon Lake perf/x86/intel: Add Cannon Lake support for RAPL profiling perf/x86/pt, coresight: Clean up address filter structure perf vendor events s390: Add JSON files for IBM z14 perf vendor events s390: Add JSON files for IBM z13 perf vendor events s390: Add JSON files for IBM zEC12 zBC12 perf vendor events s390: Add JSON files for IBM z196 perf vendor events s390: Add JSON files for IBM z10EC z10BC perf mmap: Be consistent when checking for an unmaped ring buffer perf mmap: Fix accessing unmapped mmap in perf_mmap__read_done() perf build: Fix check-headers.sh opts assignment perf/x86: Update rdpmc_always_available static key to the modern API perf annotate: Use absolute addresses to calculate jump target offsets perf annotate: Defer searching for comma in raw line till it is needed perf annotate: Support jumping from one function to another perf annotate: Add "_local" to jump/offset validation routines perf python: Reference Py_None before returning it perf annotate: Mark jumps to outher functions with the call arrow perf annotate: Pass function descriptor to its instruction parsing routines perf annotate: No need to calculate notes->start twice ...
2018-04-02Merge branch 'locking-core-for-linus' of ↵Linus Torvalds5-21/+31
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "The main changes in the locking subsystem in this cycle were: - Add the Linux Kernel Memory Consistency Model (LKMM) subsystem, which is an an array of tools in tools/memory-model/ that formally describe the Linux memory coherency model (a.k.a. Documentation/memory-barriers.txt), and also produce 'litmus tests' in form of kernel code which can be directly executed and tested. Here's a high level background article about an earlier version of this work on LWN.net: https://lwn.net/Articles/718628/ The design principles: "There is reason to believe that Documentation/memory-barriers.txt could use some help, and a major purpose of this patch is to provide that help in the form of a design-time tool that can produce all valid executions of a small fragment of concurrent Linux-kernel code, which is called a "litmus test". This tool's functionality is roughly similar to a full state-space search. Please note that this is a design-time tool, not useful for regression testing. However, we hope that the underlying Linux-kernel memory model will be incorporated into other tools capable of analyzing large bodies of code for regression-testing purposes." [...] "A second tool is klitmus7, which converts litmus tests to loadable kernel modules for direct testing. As with herd7, the klitmus7 code is freely available from http://diy.inria.fr/sources/index.html (and via "git" at https://github.com/herd/herdtools7)" [...] Credits go to: "This patch was the result of a most excellent collaboration founded by Jade Alglave and also including Alan Stern, Andrea Parri, and Luc Maranget." ... and to the gents listed in the MAINTAINERS entry: LINUX KERNEL MEMORY CONSISTENCY MODEL (LKMM) M: Alan Stern <[email protected]> M: Andrea Parri <[email protected]> M: Will Deacon <[email protected]> M: Peter Zijlstra <[email protected]> M: Boqun Feng <[email protected]> M: Nicholas Piggin <[email protected]> M: David Howells <[email protected]> M: Jade Alglave <[email protected]> M: Luc Maranget <[email protected]> M: "Paul E. McKenney" <[email protected]> The LKMM project already found several bugs in Linux locking primitives and improved the understanding and the documentation of the Linux memory model all around. - Add KASAN instrumentation to atomic APIs (Dmitry Vyukov) - Add RWSEM API debugging and reorganize the lock debugging Kconfig (Waiman Long) - ... misc cleanups and other smaller changes" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) locking/Kconfig: Restructure the lock debugging menu locking/Kconfig: Add LOCK_DEBUGGING_SUPPORT to make it more readable locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatches lockdep: Make the lock debug output more useful locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() locking/atomic, asm-generic, x86: Add comments for atomic instrumentation locking/atomic, asm-generic: Add KASAN instrumentation to atomic operations locking/atomic/x86: Switch atomic.h to use atomic-instrumented.h locking/atomic, asm-generic: Add asm-generic/atomic-instrumented.h locking/xchg/alpha: Remove superfluous memory barriers from the _local() variants tools/memory-model: Finish the removal of rb-dep, smp_read_barrier_depends(), and lockless_dereference() tools/memory-model: Add documentation of new litmus test tools/memory-model: Remove mention of docker/gentoo image locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more locking/lockdep: Show unadorned pointers mutex: Drop linkage.h from mutex.h tools/memory-model: Remove rb-dep, smp_read_barrier_depends, and lockless_dereference tools/memory-model: Convert underscores to hyphens tools/memory-model: Add a S lock-based external-view litmus test tools/memory-model: Add required herd7 version to README file ...
2018-04-02Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds10-187/+183
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU subsystem changes in this cycle were: - Miscellaneous fixes, perhaps most notably removing obsolete code whose only purpose in life was to gather information for the now-removed RCU debugfs facility. Other notable changes include removing NO_HZ_FULL_ALL in favor of the nohz_full kernel boot parameter, minor optimizations for expedited grace periods, some added tracing, creating an RCU-specific workqueue using Tejun's new WQ_MEM_RECLAIM flag, and several cleanups to code and comments. - SRCU cleanups and optimizations. - Torture-test updates, perhaps most notably the adding of ARMv8 support, but also including numerous cleanups and usability fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) rcu: Create RCU-specific workqueues with rescuers torture: Provide more sensible nreader/nwriter defaults for rcuperf torture: Grace periods do not piggyback off of themselves torture: Adjust rcuperf trace processing to allow for workqueues torture: Default jitter off when running rcuperf torture: Specify qemu memory size with --memory argument rcutorture: Add basic ARM64 support to run scripts rcutorture: Update kvm.sh header comment rcutorture: Record which grace-period primitives are tested rcutorture: Re-enable testing of dynamic expediting rcutorture: Avoid fake-writer use of undefined primitives rcutorture: Abstract function and module names rcutorture: Replace multi-instance kzalloc() with kcalloc() rcu: Remove SRCU throttling srcu: Remove dead code in srcu_gp_end() srcu: Reduce scans of srcu_data in counter wrap check srcu: Prevent sdp->srcu_gp_seq_needed_exp counter wrap srcu: Abstract function name rcu: Make expedited RCU CPU selection avoid unnecessary stores rcu: Trace expedited GP delays due to transitioning CPUs ...
2018-04-02Merge branch 'core-core-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc core updates from Ingo Molnar: "Two changes: - add membarriers to Documentation/features/ - fix a minor nit in panic printk formatting" * 'core-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: panic: Add closing panic marker parenthesis Documentation/features, membarriers: Document membarrier-sync-core architecture support Documentation/features: Allow comments in arch features files
2018-04-02Merge branches 'pm-core', 'pm-sleep' and 'acpi-pm'Rafael J. Wysocki1-1/+25
* pm-core: driver core: Introduce device links reference counting PM / wakeirq: Add wakeup name to dedicated wake irqs * pm-sleep: PM / hibernate: Change message when writing to /sys/power/resume PM / hibernate: Make passing hibernate offsets more friendly PCMCIA / PM: Avoid noirq suspend aborts during suspend-to-idle * acpi-pm: ACPI / PM: Fix keyboard wakeup from suspend-to-idle on ASUS UX331UA ACPI / PM: Allow deeper wakeup power states with no _SxD nor _SxW ACPI / PM: Reduce LPI constraints logging noise ACPI / PM: Do not reconfigure GPEs for suspend-to-idle
2018-03-31locking/rwsem: Add DEBUG_RWSEMS to look for lock/unlock mismatchesWaiman Long2-1/+11
For a rwsem, locking can either be exclusive or shared. The corresponding exclusive or shared unlock must be used. Otherwise, the protected data structures may get corrupted or the lock may be in an inconsistent state. In order to detect such anomaly, a new configuration option DEBUG_RWSEMS is added which can be enabled to look for such mismatches and print warnings that that happens. Signed-off-by: Waiman Long <[email protected]> Acked-by: Davidlohr Bueso <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-03-31Merge branch 'linus' into locking/core, to pick up fixesIngo Molnar16-84/+156
Signed-off-by: Ingo Molnar <[email protected]>
2018-03-30PM / hibernate: Change message when writing to /sys/power/resumeMario Limonciello1-1/+1
This file is used both for setting the wakeup device without kernel command line as well as for actually waking the system (when appropriate swap header is in place). To avoid confusion on incorrect logs in system log downgrade the message to debug and make it clearer. Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-03-30PM / hibernate: Make passing hibernate offsets more friendlyMario Limonciello1-0/+24
Currently the only way to specify a hibernate offset for a swap file is on the kernel command line. Add a new /sys/power/resume_offset that lets userspace specify the offset and disk to use when initiating a hibernate cycle. Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-03-29perf/x86/pt, coresight: Clean up address filter structureAlexander Shishkin1-8/+18
This is a cosmetic patch that deals with the address filter structure's ambiguous fields 'filter' and 'range'. The former stands to mean that the filter's *action* should be to filter the traces to its address range if it's set or stop tracing if it's unset. This is confusing and hard on the eyes, so this patch replaces it with 'action' enum. The 'range' field is completely redundant (meaning that the filter is an address range as opposed to a single address trigger), as we can use zero size to mean the same thing. Signed-off-by: Alexander Shishkin <[email protected]> Acked-by: Mathieu Poirier <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-03-29Merge branch 'perf/urgent' into perf/coreIngo Molnar16-91/+144
Conflicts: kernel/events/hw_breakpoint.c Signed-off-by: Ingo Molnar <[email protected]>
2018-03-29lockdep: Make the lock debug output more usefulTetsuo Handa1-2/+2
The lock debug output in print_lock() has a few shortcomings: - It prints the hlock->acquire_ip field in %px and %pS format. That's redundant information. - It lacks information about the lock object itself. The lock class is not helpful to identify a particular instance of a lock. Change the output so it prints: - hlock->instance to allow identification of a particular lock instance. - only the %pS format of hlock->ip_acquire which is sufficient to decode the actual code line with faddr2line. The resulting output is: 3 locks held by a.out/31106: #0: 00000000b0f753ba (&mm->mmap_sem){++++}, at: copy_process.part.41+0x10d5/0x1fe0 #1: 00000000ef64d539 (&mm->mmap_sem/1){+.+.}, at: copy_process.part.41+0x10fe/0x1fe0 #2: 00000000b41a282e (&mapping->i_mmap_rwsem){++++}, at: copy_process.part.41+0x12f2/0x1fe0 [ tglx: Massaged changelog ] Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: David Rientjes <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-03-28locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()Peter Zijlstra2-7/+7
In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to (->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a waiter. In such a case remove_waiter() must not be called because without a waiter it will trigger the BUG_ON() statement. This was initially reported by Yimin Deng. Thomas Gleixner fixed it then with an explicit check for waiters before calling remove_waiter(). Instead of an explicit NULL check before calling rt_mutex_top_waiter() make the function return NULL if there are no waiters. With that fixed the now pointless NULL check is removed from rt_mutex_slowlock(). Reported-and-debugged-by: Yimin Deng <[email protected]> Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com Link: https://lkml.kernel.org/r/[email protected]
2018-03-28perf/hwbp: Simplify the perf-hwbp code, fix documentationLinus Torvalds1-23/+7
Annoyingly, modify_user_hw_breakpoint() unnecessarily complicates the modification of a breakpoint - simplify it and remove the pointless local variables. Also update the stale Docbook while at it. Signed-off-by: Linus Torvalds <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Cc: <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vince Weaver <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>