aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel/time.c
AgeCommit message (Collapse)AuthorFilesLines
2020-12-17Merge tag 'powerpc-5.11-1' of ↵Linus Torvalds1-96/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Switch to the generic C VDSO, as well as some cleanups of our VDSO setup/handling code. - Support for KUAP (Kernel User Access Prevention) on systems using the hashed page table MMU, using memory protection keys. - Better handling of PowerVM SMT8 systems where all threads of a core do not share an L2, allowing the scheduler to make better scheduling decisions. - Further improvements to our machine check handling. - Show registers when unwinding interrupt frames during stack traces. - Improvements to our pseries (PowerVM) partition migration code. - Several series from Christophe refactoring and cleaning up various parts of the 32-bit code. - Other smaller features, fixes & cleanups. Thanks to: Alan Modra, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Ard Biesheuvel, Athira Rajeev, Balamuruhan S, Bill Wendling, Cédric Le Goater, Christophe Leroy, Christophe Lombard, Colin Ian King, Daniel Axtens, David Hildenbrand, Frederic Barrat, Ganesh Goudar, Gautham R. Shenoy, Geert Uytterhoeven, Giuseppe Sacco, Greg Kurz, Harish, Jan Kratochvil, Jordan Niethe, Kaixu Xia, Laurent Dufour, Leonardo Bras, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu Desnoyers, Nathan Lynch, Nicholas Piggin, Oleg Nesterov, Oliver O'Halloran, Oscar Salvador, Po-Hsu Lin, Qian Cai, Qinglang Miao, Randy Dunlap, Ravi Bangoria, Sachin Sant, Sandipan Das, Sebastian Andrzej Siewior , Segher Boessenkool, Srikar Dronamraju, Tyrel Datwyler, Uwe Kleine-König, Vincent Stehlé, Youling Tang, and Zhang Xiaoxu. * tag 'powerpc-5.11-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (304 commits) powerpc/32s: Fix cleanup_cpu_mmu_context() compile bug powerpc: Add config fragment for disabling -Werror powerpc/configs: Add ppc64le_allnoconfig target powerpc/powernv: Rate limit opal-elog read failure message powerpc/pseries/memhotplug: Quieten some DLPAR operations powerpc/ps3: use dma_mapping_error() powerpc: force inlining of csum_partial() to avoid multiple csum_partial() with GCC10 powerpc/perf: Fix Threshold Event Counter Multiplier width for P10 powerpc/mm: Fix hugetlb_free_pmd_range() and hugetlb_free_pud_range() KVM: PPC: Book3S HV: Fix mask size for emulated msgsndp KVM: PPC: fix comparison to bool warning KVM: PPC: Book3S: Assign boolean values to a bool variable powerpc: Inline setup_kup() powerpc/64s: Mark the kuap/kuep functions non __init KVM: PPC: Book3S HV: XIVE: Add a comment regarding VP numbering powerpc/xive: Improve error reporting of OPAL calls powerpc/xive: Simplify xive_do_source_eoi() powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_EOI_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_MASK_FW powerpc/xive: Remove P9 DD1 flag XIVE_IRQ_FLAG_SHIFT_BUG ...
2020-12-09powerpc/64: irq replay remove decrementer overflow checkNicholas Piggin1-6/+3
This is way to catch some cases of decrementer overflow, when the decrementer has underflowed an odd number of times, while MSR[EE] was disabled. With a typical small decrementer, a timer that fires when MSR[EE] is disabled will be "lost" if MSR[EE] remains disabled for between 4.3 and 8.6 seconds after the timer expires. In any case, the decrementer interrupt would be taken at 8.6 seconds and the timer would be found at that point. So this check is for catching extreme latency events, and it prevents those latencies from being a further few seconds long. It's not obvious this is a good tradeoff. This is already a watchdog magnitude event and that situation is not improved a significantly with this check. For large decrementers, it's useless. Therefore remove this check, which avoids a mftb when enabling hard disabled interrupts (e.g., when enabling after coming from hardware interrupt handlers). Perhaps more importantly, it also removes the clunky MSR[EE] vs PACA_IRQ_HARD_DIS incoherency in soft-interrupt replay which simplifies the code. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04powerpc/vdso: Switch VDSO to generic C implementation.Christophe Leroy1-90/+1
With the C VDSO, the performance is slightly lower, but it is worth it as it will ease maintenance and evolution, and also brings clocks that are not supported with the ASM VDSO. On an 8xx at 132 MHz, vdsotest with the ASM VDSO: gettimeofday: vdso: 828 nsec/call clock-getres-realtime-coarse: vdso: 391 nsec/call clock-gettime-realtime-coarse: vdso: 614 nsec/call clock-getres-realtime: vdso: 460 nsec/call clock-gettime-realtime: vdso: 876 nsec/call clock-getres-monotonic-coarse: vdso: 399 nsec/call clock-gettime-monotonic-coarse: vdso: 691 nsec/call clock-getres-monotonic: vdso: 460 nsec/call clock-gettime-monotonic: vdso: 1026 nsec/call On an 8xx at 132 MHz, vdsotest with the C VDSO: gettimeofday: vdso: 955 nsec/call clock-getres-realtime-coarse: vdso: 545 nsec/call clock-gettime-realtime-coarse: vdso: 592 nsec/call clock-getres-realtime: vdso: 545 nsec/call clock-gettime-realtime: vdso: 941 nsec/call clock-getres-monotonic-coarse: vdso: 545 nsec/call clock-gettime-monotonic-coarse: vdso: 591 nsec/call clock-getres-monotonic: vdso: 545 nsec/call clock-gettime-monotonic: vdso: 940 nsec/call It is even better for gettime with monotonic clocks. Unsupported clocks with ASM VDSO: clock-gettime-boottime: vdso: 3851 nsec/call clock-gettime-tai: vdso: 3852 nsec/call clock-gettime-monotonic-raw: vdso: 3396 nsec/call Same clocks with C VDSO: clock-gettime-tai: vdso: 941 nsec/call clock-gettime-monotonic-raw: vdso: 1001 nsec/call clock-gettime-monotonic-coarse: vdso: 591 nsec/call On an 8321E at 333 MHz, vdsotest with the ASM VDSO: gettimeofday: vdso: 220 nsec/call clock-getres-realtime-coarse: vdso: 102 nsec/call clock-gettime-realtime-coarse: vdso: 178 nsec/call clock-getres-realtime: vdso: 129 nsec/call clock-gettime-realtime: vdso: 235 nsec/call clock-getres-monotonic-coarse: vdso: 105 nsec/call clock-gettime-monotonic-coarse: vdso: 208 nsec/call clock-getres-monotonic: vdso: 129 nsec/call clock-gettime-monotonic: vdso: 274 nsec/call On an 8321E at 333 MHz, vdsotest with the C VDSO: gettimeofday: vdso: 272 nsec/call clock-getres-realtime-coarse: vdso: 160 nsec/call clock-gettime-realtime-coarse: vdso: 184 nsec/call clock-getres-realtime: vdso: 166 nsec/call clock-gettime-realtime: vdso: 281 nsec/call clock-getres-monotonic-coarse: vdso: 160 nsec/call clock-gettime-monotonic-coarse: vdso: 184 nsec/call clock-getres-monotonic: vdso: 169 nsec/call clock-gettime-monotonic: vdso: 275 nsec/call On a Power9 Nimbus DD2.2 at 3.8GHz, with the ASM VDSO: clock-gettime-monotonic: vdso: 35 nsec/call clock-getres-monotonic: vdso: 16 nsec/call clock-gettime-monotonic-coarse: vdso: 18 nsec/call clock-getres-monotonic-coarse: vdso: 522 nsec/call clock-gettime-monotonic-raw: vdso: 598 nsec/call clock-getres-monotonic-raw: vdso: 520 nsec/call clock-gettime-realtime: vdso: 34 nsec/call clock-getres-realtime: vdso: 16 nsec/call clock-gettime-realtime-coarse: vdso: 18 nsec/call clock-getres-realtime-coarse: vdso: 517 nsec/call getcpu: vdso: 8 nsec/call gettimeofday: vdso: 25 nsec/call And with the C VDSO: clock-gettime-monotonic: vdso: 37 nsec/call clock-getres-monotonic: vdso: 20 nsec/call clock-gettime-monotonic-coarse: vdso: 21 nsec/call clock-getres-monotonic-coarse: vdso: 19 nsec/call clock-gettime-monotonic-raw: vdso: 38 nsec/call clock-getres-monotonic-raw: vdso: 20 nsec/call clock-gettime-realtime: vdso: 37 nsec/call clock-getres-realtime: vdso: 20 nsec/call clock-gettime-realtime-coarse: vdso: 20 nsec/call clock-getres-realtime-coarse: vdso: 19 nsec/call getcpu: vdso: 8 nsec/call gettimeofday: vdso: 28 nsec/call Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-02sched/vtime: Consolidate IRQ time accountingFrederic Weisbecker1-16/+40
The 3 architectures implementing CONFIG_VIRT_CPU_ACCOUNTING_NATIVE all have their own version of irq time accounting that dispatch the cputime to the appropriate index: hardirq, softirq, system, idle, guest... from an all-in-one function. Instead of having these ad-hoc versions, move the cputime destination dispatch decision to the core code and leave only the actual per-index cputime accounting to the architecture. Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-10-08powerpc/time: Avoid using get_tbl() and get_tbu() internallyChristophe Leroy1-2/+2
get_tbl() is confusing as it returns the content of TBL register on PPC32 but the concatenation of TBL and TBU on PPC64. Use mftb() instead. Do the same with get_tbu() for consistency allthough it's name is less confusing. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/41573406a4eab98838decaa91649086fef1e6119.1601556145.git.christophe.leroy@csgroup.eu
2020-10-08powerpc: Remove get_tb_or_rtc()Christophe Leroy1-3/+3
601 is gone, get_tb_or_rtc() is equivalent to get_tb(). Replace the former by the later. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/3e8a13ee83418630c753c30cb722ae682d5b2d39.1601362098.git.christophe.leroy@csgroup.eu
2020-10-08powerpc: Remove __USE_RTC()Christophe Leroy1-44/+8
Now that PowerPC 601 is gone, __USE_RTC() is never true. Remove it. That also leads to removing get_rtc() and get_rtcl() Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/4757e1ed21fe1968c761ae081d1f3d790a9673f8.1601362098.git.christophe.leroy@csgroup.eu
2020-07-10powerpc64: Break asm/percpu.h vs spinlock_types.h dependencyPeter Zijlstra1-0/+2
In order to use <asm/percpu.h> in lockdep.h, we need to make sure asm/percpu.h does not itself depend on lockdep. The below seems to make that so and builds powerpc64-defconfig + PROVE_LOCKING. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> https://lkml.kernel.org/r/[email protected]
2020-04-03Revert "powerpc/64: irq_work avoid interrupt when called with hardware irqs ↵Nicholas Piggin1-31/+13
enabled" This reverts commit ebb37cf3ffd39fdb6ec5b07111f8bb2f11d92c5f. That commit does not play well with soft-masked irq state manipulations in idle, interrupt replay, and possibly others due to tracing code sometimes using irq_work_queue (e.g., in trace_hardirqs_on()). That can cause PACA_IRQ_DEC to become set when it is not expected, and be ignored or cleared or cause warnings. The net result seems to be missing an irq_work until the next timer interrupt in the worst case which is usually not going to be noticed, however it could be a long time if the tick is disabled, which is against the spirit of irq_work and might cause real problems. The idea is still solid, but it would need more work. It's not really clear if it would be worth added complexity, so revert this for now (not a straight revert, but replace with a comment explaining why we might see interrupts happening, and gives git blame something to find). Fixes: ebb37cf3ffd3 ("powerpc/64: irq_work avoid interrupt when called with hardware irqs enabled") Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-04-03powerpc/time: Replace <linux/clk-provider.h> by <linux/of_clk.h>Geert Uytterhoeven1-3/+1
The PowerPC time code is not a clock provider, and just needs to call of_clk_init(). Hence it can include <linux/of_clk.h> instead of <linux/clk-provider.h>. Remove the #ifdef protecting the of_clk_init() call, as a stub is available for the !CONFIG_COMMON_CLK case. Signed-off-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Stephen Boyd <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-04-01powerpc/64s/exception: Avoid touching the stack in hdecrementerNicholas Piggin1-9/+0
The hdec interrupt handler is reported to sometimes fire in Linux if KVM leaves it pending after a guest exists. This is harmless, so there is a no-op handler for it. The interrupt handler currently uses the regular kernel stack. Change this to avoid touching the stack entirely. This should be the last place where the regular Linux stack can be accessed with asynchronous interrupts (including PMI) soft-masked. It might be possible to take advantage of this invariant, e.g., to context switch the kernel stack SLB entry without clearing MSR[EE]. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-12-05powerpc: Fix vDSO clock_getres()Vincenzo Frascino1-0/+1
clock_getres in the vDSO library has to preserve the same behaviour of posix_get_hrtimer_res(). In particular, posix_get_hrtimer_res() does: sec = 0; ns = hrtimer_resolution; and hrtimer_resolution depends on the enablement of the high resolution timers that can happen either at compile or at run time. Fix the powerpc vdso implementation of clock_getres keeping a copy of hrtimer_resolution in vdso data and using that directly. Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support to 32 bits kernel") Cc: [email protected] Signed-off-by: Vincenzo Frascino <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Acked-by: Shuah Khan <[email protected]> [chleroy: changed CLOCK_REALTIME_RES to CLOCK_HRTIMER_RES] Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a55eca3a5e85233838c2349783bcb5164dae1d09.1575273217.git.christophe.leroy@c-s.fr
2019-12-01Merge tag 'y2038-cleanups-5.5' of ↵Linus Torvalds1-2/+3
git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground Pull y2038 cleanups from Arnd Bergmann: "y2038 syscall implementation cleanups This is a series of cleanups for the y2038 work, mostly intended for namespace cleaning: the kernel defines the traditional time_t, timeval and timespec types that often lead to y2038-unsafe code. Even though the unsafe usage is mostly gone from the kernel, having the types and associated functions around means that we can still grow new users, and that we may be missing conversions to safe types that actually matter. There are still a number of driver specific patches needed to get the last users of these types removed, those have been submitted to the respective maintainers" Link: https://lore.kernel.org/lkml/[email protected]/ * tag 'y2038-cleanups-5.5' of git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground: (26 commits) y2038: alarm: fix half-second cut-off y2038: ipc: fix x32 ABI breakage y2038: fix typo in powerpc vdso "LOPART" y2038: allow disabling time32 system calls y2038: itimer: change implementation to timespec64 y2038: move itimer reset into itimer.c y2038: use compat_{get,set}_itimer on alpha y2038: itimer: compat handling to itimer.c y2038: time: avoid timespec usage in settimeofday() y2038: timerfd: Use timespec64 internally y2038: elfcore: Use __kernel_old_timeval for process times y2038: make ns_to_compat_timeval use __kernel_old_timeval y2038: socket: use __kernel_old_timespec instead of timespec y2038: socket: remove timespec reference in timestamping y2038: syscalls: change remaining timeval to __kernel_old_timeval y2038: rusage: use __kernel_old_timeval y2038: uapi: change __kernel_time_t to __kernel_old_time_t y2038: stat: avoid 'time_t' in 'struct stat' y2038: ipc: remove __kernel_time_t reference from headers y2038: vdso: powerpc: avoid timespec references ...
2019-11-30Merge tag 'powerpc-5.5-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Infrastructure for secure boot on some bare metal Power9 machines. The firmware support is still in development, so the code here won't actually activate secure boot on any existing systems. - A change to xmon (our crash handler / pseudo-debugger) to restrict it to read-only mode when the kernel is lockdown'ed, otherwise it's trivial to drop into xmon and modify kernel data, such as the lockdown state. - Support for KASLR on 32-bit BookE machines (Freescale / NXP). - Fixes for our flush_icache_range() and __kernel_sync_dicache() (VDSO) to work with memory ranges >4GB. - Some reworks of the pseries CMM (Cooperative Memory Management) driver to make it behave more like other balloon drivers and enable some cleanups of generic mm code. - A series of fixes to our hardware breakpoint support to properly handle unaligned watchpoint addresses. Plus a bunch of other smaller improvements, fixes and cleanups. Thanks to: Alastair D'Silva, Andrew Donnellan, Aneesh Kumar K.V, Anthony Steinhauser, Cédric Le Goater, Chris Packham, Chris Smart, Christophe Leroy, Christopher M. Riedl, Christoph Hellwig, Claudio Carvalho, Daniel Axtens, David Hildenbrand, Deb McLemore, Diana Craciun, Eric Richter, Geert Uytterhoeven, Greg Kroah-Hartman, Greg Kurz, Gustavo L. F. Walbon, Hari Bathini, Harish, Jason Yan, Krzysztof Kozlowski, Leonardo Bras, Mathieu Malaterre, Mauro S. M. Rodrigues, Michal Suchanek, Mimi Zohar, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nick Desaulniers, Oliver O'Halloran, Qian Cai, Rasmus Villemoes, Ravi Bangoria, Sam Bobroff, Santosh Sivaraj, Scott Wood, Thomas Huth, Tyrel Datwyler, Vaibhav Jain, Valentin Longchamp, YueHaibing" * tag 'powerpc-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (144 commits) powerpc/fixmap: fix crash with HIGHMEM x86/efi: remove unused variables powerpc: Define arch_is_kernel_initmem_freed() for lockdep powerpc/prom_init: Use -ffreestanding to avoid a reference to bcmp powerpc: Avoid clang warnings around setjmp and longjmp powerpc: Don't add -mabi= flags when building with Clang powerpc: Fix Kconfig indentation powerpc/fixmap: don't clear fixmap area in paging_init() selftests/powerpc: spectre_v2 test must be built 64-bit powerpc/powernv: Disable native PCIe port management powerpc/kexec: Move kexec files into a dedicated subdir. powerpc/32: Split kexec low level code out of misc_32.S powerpc/sysdev: drop simple gpio powerpc/83xx: map IMMR with a BAT. powerpc/32s: automatically allocate BAT in setbat() powerpc/ioremap: warn on early use of ioremap() powerpc: Add support for GENERIC_EARLY_IOREMAP powerpc/fixmap: Use __fix_to_virt() instead of fix_to_virt() powerpc/8xx: use the fixmapped IMMR in cpm_reset() powerpc/8xx: add __init to cpm1 init functions ...
2019-11-15y2038: vdso: powerpc: avoid timespec referencesArnd Bergmann1-2/+3
As a preparation to stop using 'struct timespec' in the kernel, change the powerpc vdso implementation: - split up the vdso data definition to have equivalent members for seconds and nanoseconds instead of an xtime structure - use timespec64 as an intermediate for the xtime update - change the asm-offsets definition to be based the appropriate fixed-length types This is only a temporary fix for changing the types, in order to actually support a 64-bit safe vdso32 version of clock_gettime(), the entire powerpc vdso should be replaced with the generic lib/vdso/ implementation. If that happens first, this patch becomes obsolete. Signed-off-by: Arnd Bergmann <[email protected]>
2019-10-28powerpc/pseries: Mark accumulate_stolen_time() as notraceMichael Ellerman1-1/+1
accumulate_stolen_time() is called prior to interrupt state being reconciled, which can trip the warning in arch_local_irq_restore(): WARNING: CPU: 5 PID: 1017 at arch/powerpc/kernel/irq.c:258 .arch_local_irq_restore+0x9c/0x130 ... NIP .arch_local_irq_restore+0x9c/0x130 LR .rb_start_commit+0x38/0x80 Call Trace: .ring_buffer_lock_reserve+0xe4/0x620 .trace_function+0x44/0x210 .function_trace_call+0x148/0x170 .ftrace_ops_no_ops+0x180/0x1d0 ftrace_call+0x4/0x8 .accumulate_stolen_time+0x1c/0xb0 decrementer_common+0x124/0x160 For now just mark it as notrace. We may change the ordering to call it after interrupt state has been reconciled, but that is a larger change. Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2019-10-09sched/cputime: Rename vtime_account_system() to vtime_account_kernel()Frederic Weisbecker1-3/+3
vtime_account_system() decides if we need to account the time to the system (__vtime_account_system()) or to the guest (vtime_account_guest()). So this function is a misnomer as we are on a higher level than "system". All we know when we call that function is that we are accounting kernel cputime. Whether it belongs to guest or system time is a lower level detail. Rename this function to vtime_account_kernel(). This will clarify things and avoid too many underscored vtime_account_system() versions. Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Yauheni Kaliuta <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-05-30treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation either version 2 of the license or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 3029 file(s). Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Allison Randal <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-05-03powerpc/hmi: Fix kernel hang when TB is in error state.Mahesh Salgaonkar1-0/+9
On TOD/TB errors timebase register stops/freezes until HMI error recovery gets TOD/TB back into running state. On successful recovery, TB starts running again and udelay() that relies on TB value continues to function properly. But in case when HMI fails to recover from TOD/TB errors, the TB register stay freezed. With TB not running the __delay() function keeps looping and never return. If __delay() is called while in panic path then system hangs and never reboots after panic. Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-04-20powerpc: Remove duplicate headersJagadeesh Pagadala1-1/+0
Remove duplicate headers inclusions. Signed-off-by: Jagadeesh Pagadala <[email protected]> Reviewed-by: Mukesh Ojha <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-01-30powerpc/kernel/time: Remove duplicate headerBrajeswar Ghosh1-1/+0
Remove linux/rtc.h which is included more than once Signed-off-by: Brajeswar Ghosh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-20powerpc/time: no steal_time when CONFIG_PPC_SPLPAR is not selectedChristophe Leroy1-2/+3
If CONFIG_PPC_SPLPAR is not selected, steal_time will always be NUL, so accounting it is pointless Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-20powerpc/time: Only set CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC64Christophe Leroy1-2/+10
scaled cputime is only meaningfull when the processor has SPURR and/or PURR, which means only on PPC64. Removing it on PPC32 significantly reduces the size of vtime_account_system() and vtime_account_idle() on an 8xx: Before: 00000000 l F .text 000000a8 vtime_delta 00000280 g F .text 0000010c vtime_account_system 0000038c g F .text 00000048 vtime_account_idle After: (vtime_delta gets inlined inside the two functions) 000001d8 g F .text 000000a0 vtime_account_system 00000278 g F .text 00000038 vtime_account_idle In terms of performance, we also get approximatly 7% improvement on task switch. The following small benchmark app is run with perf stat: void *thread(void *arg) { int i; for (i = 0; i < atoi((char*)arg); i++) pthread_yield(); } int main(int argc, char **argv) { pthread_t th1, th2; pthread_create(&th1, NULL, thread, argv[1]); pthread_create(&th2, NULL, thread, argv[1]); pthread_join(th1, NULL); pthread_join(th2, NULL); return 0; } Before the patch: Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs): 8228.476465 task-clock (msec) # 0.954 CPUs utilized ( +- 0.23% ) 200004 context-switches # 0.024 M/sec ( +- 0.00% ) After the patch: Performance counter stats for 'chrt -f 98 ./sched 100000' (50 runs): 7649.070444 task-clock (msec) # 0.955 CPUs utilized ( +- 0.27% ) 200004 context-switches # 0.026 M/sec ( +- 0.00% ) Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-20powerpc/time: isolate scaled cputime accounting in dedicated functions.Christophe Leroy1-26/+43
scaled cputime is only meaningfull when the processor has SPURR and/or PURR, which means only on PPC64. In preparation of the following patch that will remove CONFIG_ARCH_HAS_SCALED_CPUTIME on PPC32, this patch moves all scaled cputing accounting logic into dedicated functions. This patch doesn't change any functionality. It's only code reorganisation. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-19powerpc/time: Fix clockevent_decrementer initalisation for PR KVMMichael Ellerman1-0/+4
In the recent commit 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") we changed the way we initialise the decrementer clockevent(s). We no longer initialise the mult & shift values of decrementer_clockevent itself. This has the effect of breaking PR KVM, because it uses those values in kvmppc_emulate_dec(). The symptom is guest kernels spin forever mid-way through boot. For now fix it by assigning back to decrementer_clockevent the mult and shift values. Fixes: 8b78fdb045de ("powerpc/time: Use clockevents_register_device(), fixing an issue with large decrementer") Signed-off-by: Michael Ellerman <[email protected]>
2018-10-04powerpc/time: Add set_state_oneshot_stopped decrementer callbackAnton Blanchard1-0/+1
If CONFIG_PPC_WATCHDOG is enabled we always cap the decrementer to 0x7fffffff: if (IS_ENABLED(CONFIG_PPC_WATCHDOG)) set_dec(0x7fffffff); else set_dec(decrementer_max); If there are no future events, we don't reprogram the decrementer after this and we end up with 0x7fffffff even on a large decrementer capable system. As suggested by Nick, add a set_state_oneshot_stopped callback so we program the decrementer with decrementer_max if there are no future events. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-10-04powerpc/time: Use clockevents_register_device(), fixing an issue with large ↵Anton Blanchard1-14/+3
decrementer We currently cap the decrementer clockevent at 4 seconds, even on systems with large decrementer support. Fix this by converting the code to use clockevents_register_device() which calculates the upper bound based on the max_delta passed in. Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-04powerpc/time: inline arch_vtime_task_switch()Christophe Leroy1-21/+0
arch_vtime_task_switch() is a small function which is called only from vtime_common_task_switch(), so it is worth inlining Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: remove unused to_tm() helperArnd Bergmann1-50/+0
to_tm() is now completely unused, the only reference being in the _dump_time() helper that is also unused. This removes both, leaving the rest of the powerpc RTC code y2038 safe to as far as the hardware supports. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: use time64_t in update_persistent_clockArnd Bergmann1-4/+2
update_persistent_clock() is deprecated because it suffers from overflow in 2038 on 32-bit architectures. This changes powerpc to use the update_persistent_clock64() replacement, and to pass down 64-bit timestamps consistently. This is now simpler, as we no longer have to worry about the offset numbers in tm_year and tm_mon that are different between the Linux conventions and RTAS. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: use time64_t in read_persistent_clockArnd Bergmann1-4/+3
Looking through the remaining users of the deprecated mktime() function, I found the powerpc rtc handlers, which use it in place of rtc_tm_to_time64(). To clean this up, I'm changing over the read_persistent_clock() function to the read_persistent_clock64() variant, and change all the platform specific handlers along with it. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc/time: account broadcast timer event interrupts separatelyNicholas Piggin1-4/+1
These are not local timer interrupts but IPIs. It's good to be able to see how timer offloading is behaving, so split these out into their own category. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: move timer broadcast code under GENERIC_CLOCKEVENTS_BROADCAST ifdefNicholas Piggin1-0/+2
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: allow soft-NMI watchdog to cover timer interrupts with large ↵Nicholas Piggin1-6/+13
decrementers Large decrementers (e.g., POWER9) can take a very long time to wrap, so when the timer iterrupt handler sets the decrementer to max so as to avoid taking another decrementer interrupt when hard enabling interrupts before running timers, it effectively disables the soft NMI coverage for timer interrupts. Fix this by using the traditional 31-bit value instead, which wraps after a few seconds. masked interrupt code does the same thing, and in normal operation neither of these paths would ever wrap even the 31 bit value. Note: the SMP watchdog should catch timer interrupt lockups, but it is preferable for the local soft-NMI to catch them, mainly to avoid the IPI. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc: generic clockevents broadcast receiver call tick_receive_broadcastNicholas Piggin1-45/+39
The broadcast tick recipient can call tick_receive_broadcast rather than re-running the full timer interrupt. It does not have to check for the next event time, because the sender already determined the timer has expired. It does not have to test irq_work_pending, because that's a direct decrementer interrupt and does not go through the clock events subsystem. And it does not have to read PURR because that was removed with the previous patch. This results in no code size change, but both the decrementer and broadcast path lengths are reduced. Cc: Srivatsa S. Bhat <[email protected]> Cc: Preeti U Murthy <[email protected]> Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc/pseries: lparcfg calculate PURR on demandNicholas Piggin1-8/+0
For SPLPAR, lparcfg provides a sum of PURR registers for all CPUs. Currently this is done by reading PURR in context switch and timer interrupt, and storing that into a per-CPU variable. These are summed to provide the value. This does not work with all timer schemes (e.g., NO_HZ_FULL), and it is sub-optimal for performance because it reads the PURR register on every context switch, although that's been difficult to distinguish from noise in the contxt_switch microbenchmark. This patch implements the sum by calling a function on each CPU, to read and add PURR values of each CPU. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-06-03powerpc/64: irq_work avoid interrupt when called with hardware irqs enabledNicholas Piggin1-2/+31
irq_work_raise should not cause a decrementer exception unless it is called from NMI context. Doing so often just results in an immediate masked decrementer interrupt: <...>-550 90d... 4us : update_curr_rt <-dequeue_task_rt <...>-550 90d... 5us : dbs_update_util_handler <-update_curr_rt <...>-550 90d... 6us : arch_irq_work_raise <-irq_work_queue <...>-550 90d... 7us : soft_nmi_interrupt <-soft_nmi_common <...>-550 90d... 7us : printk_nmi_enter <-soft_nmi_interrupt <...>-550 90d.Z. 8us : rcu_nmi_enter <-soft_nmi_interrupt <...>-550 90d.Z. 9us : rcu_nmi_exit <-soft_nmi_interrupt <...>-550 90d... 9us : printk_nmi_exit <-soft_nmi_interrupt <...>-550 90d... 10us : cpuacct_charge <-update_curr_rt The soft_nmi_interrupt here is the call into the watchdog, due to the decrementer interrupt firing with irqs soft-disabled. This is harmless, but sub-optimal. When it's not called from NMI context or with interrupts enabled, mark the decrementer pending in the irq_happened mask directly, rather than having the masked decrementer interupt handler do it. This will be replayed at the next local_irq_enable. See the comment for details. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-04-03powerpc: Fix oops due to bad access of lppaca on bare metalAneesh Kumar K.V1-0/+3
Commit 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized") removed allocation of lppaca on bare metal platforms. But with CONFIG_PPC_SPLPAR enabled, we still access the lppaca on bare metal in some code paths. Fix this but adding runtime checks for SPLPAR (shared processor LPAR). Fixes: 8e0b634b1327 ("powerpc/64s: Do not allocate lppaca if we are not virtualized") Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-03-14powerpc/time: stop validating rtc_time in .read_timeAlexandre Belloni1-1/+1
The RTC core is always calling rtc_valid_tm after the read_time callback. It is not necessary to call it just before returning from the callback. Signed-off-by: Alexandre Belloni <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-19powerpc/64: Rename soft_enabled to irq_soft_maskMadhavan Srinivasan1-3/+3
Rename the paca->soft_enabled to paca->irq_soft_mask as it is no longer used as a flag for interrupt state, but a mask. Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-19powerpc/64: Implement and use soft_enabled_return APIMadhavan Srinivasan1-1/+1
Add a new wrapper function, soft_enabled_return(), added to return paca->soft_enabled value. Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-19powerpc/64: Move set_soft_enabled() and renameMadhavan Srinivasan1-2/+2
Move set_soft_enabled() from powerpc/kernel/irq.c to asm/hw_irq.c, to encourage updates to paca->soft_enabled done via these access function. Add "memory" clobber to hint compiler since paca->soft_enabled memory is the target here. Renaming it as soft_enabled_set() will make namespaces works better as prefix than a postfix when new soft_enabled manipulation functions are introduced. Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-01-19powerpc/64: Add #defines for paca->soft_enabled flagsMadhavan Srinivasan1-1/+1
Two #defines IRQS_ENABLED and IRQS_DISABLED are added to be used when updating paca->soft_enabled. Replace the hardcoded values used when updating paca->soft_enabled with IRQ_(EN|DIS)ABLED #define. No logic change. Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-07-02powerpc: use spin loop primitives in some functionsNicholas Piggin1-3/+5
Use the different spin loop primitives in some simple powerpc spin loops, including those which will spin as a common case. This will help to test the spin loop primitives before more conversions are done. Signed-off-by: Nicholas Piggin <[email protected]> [mpe: Add some includes of <linux/processor.h>] Signed-off-by: Michael Ellerman <[email protected]>
2017-06-22powerpc: Convert VDSO update function to use new update_vsyscall interfacePaul Mackerras1-16/+52
This converts the powerpc VDSO time update function to use the new interface introduced in commit 576094b7f0aa ("time: Introduce new GENERIC_TIME_VSYSCALL", 2012-09-11). Where the old interface gave us the time as of the last update in seconds and whole nanoseconds, with the new interface we get the nanoseconds part effectively in a binary fixed-point format with tk->tkr_mono.shift bits to the right of the binary point. With the old interface, the fractional nanoseconds got truncated, meaning that the value returned by the VDSO clock_gettime function would have about 1ns of jitter in it compared to the value computed by the generic timekeeping code in the kernel. The powerpc VDSO time functions (clock_gettime and gettimeofday) already work in units of 2^-32 seconds, or 0.23283 ns, because that makes it simple to split the result into seconds and fractional seconds, and represent the fractional seconds in either microseconds or nanoseconds. This is good enough accuracy for now, so this patch avoids changing how the VDSO works or the interface in the VDSO data page. This patch converts the powerpc update_vsyscall_old to be called update_vsyscall and use the new interface. We convert the fractional second to units of 2^-32 seconds without truncating to whole nanoseconds. (There is still a conversion to whole nanoseconds for any legacy users of the vdso_data/systemcfg stamp_xtime field.) In addition, this improves the accuracy of the computation of tb_to_xs for those systems with high-frequency timebase clocks (>= 268.5 MHz) by doing the right shift in two parts, one before the multiplication and one after, rather than doing the right shift before the multiplication. (We can't do all of the right shift after the multiplication unless we use 128-bit arithmetic.) Signed-off-by: Paul Mackerras <[email protected]> Acked-by: John Stultz <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-21powerpc/time: Fix tracing in time.cSantosh Sivaraj1-3/+3
Since trace_clock is in a different file and already marked with notrace, enable tracing in time.c by removing it from the disabled list in Makefile. Also annotate clocksource read functions and sched_clock with notrace. Testing: Timer and ftrace selftests run with different trace clocks. Acked-by: Naveen N. Rao <[email protected]> Signed-off-by: Santosh Sivaraj <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-05-30powerpc/[booke|4xx]: Don't clobber TCR[WP] when setting TCR[DIE]Ivan Mikhaylov1-3/+11
Prevent a kernel panic caused by unintentionally clearing TCR watchdog bits. At this point in the kernel boot, the watchdog may have already been enabled by u-boot. The original code's attempt to write to the TCR register results in an inadvertent clearing of the watchdog configuration bits, causing the 476 to reset. Signed-off-by: Ivan Mikhaylov <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-04-14powerpc/time: Set ->min_delta_ticks and ->max_delta_ticksNicolai Stange1-0/+2
In preparation for making the clockevents core NTP correction aware, all clockevent device drivers must set ->min_delta_ticks and ->max_delta_ticks rather than ->min_delta_ns and ->max_delta_ns: a clockevent device's rate is going to change dynamically and thus, the ratio of ns to ticks ceases to stay invariant. Make the powerpc arch's clockevent driver initialize these fields properly. This patch alone doesn't introduce any change in functionality as the clockevents core still looks exclusively at the (untouched) ->min_delta_ns and ->max_delta_ns. As soon as this has changed, a followup patch will purge the initialization of ->min_delta_ns and ->max_delta_ns from this driver. Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Daniel Lezcano <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Oliver O'Halloran <[email protected]> Cc: [email protected] Acked-by: Michael Ellerman <[email protected]> (powerpc) Signed-off-by: Nicolai Stange <[email protected]> Signed-off-by: John Stultz <[email protected]>
2017-03-02sched/headers: Prepare to move cputime functionality from <linux/sched.h> ↵Ingo Molnar1-1/+1
into <linux/sched/cputime.h> Introduce a trivial, mostly empty <linux/sched/cputime.h> header to prepare for the moving of cputime functionality out of sched.h. Update all code that relies on these facilities. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar1-0/+1
<linux/sched/clock.h> We are going to split <linux/sched/clock.h> out of <linux/sched.h>, which will have to be picked up from other headers and .c files. Create a trivial placeholder <linux/sched/clock.h> file that just maps to <linux/sched.h> to make this patch obviously correct and bisectable. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>