Age | Commit message (Collapse) | Author | Files | Lines |
|
handle_mm_fault returning VM_FAULT_RETRY or VM_FAULT_COMPLETED means
mmap_lock has been released. However with per-VMA locks behavior is
different and the caller should still release it. To make the rules
consistent for the caller, drop the per-VMA lock when returning
VM_FAULT_RETRY or VM_FAULT_COMPLETED. Currently the only path returning
VM_FAULT_RETRY under per-VMA locks is do_swap_page and no path returns
VM_FAULT_COMPLETED for now.
[[email protected]: fix riscv]
Link: https://lkml.kernel.org/r/CAJuCfpE6GWEx1rPBmNpUfoD5o-gNFz9-UFywzCE2PbEGBiVz7g@mail.gmail.com
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Suren Baghdasaryan <[email protected]>
Acked-by: Peter Xu <[email protected]>
Tested-by: Conor Dooley <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: David Howells <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Pavel Tatashin <[email protected]>
Cc: Punit Agrawal <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Yu Zhao <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In mpc5xxx_fwnode_get_bus_frequency(), we should add
fwnode_handle_put() when break out of the iteration
fwnode_for_each_parent_node() as it will automatically
increase and decrease the refcounter.
Fixes: de06fba62af6 ("powerpc/mpc5xxx: Switch mpc5xxx_get_bus_frequency() to use fwnode")
Signed-off-by: Liang He <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
In 5.10 commit 5e84dd547bce ("powerpc/configs/skiroot: Enable some more
hardening options") set SLUB_DEBUG_ON.
When 5.14 came around, commit 792702911f58 ("slub: force on
no_hash_pointers when slub_debug is enabled") print all the
pointers when SLUB_DEBUG_ON is set. This was fine, but in 5.12 commit
5ead723a20e0 ("lib/vsprintf: no_hash_pointers prints all addresses as
unhashed") added the warning at boot.
Disable SLAB_DEBUG_ON as we don't want the nasty warning. We have
CONFIG_EXPERT so SLAB_DEBUG is enabled. We do lose the settings in
DEBUG_DEFAULT_FLAGS, but it's not clear that these should have been
always-on anyway.
Signed-off-by: Joel Stanley <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
When JUMP_LABEL=n, the tracepoint refcount test in the pre-call stores
the refcount value to the stack, so the same value can be used for the
post-call (presumably to avoid racing with the value concurrently
changing).
On little-endian (ELFv2) that might have just worked by luck, because
32(r1) is STK_PARAM(R3) there and so the value save gets clobbered by
the tracing code when it's non-zero, but fortunately r3 is the hcall
number and 0 is an invalid hcall number so it should get clobbered by
another non-zero value. In any case, commit cc1adb5f32557
("powerpc/pseries: Use jump labels for hcall tracepoints") removed the
code that actually used the value stored, so now it's just dead code.
It's fragile to be storing to the stack like this, and confusing. Better
remove it.
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
With JUMP_LABEL=n, hcall_tracepoint_refcount's address is being tested
instead of its value. This results in the tracing slowpath always being
taken unnecessarily.
Fixes: 9a10ccb29c0a2 ("powerpc/pseries: move hcall_tracepoint_refcount out of .toc")
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Add missing whitespace between node name/label and opening {.
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it mannually. Use pci_dev_id() to
simplify the code a little bit.
Signed-off-by: Jialin Zhang <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Currently the -mtune options are set in the Makefile, depending on what
the compiler supports.
One downside of doing it that way is that the chosen -mtune option is
not recorded in the .config.
Another downside is that if there's ever a need to do more complicated
logic to calculate the correct option, that gets messy in the Makefile.
So move the determination of which -mtune option to use into Kconfig
logic.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Clang reports:
arch/powerpc/platforms/powermac/feature.c:137:19: error: unused function 'simple_feature_tweak'
It's only used inside the #ifndef CONFIG_PPC64 block, so move it in
there to fix the warning. While at it drop the inline, the compiler will
decide whether it should be inlined or not.
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
include/net/inet_sock.h
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
c274af224269 ("inet: introduce inet->inet_flags")
https://lore.kernel.org/all/[email protected]/
Adjacent changes:
drivers/net/bonding/bond_alb.c
e74216b8def3 ("bonding: fix macvlan over alb bond support")
f11e5bd159b0 ("bonding: support balance-alb with openvswitch")
drivers/net/ethernet/broadcom/bgmac.c
d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()")
23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()")
drivers/net/ethernet/broadcom/genet/bcmmii.c
32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()")
acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()")
net/sctp/socket.c
f866fbc842de ("ipv4: fix data-races around inet->inet_id")
b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags")
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
lppaca_shared_proc() takes a pointer to the lppaca which is typically
accessed through get_lppaca(). With DEBUG_PREEMPT enabled, this leads
to checking if preemption is enabled, for example:
BUG: using smp_processor_id() in preemptible [00000000] code: grep/10693
caller is lparcfg_data+0x408/0x19a0
CPU: 4 PID: 10693 Comm: grep Not tainted 6.5.0-rc3 #2
Call Trace:
dump_stack_lvl+0x154/0x200 (unreliable)
check_preemption_disabled+0x214/0x220
lparcfg_data+0x408/0x19a0
...
This isn't actually a problem however, as it does not matter which
lppaca is accessed, the shared proc state will be the same.
vcpudispatch_stats_procfs_init() already works around this by disabling
preemption, but the lparcfg code does not, erroring any time
/proc/powerpc/lparcfg is accessed with DEBUG_PREEMPT enabled.
Instead of disabling preemption on the caller side, rework
lppaca_shared_proc() to not take a pointer and instead directly access
the lppaca, bypassing any potential preemption checks.
Fixes: f13c13a00512 ("powerpc: Stop using non-architected shared_proc field in lppaca")
Signed-off-by: Russell Currey <[email protected]>
[mpe: Rework to avoid needing a definition in paca.h and lppaca.h]
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
By adding a forward declaration for struct lppaca we can untangle paca.h
and lppaca.h. Also move get_lppaca() into lppaca.h for consistency.
Add includes of lppaca.h to some files that need it.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Consolidate the two prototypes for hcall_vphn() into vphn.h.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
These don't have any particularly good reason to belong in lppaca.h,
move them into their own header.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
The only callers of zalloc_maybe_bootmem() are PCI setup routines. These
used to be called early during boot before slab setup, and also during
runtime due to hotplug.
But commit 5537fcb319d0 ("powerpc/pci: Add ppc_md.discover_phbs()")
moved the boot-time calls later, after slab setup, meaning there's no
longer any need for zalloc_maybe_bootmem(), kzalloc() can be used in all
cases.
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
Use the newly added struct opal_prd_msg in some other functions that
operate on opal_prd messages, rather than using other types.
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a
FORTIFY_SOURCE warning:
memcpy: detected field-spanning write (size 32) of single field "&item->msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4)
WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 opal_prd_msg_notifier+0x174/0x188 [opal_prd]
NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd]
LR opal_prd_msg_notifier+0x170/0x188 [opal_prd]
Call Trace:
opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable)
notifier_call_chain+0xc0/0x1b0
atomic_notifier_call_chain+0x2c/0x40
opal_message_notify+0xf4/0x2c0
This happens because the copy is targeting item->msg, which is only 4
bytes in size, even though the enclosing item was allocated with extra
space following the msg.
To fix the warning define struct opal_prd_msg with a union of the header
and a flex array, and have the memcpy target the flex array.
Reported-by: "Aneesh Kumar K.V" <[email protected]>
Reported-by: Mahesh Salgaonkar <[email protected]>
Tested-by: Mahesh Salgaonkar <[email protected]>
Reviewed-by: Mahesh Salgaonkar <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/[email protected]
|
|
As of commit b7fb14d3ac63117e ("ide: remove the legacy ide driver") in
v5.14, there are no more generic users of <asm/ide.h>.
Signed-off-by: Geert Uytterhoeven <[email protected]>
Acked-by: Michael Ellerman <[email protected]> (powerpc)
Signed-off-by: Damien Le Moal <[email protected]>
|
|
previous prototype error
corenet{32/64}_smp_defconfig leads to:
CC arch/powerpc/sysdev/ehv_pic.o
arch/powerpc/sysdev/ehv_pic.c:45:6: error: no previous prototype for 'ehv_pic_unmask_irq' [-Werror=missing-prototypes]
45 | void ehv_pic_unmask_irq(struct irq_data *d)
| ^~~~~~~~~~~~~~~~~~
arch/powerpc/sysdev/ehv_pic.c:52:6: error: no previous prototype for 'ehv_pic_mask_irq' [-Werror=missing-prototypes]
52 | void ehv_pic_mask_irq(struct irq_data *d)
| ^~~~~~~~~~~~~~~~
arch/powerpc/sysdev/ehv_pic.c:59:6: error: no previous prototype for 'ehv_pic_end_irq' [-Werror=missing-prototypes]
59 | void ehv_pic_end_irq(struct irq_data *d)
| ^~~~~~~~~~~~~~~
arch/powerpc/sysdev/ehv_pic.c:66:6: error: no previous prototype for 'ehv_pic_direct_end_irq' [-Werror=missing-prototypes]
66 | void ehv_pic_direct_end_irq(struct irq_data *d)
| ^~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/sysdev/ehv_pic.c:71:5: error: no previous prototype for 'ehv_pic_set_affinity' [-Werror=missing-prototypes]
71 | int ehv_pic_set_affinity(struct irq_data *d, const struct cpumask *dest,
| ^~~~~~~~~~~~~~~~~~~~
arch/powerpc/sysdev/ehv_pic.c:112:5: error: no previous prototype for 'ehv_pic_set_irq_type' [-Werror=missing-prototypes]
112 | int ehv_pic_set_irq_type(struct irq_data *d, unsigned int flow_type)
| ^~~~~~~~~~~~~~~~~~~~
CC arch/powerpc/sysdev/fsl_rio.o
arch/powerpc/sysdev/fsl_rio.c:102:5: error: no previous prototype for 'fsl_rio_mcheck_exception' [-Werror=missing-prototypes]
102 | int fsl_rio_mcheck_exception(struct pt_regs *regs)
| ^~~~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/sysdev/fsl_rio.c:306:5: error: no previous prototype for 'fsl_map_inb_mem' [-Werror=missing-prototypes]
306 | int fsl_map_inb_mem(struct rio_mport *mport, dma_addr_t lstart,
| ^~~~~~~~~~~~~~~
arch/powerpc/sysdev/fsl_rio.c:357:6: error: no previous prototype for 'fsl_unmap_inb_mem' [-Werror=missing-prototypes]
357 | void fsl_unmap_inb_mem(struct rio_mport *mport, dma_addr_t lstart)
| ^~~~~~~~~~~~~~~~~
arch/powerpc/sysdev/fsl_rio.c:445:5: error: no previous prototype for 'fsl_rio_setup' [-Werror=missing-prototypes]
445 | int fsl_rio_setup(struct platform_device *dev)
| ^~~~~~~~~~~~~
CC arch/powerpc/sysdev/fsl_rmu.o
arch/powerpc/sysdev/fsl_rmu.c:362:6: error: no previous prototype for 'msg_unit_error_handler' [-Werror=missing-prototypes]
362 | void msg_unit_error_handler(void)
| ^~~~~~~~~~~~~~~~~~~~~~
CC arch/powerpc/platforms/85xx/corenet_generic.o
arch/powerpc/platforms/85xx/corenet_generic.c:33:13: error: no previous prototype for 'corenet_gen_pic_init' [-Werror=missing-prototypes]
33 | void __init corenet_gen_pic_init(void)
| ^~~~~~~~~~~~~~~~~~~~
arch/powerpc/platforms/85xx/corenet_generic.c:51:13: error: no previous prototype for 'corenet_gen_setup_arch' [-Werror=missing-prototypes]
51 | void __init corenet_gen_setup_arch(void)
| ^~~~~~~~~~~~~~~~~~~~~~
arch/powerpc/platforms/85xx/corenet_generic.c:104:12: error: no previous prototype for 'corenet_gen_publish_devices' [-Werror=missing-prototypes]
104 | int __init corenet_gen_publish_devices(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~
CC arch/powerpc/platforms/85xx/qemu_e500.o
arch/powerpc/platforms/85xx/qemu_e500.c:28:13: error: no previous prototype for 'qemu_e500_pic_init' [-Werror=missing-prototypes]
28 | void __init qemu_e500_pic_init(void)
| ^~~~~~~~~~~~~~~~~~
CC arch/powerpc/kernel/pmc.o
arch/powerpc/kernel/pmc.c:78:6: error: no previous prototype for 'power4_enable_pmcs' [-Werror=missing-prototypes]
78 | void power4_enable_pmcs(void)
| ^~~~~~~~~~~~~~~~~~
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/c90780017b624b91771a3e4240dcbadc68137915.1692684784.git.christophe.leroy@csgroup.eu
|
|
asm/percpu.h includes asm/paca.h which needs struct tlb_core_data
which is defined in mmu-e500.h
asm/percpu.h is included from asm/mmu.h in a #ifdef CONFIG_E500
before the inclusion of mmu-e500.h
To fix that, move the inclusion of asm/percpu.h into mmu-e500.h
after the definition of struct tlb_core_data
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Fixes: 3a24ea0df83e ("powerpc/kuap: Use ASM feature fixups instead of static branches")
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/5e0f97d5cbcd05238b56b4424ab096468296824d.1692684461.git.christophe.leroy@csgroup.eu
|
|
The debugfs_create_dir returns ERR_PTR incase of an error and the
correct way of checking it by using the IS_ERR inline function, and
not the simple null comparision. This patch fixes this.
Suggested-by: Ivan Orlov <[email protected]>
Signed-off-by: Immad Mir <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/CY5PR12MB64553EE96EBB3927311DB598C6459@CY5PR12MB6455.namprd12.prod.outlook.com
|
|
|
|
There is only one Kconfig user of CONFIG_EMBEDDED and it can be switched
to EXPERT or "if !ARCH_MULTIPLATFORM" (suggested by Arnd).
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Randy Dunlap <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Palmer Dabbelt <[email protected]> [RISC-V]
Acked-by: Greg Ungerer <[email protected]>
Acked-by: Jason A. Donenfeld <[email protected]>
Acked-by: Michael Ellerman <[email protected]> [powerpc]
Cc: Russell King <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Michal Simek <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
In order to split struct ptdesc from struct page, convert various
functions to use ptdescs.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vishal Moola (Oracle) <[email protected]>
Acked-by: Mike Rapoport (IBM) <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Claudio Imbrenda <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Dinh Nguyen <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Guo Ren <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Cc: Richard Weinberger <[email protected]>
Cc: Thomas Bogendoerfer <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Radix vmemmap mapping can map things correctly at the PMD level or PTE
level based on different device boundary checks. Hence we skip the
restrictions w.r.t vmemmap size to be multiple of PMD_SIZE. This also
makes the feature widely useful because to use PMD_SIZE vmemmap area we
require a memory block size of 2GiB
We can also use MHP_RESERVE_PAGES_MEMMAP_ON_MEMORY to that the feature can
work with a memory block size of 256MB. Using altmap.reserve feature to
align things correctly at pageblock granularity. We can end up losing
some pages in memory with this. For ex: with a 256MiB memory block size,
we require 4 pages to map vmemmap pages, In order to align things
correctly we end up adding a reserve of 28 pages. ie, for every 4096
pages 28 pages get reserved.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Vishal Verma <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is
not obvious and makes it hard to understand where vma locking is happening.
Also in some cases (like in dup_userfaultfd()) vma should be locked earlier
than vma_flags modification. To make locking more visible, change these
functions to assert that the vma write lock is taken and explicitly lock
the vma beforehand. Fix userfaultfd functions which should lock the vma
earlier.
Link: https://lkml.kernel.org/r/[email protected]
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Suren Baghdasaryan <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
walk_page_range() and friends often operate under write-locked mmap_lock.
With introduction of vma locks, the vmas have to be locked as well during
such walks to prevent concurrent page faults in these areas. Add an
additional member to mm_walk_ops to indicate locking requirements for the
walk.
The change ensures that page walks which prevent concurrent page faults
by write-locking mmap_lock, operate correctly after introduction of
per-vma locks. With per-vma locks page faults can be handled under vma
lock without taking mmap_lock at all, so write locking mmap_lock would
not stop them. The change ensures vmas are properly locked during such
walks.
A sample issue this solves is do_mbind() performing queue_pages_range()
to queue pages for migration. Without this change a concurrent page
can be faulted into the area and be left out of migration.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Suren Baghdasaryan <[email protected]>
Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Jann Horn <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Cc: Peter Xu <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
GCC v13.1 updated support for -fpatchable-function-entry on ppc64le to
emit nops after the local entry point, rather than before it. This
allows us to use this in the kernel for ftrace purposes. A new script is
added under arch/powerpc/tools/ to help detect if nops are emitted after
the function local entry point, or before the global entry point.
With -fpatchable-function-entry, we no longer have the profiling
instructions generated at function entry, so we only need to validate
the presence of two nops at the ftrace location in ftrace_init_nop(). We
patch the preceding instruction with 'mflr r0' to match the
-mprofile-kernel ABI for subsequent ftrace use.
This changes the profiling instructions used on ppc32. The default -pg
option emits an additional 'stw' instruction after 'mflr r0' and before
the branch to _mcount 'bl _mcount'. This is very similar to the original
-mprofile-kernel implementation on ppc64le, where an additional 'std'
instruction was used to save LR to its save location in the caller's
stackframe. Subsequently, this additional store was removed in later
compiler versions for performance reasons. The same reasons apply for
ppc32 so we only patch in a 'mflr r0'.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/68586d22981a2c3bb45f27a2b621173d10a7d092.1687166935.git.naveen@kernel.org
|
|
Implement ftrace_replace_code() to consolidate logic from the different
ftrace patching routines: ftrace_make_nop(), ftrace_make_call() and
ftrace_modify_call(). Note that ftrace_make_call() is still required
primarily to handle patching modules during their load time. The other
two routines should no longer be called.
This lays the groundwork to enable better control in patching ftrace
locations, including the ability to nop-out preceding profiling
instructions when ftrace is disabled.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/c28f852225646b0561bbf3c1d22d03f041ace8e0.1687166935.git.naveen@kernel.org
|
|
ftrace_create_branch_inst()
ftrace_create_branch_inst() is clearer about its intent than
ftrace_call_replace().
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/953513b88fa922ba7a66d772dc1310710efe9177.1687166935.git.naveen@kernel.org
|
|
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_modify_call() to patch-in the
updated branch instruction without worrying about the instructions
surrounding the ftrace location. Note that we continue to ensure we
have the expected branch instruction at the ftrace location before
patching it with the updated branch destination.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/06275720939f8ee4c2f61c9e9a3e89b1fa3c441d.1687166935.git.naveen@kernel.org
|
|
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_make_call() to replace the nop
without worrying about the instructions surrounding the ftrace location.
Note that we continue to ensure that we have a nop at the ftrace
location before patching it.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/2d28866d2f556488a663981abe5621511efb207b.1687166935.git.naveen@kernel.org
|
|
Now that we validate the ftrace location during initialization in
ftrace_init_nop(), we can simplify ftrace_make_nop() to patch-in the nop
without worrying about the instructions surrounding the ftrace location.
Note that we continue to ensure that we have a bl to
ftrace_[regs_]caller at the ftrace location before nop-ing it out.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/e12ccbf28c50c3a07fb614f4d392e55f7098a729.1687166935.git.naveen@kernel.org
|
|
Currently, we validate instructions around the ftrace location every
time we have to enable/disable ftrace. Introduce ftrace_init_nop() to
instead perform all the validation during ftrace initialization. This
allows us to simply patch the necessary instructions during
enabling/disabling ftrace.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/f373684081e8e98be09b7f44d2d93069768324dc.1687166935.git.naveen@kernel.org
|
|
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs")
added ftrace support for ppc64 kernel images with a text section larger
than 32MB. The patch did two things:
1. Add stubs at the end of .text to branch into ftrace_[regs_]caller for
functions that were out of branch range.
2. Re-purpose linker-generated long branches to _mcount to instead branch
to ftrace_[regs_]caller.
Before that, we only supported kernel .text up to ~32MB. With the above,
we now support up to ~96MB:
- The first 32MB of kernel text can branch directly into
ftrace_[regs_]caller since that symbol is usually at the beginning.
- The modified long_branch from (2) above is used by the next 32MB of
kernel text.
- The next 32MB of kernel text can use the stub at the end of text to
branch back to ftrace_[regs_]caller.
While re-purposing the long branch works in practice, it still restricts
ftrace to kernel text up to ~96MB. The stub at the end of kernel text
from (1) already enables us to extend ftrace support for kernel text
up to 64MB, which fulfils the original requirement. Further, once we
switch to -fpatchable-function-entry, there will not be a long branch
that we can use.
Stop re-purposing the linker-generated long branches for ftrace to
simplify the code. If there are good reasons to support ftrace on
kernels beyond 64MB, we can consider adding support by using
-fpatchable-function-entry.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/33fa3be97f8e1f2171254ef2e1b0d5c8836c11fd.1687166935.git.naveen@kernel.org
|
|
Split up ftrace_modify_code() into a few helpers for future use. Also
update error messages accordingly.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/a8daa49712b44ff539e6c22a2ea649a540386798.1687166935.git.naveen@kernel.org
|
|
ftrace_low.S has just the _mcount stub and return_to_handler(). Merge
this back into ftrace_mprofile.S and ftrace_64_pg.S to keep all ftrace
code together, and to allow those to evolve independently.
ftrace_mprofile.S is also not an entirely accurate name since this also
holds ppc32 code. This will be all the more incorrect once support for
-fpatchable-function-entry is added. Rename files here to more
accurately describe the code:
- ftrace_mprofile.S is renamed to ftrace_entry.S
- ftrace_pg.c is renamed to ftrace_64_pg.c
- ftrace_64_pg.S is rename to ftrace_64_pg_entry.S
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/b900c9a8bba9d6c3c295e0f99886acf3e5bf6f7b.1687166935.git.naveen@kernel.org
|
|
Commit 67361cf8071286 ("powerpc/ftrace: Handle large kernel configs")
added ftrace support for ppc64 kernel images with a text section larger
than 32MB. The approach itself isn't specific to ppc64, so extend the
same to also work on ppc32.
While at it, reduce the space reserved for the stub from 64 bytes to 32
bytes since the different stub variants are all less than 8
instructions.
To reduce use of #ifdef, a stub implementation is provided for
kernel_toc_address() and -SZ_2G is cast to 'long long' to prevent
errors on ppc32.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/9fa3258cbb9105cf8a0a8135214d44ffbc75fe84.1687166935.git.naveen@kernel.org
|
|
Instead of keying off DYNAMIC_FTRACE_WITH_REGS, use FTRACE_REGS_ADDR to
identify the proper ftrace trampoline address to use.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/6045a280a57a7ea937a5bb13ccac747026dbfb07.1687166935.git.naveen@kernel.org
|
|
Since we now support DYNAMIC_FTRACE_WITH_ARGS across ppc32 and ppc64
ELFv2, we can simplify function_graph tracer support code in ftrace.c
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/4dc92c4b1ed444dc62b748ae7327acdb9e096864.1687166935.git.naveen@kernel.org
|
|
ELFv1 support is deprecated and on the way out. Pre -mprofile-kernel
ftrace support (-pg only) is very limited and is retained primarily for
clang builds. It won't be necessary once clang lands support for
-fpatchable-function-entry.
Copy the existing ftrace code supporting these into ftrace_pg.c.
ftrace.c can then be refactored and enhanced with a focus on ppc32 and
ppc64 ELFv2.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/1eb6cc6c3141ddb77a2a25f8a9e83d83ff312b02.1687166935.git.naveen@kernel.org
|
|
.ftrace.tramp section is not used for any purpose. This code was added
all the way back in the original commit introducing support for dynamic
ftrace on ppc64 modules. Remove it.
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/9cf6d7f37ba82f7cb6dafecf660f44925c526d8d.1687166935.git.naveen@kernel.org
|
|
The minimum level of gcc supported for building the kernel is v5.1.
v5.x releases of gcc emitted a three instruction sequence for
-mprofile-kernel:
mflr r0
std r0, 16(r1)
bl _mcount
It is only with the v6.x releases that gcc started emitting the two
instruction sequence for -mprofile-kernel, omitting the second store
instruction.
With the older three instruction sequence, the actual ftrace location
can be the 5th instruction into a function. Update the allowed offset
for ftrace location from 12 to 16 to accommodate the same.
Cc: [email protected]
Fixes: 7af82ff90a2b06 ("powerpc/ftrace: Ignore weak functions")
Signed-off-by: Naveen N Rao <[email protected]>
Reviewed-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/7b265908a9461e38fc756ef9b569703860a80621.1687166935.git.naveen@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty") into tty-next
We need the serial-core fixes in here as well.
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
- Fix hardened usercopy BUG when using /proc based firmware update
interface
Thanks to Nathan Lynch and Kees Cook.
* tag 'powerpc-6.5-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/rtas_flash: allow user copy to flash block cache objects
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/sfc/tc.c
fa165e194997 ("sfc: don't unregister flow_indr if it was never registered")
3bf969e88ada ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/[email protected]/
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU. This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.
Let's extend the API to take a CPU ID to exclude instead of just a
boolean. This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.
Arguably, this new API also encourages safer behavior. Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.
[[email protected]: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: kernel test robot <[email protected]>
Cc: Lecopzer Chen <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Pingfan Liu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The Kconfig refactor to consolidate KEXEC and CRASH options utilized
option names of the form ARCH_SUPPORTS_<option>. Thus rename the
ARCH_HAS_KEXEC_PURGATORY to ARCH_SUPPORTS_KEXEC_PURGATORY to follow
the same.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Eric DeVolder <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The kexec and crash kernel options are provided in the common
kernel/Kconfig.kexec. Utilize the common options and provide
the ARCH_SUPPORTS_ and ARCH_SELECTS_ entries to recreate the
equivalent set of KEXEC and CRASH options.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Eric DeVolder <[email protected]>
Reviewed-by: Sourabh Jain <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
allocation
Add some extra vmemmap pr_debug message that will indicate the type of
vmemmap allocations.
For ex: with DAX vmemmap optimization we can find the below details:
[ 187.166580] radix-mmu: PAGE_SIZE vmemmap mapping
[ 187.166587] radix-mmu: PAGE_SIZE vmemmap mapping
[ 187.166591] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166594] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166598] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166601] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166604] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166608] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166611] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166614] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166617] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166620] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166623] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166626] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166629] radix-mmu: Tail page reuse vmemmap mapping
[ 187.166632] radix-mmu: Tail page reuse vmemmap mapping
And without vmemmap optimization
[ 293.549931] radix-mmu: PMD_SIZE vmemmap mapping
[ 293.549984] radix-mmu: PMD_SIZE vmemmap mapping
[ 293.550032] radix-mmu: PMD_SIZE vmemmap mapping
[ 293.550076] radix-mmu: PMD_SIZE vmemmap mapping
[ 293.550117] radix-mmu: PMD_SIZE vmemmap mapping
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Joao Martins <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|