aboutsummaryrefslogtreecommitdiff
path: root/scripts/mod
diff options
context:
space:
mode:
authorNadav Amit <[email protected]>2018-10-03 14:30:52 -0700
committerIngo Molnar <[email protected]>2018-10-04 10:57:09 +0200
commit77b0bf55bc675233d22cd5df97605d516d64525e (patch)
treedc568685e7ab654809212470f818e966fc48be62 /scripts/mod
parent35e76b99ddf20405a6196bb7c9eb152675c93106 (diff)
kbuild/Makefile: Prepare for using macros in inline assembly code to work around asm() related GCC inlining bugs
Using macros in inline assembly allows us to work around bugs in GCC's inlining decisions. Compile macros.S and use it to assemble all C files. Currently only x86 will use it. Background: The inlining pass of GCC doesn't include an assembler, so it's not aware of basic properties of the generated code, such as its size in bytes, or that there are such things as discontiuous blocks of code and data due to the newfangled linker feature called 'sections' ... Instead GCC uses a lazy and fragile heuristic: it does a linear count of certain syntactic and whitespace elements in inlined assembly block source code, such as a count of new-lines and semicolons (!), as a poor substitute for "code size and complexity". Unsurprisingly this heuristic falls over and breaks its neck whith certain common types of kernel code that use inline assembly, such as the frequent practice of putting useful information into alternative sections. As a result of this fresh, 20+ years old GCC bug, GCC's inlining decisions are effectively disabled for inlined functions that make use of such asm() blocks, because GCC thinks those sections of code are "large" - when in reality they are often result in just a very low number of machine instructions. This absolute lack of inlining provess when GCC comes across such asm() blocks both increases generated kernel code size and causes performance overhead, which is particularly noticeable on paravirt kernels, which make frequent use of these inlining facilities in attempt to stay out of the way when running on baremetal hardware. Instead of fixing the compiler we use a workaround: we set an assembly macro and call it from the inlined assembly block. As a result GCC considers the inline assembly block as a single instruction. (Which it often isn't but I digress.) This uglifies and bloats the source code - for example just the refcount related changes have this impact: Makefile | 9 +++++++-- arch/x86/Makefile | 7 +++++++ arch/x86/kernel/macros.S | 7 +++++++ scripts/Kbuild.include | 4 +++- scripts/mod/Makefile | 2 ++ 5 files changed, 26 insertions(+), 3 deletions(-) Yay readability and maintainability, it's not like assembly code is hard to read and maintain ... We also hope that GCC will eventually get fixed, but we are not holding our breath for that. Yet we are optimistic, it might still happen, any decade now. [ mingo: Wrote new changelog describing the background. ] Tested-by: Kees Cook <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Acked-by: Masahiro Yamada <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michal Marek <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
Diffstat (limited to 'scripts/mod')
-rw-r--r--scripts/mod/Makefile2
1 files changed, 2 insertions, 0 deletions
diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index 42c5d50f2bcc..a5b4af47987a 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -4,6 +4,8 @@ OBJECT_FILES_NON_STANDARD := y
hostprogs-y := modpost mk_elfconfig
always := $(hostprogs-y) empty.o
+CFLAGS_REMOVE_empty.o := $(ASM_MACRO_FLAGS)
+
modpost-objs := modpost.o file2alias.o sumversion.o
devicetable-offsets-file := devicetable-offsets.h