blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-11-22	mm: fix phys_to_target_node() and memory_add_physaddr_to_nid() exports	Dan Williams	1	-0/+2
	The core-mm has a default __weak implementation of phys_to_target_node() to mirror the weak definition of memory_add_physaddr_to_nid(). That symbol is exported for modules. However, while the export in mm/memory_hotplug.c exported the symbol in the configuration cases of: CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_HOTPLUG=y ...and: CONFIG_NUMA_KEEP_MEMINFO=n CONFIG_MEMORY_HOTPLUG=y ...it failed to export the symbol in the case of: CONFIG_NUMA_KEEP_MEMINFO=y CONFIG_MEMORY_HOTPLUG=n Not only is that broken, but Christoph points out that the kernel should not be exporting any __weak symbol, which means that memory_add_physaddr_to_nid() example that phys_to_target_node() copied is broken too. Rework the definition of phys_to_target_node() and memory_add_physaddr_to_nid() to not require weak symbols. Move to the common arch override design-pattern of an asm header defining a symbol to replace the default implementation. The only common header that all memory_add_physaddr_to_nid() producing architectures implement is asm/sparsemem.h. In fact, powerpc already defines its memory_add_physaddr_to_nid() helper in sparsemem.h. Double-down on that observation and define phys_to_target_node() where necessary in asm/sparsemem.h. An alternate consideration that was discarded was to put this override in asm/numa.h, but that entangles with the definition of MAX_NUMNODES relative to the inclusion of linux/nodemask.h, and requires powerpc to grow a new header. The dependency on NUMA_KEEP_MEMINFO for DEV_DAX_HMEM_DEVICES is invalid now that the symbol is properly exported / stubbed in all combinations of CONFIG_NUMA_KEEP_MEMINFO and CONFIG_MEMORY_HOTPLUG. [[email protected]: v4] Link: https://lkml.kernel.org/r/160461461867.1505359.5301571728749534585.stgit@dwillia2-desk3.amr.corp.intel.com [[email protected]: powerpc: fix create_section_mapping compile warning] Link: https://lkml.kernel.org/r/160558386174.2948926.2740149041249041764.stgit@dwillia2-desk3.amr.corp.intel.com Fixes: a035b6bf863e ("mm/memory_hotplug: introduce default phys_to_target_node() implementation") Reported-by: Randy Dunlap <[email protected]> Reported-by: Thomas Gleixner <[email protected]> Reported-by: kernel test robot <[email protected]> Reported-by: Christoph Hellwig <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Randy Dunlap <[email protected]> Tested-by: Thomas Gleixner <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Joao Martins <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Stephen Rothwell <[email protected]> Link: https://lkml.kernel.org/r/160447639846.1133764.7044090803980177548.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-14	Merge tag 'acpi-5.10-rc1' of ↵	Linus Torvalds	1	-0/+21
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it, clean up some non-ACPICA code referring to debug facilities from ACPICA, reduce the overhead related to accessing GPE registers, add a new DPTF (Dynamic Power and Thermal Framework) participant driver, update the ACPICA code in the kernel to upstream revision 20200925, add a new ACPI backlight whitelist entry, fix a few assorted issues and clean up some code. Specifics: - Add support for generic initiator-only proximity domains to the ACPI NUMA code and the architectures using it (Jonathan Cameron) - Clean up some non-ACPICA code referring to debug facilities from ACPICA that are not actually used in there (Hanjun Guo) - Add new DPTF driver for the PCH FIVR participant (Srinivas Pandruvada) - Reduce overhead related to accessing GPE registers in ACPICA and the OS interface layer and make it possible to access GPE registers using logical addresses if they are memory-mapped (Rafael Wysocki) - Update the ACPICA code in the kernel to upstream revision 20200925 including changes as follows: + Add predefined names from the SMBus sepcification (Bob Moore) + Update acpi_help UUID list (Bob Moore) + Return exceptions for string-to-integer conversions in iASL (Bob Moore) + Add a new "ALL <NameSeg>" debugger command (Bob Moore) + Add support for 64 bit risc-v compilation (Colin Ian King) + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap) - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex Hung) - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko) - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo) - Add missing config_item_put() to fix refcount leak (Hanjun Guo) - Drop lefrover field from struct acpi_memory_device (Hanjun Guo) - Make the ACPI extlog driver check for RDMSR failures (Ben Hutchings) - Fix handling of lid state changes in the ACPI button driver when input device is closed (Dmitry Torokhov) - Fix several assorted build issues (Barnabás Pőcze, John Garry, Nathan Chancellor, Tian Tao) - Drop unused inline functions and reduce code duplication by using kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing) - Serialize tools/power/acpi Makefile (Thomas Renninger)" * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits) ACPICA: Update version to 20200925 Version 20200925 ACPICA: Remove unnecessary semicolon ACPICA: Debugger: Add a new command: "ALL <NameSeg>" ACPICA: iASL: Return exceptions for string-to-integer conversions ACPICA: acpi_help: Update UUID list ACPICA: Add predefined names found in the SMBus sepcification ACPICA: Tree-wide: fix various typos and spelling mistakes ACPICA: Drop the repeated word "an" in a comment ACPICA: Add support for 64 bit risc-v compilation ACPI: button: fix handling lid state changes when input device closed tools/power/acpi: Serialize Makefile ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug() ACPI: memhotplug: Remove 'state' from struct acpi_memory_device ACPI / extlog: Check for RDMSR failure ACPI: Make acpi_evaluate_dsm() prototype consistent docs: mm: numaperf.rst Add brief description for access class 1. node: Add access1 class to represent CPU to memory characteristics ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3 ACPI: Let ACPI know we support Generic Initiator Affinity Structures x86: Support Generic Initiator only proximity domains ...
2020-10-13	memblock: use separate iterators for memory and reserved regions	Mike Rapoport	1	-1/+1
	for_each_memblock() is used to iterate over memblock.memory in a few places that use data from memblock_region rather than the memory ranges. Introduce separate for_each_mem_region() and for_each_reserved_mem_region() to improve encapsulation of memblock internals from its users. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Baoquan He <[email protected]> Acked-by: Ingo Molnar <[email protected]> [x86] Acked-by: Thomas Bogendoerfer <[email protected]> [MIPS] Acked-by: Miguel Ojeda <[email protected]> [.clang-format] Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Daniel Axtens <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Emil Renner Berthing <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Marek Szyprowski <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13	mm/memory_hotplug: introduce default phys_to_target_node() implementation	Dan Williams	1	-1/+0
	In preparation to set a fallback value for dev_dax->target_node, introduce generic fallback helpers for phys_to_target_node() A generic implementation based on node-data or memblock was proposed, but as noted by Mike: "Here again, I would prefer to add a weak default for phys_to_target_node() because the "generic" implementation is not really generic. The fallback to reserved ranges is x86 specfic because on x86 most of the reserved areas is not in memblock.memory. AFAIK, no other architecture does this." The info message in the generic memory_add_physaddr_to_nid() implementation is fixed up to properly reflect that memory_add_physaddr_to_nid() communicates "online" node info and phys_to_target_node() indicates "target / to-be-onlined" node info. [[email protected]: fix CONFIG_MEMORY_HOTPLUG=n build] Link: https://lkml.kernel.org/r/202008252130.7YrHIyMI%[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Jia He <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643097768.4062302.3135192588966888630.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13	x86/numa: add 'nohmat' option	Dan Williams	1	-0/+2
	Disable parsing of the HMAT for debug, to workaround broken platform instances, or cases where it is otherwise not wanted. [[email protected]: fix build when CONFIG_ACPI is not set] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643095540.4062302.732962081968036212.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13	x86/numa: cleanup configuration dependent command-line options	Dan Williams	1	-6/+2
	Patch series "device-dax: Support sub-dividing soft-reserved ranges", v5. The device-dax facility allows an address range to be directly mapped through a chardev, or optionally hotplugged to the core kernel page allocator as System-RAM. It is the mechanism for converting persistent memory (pmem) to be used as another volatile memory pool i.e. the current Memory Tiering hot topic on linux-mm. In the case of pmem the nvdimm-namespace-label mechanism can sub-divide it, but that labeling mechanism is not available / applicable to soft-reserved ("EFI specific purpose") memory [3]. This series provides a sysfs-mechanism for the daxctl utility to enable provisioning of volatile-soft-reserved memory ranges. The motivations for this facility are: 1/ Allow performance differentiated memory ranges to be split between kernel-managed and directly-accessed use cases. 2/ Allow physical memory to be provisioned along performance relevant address boundaries. For example, divide a memory-side cache [4] along cache-color boundaries. 3/ Parcel out soft-reserved memory to VMs using device-dax as a security / permissions boundary [5]. Specifically I have seen people (ab)using memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the device-dax interface on custom address ranges. A follow-on for the VM use case is to teach device-dax to dynamically allocate 'struct page' at runtime to reduce the duplication of 'struct page' space in both the guest and the host kernel for the same physical pages. [2]: http://lore.kernel.org/r/[email protected] [3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com [4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com [5]: http://lore.kernel.org/r/[email protected] This patch (of 23): In preparation for adding a new numa= option clean up the existing ones to avoid ifdefs in numa_setup(), and provide feedback when the option is numa=fake= option is invalid due to kernel config. The same does not need to be done for numa=noacpi, since the capability is already hard disabled at compile-time. Suggested-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/159643094279.4062302.17779410714418721328.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/159643094925.4062302.14979872973043772305.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-02	x86: Support Generic Initiator only proximity domains	Jonathan Cameron	1	-0/+21
	In common with memoryless domains only register GI domains if the proximity node is not online. If a domain is already a memory containing domain, or a memoryless domain there is nothing to do just because it also contains a Generic Initiator. Signed-off-by: Jonathan Cameron <[email protected]> Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2020-08-12	mm/memory_hotplug: introduce default dummy memory_add_physaddr_to_nid()	Jia He	1	-1/+0
	This is to introduce a general dummy helper. memory_add_physaddr_to_nid() is a fallback option to get the nid in case NUMA_NO_NID is detected. After this patch, arm64/sh/s390 can simply use the general dummy version. PowerPC/x86/ia64 will still use their specific version. This is the preparation to set a fallback value for dev_dax->target_node. Signed-off-by: Jia He <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Dan Williams <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Tony Luck <[email protected]> Cc: Fenghua Yu <[email protected]> Cc: Yoshinori Sato <[email protected]> Cc: Rich Felker <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Baoquan He <[email protected]> Cc: Chuhong Yuan <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Kaly Xin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-16	x86/mm/numa: Remove uninitialized_var() usage	Kees Cook	1	-9/+9
	Using uninitialized_var() is dangerous as it papers over real bugs[1] (or can in the future), and suppresses unrelated compiler warnings (e.g. "unused variable"). If the compiler thinks it is uninitialized, either simply initialize the variable or make compiler changes. As a precursor to removing[2] this[3] macro[4], refactor code to avoid its need. The original reason for its use here was to work around the #ifdef being the only place the variable was used. This is better expressed using IS_ENABLED() and a new code block where the variable can be used unconditionally. [1] https://lore.kernel.org/lkml/[email protected]/ [2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/ [3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/ [4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/ Fixes: 1e01979c8f50 ("x86, numa: Implement pfn -> nid mapping granularity check") Signed-off-by: Kees Cook <[email protected]>
2020-06-03	mm: rename free_area_init_node() to free_area_init_memoryless_node()	Mike Rapoport	1	-4/+1
	free_area_init_node() is only used by x86 to initialize a memory-less nodes. Make its name reflect this and drop all the function parameters except node ID as they are anyway zero. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Hoan Tran <[email protected]> [arm64] Cc: Baoquan He <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Simek <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-06-03	mm: memblock: replace dereferences of memblock_region.nid with API calls	Mike Rapoport	1	-2/+4
	Patch series "mm: rework free_area_init*() funcitons". After the discussion [1] about removal of CONFIG_NODES_SPAN_OTHER_NODES and CONFIG_HAVE_MEMBLOCK_NODE_MAP options, I took it a bit further and updated the node/zone initialization. Since all architectures have memblock, it is possible to use only the newer version of free_area_init_node() that calculates the zone and node boundaries based on memblock node mapping and architectural limits on possible zone PFNs. The architectures that still determined zone and hole sizes can be switched to the generic code and the old code that took those zone and hole sizes can be simply removed. And, since it all started from the removal of CONFIG_NODES_SPAN_OTHER_NODES, the memmap_init() is now updated to iterate over memblocks and so it does not need to perform early_pfn_to_nid() query for every PFN. [1] https://lore.kernel.org/lkml/[email protected] This patch (of 21): There are several places in the code that directly dereference memblock_region.nid despite this field being defined only when CONFIG_HAVE_MEMBLOCK_NODE_MAP=y. Replace these with calls to memblock_get_region_nid() to improve code robustness and to avoid possible breakage when CONFIG_HAVE_MEMBLOCK_NODE_MAP will be removed. Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Hoan Tran <[email protected]> [arm64] Reviewed-by: Baoquan He <[email protected]> Cc: Brian Cain <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Ungerer <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Simek <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Nick Hu <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Thomas Bogendoerfer <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-02-18	x86/NUMA: Provide a range-to-target_node lookup facility	Dan Williams	1	-10/+51
	The DEV_DAX_KMEM facility is a generic mechanism to allow device-dax instances, fronting performance-differentiated-memory like pmem, to be added to the System RAM pool. The NUMA node for that hot-added memory is derived from the device-dax instance's 'target_node' attribute. Recall that the 'target_node' is the ACPI-PXM-to-node translation for memory when it comes online whereas the 'numa_node' attribute of the device represents the closest online cpu node. Presently useful target_node information from the ACPI SRAT is discarded with the expectation that "Reserved" memory will never be onlined. Now, DEV_DAX_KMEM violates that assumption, there is a need to retain the translation. Move, rather than discard, numa_memblk data to a secondary array that memory_add_physaddr_to_target_node() may consider at a later point in time. Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> Cc: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Michal Hocko <[email protected]> Reported-by: kbuild test robot <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Signed-off-by: Dan Williams <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/158188326978.894464.217282995221175417.stgit@dwillia2-desk3.amr.corp.intel.com
2020-02-17	x86/mm: Introduce CONFIG_NUMA_KEEP_MEMINFO	Dan Williams	1	-5/+1
	Currently x86 numa_meminfo is marked __initdata in the CONFIG_MEMORY_HOTPLUG=n case. In support of a new facility to allow drivers to map reserved memory to a 'target_node' (phys_to_target_node()), add support for removing the __initdata designation for those users. Both memory hotplug and phys_to_target_node() users select CONFIG_NUMA_KEEP_MEMINFO to tell the arch to maintain its physical address to NUMA mapping infrastructure post init. Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> Cc: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Michal Hocko <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Signed-off-by: Dan Williams <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/158188326422.894464.15742054998046628934.stgit@dwillia2-desk3.amr.corp.intel.com
2019-11-18	x86: Fix typos in comments	Cao jin	1	-1/+1
	BIOSen -> BIOSes; paing -> paging. Append to 640 its proper unit "Kb". encomapssing -> encompassing. [ bp: Merge into a single patch, fix one more typo, massage. ] Signed-off-by: Cao jin <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Baoquan He <[email protected]> Cc: Dave Young <[email protected]> Cc: David Howells <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Robert Richter <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Lendacky <[email protected]> Cc: x86-ml <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-09-05	x86/mm: Fix cpumask_of_node() error condition	Peter Zijlstra	1	-2/+2
	When CONFIG_DEBUG_PER_CPU_MAPS=y we validate that the @node argument of cpumask_of_node() is a valid node_id. It however forgets to check for negative numbers. Fix this by explicitly casting to unsigned int. (unsigned)node >= nr_node_ids verifies: 0 <= node < nr_node_ids Also ammend the error message to match the condition. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Yunsheng Lin <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2019-05-21	treewide: Add SPDX license identifier for missed files	Thomas Gleixner	1	-0/+1
	Add SPDX license identifiers to all files which: - Have no license information of any form - Have EXPORT_.*_SYMBOL_GPL inside which was used in the initial scan/conversion to ignore the file These files fall under the project license, GPL v2 only. The resulting SPDX license identifier is: GPL-2.0-only Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2019-03-12	memblock: drop __memblock_alloc_base()	Mike Rapoport	1	-8/+4
	The __memblock_alloc_base() function tries to allocate a memory up to the limit specified by its max_addr parameter. Depending on the value of this parameter, the __memblock_alloc_base() can is replaced with the appropriate memblock_phys_alloc*() variant. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Acked-by: Rob Herring <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Guo Ren <[email protected]> Cc: Guo Ren <[email protected]> [c-sky] Cc: Heiko Carstens <[email protected]> Cc: Juergen Gross <[email protected]> [Xen] Cc: Mark Salter <[email protected]> Cc: Matt Turner <[email protected]> Cc: Max Filippov <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Paul Burton <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Rob Herring <[email protected]> Cc: Russell King <[email protected]> Cc: Stafford Horne <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-03-05	numa: make "nr_node_ids" unsigned int	Alexey Dobriyan	1	-2/+2
	Number of NUMA nodes can't be negative. This saves a few bytes on x86_64: add/remove: 0/0 grow/shrink: 4/21 up/down: 27/-265 (-238) Function old new delta hv_synic_alloc.cold 88 110 +22 prealloc_shrinker 260 262 +2 bootstrap 249 251 +2 sched_init_numa 1566 1567 +1 show_slab_objects 778 777 -1 s_show 1201 1200 -1 kmem_cache_init 346 345 -1 __alloc_workqueue_key 1146 1145 -1 mem_cgroup_css_alloc 1614 1612 -2 __do_sys_swapon 4702 4699 -3 __list_lru_init 655 651 -4 nic_probe 2379 2374 -5 store_user_store 118 111 -7 red_zone_store 106 99 -7 poison_store 106 99 -7 wq_numa_init 348 338 -10 __kmem_cache_empty 75 65 -10 task_numa_free 186 173 -13 merge_across_nodes_store 351 336 -15 irq_create_affinity_masks 1261 1246 -15 do_numa_crng_init 343 321 -22 task_numa_fault 4760 4737 -23 swapfile_init 179 156 -23 hv_synic_alloc 536 492 -44 apply_wqattrs_prepare 746 695 -51 Link: http://lkml.kernel.org/r/20190201223029.GA15820@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-10-31	mm: remove include/linux/bootmem.h	Mike Rapoport	1	-1/+0
	Move remaining definitions and declarations from include/linux/bootmem.h into include/linux/memblock.h and remove the redundant header. The includes were replaced with the semantic patch below and then semi-automated removal of duplicated '#include <linux/memblock.h> @@ @@ - #include <linux/bootmem.h> + #include <linux/memblock.h> [[email protected]: dma-direct: fix up for the removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: powerpc: fix up for removal of linux/bootmem.h] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chris Zankel <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Matt Turner <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Burton <[email protected]> Cc: Richard Kuo <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Serge Semin <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-10-31	memblock: rename memblock_alloc{_nid,_try_nid} to memblock_phys_alloc*	Mike Rapoport	1	-1/+1
	Make it explicit that the caller gets a physical address rather than a virtual one. This will also allow using meblock_alloc prefix for memblock allocations returning virtual address, which is done in the following patches. The conversion is done using the following semantic patch: @@ expression e1, e2, e3; @@ ( - memblock_alloc(e1, e2) + memblock_phys_alloc(e1, e2) \| - memblock_alloc_nid(e1, e2, e3) + memblock_phys_alloc_nid(e1, e2, e3) \| - memblock_alloc_try_nid(e1, e2, e3) + memblock_phys_alloc_try_nid(e1, e2, e3) ) Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mike Rapoport <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Chris Zankel <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Cc: Greentime Hu <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Guan Xuetao <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Jonas Bonn <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Ley Foon Tan <[email protected]> Cc: Mark Salter <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: Matt Turner <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Michal Simek <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Burton <[email protected]> Cc: Richard Kuo <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Rich Felker <[email protected]> Cc: Russell King <[email protected]> Cc: Serge Semin <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Luck <[email protected]> Cc: Vineet Gupta <[email protected]> Cc: Yoshinori Sato <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-13	x86: Remove pr_fmt duplicate logging prefixes	Joe Perches	1	-11/+11
	Converting pr_fmt from a default simple #define to use KBUILD_MODNAME added some duplicate prefixes. Remove the duplicate prefixes. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Link: https://lkml.kernel.org/r/e7b709a2b040af7faa81b0aa2c3a125aed628a82.1525964383.git.joe@perches.com
2017-04-11	Merge branch 'x86/boot' into x86/mm, to avoid conflict	Ingo Molnar	1	-1/+1
	There's a conflict between ongoing level-5 paging support and the E820 rewrite. Since the E820 rewrite is essentially ready, merge it into x86/mm to reduce tree conflicts. Signed-off-by: Ingo Molnar <[email protected]>
2017-04-08	Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"	Thomas Gleixner	1	-1/+20
	This reverts commit 474aeffd88b87746a75583f356183d5c6caa4213 due to testing failures. Reported-by: "Kirill A. Shutemov" <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: Wei Yang <[email protected]> Cc: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-04-03	x86/mm/numa: Remove numa_nodemask_from_meminfo()	Wei Yang	1	-20/+1
	numa_nodemask_from_meminfo() generates a nodemask of nodes which have memory according to a meminfo descriptor. The two callsites of that function both set bits in copies of the numa_nodes_parsed nodemask. In both cases, the information in supplied numa_meminfo is a subset of numa_nodes_parsed. So setting those bits again is not really necessary. Here are the three call paths which show that the supplied numa_meminfo argument describes memory regions in nodes which are already in numa_nodes_parsed: x86_numa_init() numa_init() Case 1: acpi_numa_init() acpi_parse_memory_affinity() numa_add_memblk() node_set(numa_nodes_parsed) acpi_parse_slit() acpi_numa_slit_init() numa_set_distance() numa_alloc_distance() numa_nodemask_from_meminfo() Case 2: amd_numa_init() numa_add_memblk() node_set(numa_nodes_parsed) Case 3 dummy_numa_init() node_set(numa_nodes_parsed) numa_add_memblk() numa_register_memblks() numa_nodemask_from_meminfo() Thus, in all three cases, the respective bit in numa_nodes_parsed is set, which means it is not necessary to set it again in a copy of numa_nodes_parsed. So remove that function. Signed-off-by: Wei Yang <[email protected]> Cc: x86-ml <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Heavily massage commit message. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2017-04-03	x86/mm/numa: Improve alloc_node_data() error path message	Wei Yang	1	-2/+2
	alloc_node_data() tries to allocate from the local node first and, if that attempt fails, falls back to any node. Improve the error message to issue the initial node for ease during debugging. Fix a typo in the comments, while at it. Signed-off-by: Wei Yang <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Masssage commit message. ] Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]>
2017-01-28	x86/boot/e820: Move asm/e820.h to asm/e820/api.h	Ingo Molnar	1	-1/+1
	In line with asm/e820/types.h, move the e820 API declarations to asm/e820/api.h and update all usage sites. This is just a mechanical, obviously correct move & replace patch, there will be subsequent changes to clean up the code and to make better use of the new header organization. Cc: Alex Thorlton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dan Williams <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Huang, Ying <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Paul Jackson <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wei Yang <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-12-15	ACPI/NUMA: Do not map pxm to node when NUMA is turned off	Boris Ostrovsky	1	-1/+1
	acpi_map_pxm_to_node() unconditially maps nodes even when NUMA is turned off. So acpi_get_node() might return a node > 0, which is fatal when NUMA is disabled as the rest of the kernel assumes that only node 0 exists. Expose numa_off to the acpi code and return NUMA_NO_NODE when it's set. Signed-off-by: Boris Ostrovsky <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-09-21	x86/numa: Online memory-less nodes at boot time	Tang Chen	1	-14/+13
	For now, x86 does not support memory-less node. A node without memory will not be onlined, and the cpus on it will be mapped to the other online nodes with memory in init_cpu_to_node(). The reason of doing this is to ensure each cpu has mapped to a node with memory, so that it will be able to allocate local memory for that cpu. But we don't have to do it in this way. In this series of patches, we are going to construct cpu <-> node mapping for all possible cpus at boot time, which is a persistent mapping. It means that the cpu will be mapped to the node which it belongs to, and will never be changed. If a node has only cpus but no memory, the cpus on it will be mapped to a memory-less node. And the memory-less node should be onlined. Allocate pgdats for all memory-less nodes and online them at boot time. Then build zonelists for these nodes. As a result, when cpus on these memory-less nodes try to allocate memory from local node, it will automatically fall back to the proper zones in the zonelists. Signed-off-by: Zhu Guihua <[email protected]> Signed-off-by: Dou Liyang <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Tang Chen <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2016-08-01	Merge branch 'x86-headers-for-linus' of ↵	Linus Torvalds	1	-1/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 header cleanups from Ingo Molnar: "This tree is a cleanup of the x86 tree reducing spurious uses of module.h - which should improve build performance a bit" * 'x86-headers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, crypto: Restore MODULE_LICENSE() to glue_helper.c so it loads x86/apic: Remove duplicated include from probe_64.c x86/ce4100: Remove duplicated include from ce4100.c x86/headers: Include spinlock_types.h in x8664_ksyms_64.c for missing spinlock_t x86/platform: Delete extraneous MODULE_* tags fromm ts5500 x86: Audit and remove any remaining unnecessary uses of module.h x86/kvm: Audit and remove any unnecessary uses of module.h x86/xen: Audit and remove any unnecessary uses of module.h x86/platform: Audit and remove any unnecessary uses of module.h x86/lib: Audit and remove any unnecessary uses of module.h x86/kernel: Audit and remove any unnecessary uses of module.h x86/mm: Audit and remove any unnecessary uses of module.h x86: Don't use module.h just for AUTHOR / LICENSE tags
2016-07-14	x86/mm: Audit and remove any unnecessary uses of module.h	Paul Gortmaker	1	-1/+0
	Historically a lot of these existed because we did not have a distinction between what was modular code and what was providing support to modules via EXPORT_SYMBOL and friends. That changed when we forked out support for the latter into the export.h file. This means we should be able to reduce the usage of module.h in code that is obj-y Makefile or bool Kconfig. The advantage in doing so is that module.h itself sources about 15 other headers; adding significantly to what we feed cpp, and it can obscure what headers we are effectively using. Since module.h was the source for init.h (for __init) and for export.h (for EXPORT_SYMBOL) we consider each obj-y/bool instance for the presence of either and replace accordingly where needed. Note that some bool/obj-y instances remain since module.h is the header for some exception table entry stuff, and for things like __init_or_module (code that is tossed when MODULES=n). Signed-off-by: Paul Gortmaker <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-05-30	ACPI / NUMA: move bad_srat() and srat_disabled() to drivers/acpi/numa.c	David Daney	1	-1/+1
	bad_srat() and srat_disabled() are shared by x86 and follow-on arm64 patches. Move them to drivers/acpi/numa.c in preparation for arm64 support. Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Robert Richter <[email protected]> [[email protected] moved definitions to drivers/acpi/numa.c] Signed-off-by: David Daney <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2016-05-19	include/linux/nodemask.h: create next_node_in() helper	Andrew Morton	1	-3/+1
	Lots of code does node = next_node(node, XXX); if (node == MAX_NUMNODES) node = first_node(XXX); so create next_node_in() to do this and use it in various places. [[email protected]: use next_node_in() helper] Acked-by: Vlastimil Babka <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: David Rientjes <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Laura Abbott <[email protected]> Cc: Hui Zhu <[email protected]> Cc: Wang Xiaoqiang <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-02-08	x86/mm/numa: Check for failures in numa_clear_kernel_node_hotplug()	Ingo Molnar	1	-1/+3
	numa_clear_kernel_node_hotplug() uses memblock_set_node() without checking for failures. memblock_set_node() is a complex function that might extend the memblock array - which extension might fail - so check for this possibility. It's not supposed to happen (because realistically if we have so little memory that this fails then we likely won't be able to boot anyway), but do the check nevertheless. Cc: Andrew Morton <[email protected]> Cc: Brad Spengler <[email protected]> Cc: Chen Tang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: PaX Team <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tang Chen <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: y14sg1 <[email protected]> Cc: Zhang Yanfei <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-02-08	x86/mm/numa: Clean up numa_clear_kernel_node_hotplug()	Ingo Molnar	1	-23/+42
	So we fixed an overflow bug in numa_clear_kernel_node_hotplug(): 2b54ab3c66d4 ("x86/mm/numa: Fix memory corruption on 32-bit NUMA kernels") ... and the bug was indirectly caused by poor coding style, such as using start/end local variables unnecessarily, which lost the physaddr_t type. So make the code more readable and try to fully comment all the thinking behind the logic. No change in functionality. Cc: Andrew Morton <[email protected]> Cc: Brad Spengler <[email protected]> Cc: Chen Tang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: PaX Team <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tang Chen <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: y14sg1 <[email protected]> Cc: Zhang Yanfei <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-02-08	x86/mm/numa: Fix 32-bit memblock range truncation bug on 32-bit NUMA kernels	Ingo Molnar	1	-1/+1
	The following commit: a0acda917284 ("acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable") Introduced numa_clear_kernel_node_hotplug(), which function is executed during early bootup, and which marks all currently reserved memblock regions as hot-memory-unswappable as well. y14sg1 <[email protected]> reported that when running 32-bit NUMA kernels, the grsecurity/PAX kernel patch flagged a size overflow in this function: PAX: size overflow detected in function x86_numa_init arch/x86/mm/numa.c:691 [...] ... the reason for the overflow is that memblock_clear_hotplug() takes physical addresses as arguments, while the start/end variables used by numa_clear_kernel_node_hotplug() are 'unsigned long', which is 32-bit on PAE kernels, but which has 64-bit physical addresses. So on 32-bit PAE kernels that have physical memory above the 4GB boundary, we truncate a 64-bit physical address range to 32 bits and pass it to memblock_clear_hotplug(), which at minimum prevents the original memory-hotplug bugfix from working, but might have other side effects as well. The fix is to use the proper type to handle physical addresses, phys_addr_t. Reported-by: y14sg1 <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Brad Spengler <[email protected]> Cc: Chen Tang <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: PaX Team <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tang Chen <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Zhang Yanfei <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-08	mem-hotplug: handle node hole when initializing numa_meminfo.	Tang Chen	1	-2/+4
	When parsing SRAT, all memory ranges are added into numa_meminfo. In numa_init(), before entering numa_cleanup_meminfo(), all possible memory ranges are in numa_meminfo. And numa_cleanup_meminfo() removes all ranges over max_pfn or empty. But, this only works if the nodes are continuous. Let's have a look at the following example: We have an SRAT like this: SRAT: Node 0 PXM 0 [mem 0x00000000-0x5fffffff] SRAT: Node 0 PXM 0 [mem 0x100000000-0x1ffffffffff] SRAT: Node 1 PXM 1 [mem 0x20000000000-0x3ffffffffff] SRAT: Node 4 PXM 2 [mem 0x40000000000-0x5ffffffffff] hotplug SRAT: Node 5 PXM 3 [mem 0x60000000000-0x7ffffffffff] hotplug SRAT: Node 2 PXM 4 [mem 0x80000000000-0x9ffffffffff] hotplug SRAT: Node 3 PXM 5 [mem 0xa0000000000-0xbffffffffff] hotplug SRAT: Node 6 PXM 6 [mem 0xc0000000000-0xdffffffffff] hotplug SRAT: Node 7 PXM 7 [mem 0xe0000000000-0xfffffffffff] hotplug On boot, only node 0,1,2,3 exist. And the numa_meminfo will look like this: numa_meminfo.nr_blks = 9 1. on node 0: [0, 60000000] 2. on node 0: [100000000, 20000000000] 3. on node 1: [20000000000, 40000000000] 4. on node 4: [40000000000, 60000000000] 5. on node 5: [60000000000, 80000000000] 6. on node 2: [80000000000, a0000000000] 7. on node 3: [a0000000000, a0800000000] 8. on node 6: [c0000000000, a0800000000] 9. on node 7: [e0000000000, a0800000000] And numa_cleanup_meminfo() will merge 1 and 2, and remove 8,9 because the end address is over max_pfn, which is a0800000000. But 4 and 5 are not removed because their end addresses are less then max_pfn. But in fact, node 4 and 5 don't exist. In a word, numa_cleanup_meminfo() is not able to handle holes between nodes. Since memory ranges in node 4 and 5 are in numa_meminfo, in numa_register_memblks(), node 4 and 5 will be mistakenly set to online. If you run lscpu, it will show: NUMA node0 CPU(s): 0-14,128-142 NUMA node1 CPU(s): 15-29,143-157 NUMA node2 CPU(s): NUMA node3 CPU(s): NUMA node4 CPU(s): 62-76,190-204 NUMA node5 CPU(s): 78-92,206-220 In this patch, we use memblock_overlaps_region() to check if ranges in numa_meminfo overlap with ranges in memory_block. Since memory_block contains all available memory at boot time, if they overlap, it means the ranges exist. If not, then remove them from numa_meminfo. After this patch, lscpu will show: NUMA node0 CPU(s): 0-14,128-142 NUMA node1 CPU(s): 15-29,143-157 NUMA node4 CPU(s): 62-76,190-204 NUMA node5 CPU(s): 78-92,206-220 Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Yasuaki Ishimatsu <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Luiz Capitulino <[email protected]> Cc: Xishi Qiu <[email protected]> Cc: Will Deacon <[email protected]> Cc: Vladimir Murzin <[email protected]> Cc: Fabian Frederick <[email protected]> Cc: Alexander Kuleshov <[email protected]> Cc: Baoquan He <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-04-07	x86/mm/numa: Fix kernel stack corruption in ↵	Dave Young	1	-2/+9
	numa_init()->numa_clear_kernel_node_hotplug() I got below kernel panic during kdump test on Thinkpad T420 laptop: [ 0.000000] No NUMA configuration found [ 0.000000] Faking a node at [mem 0x0000000000000000-0x0000000037ba4fff] [ 0.000000] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff81d21910 ... [ 0.000000] Call Trace: [ 0.000000] [<ffffffff817c2a26>] dump_stack+0x45/0x57 [ 0.000000] [<ffffffff817bc8d2>] panic+0xd0/0x204 [ 0.000000] [<ffffffff81d21910>] ? numa_clear_kernel_node_hotplug+0xe6/0xf2 [ 0.000000] [<ffffffff8107741b>] __stack_chk_fail+0x1b/0x20 [ 0.000000] [<ffffffff81d21910>] numa_clear_kernel_node_hotplug+0xe6/0xf2 [ 0.000000] [<ffffffff81d21e5d>] numa_init+0x1a5/0x520 [ 0.000000] [<ffffffff81d222b1>] x86_numa_init+0x19/0x3d [ 0.000000] [<ffffffff81d22460>] initmem_init+0x9/0xb [ 0.000000] [<ffffffff81d0d00c>] setup_arch+0x94f/0xc82 [ 0.000000] [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120 [ 0.000000] [<ffffffff817bd0bb>] ? printk+0x55/0x6b [ 0.000000] [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120 [ 0.000000] [<ffffffff81d05d9b>] start_kernel+0xe8/0x4d6 [ 0.000000] [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120 [ 0.000000] [<ffffffff81d05120>] ? early_idt_handlers+0x120/0x120 [ 0.000000] [<ffffffff81d055ee>] x86_64_start_reservations+0x2a/0x2c [ 0.000000] [<ffffffff81d05751>] x86_64_start_kernel+0x161/0x184 [ 0.000000] ---[ end Kernel panic - not syncing: stack-protector: Kernel sta This is caused by writing over the end of numa mask bitmap in numa_clear_kernel_node(). numa_clear_kernel_node() tries to set the node id in a mask bitmap, by iterating all reserved regions and assuming that every region has a valid nid. This assumption is not true because there's an exception for some graphic memory quirks. See trim_snb_memory() in arch/x86/kernel/setup.c It is easily to reproduce the bug in the kdump kernel because kdump kernel use pre-reserved memory instead of the whole memory, but kexec pass other reserved memory ranges to 2nd kernel as well. like below in my test: kdump kernel ram 0x2d000000 - 0x37bfffff One of the reserved regions: 0x40000000 - 0x40100000 which includes 0x40004000, a page excluded in trim_snb_memory(). For this memblock reserved region the nid is not set, it is still default value MAX_NUMNODES. later node_set will set bit MAX_NUMNODES thus stack corruption happen. This also happens when booting with mem= kernel commandline during my test. Fixing it by adding a check, do not call node_set in case nid is MAX_NUMNODES. Signed-off-by: Dave Young <[email protected]> Reviewed-by: Yasuaki Ishimatsu <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-02-13	x86: use %*pb[l] to print bitmaps including cpumasks and nodemasks	Tejun Heo	1	-4/+2
	printk and friends can now format bitmaps using '%pb[l]'. cpumask and nodemask also provide cpumask_pr_args() and nodemask_pr_args() respectively which can be used to generate the two printf arguments necessary to format the specified cpu/nodemask. Unnecessary buffer size calculation and condition on the lenght removed from intel_cacheinfo.c::show_shared_cpu_map_func(). * uv_nmi_nr_cpus_pr() got overly smart and implemented "..." abbreviation if the output stretched over the predefined 1024 byte buffer. Replaced with plain printk. Signed-off-by: Tejun Heo <[email protected]> Cc: Mike Travis <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-10-14	Merge branch 'akpm' (patches from Andrew Morton)	Linus Torvalds	1	-44/+45
	Merge second patch-bomb from Andrew Morton: - a few hotfixes - drivers/dma updates - MAINTAINERS updates - Quite a lot of lib/ updates - checkpatch updates - binfmt updates - autofs4 - drivers/rtc/ - various small tweaks to less used filesystems - ipc/ updates - kernel/watchdog.c changes * emailed patches from Andrew Morton <[email protected]>: (135 commits) mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared kernel/param: consolidate __{start,stop}___param[] in <linux/moduleparam.h> ia64: remove duplicate declarations of __per_cpu_start[] and __per_cpu_end[] frv: remove unused declarations of __start___ex_table and __stop___ex_table kvm: ensure hard lockup detection is disabled by default kernel/watchdog.c: control hard lockup detection default staging: rtl8192u: use %pEn to escape buffer staging: rtl8192e: use %pEn to escape buffer staging: wlan-ng: use %pEhp to print SN lib80211: remove unused print_ssid() wireless: hostap: proc: print properly escaped SSID wireless: ipw2x00: print SSID via %pE wireless: libertas: print esaped string via %pE lib/vsprintf: add %pE[achnops] format specifier lib / string_helpers: introduce string_escape_mem() lib / string_helpers: refactoring the test suite lib / string_helpers: move documentation to c-file include/linux: remove strict_strto* definitions arch/x86/mm/numa.c: fix boot failure when all nodes are hotpluggable fs: check bh blocknr earlier when searching lru ...
2014-10-14	arch/x86/mm/numa.c: fix boot failure when all nodes are hotpluggable	Xishi Qiu	1	-44/+45
	If all the nodes are marked hotpluggable, alloc node data will fail. Because __next_mem_range_rev() will skip the hotpluggable memory regions. numa_clear_kernel_node_hotplug() is called after alloc node data. numa_init() ... ret = init_func(); // this will mark hotpluggable flag from SRAT ... memblock_set_bottom_up(false); ... ret = numa_register_memblks(&numa_meminfo); // this will alloc node data(pglist_data) ... numa_clear_kernel_node_hotplug(); // in case all the nodes are hotpluggable ... numa_register_memblks() setup_node_data() memblock_find_in_range_node() __memblock_find_range_top_down() for_each_mem_range_rev() __next_mem_range_rev() This patch moves numa_clear_kernel_node_hotplug() into numa_register_memblks(), clear kernel node hotpluggable flag before alloc node data, then alloc node data won't fail even all the nodes are hotpluggable. [[email protected]: coding-style fixes] Signed-off-by: Xishi Qiu <[email protected]> Cc: Dave Jones <[email protected]> Cc: Tang Chen <[email protected]> Cc: Gu Zheng <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-09-16	x86/mm/numa: Drop dead code and rename setup_node_data() to setup_alloc_data()	Luiz Capitulino	1	-20/+14
	The setup_node_data() function allocates a pg_data_t object, inserts it into the node_data[] array and initializes the following fields: node_id, node_start_pfn and node_spanned_pages. However, a few function calls later during the kernel boot, free_area_init_node() re-initializes those fields, possibly with setup_node_data() is not used. This causes a small glitch when running Linux as a hyperv numa guest: SRAT: PXM 0 -> APIC 0x00 -> Node 0 SRAT: PXM 0 -> APIC 0x01 -> Node 0 SRAT: PXM 1 -> APIC 0x02 -> Node 1 SRAT: PXM 1 -> APIC 0x03 -> Node 1 SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] SRAT: Node 1 PXM 1 [mem 0x80200000-0xf7ffffff] SRAT: Node 1 PXM 1 [mem 0x100000000-0x1081fffff] NUMA: Node 1 [mem 0x80200000-0xf7ffffff] + [mem 0x100000000-0x1081fffff] -> [mem 0x80200000-0x1081fffff] Initmem setup node 0 [mem 0x00000000-0x7fffffff] NODE_DATA [mem 0x7ffdc000-0x7ffeffff] Initmem setup node 1 [mem 0x80800000-0x1081fffff] NODE_DATA [mem 0x1081ea000-0x1081fdfff] crashkernel: memory value expected [ffffea0000000000-ffffea0001ffffff] PMD -> [ffff88007de00000-ffff88007fdfffff] on node 0 [ffffea0002000000-ffffea00043fffff] PMD -> [ffff880105600000-ffff8801077fffff] on node 1 Zone ranges: DMA [mem 0x00001000-0x00ffffff] DMA32 [mem 0x01000000-0xffffffff] Normal [mem 0x100000000-0x1081fffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x00001000-0x0009efff] node 0: [mem 0x00100000-0x7ffeffff] node 1: [mem 0x80200000-0xf7ffffff] node 1: [mem 0x100000000-0x1081fffff] On node 0 totalpages: 524174 DMA zone: 64 pages used for memmap DMA zone: 21 pages reserved DMA zone: 3998 pages, LIFO batch:0 DMA32 zone: 8128 pages used for memmap DMA32 zone: 520176 pages, LIFO batch:31 On node 1 totalpages: 524288 DMA32 zone: 7672 pages used for memmap DMA32 zone: 491008 pages, LIFO batch:31 Normal zone: 520 pages used for memmap Normal zone: 33280 pages, LIFO batch:7 In this dmesg, the SRAT table reports that the memory range for node 1 starts at 0x80200000. However, the line starting with "Initmem" reports that node 1 memory range starts at 0x80800000. The "Initmem" line is reported by setup_node_data() and is wrong, because the kernel ends up using the range as reported in the SRAT table. This commit drops all that dead code from setup_node_data(), renames it to alloc_node_data() and adds a printk() to free_area_init_node() so that we report a node's memory range accurately. Here's the same dmesg section with this patch applied: SRAT: PXM 0 -> APIC 0x00 -> Node 0 SRAT: PXM 0 -> APIC 0x01 -> Node 0 SRAT: PXM 1 -> APIC 0x02 -> Node 1 SRAT: PXM 1 -> APIC 0x03 -> Node 1 SRAT: Node 0 PXM 0 [mem 0x00000000-0x7fffffff] SRAT: Node 1 PXM 1 [mem 0x80200000-0xf7ffffff] SRAT: Node 1 PXM 1 [mem 0x100000000-0x1081fffff] NUMA: Node 1 [mem 0x80200000-0xf7ffffff] + [mem 0x100000000-0x1081fffff] -> [mem 0x80200000-0x1081fffff] NODE_DATA(0) allocated [mem 0x7ffdc000-0x7ffeffff] NODE_DATA(1) allocated [mem 0x1081ea000-0x1081fdfff] crashkernel: memory value expected [ffffea0000000000-ffffea0001ffffff] PMD -> [ffff88007de00000-ffff88007fdfffff] on node 0 [ffffea0002000000-ffffea00043fffff] PMD -> [ffff880105600000-ffff8801077fffff] on node 1 Zone ranges: DMA [mem 0x00001000-0x00ffffff] DMA32 [mem 0x01000000-0xffffffff] Normal [mem 0x100000000-0x1081fffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x00001000-0x0009efff] node 0: [mem 0x00100000-0x7ffeffff] node 1: [mem 0x80200000-0xf7ffffff] node 1: [mem 0x100000000-0x1081fffff] Initmem setup node 0 [mem 0x00001000-0x7ffeffff] On node 0 totalpages: 524174 DMA zone: 64 pages used for memmap DMA zone: 21 pages reserved DMA zone: 3998 pages, LIFO batch:0 DMA32 zone: 8128 pages used for memmap DMA32 zone: 520176 pages, LIFO batch:31 Initmem setup node 1 [mem 0x80200000-0x1081fffff] On node 1 totalpages: 524288 DMA32 zone: 7672 pages used for memmap DMA32 zone: 491008 pages, LIFO batch:31 Normal zone: 520 pages used for memmap Normal zone: 33280 pages, LIFO batch:7 This commit was tested on a two node bare-metal NUMA machine and Linux as a numa guest on hyperv and qemu/kvm. PS: The wrong memory range reported by setup_node_data() seems to be harmless in the current kernel because it's just not used. However, that bad range is used in kernel 2.6.32 to initialize the old boot memory allocator, which causes a crash during boot. Signed-off-by: Luiz Capitulino <[email protected]> Acked-by: Rik van Riel <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Rientjes <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2014-06-04	arch/x86/mm/numa.c: use for_each_memblock()	Emil Medve	1	-3/+3
	Signed-off-by: Emil Medve <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Yinghai Lu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-02-27	x86, platforms: Remove NUMAQ	H. Peter Anvin	1	-4/+0
	The NUMAQ support seems to be unmaintained, remove it. Cc: Paul Gortmaker <[email protected]> Cc: David Rientjes <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Signed-off-by: H. Peter Anvin <[email protected]> Link: http://lkml.kernel.org/r/n/[email protected]
2014-02-06	arch/x86/mm/numa.c: fix array index overflow when synchronizing nid to ↵	Tang Chen	1	-8/+11
	memblock.reserved. The following path will cause array out of bound. memblock_add_region() will always set nid in memblock.reserved to MAX_NUMNODES. In numa_register_memblks(), after we set all nid to correct valus in memblock.reserved, we called setup_node_data(), and used memblock_alloc_nid() to allocate memory, with nid set to MAX_NUMNODES. The nodemask_t type can be seen as a bit array. And the index is 0 ~ MAX_NUMNODES-1. After that, when we call node_set() in numa_clear_kernel_node_hotplug(), the nodemask_t got an index of value MAX_NUMNODES, which is out of [0 ~ MAX_NUMNODES-1]. See below: numa_init() \|---> numa_register_memblks() \| \|---> memblock_set_node(memory) set correct nid in memblock.memory \| \|---> memblock_set_node(reserved) set correct nid in memblock.reserved \| \|...... \| \|---> setup_node_data() \| \|---> memblock_alloc_nid() here, nid is set to MAX_NUMNODES (1024) \|...... \|---> numa_clear_kernel_node_hotplug() \|---> node_set() here, we have an index 1024, and overflowed This patch moves nid setting to numa_clear_kernel_node_hotplug() to fix this problem. Reported-by: Dave Jones <[email protected]> Signed-off-by: Tang Chen <[email protected]> Tested-by: Gu Zheng <[email protected]> Reported-by: Dave Jones <[email protected]> Cc: David Rientjes <[email protected]> Tested-by: Dave Jones <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-02-06	arch/x86/mm/numa.c: initialize numa_kernel_nodes in ↵	Tang Chen	1	-1/+1
	numa_clear_kernel_node_hotplug() On-stack variable numa_kernel_nodes in numa_clear_kernel_node_hotplug() was not initialized. So we need to initialize it. [[email protected]: use NODE_MASK_NONE, per David] Signed-off-by: Tang Chen <[email protected]> Tested-by: Gu Zheng <[email protected]> Reported-by: Dave Jones <[email protected]> Reported-by: David Rientjes <[email protected]> Tested-by: Dave Jones <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-21	acpi, numa, mem_hotplug: mark all nodes the kernel resides un-hotpluggable	Tang Chen	1	-0/+44
	At very early time, the kernel have to use some memory such as loading the kernel image. We cannot prevent this anyway. So any node the kernel resides in should be un-hotpluggable. Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Zhang Yanfei <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Rafael J . Wysocki" <[email protected]> Cc: Chen Tang <[email protected]> Cc: Gong Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Larry Woodman <[email protected]> Cc: Len Brown <[email protected]> Cc: Liu Jiang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Toshi Kani <[email protected]> Cc: Vasilis Liaskovitis <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Yinghai Lu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-21	acpi, numa, mem_hotplug: mark hotpluggable memory in memblock	Tang Chen	1	-0/+2
	When parsing SRAT, we know that which memory area is hotpluggable. So we invoke function memblock_mark_hotplug() introduced by previous patch to mark hotpluggable memory in memblock. [[email protected]: coding-style fixes] Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Zhang Yanfei <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Rafael J . Wysocki" <[email protected]> Cc: Chen Tang <[email protected]> Cc: Gong Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Larry Woodman <[email protected]> Cc: Len Brown <[email protected]> Cc: Liu Jiang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Toshi Kani <[email protected]> Cc: Vasilis Liaskovitis <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Yinghai Lu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2014-01-21	memblock: make memblock_set_node() support different memblock_type	Tang Chen	1	-2/+4
	[[email protected]: fix powerpc build] Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Zhang Yanfei <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: "Rafael J . Wysocki" <[email protected]> Cc: Chen Tang <[email protected]> Cc: Gong Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Larry Woodman <[email protected]> Cc: Len Brown <[email protected]> Cc: Liu Jiang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Prarit Bhargava <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Toshi Kani <[email protected]> Cc: Vasilis Liaskovitis <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Yinghai Lu <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-12-19	x86/mm/numa: Fix 32-bit kernel NUMA boot	Lans Zhang	1	-3/+7
	When booting a 32-bit x86 kernel on a NUMA machine, node data cannot be allocated from local node if the account of memory for node 0 covers the low memory space entirely: [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x83fffffff] [ 0.000000] NODE_DATA [mem 0x367ed000-0x367edfff] [ 0.000000] Initmem setup node 1 [mem 0x840000000-0xfffffffff] [ 0.000000] Cannot find 4096 bytes in node 1 [ 0.000000] 64664MB HIGHMEM available. [ 0.000000] 871MB LOWMEM available. To fix this issue, node data is allowed to be allocated from other nodes if the memory of local node is still not mapped. The expected result looks like this: [ 0.000000] Initmem setup node 0 [mem 0x00000000-0x83fffffff] [ 0.000000] NODE_DATA [mem 0x367ed000-0x367edfff] [ 0.000000] Initmem setup node 1 [mem 0x840000000-0xfffffffff] [ 0.000000] NODE_DATA [mem 0x367ec000-0x367ecfff] [ 0.000000] NODE_DATA(1) on node 0 [ 0.000000] 64664MB HIGHMEM available. [ 0.000000] 871MB LOWMEM available. Signed-off-by: Lans Zhang <[email protected]> Cc: <[email protected]> Cc: Yinghai Lu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2013-11-13	mem-hotplug: introduce movable_node boot option	Tang Chen	1	-0/+11
	The hot-Pluggable field in SRAT specifies which memory is hotpluggable. As we mentioned before, if hotpluggable memory is used by the kernel, it cannot be hot-removed. So memory hotplug users may want to set all hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it. Memory hotplug users may also set a node as movable node, which has ZONE_MOVABLE only, so that the whole node can be hot-removed. But the kernel cannot use memory in ZONE_MOVABLE. By doing this, the kernel cannot use memory in movable nodes. This will cause NUMA performance down. And other users may be unhappy. So we need a way to allow users to enable and disable this functionality. In this patch, we introduce movable_node boot option to allow users to choose to not to consume hotpluggable memory at early boot time and later we can set it as ZONE_MOVABLE. To achieve this, the movable_node boot option will control the memblock allocation direction. That said, after memblock is ready, before SRAT is parsed, we should allocate memory near the kernel image as we explained in the previous patches. So if movable_node boot option is set, the kernel does the following: 1. After memblock is ready, make memblock allocate memory bottom up. 2. After SRAT is parsed, make memblock behave as default, allocate memory top down. Users can specify "movable_node" in kernel commandline to enable this functionality. For those who don't use memory hotplug or who don't want to lose their NUMA performance, just don't specify anything. The kernel will work as before. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Zhang Yanfei <[email protected]> Suggested-by: Kamezawa Hiroyuki <[email protected]> Suggested-by: Ingo Molnar <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Toshi Kani <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Wanpeng Li <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Wen Congyang <[email protected]> Cc: Lai Jiangshan <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Taku Izumi <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Nazarewicz <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>