aboutsummaryrefslogtreecommitdiff
path: root/mm/memory_hotplug.c
AgeCommit message (Collapse)AuthorFilesLines
2008-05-14memory_hotplug: always initialize pageblock bitmapHeiko Carstens1-39/+44
Trying to online a new memory section that was added via memory hotplug sometimes results in crashes when the new pages are added via __free_page. Reason for that is that the pageblock bitmap isn't initialized and hence contains random stuff. That means that get_pageblock_migratetype() returns also random stuff and therefore list_add(&page->lru, &zone->free_area[order].free_list[migratetype]); in __free_one_page() tries to do a list_add to something that isn't even necessarily a list. This happens since 86051ca5eaf5e560113ec7673462804c54284456 ("mm: fix usemap initialization") which makes sure that the pageblock bitmap gets only initialized for pages present in a zone. Unfortunately for hot-added memory the zones "grow" after the memmap and the pageblock memmap have been initialized. Which means that the new pages have an unitialized bitmap. To solve this the calls to grow_zone_span() and grow_pgdat_span() are moved to __add_zone() just before the initialization happens. The patch also moves the two functions since __add_zone() is the only caller and I didn't want to add a forward declaration. Signed-off-by: Heiko Carstens <[email protected]> Cc: Andy Whitcroft <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Yasunori Goto <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-14memory_hotplug: check for walk_memory_resource() failure in online_pages()Geoff Levand1-1/+8
Add a check to online_pages() to test for failure of walk_memory_resource(). This fixes a condition where a failure of walk_memory_resource() can lead to online_pages() returning success without the requested pages being onlined. Signed-off-by: Geoff Levand <[email protected]> Cc: Yasunori Goto <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Keith Mannthey <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Paul Jackson <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-05-14memory hotplug: memmap_init_zone called twiceHeiko Carstens1-7/+3
__add_zone calls memmap_init_zone twice if memory gets attached to an empty zone. Once via init_currently_empty_zone and once explictly right after that call. Looks like this is currently not a bug, however the call is superfluous and might lead to subtle bugs if memmap_init_zone gets changed. So make sure it is called only once. Cc: Yasunori Goto <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28mm/memory_hotplug.c must #include "internal.h"Adrian Bunk1-0/+2
This patch fixes the following compile error caused by commit 04753278769f3b6c3b79a080edb52f21d83bf6e2 ("memory hotplug: register section/node id to free"): CC mm/memory_hotplug.o /home/bunk/linux/kernel-2.6/git/linux-2.6/mm/memory_hotplug.c: In function ‘put_page_bootmem’: /home/bunk/linux/kernel-2.6/git/linux-2.6/mm/memory_hotplug.c:82: error: implicit declaration of function ‘__free_pages_bootmem’ /home/bunk/linux/kernel-2.6/git/linux-2.6/mm/memory_hotplug.c: At top level: /home/bunk/linux/kernel-2.6/git/linux-2.6/mm/memory_hotplug.c:87: warning: no previous prototype for ‘register_page_bootmem_info_section’ make[2]: *** [mm/memory_hotplug.o] Error 1 [ Andrew: "Argh. The -mm-only memory-hotplug-add-removable-to-sysfs- to-show-memblock-removability.patch debugging patch adds that include so nobody hit this before. ] Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28memory hotplug: free memmaps allocated by bootmemYasunori Goto1-0/+11
This patch is to free memmaps which is allocated by bootmem. Freeing usemap is not necessary. The pages of usemap may be necessary for other sections. If removing section is last section on the node, its section is the final user of usemap page. (usemaps are allocated on its section by previous patch.) But it shouldn't be freed too, because the section must be logical offline state which all pages are isolated against page allocater. If it is freed, page alloctor may use it which will be removed physically soon. It will be disaster. So, this patch keeps it as it is. Signed-off-by: Yasunori Goto <[email protected]> Cc: Badari Pulavarty <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28memory hotplug: register section/node id to freeYasunori Goto1-1/+98
This patch set is to free pages which is allocated by bootmem for memory-hotremove. Some structures of memory management are allocated by bootmem. ex) memmap, etc. To remove memory physically, some of them must be freed according to circumstance. This patch set makes basis to free those pages, and free memmaps. Basic my idea is using remain members of struct page to remember information of users of bootmem (section number or node id). When the section is removing, kernel can confirm it. By this information, some issues can be solved. 1) When the memmap of removing section is allocated on other section by bootmem, it should/can be free. 2) When the memmap of removing section is allocated on the same section, it shouldn't be freed. Because the section has to be logical memory offlined already and all pages must be isolated against page allocater. If it is freed, page allocator may use it which will be removed physically soon. 3) When removing section has other section's memmap, kernel will be able to show easily which section should be removed before it for user. (Not implemented yet) 4) When the above case 2), the page isolation will be able to check and skip memmap's page when logical memory offline (offline_pages()). Current page isolation code fails in this case because this page is just reserved page and it can't distinguish this pages can be removed or not. But, it will be able to do by this patch. (Not implemented yet.) 5) The node information like pgdat has similar issues. But, this will be able to be solved too by this. (Not implemented yet, but, remembering node id in the pages.) Fortunately, current bootmem allocator just keeps PageReserved flags, and doesn't use any other members of page struct. The users of bootmem doesn't use them too. This patch: This is to register information which is node or section's id. Kernel can distinguish which node/section uses the pages allcated by bootmem. This is basis for hot-remove sections or nodes. Signed-off-by: Yasunori Goto <[email protected]> Cc: Badari Pulavarty <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28hotplug-memory: make online_page() commonJeremy Fitzhardinge1-0/+19
All architectures use an effectively identical definition of online_page(), so just make it common code. x86-64, ia64, powerpc and sh are actually identical; x86-32 is slightly different. x86-32's differences arise because it puts its hotplug pages in the highmem zone. We can handle this in the generic code by inspecting the page to see if its in highmem, and update the totalhigh_pages count appropriately. This leaves init_32.c:free_new_highpage with a single caller, so I folded it into add_one_highpage_init. I also removed an incorrect comment referring to the NUMA case; any NUMA details have already been dealt with by the time online_page() is called. [[email protected]: fix indenting] Signed-off-by: Jeremy Fitzhardinge <[email protected]> Acked-by: Dave Hansen <[email protected]> Reviewed-by: KAMEZAWA Hiroyuki <[email protected]> Tested-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Yasunori Goto <[email protected]> Cc: Christoph Lameter <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-28hotplug memory remove: generic __remove_pages() supportBadari Pulavarty1-0/+55
Generic helper function to remove section mappings and sysfs entries for the section of the memory we are removing. offline_pages() correctly adjusted zone and marked the pages reserved. TODO: Yasunori Goto is working on patches to free up allocations from bootmem. Signed-off-by: Badari Pulavarty <[email protected]> Acked-by: Yasunori Goto <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2008-04-19driver core: memory: semaphore to mutexDaniel Walker1-1/+1
Signed-off-by: Daniel Walker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2008-02-05Page allocator: clean up pcp draining functionsChristoph Lameter1-4/+2
- Add comments explaing how drain_pages() works. - Eliminate useless functions - Rename drain_all_local_pages to drain_all_pages(). It does drain all pages not only those of the local processor. - Eliminate useless interrupt off / on sequences. drain_pages() disables interrupts on its own. The execution thread is pinned to processor by the caller. So there is no need to disable interrupts. - Put drain_all_pages() declaration in gfp.h and remove the declarations from suspend.h and from mm/memory_hotplug.c - Make software suspend call drain_all_pages(). The draining of processor local pages is may not the right approach if software suspend wants to support SMP. If they call drain_all_pages then we can make drain_pages() static. [[email protected]: fix build] Signed-off-by: Christoph Lameter <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Daniel Walker <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-11-14Add IORESOUCE_BUSY flag for System RAMYasunori Goto1-1/+1
i386 and x86-64 registers System RAM as IORESOURCE_MEM | IORESOURCE_BUSY. But ia64 registers it as IORESOURCE_MEM only. In addition, memory hotplug code registers new memory as IORESOURCE_MEM too. This difference causes a failure of memory unplug of x86-64. This patch fixes it. This patch adds IORESOURCE_BUSY to avoid potential overlap mapping by PCI device. Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Badari Pulavarty <[email protected]> Cc: Luck, Tony" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-11-14memory hotremove: unset migrate type "ISOLATE" after removalKAMEZAWA Hiroyuki1-2/+2
We should unset migrate type "ISOLATE" when we successfully removed memory. But current code has BUG and cannot works well. This patch also includes bugfix? to change get_pageblock_flags to get_pageblock_migratetype(). Thanks to Badari Pulavarty for finding this. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Badari Pulavarty <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-22memory hotplug: rearrange memory hotplug notifierYasunori Goto1-3/+45
Current memory notifier has some defects yet. (Fortunately, nothing uses it.) This patch is to fix and rearrange for them. - Add information of start_pfn, nr_pages, and node id if node status is changes from/to memoryless node for callback functions. Callbacks can't do anything without those information. - Add notification going-online status. It is necessary for creating per node structure before the node's pages are available. - Move GOING_OFFLINE status notification after page isolation. It is good place for return memory like cache for callback, because returned page is not used again. - Make CANCEL events for rollingback when error occurs. - Delete MEM_MAPPING_INVALID notification. It will be not used. - Fix compile error of (un)register_memory_notifier(). Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-20spelling fixes: mm/Simon Arlott1-1/+1
Spelling fixes in mm/. Signed-off-by: Simon Arlott <[email protected]> Signed-off-by: Adrian Bunk <[email protected]>
2007-10-16fix memory hot remove not configured case.KAMEZAWA Hiroyuki1-0/+6
Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess. This patch cleans up them. This is against 2.6.23-rc6-mm1. - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case. - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(), which returns -EINVAL. - removed remove_pages() only used in powerpc. - removed no-op remove_memory() in i386, sh, sparc64, x86_64. - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it to return -EINVAL. Note: Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other archs if there are requirements and testers. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-16memory unplug: page offlineKAMEZAWA Hiroyuki1-0/+254
Logic. - set all pages in [start,end) as isolated migration-type. by this, all free pages in the range will be not-for-use. - Migrate all LRU pages in the range. - Test all pages in the range's refcnt is zero or not. Todo: - allocate migration destination page from better area. - confirm page_count(page)== 0 && PageReserved(page) page is safe to be freed.. (I don't like this kind of page but.. - Find out pages which cannot be migrated. - more running tests. - Use reclaim for unplugging other memory type area. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-16memory unplug: memory hotplug cleanupKAMEZAWA Hiroyuki1-26/+19
A clean up patch for "scanning memory resource [start, end)" operation. Now, find_next_system_ram() function is used in memory hotplug, but this interface is not easy to use and codes are complicated. This patch adds walk_memory_resouce(start,len,arg,func) function. The function 'func' is called per valid memory resouce range in [start,pfn). [[email protected]: Error handling in walk_memory_resource()] Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Badari Pulavarty <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-10-16Memoryless nodes: introduce mask of nodes with memoryChristoph Lameter1-3/+4
It is necessary to know if nodes have memory since we have recently begun to add support for memoryless nodes. For that purpose we introduce a two new node states: N_HIGH_MEMORY and N_NORMAL_MEMORY. A node has its bit in N_HIGH_MEMORY set if it has any memory regardless of the type of mmemory. If a node has memory then it has at least one zone defined in its pgdat structure that is located in the pgdat itself. A node has its bit in N_NORMAL_MEMORY set if it has a lower zone than ZONE_HIGHMEM. This means it is possible to allocate memory that is not subject to kmap. N_HIGH_MEMORY and N_NORMAL_MEMORY can then be used in various places to insure that we do the right thing when we encounter a memoryless node. [[email protected]: build fix] [[email protected]: update N_HIGH_MEMORY node state for memory hotadd] [[email protected]: Fix memory hotplug + sparsemem build] Signed-off-by: Lee Schermerhorn <[email protected]> Signed-off-by: Nishanth Aravamudan <[email protected]> Signed-off-by: Christoph Lameter <[email protected]> Acked-by: Bob Picco <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Paul Mundt <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-06-01memory hotplug: fix unnecessary calling of init_currenty_empty_zone()Yasunori Goto1-1/+1
zone->present_pages is updated in online_pages(). But, __add_zone() can be called twice or more before calling online_pages(). So, init_currenty_empty_zone() can be called unnecessary times. It is cause of memory leak of zone's wait_table. Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2007-01-11[PATCH] Fix sparsemem on CellDave Hansen1-2/+4
Fix an oops experienced on the Cell architecture when init-time functions, early_*(), are called at runtime. It alters the call paths to make sure that the callers explicitly say whether the call is being made on behalf of a hotplug even, or happening at boot-time. It has been compile tested on ppc64, ia64, s390, i386 and x86_64. Acked-by: Arnd Bergmann <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Cc: Yasunori Goto <[email protected]> Acked-by: Andy Whitcroft <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Martin Schwidefsky <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-12-07[PATCH] Get rid of zone_table[]Christoph Lameter1-1/+0
The zone table is mostly not needed. If we have a node in the page flags then we can get to the zone via NODE_DATA() which is much more likely to be already in the cpu cache. In case of SMP and UP NODE_DATA() is a constant pointer which allows us to access an exact replica of zonetable in the node_zones field. In all of the above cases there will be no need at all for the zone table. The only remaining case is if in a NUMA system the node numbers do not fit into the page flags. In that case we make sparse generate a table that maps sections to nodes and use that table to to figure out the node number. This table is sized to fit in a single cache line for the known 32 bit NUMA platform which makes it very likely that the information can be obtained without a cache miss. For sparsemem the zone table seems to be have been fairly large based on the maximum possible number of sections and the number of zones per node. There is some memory saving by removing zone_table. The main benefit is to reduce the cache foootprint of the VM from the frequent lookups of zones. Plus it simplifies the page allocator. [[email protected]: build fix] Signed-off-by: Christoph Lameter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Whitcroft <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] hot-add-mem x86_64: use CONFIG_MEMORY_HOTPLUG_RESERVEKeith Mannthey1-30/+30
The api for hot-add memory already has a construct for finding nodes based on an address, memory_add_physaddr_to_nid. This patch allows the fucntion to do something besides return 0. It uses the nodes_add infomation to lookup to node info for a hot add event. Signed-off-by: Keith Mannthey <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] hot-add-mem x86_64: use CONFIG_MEMORY_HOTPLUG_SPARSEKeith Mannthey1-0/+2
Migate CONFIG_MEMORY_HOTPLUG to CONFIG_MEMORY_HOTPLUG_SPARSE where needed. Signed-off-by: Keith Mannthey <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-10-01[PATCH] hot-add-mem x86_64: fixup externsKeith Mannthey1-4/+0
Fix up externs in memory_hotplug.c. Cleanup. Signed-off-by: Keith Mannthey <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Andi Kleen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-09-29[PATCH] call mm/page-writeback.c:set_ratelimit() when new pages are hot-addedChandra Seetharaman1-0/+2
ratelimit_pages in page-writeback.c is recalculated (in set_ratelimit()) every time a CPU is hot-added/removed. But this value is not recalculated when new pages are hot-added. This patch fixes that problem by calling set_ratelimit() when new pages are hot-added. [[email protected]: cleanups] Signed-off-by: Chandra Seetharaman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-09-29[PATCH] cpuset: top_cpuset tracks hotplug changes to node_online_mapPaul Jackson1-0/+3
Change the list of memory nodes allowed to tasks in the top (root) nodeset to dynamically track what cpus are online, using a call to a cpuset hook from the memory hotplug code. Make this top cpus file read-only. On systems that have cpusets configured in their kernel, but that aren't actively using cpusets (for some distros, this covers the majority of systems) all tasks end up in the top cpuset. If that system does support memory hotplug, then these tasks cannot make use of memory nodes that are added after system boot, because the memory nodes are not allowed in the top cpuset. This is a surprising regression over earlier kernels that didn't have cpusets enabled. One key motivation for this change is to remain consistent with the behaviour for the top_cpuset's 'cpus', which is also read-only, and which automatically tracks the cpu_online_map. This change also has the minor benefit that it fixes a long standing, little noticed, minor bug in cpusets. The cpuset performance tweak to short circuit the cpuset_zone_allowed() check on systems with just a single cpuset (see 'number_of_cpusets', in linux/cpuset.h) meant that simply changing the 'mems' of the top_cpuset had no affect, even though the change (the write system call) appeared to succeed. With the following change, that write to the 'mems' file fails -EACCES, and the 'mems' file stubbornly refuses to be changed via user space writes. Thus no one should be mislead into thinking they've changed the top_cpusets's 'mems' when in affect they haven't. In order to keep the behaviour of cpusets consistent between systems actively making use of them and systems not using them, this patch changes the behaviour of the 'mems' file in the top (root) cpuset, making it read only, and making it automatically track the value of node_online_map. Thus tasks in the top cpuset will have automatic use of hot plugged memory nodes allowed by their cpuset. [[email protected]: build fix] [[email protected]: build fix] Signed-off-by: Paul Jackson <[email protected]> Signed-off-by: Adrian Bunk <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-08-06[PATCH] memory hotadd fixes: enhance collision checkKAMEZAWA Hiroyuki1-7/+22
This patch is for collision check enhancement for memory hot add. It's better to do resouce collision check before doing memory hot add, which will touch memory management structures. And add_section() should check section exists or not before calling sparse_add_one_section(). (sparse_add_one_section() will do another check anyway. but checking in memory_hotplug.c will be easy to understand.) Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Cc: keith mannthey <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-08-06[PATCH] memory hotadd fixes: find_next_system_ram catch range fixKAMEZAWA Hiroyuki1-1/+1
find_next_system_ram() is used to find available memory resource at onlining newly added memory. This patch fixes following problem. find_next_system_ram() cannot catch this case. Resource: (start)-------------(end) Section : (start)-------------(end) Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Keith Mannthey <[email protected]> Cc: Yasunori Goto <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-08-06[PATCH] memory hotadd fixes: not-aligned memory hotadd handling fixKAMEZAWA Hiroyuki1-7/+16
ioresouce handling code in memory hotplug allows not-aligned memory hot add. But when memmap and other memory structures are initialized, parameters should be aligned. (if not aligned, initialization of mem_map will do wrong, it assumes parameters are aligned.) This patch fix it. And this patch allows ioresource collision check to handle -EEXIST. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Keith Mannthey <[email protected]> Cc: Yasunori Goto <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel1-1/+0
Signed-off-by: Jörn Engel <[email protected]> Signed-off-by: Adrian Bunk <[email protected]>
2006-06-27[PATCH] Register sysfs file for hotplugged new nodeYasunori Goto1-1/+11
When new node becomes enable by hot-add, new sysfs file must be created for new node. So, if new node is enabled by add_memory(), register_one_node() is called to create it. In addition, I386's arch_register_node() and a part of register_nodes() of powerpc are consolidated to register_one_node() as a generic_code(). This is tested by Tiger4(IPF) with node hot-plug emulation. Signed-off-by: Keiichiro Tokunaga <[email protected]> Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-27[PATCH] catch valid mem range at onlining memoryKAMEZAWA Hiroyuki1-4/+24
This patch allows hot-add memory which is not aligned to section. Now, hot-added memory has to be aligned to section size. Considering big section sized archs, this is not useful. When hot-added memory is registerd as iomem resoruce by iomem resource patch, we can make use of that information to detect valid memory range. Note: With this, not-aligned memory can be registerd. To allow hot-add memory with holes, we have to do more work around add_memory(). (It doesn't allows add memory to already existing mem section.) Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-27[PATCH] register hot-added memory to iomem resourceKAMEZAWA Hiroyuki1-0/+25
Register hot-added memory to iomem_resource. With this, /proc/iomem can show hot-added memory. Note: kdump uses /proc/iomem to catch memory range when it is installed. So, kdump should be re-installed after /proc/iomem change. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Greg KH <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-27[PATCH] pgdat allocation for new node add (call pgdat allocation)Yasunori Goto1-0/+52
Add node-hot-add support to add_memory(). node hotadd uses this sequence. 1. allocate pgdat. 2. refresh NODE_DATA() 3. call free_area_init_node() to initialize 4. create sysfs entry 5. add memory (old add_memory()) 6. set node online 7. run kswapd for new node. (8). update zonelist after pages are onlined. (This is already merged in -mm due to update phase is difference.) Note: To make common function as much as possible, there is 2 changes from v2. - The old add_memory(), which is defiend by each archs, is renamed to arch_add_memory(). New add_memory becomes caller of arch dependent function as a common code. - This patch changes add_memory()'s interface From: add_memory(start, end) TO : add_memory(nid, start, end). It was cause of similar code that finding node id from physical address is inside of old add_memory() on each arch. In addition, acpi memory hotplug driver can find node id easier. In v2, it must walk DSDT'S _CRS by matching physical address to get the handle of its memory device, then get _PXM and node id. Because input is just physical address. However, in v3, the acpi driver can use handle to get _PXM and node id for the new memory device. It can pass just node id to add_memory(). Fix interface of arch_add_memory() is in next patche. Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Brown, Len" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-27[PATCH] pgdat allocation for new node add (specify node id)Yasunori Goto1-0/+11
Change the name of old add_memory() to arch_add_memory. And use node id to get pgdat for the node at NODE_DATA(). Note: Powerpc's old add_memory() is defined as __devinit. However, add_memory() is usually called only after bootup. I suppose it may be redundant. But, I'm not well known about powerpc. So, I keep it. (But, __meminit is better at least.) Signed-off-by: Yasunori Goto <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Brown, Len" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-23[PATCH] update vm_total_pages at memory hotaddKAMEZAWA Hiroyuki1-1/+1
Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: update zonelistsYasunori Goto1-0/+12
In current code, zonelist is considered to be build once, no modification. But MemoryHotplug can add new zone/pgdat. It must be updated. This patch modifies build_all_zonelists(). By this, build_all_zonelist() can reconfig pgdat's zonelists. To update them safety, this patch use stop_machine_run(). Other cpus don't touch among updating them by using it. In old version (V2 of node hotadd), kernel updated them after zone initialization. But present_page of its new zone is still 0, because online_page() is not called yet at this time. Build_zonelists() checks present_pages to find present zone. It was too early. So, I changed it after online_pages(). Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-06-23[PATCH] wait_table and zonelist initializing for memory hotadd: add return ↵Yasunori Goto1-2/+13
code for init_current_empty_zone When add_zone() is called against empty zone (not populated zone), we have to initialize the zone which didn't initialize at boot time. But, init_currently_empty_zone() may fail due to allocation of wait table. So, this patch is to catch its error code. Changes against wait_table is in the next patch. Signed-off-by: KAMEZAWA Hiroyuki <[email protected]> Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-05-31[PATCH] spanned_pages is not updated at a case of memory hot-addYasunori Goto1-4/+4
From: Yasunori Goto <[email protected]> If hot-added memory's address is smaller than old area, spanned_pages will not be updated. It must be fixed. example) Old zone_start_pfn = 0x60000, and spanned_pages = 0x10000 Added new memory's start_pfn = 0x50000, and end_pfn = 0x60000 new spanned_pages will be still 0x10000 by old code. (It should be updated to 0x20000.) Because old_zone_end_pfn will be 0x70000, and end_pfn smaller than it. So, spanned_pages will not be updated. In current code, spanned_pages is updated only when end_pfn is updated. But, it should be updated by subtraction between bigger end_pfn and new zone_start_pfn. Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-05-01[PATCH] spufs: fix for CONFIG_NUMAJoel H Schopp1-1/+5
Based on an older patch from Mike Kravetz <[email protected]> We need to have a mem_map for high addresses in order to make fops->no_page work on spufs mem and register files. So far, we have used the memory_present() function during early bootup, but that did not work when CONFIG_NUMA was enabled. We now use the __add_pages() function to add the mem_map when loading the spufs module, which is a lot nicer. Signed-off-by: Arnd Bergmann <[email protected]> Cc: Paul Mackerras <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-03-09[PATCH] memory hotadd: pgdat->node_present_pages fixYasunori Goto1-0/+1
When pages are onlined, not only zone->present_pages but also pgdat->node_present_pages should be refreshed. This parameter is used to show information at /sys/device/system/node/nodeX/meminfo via si_meminfo_node(). So, it shows strange value for MemUsed which is calculated (node_present_pages - all zones free pages). Signed-off-by: Yasunori Goto <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2006-01-06[PATCH] memhotplug: __add_section remove unused pgdat definitionAndy Whitcroft1-1/+0
__add_section defines an unused pointer to the zones pgdat. Remove this definition. This fixes a compile warning. Signed-off-by: Andy Whitcroft <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-12-13[PATCH] Fix calculation of grow_pgdat_span() in mm/memory_hotplug.cYasunori Goto1-1/+1
The calculation for node_spanned_pages at grow_pgdat_span() is clearly wrong. This is patch for it. (Please see grow_zone_span() to compare. It is correct.) Signed-off-by: Yasunori Goto <[email protected]> Acked-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-10-29[PATCH] memory hotplug: call setup_per_zone_pages_min after hotplugDave Hansen1-0/+2
From: IWAMOTO Toshihiro <[email protected]> > I found the tests does not work well with Dave's patchset. > I've found the followings: > > - setup_per_zone_pages_min() calls should be added in > capture_page_range() and online_pages() > - lru_add_drain() should be called before try_to_migrate_pages() The following patch deals with the first item. Signed-off-by: IWAMOTO Toshihiro <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-10-29[PATCH] memory hotplug: move section_mem_map alloc to sparse.cDave Hansen1-45/+3
This basically keeps up from having to extern __kmalloc_section_memmap(). The vaddr_in_vmalloc_area() helper could go in a vmalloc header, but that header gets hard to work with, because it needs some arch-specific macros. Just stick it in here for now, instead of creating another header. Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Lion Vollnhals <[email protected]> Signed-off-by: Jiri Slaby <[email protected]> Signed-off-by: Yasunori Goto <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2005-10-29[PATCH] memory hotplug: sysfs and add/remove functionsDave Hansen1-0/+178
This adds generic memory add/remove and supporting functions for memory hotplug into a new file as well as a memory hotplug kernel config option. Individual architecture patches will follow. For now, disable memory hotplug when swsusp is enabled. There's a lot of churn there right now. We'll fix it up properly once it calms down. Signed-off-by: Matt Tolentino <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>