blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2013-07-09	ipc,msg: introduce msgctl_nolock	Davidlohr Bueso	1	-15/+34
	Similar to semctl, when calling msgctl, the _INFO and _STAT commands can be performed without acquiring the ipc object. Add a msgctl_nolock() function and move the logic of _INFO and _STAT out of msgctl(). This change still takes the lock and it will be properly lockless in the next patch Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc,msg: shorten critical region in msgctl_down	Davidlohr Bueso	1	-5/+7
	Instead of holding the ipc lock for the entire function, use the ipcctl_pre_down_nolock and only acquire the lock for specific commands: RMID and SET. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc: move locking out of ipcctl_pre_down_nolock	Davidlohr Bueso	4	-39/+56
	This function currently acquires both the rw_mutex and the rcu lock on successful lookups, leaving the callers to explicitly unlock them, creating another two level locking situation. Make the callers (including those that still use ipcctl_pre_down()) explicitly lock and unlock the rwsem and rcu lock. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc: close open coded spin lock calls	Davidlohr Bueso	4	-12/+12
	Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc: introduce ipc object locking helpers	Davidlohr Bueso	1	-5/+15
	Simple helpers around the (kern_ipc_perm *)->lock spinlock. Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc: move rcu lock out of ipc_addid	Davidlohr Bueso	3	-7/+8
	This patchset continues the work that began in the sysv ipc semaphore scaling series, see https://lkml.org/lkml/2013/3/20/546 Just like semaphores used to be, sysv shared memory and msg queues also abuse the ipc lock, unnecessarily holding it for operations such as permission and security checks. This patchset mostly deals with mqueues, and while shared mem can be done in a very similar way, I want to get these patches out in the open first. It also does some pending cleanups, mostly focused on the two level locking we have in ipc code, taking care of ipc_addid() and ipcctl_pre_down_nolock() - yes there are still functions that need to be updated as well. This patch: Make all callers explicitly take and release the RCU read lock. This addresses the two level locking seen in newary(), newseg() and newqueue(). For the last two, explicitly unlock the ipc object and the rcu lock, instead of calling the custom shm_unlock and msg_unlock functions. The next patch will deal with the open coded locking for ->perm.lock Signed-off-by: Davidlohr Bueso <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Rik van Riel <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ipc/shmc.c: eliminate ugly 80-col tricks	Andrew Morton	2	-4/+4
	Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: flush_ptrace_hw_breakpoint() shoule clear the virtual debug ↵	Oleg Nesterov	1	-0/+3
	registers flush_ptrace_hw_breakpoint() destroys the counters set by ptrace, but "leaks" ->debugreg6 and ->ptrace_dr7. The problem is minor, but still it doesn't look right and flush_thread() did this until commit 66cb59172959 ("hw-breakpoints: use the new wrapper routines to access debug registers in process/thread code"). Now that PTRACE_DETACH does flush_ too this makes even more sense. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace: PTRACE_DETACH should do flush_ptrace_hw_breakpoint(child)	Oleg Nesterov	1	-0/+1
	Change ptrace_detach() to call flush_ptrace_hw_breakpoint(child). This frees the slots for non-ptrace PERF_TYPE_BREAKPOINT users, and this ensures that the tracee won't be killed by SIGTRAP triggered by the active breakpoints. Test-case: unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len \| type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 \|= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void func(void) { } int main(void) { int pid, stat; unsigned long dr7; pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGHUP); func(); return 0x13; } assert(pid == waitpid(-1, &stat, 0)); assert(WSTOPSIG(stat) == SIGHUP); assert(write_dr(pid, 0, (long)func) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0); assert(pid == waitpid(-1, &stat, 0)); assert(stat == 0x1300); return 0; } Before this patch the child is killed after PTRACE_DETACH. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: cleanup ptrace_set_debugreg()	Oleg Nesterov	1	-18/+8
	ptrace_set_debugreg() is trivial but looks horrible. Kill the unnecessary goto's and return's to cleanup the code. This matches ptrace_get_debugreg() which also needs the trivial whitespace cleanups. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: ptrace_write_dr7() should create bp if !disabled	Oleg Nesterov	1	-7/+10
	Commit 24f1e32c60c4 ("hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events") introduced the minor regression. Before this commit PTRACE_POKEUSER DR7, enableDR0 PTRACE_POKEUSER DR0, address was perfectly valid, now PTRACE_POKEUSER(DR7) fails if DR0 was not previously initialized by PTRACE_POKEUSER(DR0). Change ptrace_write_dr7() to do ptrace_register_breakpoint(addr => 0) if !bp && !disabled. This fixes watchpoint-zeroaddr from ptrace-tests, see https://bugzilla.redhat.com/show_bug.cgi?id=660204. Signed-off-by: Oleg Nesterov <[email protected]> Reported-by: Jan Kratochvil <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: introduce ptrace_register_breakpoint()	Oleg Nesterov	1	-36/+50
	No functional changes, preparation. Extract the "register breakpoint" code from ptrace_get_debugreg() into the new/generic helper, ptrace_register_breakpoint(). It will have more users. The patch also adds another simple helper, ptrace_fill_bp_fields(), to factor out the arch_bp_generic_fields() logic in register/modify. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: dont delay "disable" till second pass in ptrace_write_dr7()	Oleg Nesterov	1	-33/+20
	ptrace_write_dr7() skips ptrace_modify_breakpoint(disabled => true) unless second_pass, this buys nothing but complicates the code and means that we always do the main loop twice even if "disabled" was never true. The comment says: Don't unregister the breakpoints right-away, unless all register_user_hw_breakpoint() requests have succeeded. Firstly, we do not do register_user_hw_breakpoint(), it was removed by commit 24f1e32c60c4 ("hw-breakpoints: Rewrite the hw-breakpoints layer on top of perf events"). We are going to restore register_user_hw_breakpoint() (see the next patch) but this doesn't matter: after commit 44234adcdce3 ("hw-breakpoints: Modify breakpoints without unregistering them") perf_event_disable() can not hurt, hw_breakpoint_del() does not free the slot. Remove the "second_pass" check from the main loop and simplify the code. Since we have to check "bp != NULL" anyway, the patch also removes the same check in ptrace_modify_breakpoint() and moves the comment into ptrace_write_dr7(). With this patch the second pass is only needed to restore the saved old_dr7. This should never fail, so the patch adds WARN_ON() to catch the potential problems as Frederic suggested. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: simplify the "disable" logic in ptrace_write_dr7()	Oleg Nesterov	1	-25/+15
	ptrace_write_dr7() looks unnecessarily overcomplicated. We can factor out ptrace_modify_breakpoint() and do not do "continue" twice, just we need to pass the proper "disabled" argument to ptrace_modify_breakpoint(). Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace: revert "Prepare to fix racy accesses on task breakpoints"	Oleg Nesterov	4	-30/+1
	This reverts commit bf26c018490c ("Prepare to fix racy accesses on task breakpoints"). The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Now that ptrace_get_breakpoints/ptrace_put_breakpoints have no callers, we can kill them and remove task->ptrace_bp_refcnt. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Acked-by: Michael Neuling <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/sh: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"	Oleg Nesterov	1	-4/+0
	This reverts commit e0ac8457d020 ("hw_breakpoints: Fix racy access to ptrace breakpoints"). The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Signed-off-by: Oleg Nesterov <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/arm: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"	Oleg Nesterov	1	-8/+0
	This reverts commit bf0b8f4b55e5 ("hw_breakpoints: Fix racy access to ptrace breakpoints"). The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/powerpc: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"	Oleg Nesterov	1	-26/+4
	This reverts commit 07fa7a0a8a58 ("hw_breakpoints: Fix racy access to ptrace breakpoints") and removes ptrace_get/put_breakpoints() added by other commits. The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Michael Neuling <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ptrace/x86: revert "hw_breakpoints: Fix racy access to ptrace breakpoints"	Oleg Nesterov	1	-23/+5
	This reverts commit 87dc669ba257 ("hw_breakpoints: Fix racy access to ptrace breakpoints"). The patch was fine but we can no longer race with SIGKILL after commit 9899d11f6544 ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL"), the __TASK_TRACED tracee can't be woken up and ->ptrace_bps[] can't go away. The patch only removes ptrace_get_breakpoints/ptrace_put_breakpoints and does a couple of "while at it" cleanups, it doesn't remove other changes from the reverted commit. Signed-off-by: Oleg Nesterov <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Jan Kratochvil <[email protected]> Cc: Michael Neuling <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Will Deacon <[email protected]> Cc: Prasad <[email protected]> Cc: Russell King <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	fatfs: add FAT_IOCTL_GET_VOLUME_ID	Mike Lockwood	4	-0/+31
	This patch, originally from Android kernel, adds vfat ioctl command FAT_IOCTL_GET_VOLUME_ID, with this command we can get the vfat volume ID using following code: ioctl(fd, FAT_IOCTL_GET_VOLUME_ID, &volume_ID) This patch is a modified version of the patch by Mike Lockwood, with changes from Dmitry Pervushin, who noticed the original patch makes some volume IDs abiguous with error returns: for example, if volume id is 0xFFFFFDAD, that matches -ENOIOCTLCMD, we get "FFFFFFFF" from the user space. So add a parameter to ioctl to get the correct volume ID. Android uses vfat volume ID to identify different sd card, when a new sd card is inserted to device, android can scan the media on it and pop up new contents. Signed-off-by: Bintian Wang <[email protected]> Cc: dmitry pervushin <[email protected]> Cc: Mike Lockwood <[email protected]> Cc: Colin Cross <[email protected]> Acked-by: OGAWA Hirofumi <[email protected]> Cc: John Stultz <[email protected]> Cc: Sean McNeil <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	drivers/rtc/rtc-stmp3xxx.c: check the return value from stmp_reset_block()	Fabio Estevam	1	-1/+6
	stmp_reset_block() may fail, so let's check its return value and propagate it in the case of error. Signed-off-by: Fabio Estevam <[email protected]> Acked-by: Shawn Guo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	ncpfs: fix error return code in ncp_parse_options()	Wei Yongjun	1	-3/+9
	Fix to return -EINVAL from the option parse error handling case instead of 0, as done elsewhere in this function. Signed-off-by: Wei Yongjun <[email protected]> Cc: Petr Vandrovec <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	checkpatch: make the CamelCase cache work for non-git trees too	Joe Perches	1	-19/+35
	Might as well check include timestamps and cache the include file CamelCase uses for the non-git case too. The camelcase cache file is now named: for git: .checkpatch-camelcase.git.<commit_id> for non-git: .checkpatch-camelcase.date.<YYYYMMDDhhmm> All .checkpatch-camelcase* files are deleted if not current. Signed-off-by: Joe Perches <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	panic: add cpu/pid to warn_slowpath_common in WARNING printk()s	Alex Thorlton	1	-2/+3
	Add the cpu/pid that called WARN() so that the stack traces can be matched up with the WARNING messages. [[email protected]: remove stray quote] Signed-off-by: Alex Thorlton <[email protected]> Reviewed-by: Robin Holt <[email protected]> Cc: Stephen Boyd <[email protected]> Cc: Vikram Mulukutla <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/memory_hotplug.c: fix return value of online_pages()	Toshi Kani	1	-3/+3
	online_pages() is called from memory_block_action() when a user requests to online a memory block via sysfs. This function needs to return a proper error value in case of error. Signed-off-by: Toshi Kani <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Tang Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm: honor min_free_kbytes set by user	Michal Hocko	1	-7/+17
	min_free_kbytes is updated during memory hotplug (by init_per_zone_wmark_min) currently which is right thing to do in most cases but this could be unexpected if admin increased the value to prevent from allocation failures and the new min_free_kbytes would be decreased as a result of memory hotadd. This patch saves the user defined value and allows updating min_free_kbytes only if it is higher than the saved one. A warning is printed when the new value is ignored. Signed-off-by: Michal Hocko <[email protected]> Cc: Mel Gorman <[email protected]> Acked-by: Zhang Yanfei <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: don't need to free memcg via RCU or workqueue	Li Zefan	1	-46/+5
	Now memcg has the same life cycle with its corresponding cgroup, and a cgroup is freed via RCU and then mem_cgroup_css_free() will be called in a work function, so we can simply call __mem_cgroup_free() in mem_cgroup_css_free(). This actually reverts commit 59927fb984d ("memcg: free mem_cgroup by RCU to fix oops"). Signed-off-by: Li Zefan <[email protected]> Cc: Hugh Dickins <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: kill memcg refcnt	Li Zefan	1	-17/+1
	Now memcg has the same life cycle as its corresponding cgroup. Kill the useless refcnt. Signed-off-by: Li Zefan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: don't need to get a reference to the parent	Li Zefan	1	-16/+3
	The cgroup core guarantees it's always safe to access the parent. Signed-off-by: Li Zefan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: use css_get/put for swap memcg	Li Zefan	1	-10/+16
	Use css_get/put instead of mem_cgroup_get/put. A simple replacement will do. The historical reason that memcg has its own refcnt instead of always using css_get/put, is that cgroup couldn't be removed if there're still css refs, so css refs can't be used as long-lived reference. The situation has changed so that rmdir a cgroup will succeed regardless css refs, but won't be freed until css refs goes down to 0. Signed-off-by: Li Zefan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: use css_get/put when charging/uncharging kmem	Li Zefan	1	-26/+54
	Use css_get/put instead of mem_cgroup_get/put. We can't do a simple replacement, because here mem_cgroup_put() is called during mem_cgroup_css_free(), while mem_cgroup_css_free() won't be called until css refcnt goes down to 0. Instead we increment css refcnt in mem_cgroup_css_offline(), and then check if there's still kmem charges. If not, css refcnt will be decremented immediately, otherwise the refcnt will be released after the last kmem allocation is uncahred. [[email protected]: tweak comment] Signed-off-by: Li Zefan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Reviewed-by: Tejun Heo <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: don't use mem_cgroup_get() when creating a kmemcg cache	Li Zefan	1	-5/+5
	Use css_get()/css_put() instead of mem_cgroup_get()/mem_cgroup_put(). There are two things being done in the current code: First, we acquired a css_ref to make sure that the underlying cgroup would not go away. That is a short lived reference, and it is put as soon as the cache is created. At this point, we acquire a long-lived per-cache memcg reference count to guarantee that the memcg will still be alive. so it is: enqueue: css_get create : memcg_get, css_put destroy: memcg_put So we only need to get rid of the memcg_get, change the memcg_put to css_put, and get rid of the now extra css_put. (This changelog is mostly written by Glauber) Signed-off-by: Li Zefan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg: use css_get() in sock_update_memcg()	Li Zefan	1	-4/+4
	Use css_get/css_put instead of mem_cgroup_get/put. Note, if at the same time someone is moving @current to a different cgroup and removing the old cgroup, css_tryget() may return false, and sock->sk_cgrp won't be initialized, which is fine. Signed-off-by: Li Zefan <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	memcg, kmem: fix reference count handling on the error path	Michal Hocko	1	-8/+0
	mem_cgroup_css_online calls mem_cgroup_put if memcg_init_kmem fails. This is not correct because only memcg_propagate_kmem takes an additional reference while mem_cgroup_sockets_init is allowed to fail as well (although no current implementation fails) but it doesn't take any reference. This all suggests that it should be memcg_propagate_kmem that should clean up after itself so this patch moves mem_cgroup_put over there. Unfortunately this is not that easy (as pointed out by Li Zefan) because memcg_kmem_mark_dead marks the group dead (KMEM_ACCOUNTED_DEAD) if it is marked active (KMEM_ACCOUNTED_ACTIVE) which is the case even if memcg_propagate_kmem fails so the additional reference is dropped in that case in kmem_cgroup_destroy which means that the reference would be dropped two times. The easiest way then would be to simply remove mem_cgrroup_put from mem_cgroup_css_online and rely on kmem_cgroup_destroy doing the right thing. Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Li Zefan <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: <[email protected]> [3.8] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	Revert "memcg: avoid dangling reference count in creation failure"	Michal Hocko	1	-2/+0
	This reverts commit e4715f01be697a. mem_cgroup_put is hierarchy aware so mem_cgroup_put(memcg) already drops an additional reference from all parents so the additional mem_cgrroup_put(parent) potentially causes use-after-free. Signed-off-by: Michal Hocko <[email protected]> Signed-off-by: Li Zefan <[email protected]> Acked-by: KAMEZAWA Hiroyuki <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Glauber Costa <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: <[email protected]> [3.9+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mmap: allow MAP_HUGETLB for hugetlbfs files v2	Jörn Engel	1	-2/+4
	It is counterintuitive at best that mmap'ing a hugetlbfs file with MAP_HUGETLB fails, while mmap'ing it without will a) succeed and b) return huge pages. v2: use is_file_hugepages(), as suggested by Jianguo Signed-off-by: Joern Engel <[email protected]> Cc: Jianguo Wu <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm: vmscan: do not scale writeback pages when deciding whether to set ↵	Mel Gorman	1	-15/+1
	ZONE_WRITEBACK After the patch "mm: vmscan: Flatten kswapd priority loop" was merged the scanning priority of kswapd changed. The priority now rises until it is scanning enough pages to meet the high watermark. shrink_inactive_list sets ZONE_WRITEBACK if a number of pages were encountered under writeback but this value is scaled based on the priority. As kswapd frequently scans with a higher priority now it is relatively easy to set ZONE_WRITEBACK. This patch removes the scaling and treates writeback pages similar to how it treats unqueued dirty pages and congested pages. The user-visible effect should be that kswapd will writeback fewer pages from reclaim context. Signed-off-by: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Kamezawa Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm: vmscan: do not continue scanning if reclaim was aborted for compaction	Mel Gorman	1	-3/+5
	Direct reclaim is not aborting to allow compaction to go ahead properly. do_try_to_free_pages is told to abort reclaim which is happily ignores and instead increases priority instead until it reaches 0 and starts shrinking file/anon equally. This patch corrects the situation by aborting reclaim when requested instead of raising priority. Signed-off-by: Mel Gorman <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Kamezawa Hiroyuki <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/memory_hotplug.c: fix a comment typo in register_page_bootmem_info_node()	Tang Chen	1	-2/+2
	Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/memblock.c: fix wrong comment in __next_free_mem_range()	Tang Chen	1	-1/+1
	Remove one redundant "nid" in the comment. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	page migration: fix wrong comment in address_space_operations.migratepage()	Tang Chen	1	-2/+2
	There is no parameter "sync" in address_space_operations->migratepage(). It should be migrate_mode. And the comment is for MIGRATE_ASYNC. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/vmalloc.c: fix an overflow bug in alloc_vmap_area()	Zhang Yanfei	1	-3/+3
	When searching a vmap area in the vmalloc space, we use (addr + size - 1) to check if the value is less than addr, which is an overflow. But we assign (addr + size) to vmap_area->va_end. So if we come across the below case: (addr + size - 1) : not overflow (addr + size) : overflow we will assign an overflow value (e.g 0) to vmap_area->va_end, And this will trigger BUG in __insert_vmap_area, causing system panic. So using (addr + size) to check the overflow should be the correct behaviour, not (addr + size - 1). Signed-off-by: Zhang Yanfei <[email protected]> Reported-by: Ghennadi Procopciuc <[email protected]> Tested-by: Daniel Baluta <[email protected]> Cc: David Rientjes <[email protected]> Cc: Minchan Kim <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm: remove unused VM_<READfoo> macros and expand other in-place	Joe Perches	4	-11/+5
	These VM_<READfoo> macros aren't used very often and three of them aren't used at all. Expand the ones that are used in-place, and remove all the now unused #define VM_<foo> macros. VM_READHINTMASK, VM_NormalReadHint and VM_ClearReadHint were added just before 2.4 and appears have never been used. Signed-off-by: Joe Perches <[email protected]> Acked-by: KOSAKI Motohiro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/pgtable: don't accumulate addr during pgd prepopulate pmd	Wanpeng Li	1	-3/+1
	The old codes accumulate addr to get right pmd, however, currently pmds are preallocated and transfered as a parameter, there is unnecessary to accumulate addr variable any more, this patch remove it. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Reviewed-by: Zhang Yanfei <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/thp: fix doc for transparent huge zero page	Wanpeng Li	1	-2/+2
	Transparent huge zero page is used during the page fault instead of in khugepaged. # ls /sys/kernel/mm/transparent_hugepage/ defrag enabled khugepaged use_zero_page # ls /sys/kernel/mm/transparent_hugepage/khugepaged/ alloc_sleep_millisecs defrag full_scans max_ptes_none pages_collapsed pages_to_scan scan_sleep_millisecs This patch corrects the documentation just like the codes done. Signed-off-by: Wanpeng Li <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/page_alloc: fix doc for numa_zonelist_order	Wanpeng Li	1	-1/+1
	The default zonelist order selecter will select "node" order if any nodes DMA zone comprises greater than 70% of its local memory instead of 60%, according to default_zonelist_order::low_kmem_size > total * 70/100. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/writeback: commit reason of WB_REASON_FORKER_THREAD mismatch name	Wanpeng Li	1	-0/+6
	After commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue"), there is no bdi forker thread any more. However, WB_REASON_FORKER_THREAD is still used due to it is TPs userland visible and we won't be exposing exactly the same information with just a different name. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/writeback: don't check force_wait to handle bdi->work_list	Wanpeng Li	1	-8/+2
	After commit 839a8e8660b6 ("writeback: replace custom worker pool implementation with unbound workqueue"), bdi_writeback_workfn runs off bdi_writeback->dwork, on each execution, it processes bdi->work_list and reschedules if there are more things to do instead of flush any work that race with us existing. It is unecessary to check force_wait in wb_do_writeback since it is always 0 after the mentioned commit. This patch remove the force_wait in wb_do_writeback. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Tejun Heo <[email protected]> Reviewed-by: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	mm/writeback: remove wb_reason_name	Wanpeng Li	1	-1/+0
	wb_reason_name is not used any more - remove it. Signed-off-by: Wanpeng Li <[email protected]> Reviewed-by: Tejun Heo <[email protected]> Reviewed-by: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2013-07-09	fs/fs-writeback.c: : make wb_do_writeback() as static	Haicheng Li	2	-2/+1
	It's not used globally and could be static. Signed-off-by: Haicheng Li <[email protected]> Cc: Jan Kara <[email protected]> Cc: Wu Fengguang <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>