blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2015-08-22	Merge branch 'irq-urgent-for-linus' of ↵	Linus Torvalds	1	-1/+18
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A series of small fixlets for a regression visible on OMAP devices caused by the conversion of the OMAP interrupt chips to hierarchical interrupt domains. Mostly one liners on the driver side plus a small helper function in the core to avoid open coded mess in the drivers" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/crossbar: Restore set_wake functionality irqchip/crossbar: Restore the mask on suspend behaviour ARM: OMAP: wakeupgen: Restore the irq_set_type() mechanism irqchip/crossbar: Restore the irq_set_type() mechanism genirq: Introduce irq_chip_set_type_parent() helper genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy
2015-08-22	Merge branch 'timers-urgent-for-linus' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two minimalistic fixes for 4.2 regressions: - Eric fixed a thinko in the timer_list base switching code caused by the overhaul of the timer wheel. It can cause a cpu to see the wrong base for a timer while we move the timer around. - Guenter fixed a regression for IMX if booted w/o device tree, where the timer interrupt is not initialized and therefor the machine fails to boot" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/imx: Fix boot with non-DT systems timer: Write timer->flags atomically
2015-08-22	hrtimer: Handle failure of tick_init_highres() gracefully	Guenter Roeck	1	-0/+1
	Commit 75e3b37d0598 ("hrtimer: Drop return code of hrtimer_switch_to_hres()") drops the return code of hrtimer_switch_to_hres(). While doing so, it also drops the return statement itself on failure. This may cause a system hang. Seen when running arm:multi_v7_defconfig in qemu with devicetree file vexpress-v2p-ca9. Fixes: 75e3b37d0598 ("hrtimer: Drop return code of hrtimer_switch_to_hres()") Cc: Luiz Capitulino <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-21	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	4	-30/+84
	Conflicts: drivers/net/usb/qmi_wwan.c Overlapping additions of new device IDs to qmi_wwan.c Signed-off-by: David S. Miller <[email protected]>
2015-08-20	Merge branch 'fortglx/4.3/time' of ↵	Thomas Gleixner	4	-29/+40
	https://git.linaro.org/people/john.stultz/linux into timers/core - A handful or y2038 related items - A walltime to monotonic limit - Small fixes for timespec_trunc() and timer_list output
2015-08-20	Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding ↵	Ingo Molnar	5	-9/+29
	more changes Signed-off-by: Ingo Molnar <[email protected]>
2015-08-20	genirq: Introduce irq_chip_set_type_parent() helper	Grygorii Strashko	1	-0/+17
	This helper is required for irq chips which do not implement a irq_set_type callback and need to call down the irq domain hierarchy for the actual trigger type change. This helper is required to fix further wreckage caused by the conversion of TI OMAP to hierarchical irq domains and therefor tagged for stable. [ tglx: Massaged changelog ] Signed-off-by: Grygorii Strashko <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: [email protected] # 4.1 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-20	genirq: Don't return ENOSYS in irq_chip_retrigger_hierarchy	Grygorii Strashko	1	-1/+1
	irq_chip_retrigger_hierarchy() returns -ENOSYS if it was not able to find at least one .irq_retrigger() callback implemented in the IRQ domain hierarchy. That's wrong, because check_irq_resend() expects a 0 return value from the callback in case that the hardware assisted resend was not possible. If the return value is non zero the core code assumes hardware resend success and the software resend is not invoked. This results in lost interrupts on platforms where none of the parent irq chips in the hierarchy implements the retrigger callback. This is observable on TI OMAP, where the hierarchy is: ARM GIC <- OMAP wakeupgen <- TI Crossbar Return 0 instead so the software resend mechanism gets invoked. [ tglx: Massaged changelog ] Fixes: 85f08c17de26 ('genirq: Introduce helper functions...') Signed-off-by: Grygorii Strashko <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Reviewed-by: Jiang Liu <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: [email protected] # 4.1 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-18	cgroup: introduce cgroup_subsys->legacy_name	Tejun Heo	1	-11/+18
	This allows cgroup subsystems to use a different name on the unified hierarchy. cgroup_subsys->name is used on the unified hierarchy, ->legacy_name elsewhere. If ->legacy_name is not explicitly set, it's automatically set to ->name and the userland visible behavior remains unchanged. v2: Make parse_cgroupfs_options() only consider ->legacy_name as mount options are used only on legacy hierarchies. Suggested by Li Zefan. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: [email protected]
2015-08-18	cgroup: don't print subsystems for the default hierarchy	Tejun Heo	1	-6/+9
	It doesn't make sense to print subsystems on mount option or /proc/PID/cgroup for the default hierarchy. * cgroup.controllers file at the root of the default hierarchy lists the currently attached controllers. * The default hierarchy is catch-all for unmounted subsystems. * The default hierarchy doesn't accept any mount options. Suppress subsystem printing on mount options and /proc/PID/cgroup for the default hierarchy. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Li Zefan <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: [email protected]
2015-08-18	hrtimer: Unconfuse switch_hrtimer_base() a bit	Frederic Weisbecker	1	-8/+17
	The variable called "this_base" is confusing because its name suggests it's of "struct hrtimer_clock_base" type, along with "base" and "new_base" which doesn't help understanding this complicated function. Make its name clearer and fix the misleading comment while at it. [ tglx: Fixed the comment for real ] Signed-off-by: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-18	hrtimer: Simplify get_target_base() by returning current base	Frederic Weisbecker	1	-2/+2
	Instead of fetching again the current cpu base, just take it from the parameter. Signed-off-by: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-18	timer: Write timer->flags atomically	Eric Dumazet	1	-2/+2
	lock_timer_base() cannot prevent the following : CPU1 ( in __mod_timer() timer->flags \|= TIMER_MIGRATING; spin_unlock(&base->lock); base = new_base; spin_lock(&base->lock); // The next line clears TIMER_MIGRATING timer->flags &= ~TIMER_BASEMASK; CPU2 (in lock_timer_base()) see timer base is cpu0 base spin_lock_irqsave(&base->lock, *flags); if (timer->flags == tf) return base; // oops, wrong base timer->flags \|= base->cpu // too late We must write timer->flags in one go, otherwise we can fool other cpus. Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled") Signed-off-by: Eric Dumazet <[email protected]> Cc: Jon Christopherson <[email protected]> Cc: David Miller <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Sander Eikelenboom <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]> Cc: Thomas Gleixner <[email protected]>
2015-08-18	Merge branch 'x86/urgent' into x86/asm to fix up conflicts and to pick up fixes	Ingo Molnar	6	-36/+101
	Conflicts: arch/x86/entry/entry_64_compat.S arch/x86/math-emu/get_address.c Signed-off-by: Ingo Molnar <[email protected]>
2015-08-17	Merge branch 'for-4.2-fixes' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fix from Tejun Heo: "A fix for a subtle bug introduced back during 3.17 cycle which interferes with setting configurations under specific conditions" * 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: use trialcs->mems_allowed as a temp variable
2015-08-17	hrtimer: Drop return code of hrtimer_switch_to_hres()	Luiz Capitulino	1	-4/+2
	It's not checked by the caller. Signed-off-by: Luiz Capitulino <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2015-08-17	time: Introduce timespec64_to_jiffies()/jiffies_to_timespec64()	Baolin Wang	1	-8/+13
	The conversion between struct timespec and jiffies is not year 2038 safe on 32bit systems. Introduce timespec64_to_jiffies() and jiffies_to_timespec64() functions which use struct timespec64 to make it ready for 2038 issue. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17	time: Introduce current_kernel_time64()	Baolin Wang	1	-3/+3
	The current_kernel_time() is not year 2038 safe on 32bit systems since it returns a timespec value. Introduce current_kernel_time64() which returns a timespec64 value. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17	time: Add the common weak version of update_persistent_clock()	Xunlei Pang	1	-0/+5
	The weak update_persistent_clock64() calls update_persistent_clock(), if the architecture defines an update_persistent_clock64() to replace and remove its update_persistent_clock() version, when building the kernel the linker will throw an undefined symbol error, that is, any arch that switches to update_persistent_clock64() will have this issue. To solve the issue, we add the common weak update_persistent_clock(). Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Xunlei Pang <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17	time: Always make sure wall_to_monotonic isn't positive	Wang YanQing	1	-3/+10
	Two issues were found on an IMX6 development board without an enabled RTC device(resulting in the boot time and monotonic time being initialized to 0). Issue 1:exportfs -a generate: "exportfs: /opt/nfs/arm does not support NFS export" Issue 2:cat /proc/stat: "btime 4294967236" The same issues can be reproduced on x86 after running the following code: int main(void) { struct timeval val; int ret; val.tv_sec = 0; val.tv_usec = 0; ret = settimeofday(&val, NULL); return 0; } Two issues are different symptoms of same problem: The reason is a positive wall_to_monotonic pushes boot time back to the time before Epoch, and getboottime will return negative value. In symptom 1: negative boot time cause get_expiry() to overflow time_t when input expire time is 2147483647, then cache_flush() always clears entries just added in ip_map_parse. In symptom 2: show_stat() uses "unsigned long" to print negative btime value returned by getboottime. This patch fix the problem by prohibiting time from being set to a value which would cause a negative boot time. As a result one can't set the CLOCK_REALTIME time prior to (1970 + system uptime). Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Wang YanQing <[email protected]> [jstultz: reworded commit message] Signed-off-by: John Stultz <[email protected]>
2015-08-17	time: Fix nanosecond file time rounding in timespec_trunc()	Karsten Blees	1	-14/+8
	timespec_trunc() avoids rounding if granularity <= nanoseconds-per-jiffie (or TICK_NSEC). This optimization assumes that: 1. current_kernel_time().tv_nsec is already rounded to TICK_NSEC (i.e. with HZ=1000 you'd get 1000000, 2000000, 3000000... but never 1000001). This is no longer true (probably since hrtimers introduced in 2.6.16). 2. TICK_NSEC is evenly divisible by all possible granularities. This may be true for HZ=100, 250, 1000, but obviously not for HZ=300 / TICK_NSEC=3333333 (introduced in 2.6.20). Thus, sub-second portions of in-core file times are not rounded to on-disk granularity. I.e. file times may change when the inode is re-read from disk or when the file system is remounted. This affects all file systems with file time granularities > 1 ns and < 1s, e.g. CEPH (1000 ns), UDF (1000 ns), CIFS (100 ns), NTFS (100 ns) and FUSE (configurable from user mode via struct fuse_init_out.time_gran). Steps to reproduce with e.g. UDF: $ dd if=/dev/zero of=udfdisk count=10000 && mkudffs udfdisk $ mkdir udf && mount udfdisk udf $ touch udf/test && stat -c %y udf/test 2015-06-09 10:22:56.130006767 +0200 $ umount udf && mount udfdisk udf $ stat -c %y udf/test 2015-06-09 10:22:56.130006000 +0200 Remounting truncates the mtime to 1 µs. Fix the rounding in timespec_trunc() and update the documentation. timespec_trunc() is exclusively used to calculate inode's [acm]time (mostly via current_fs_time()), and always with super_block.s_time_gran as second argument. So this can safely be changed without side effects. Note: This does _not_ fix the issue for FAT's 2 second mtime resolution, as super_block.s_time_gran isn't prepared to handle different ctime / mtime / atime resolutions nor resolutions > 1 second. Cc: Prarit Bhargava <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Karsten Blees <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-17	timer_list: Add the base offset so remaining nsecs are accurate for non ↵	John Stultz	1	-1/+1
	monotonic timers I noticed for non-monotonic timers in timer_list, some of the output looked a little confusing. For example: #1: <0000000000000000>, posix_timer_fn, S:01, hrtimer_start_range_ns, leap-a-day/2360 # expires at 1434412800000000000-1434412800000000000 nsecs [in 1434410725062375469 to 1434410725062375469 nsecs] You'll note the relative time till the expiration "[in xxx to yyy nsecs]" is incorrect. This is because its printing the delta between CLOCK_MONOTONIC time to the CLOCK_REALTIME expiration. This patch fixes this issue by adding the clock offset to the "now" time which we use to calculate the delta. Cc: Prarit Bhargava <[email protected]> Cc: Daniel Bristot de Oliveira <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jiri Bohac <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Shuah Khan <[email protected]> Signed-off-by: John Stultz <[email protected]>
2015-08-15	percpu-rwsem: kill CONFIG_PERCPU_RWSEM	Oleg Nesterov	1	-2/+1
	Remove CONFIG_PERCPU_RWSEM, the next patch adds the unconditional user of percpu_rw_semaphore. Signed-off-by: Oleg Nesterov <[email protected]>
2015-08-15	percpu-rwsem: introduce percpu_down_read_trylock()	Oleg Nesterov	1	-0/+13
	Add percpu_down_read_trylock(), it will have the user soon. Signed-off-by: Oleg Nesterov <[email protected]>
2015-08-14	Merge branch 'perf-urgent-for-linus' of ↵	Linus Torvalds	2	-28/+73
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX' (Intel PT) race related core fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler perf/x86/intel: Fix memory leak on hot-plug allocation fail perf: Fix PERF_EVENT_IOC_PERIOD migration race perf: Fix double-free of the AUX buffer perf: Fix fasync handling on inherited events perf tools: Fix test build error when bindir contains double slash perf stat: Fix transaction lenght metrics perf: Fix running time accounting
2015-08-14	Merge branch 'locking-urgent-for-linus' of ↵	Linus Torvalds	1	-1/+10
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Ingo Molnar: "A single fix for a locking self-test crash" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/pvqspinlock: Fix kernel panic in locking-selftest
2015-08-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	3	-7/+18
	Conflicts: drivers/net/ethernet/cavium/Kconfig The cavium conflict was overlapping dependency changes. Signed-off-by: David S. Miller <[email protected]>
2015-08-12	bpf: fix bpf_perf_event_read() loop upper bound	Wei-Chun Chao	1	-1/+1
	Verifier rejects programs incorrectly. Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read()") Cc: Kaixu Xia <[email protected]> Cc: Alexei Starovoitov <[email protected]> Signed-off-by: Wei-Chun Chao <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-12	userns,pidns: Force thread group sharing, not signal handler sharing.	Eric W. Biederman	2	-6/+6
	The code that places signals in signal queues computes the uids, gids, and pids at the time the signals are enqueued. Which means that tasks that share signal queues must be in the same pid and user namespaces. Sharing signal handlers is fine, but bizarre. So make the code in fork and userns_install clearer by only testing for what is functionally necessary. Also update the comment in unshare about unsharing a user namespace to be a little more explicit and make a little more sense. Acked-by: Oleg Nesterov <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2015-08-12	unshare: Unsharing a thread does not require unsharing a vm	Eric W. Biederman	1	-10/+18
	In the logic in the initial commit of unshare made creating a new thread group for a process, contingent upon creating a new memory address space for that process. That is wrong. Two separate processes in different thread groups can share a memory address space and clone allows creation of such proceses. This is significant because it was observed that mm_users > 1 does not mean that a process is multi-threaded, as reading /proc/PID/maps temporarily increments mm_users, which allows other processes to (accidentally) interfere with unshare() calls. Correct the check in check_unshare_flags() to test for !thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM. For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM. For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM. By using the correct checks in unshare this removes the possibility of an accidental denial of service attack. Additionally using the correct checks in unshare ensures that only an explicit unshare(CLONE_VM) can possibly trigger the slow path of current_is_single_threaded(). As an explict unshare(CLONE_VM) is pointless it is not expected there are many applications that make that call. Cc: [email protected] Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace Reported-by: Ricky Zhou <[email protected]> Reported-by: Kees Cook <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]>
2015-08-12	Merge branch 'for-mingo' of ↵	Ingo Molnar	14	-498/+620
	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - The combination of tree geometry-initialization simplifications and OS-jitter-reduction changes to expedited grace periods. These two are stacked due to the large number of conflicts that would otherwise result. [ With one addition, a temporary commit to silence a lockdep false positive. Additional changes to the expedited grace-period primitives (queued for 4.4) remove the cause of this false positive, and therefore include a revert of this temporary commit. ] - Documentation updates. - Torture-test updates. - Miscellaneous fixes. Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched/deadline: Fix comment in enqueue_task_dl()	Andrea Parri	1	-1/+1
	The "dl_boosted" flag is set by comparing absolute deadlines (c.f., rt_mutex_setprio()). Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched/deadline: Fix comment in push_dl_tasks()	Andrea Parri	1	-1/+1
	The comment is "misleading"; fix it by adapting a comment from push_rt_tasks(). Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched: Change the sched_class::set_cpus_allowed() calling context	Peter Zijlstra	3	-81/+26
	Change the calling context of sched_class::set_cpus_allowed() such that we can assume the task is inactive. This allows us to easily make changes that affect accounting done by enqueue/dequeue. This does in fact completely remove set_cpus_allowed_rt() and greatly reduces set_cpus_allowed_dl(). Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched: Make sched_class::set_cpus_allowed() unconditional	Peter Zijlstra	7	-18/+36
	Give every class a set_cpus_allowed() method, this enables some small optimization in the RT,DL implementation by avoiding a double cpumask_weight() call. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched: Fix a race between __kthread_bind() and sched_setaffinity()	Peter Zijlstra	3	-11/+51
	Because sched_setscheduler() checks p->flags & PF_NO_SETAFFINITY without locks, a caller might observe an old value and race with the set_cpus_allowed_ptr() call from __kthread_bind() and effectively undo it: __kthread_bind() do_set_cpus_allowed() <SYSCALL> sched_setaffinity() if (p->flags & PF_NO_SETAFFINITIY) set_cpus_allowed_ptr() p->flags \|= PF_NO_SETAFFINITY Fix the bug by putting everything under the regular scheduler locks. This also closes a hole in the serialization of task_struct::{nr_,}cpus_allowed. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Tejun Heo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched: Ensure a task has a non-normalized vruntime when returning back to CFS	Byungchul Park	1	-2/+17
	Current code ensures that a task has a normalized vruntime when switching away from the fair class, but it does not ensure the task has a non-normalized vruntime when switching back to the fair class. This is an example breaking this consistency: 1. a task is in fair class and !queued 2. changes its class to RT class (still !queued) 3. changes its class to fair class again (still !queued) Signed-off-by: Byungchul Park <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	sched/numa: Fix NUMA_DIRECT topology identification	Aravind Gopalakrishnan	1	-1/+3
	Systems which have all nodes at a distance of at most 1 hop should be identified as 'NUMA_DIRECT'. However, the scheduler incorrectly identifies it as 'NUMA_BACKPLANE'. This is because 'n' is assigned to sched_max_numa_distance but the code (mis)interprets it to mean 'number of hops'. Rik had actually used sched_domains_numa_levels for detecting a 'NUMA_DIRECT' topology: http://marc.info/?l=linux-kernel&m=141279712429834&w=2 But that was changed when he removed the hops table in the subsequent version: http://marc.info/?l=linux-kernel&m=141353106106771&w=2 Fixing the issue here. Signed-off-by: Aravind Gopalakrishnan <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	locking/qrwlock: Make use of _{acquire\|release\|relaxed}() atomics	Will Deacon	1	-12/+12
	The qrwlock implementation is slightly heavy in its use of memory barriers, mainly through the use of _cmpxchg() and _return() atomics, which imply full barrier semantics. This patch modifies the qrwlock code to use the more relaxed atomic routines so that we can reduce the unnecessary barrier overhead on weakly-ordered architectures. Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	perf/ring-buffer: Clarify the use of page::private for high-order AUX ↵	Alexander Shishkin	1	-1/+4
	allocations A question [1] was raised about the use of page::private in AUX buffer allocations, so let's add a clarification about its intended use. The private field and flag are used by perf's rb_alloc_aux() path to tell the pmu driver the size of each high-order allocation, so that the driver can program those appropriately into its hardware. This only matters for PMUs that don't support hardware scatter tables. Otherwise, every page in the buffer is just a page. This patch adds a comment about the private field to the AUX buffer allocation path. [1] http://marc.info/?l=linux-kernel&m=143803696607968 Reported-by: Mathieu Poirier <[email protected]> Signed-off-by: Alexander Shishkin <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/1438063204-665-1-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	Merge branch 'perf/urgent' into perf/core, to pick up fixes before applying ↵	Ingo Molnar	2	-26/+71
	new changes Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	perf: Fix PERF_EVENT_IOC_PERIOD migration race	Peter Zijlstra	1	-20/+55
	I ran the perf fuzzer, which triggered some WARN()s which are due to trying to stop/restart an event on the wrong CPU. Use the normal IPI pattern to ensure we run the code on the correct CPU. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: bad7192b842c ("perf: Fix PERF_EVENT_IOC_PERIOD to force-reset the period") Signed-off-by: Ingo Molnar <[email protected]>
2015-08-12	perf: Fix double-free of the AUX buffer	Ben Hutchings	1	-4/+6
	If rb->aux_refcount is decremented to zero before rb->refcount, __rb_free_aux() may be called twice resulting in a double free of rb->aux_pages. Fix this by adding a check to __rb_free_aux(). Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Fixes: 57ffc5ca679f ("perf: Fix AUX buffer refcounting") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-10	cpuset: use trialcs->mems_allowed as a temp variable	Alban Crequy	1	-1/+1
	The comment says it's using trialcs->mems_allowed as a temp variable but it didn't match the code. Change the code to match the comment. This fixes an issue when writing in cpuset.mems when a sub-directory exists: we need to write several times for the information to persist: \| root@alban:/sys/fs/cgroup/cpuset# mkdir footest9 \| root@alban:/sys/fs/cgroup/cpuset# cd footest9 \| root@alban:/sys/fs/cgroup/cpuset/footest9# mkdir aa \| root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems \| \| root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems \| root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems \| \| root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > cpuset.mems \| root@alban:/sys/fs/cgroup/cpuset/footest9# cat cpuset.mems \| 0 \| root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems \| \| root@alban:/sys/fs/cgroup/cpuset/footest9# echo 0 > aa/cpuset.mems \| root@alban:/sys/fs/cgroup/cpuset/footest9# cat aa/cpuset.mems \| 0 \| root@alban:/sys/fs/cgroup/cpuset/footest9# This should help to fix the following issue in Docker: https://github.com/opencontainers/runc/issues/133 In some conditions, a Docker container needs to be started twice in order to work. Signed-off-by: Alban Crequy <[email protected]> Tested-by: Iago López Galeiras <[email protected]> Cc: <[email protected]> # 3.17+ Acked-by: Li Zefan <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2015-08-10	kernel: broadcast-hrtimer: Migrate to new 'set-state' interface	Viresh Kumar	1	-29/+20
	Migrate broadcast-hrtimer driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. Cc: Thomas Gleixner <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]>
2015-08-09	bpf: Implement function bpf_perf_event_read() that get the selected hardware ↵	Kaixu Xia	2	-15/+64
	PMU conuter According to the perf_event_map_fd and index, the function bpf_perf_event_read() can convert the corresponding map value to the pointer to struct perf_event and return the Hardware PMU counter value. Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09	bpf: Add new bpf map type to store the pointer to struct perf_event	Kaixu Xia	1	-0/+57
	Introduce a new bpf map type 'BPF_MAP_TYPE_PERF_EVENT_ARRAY'. This map only stores the pointer to struct perf_event. The user space event FDs from perf_event_open() syscall are converted to the pointer to struct perf_event and stored in map. Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09	bpf: Make the bpf_prog_array_map more generic	Wang Nan	3	-33/+51
	All the map backends are of generic nature. In order to avoid adding much special code into the eBPF core, rewrite part of the bpf_prog_array map code and make it more generic. So the new perf_event_array map type can reuse most of code with bpf_prog_array map and add fewer lines of special code. Signed-off-by: Wang Nan <[email protected]> Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09	perf: add the necessary core perf APIs when accessing events counters in ↵	Kaixu Xia	1	-0/+78
	eBPF programs This patch add three core perf APIs: - perf_event_attrs(): export the struct perf_event_attr from struct perf_event; - perf_event_get(): get the struct perf_event from the given fd; - perf_event_read_local(): read the events counters active on the current CPU; These APIs are needed when accessing events counters in eBPF programs. The API perf_event_read_local() comes from Peter and I add the corresponding SOB. Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09	Merge 4.2-rc6 into char-misc-next	Greg Kroah-Hartman	3	-7/+18
	We want the fixes in Linus's tree in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>