blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2010-04-14	rcu: Better explain the condition parameter of rcu_dereference_check()	David Howells	1	-5/+23
	Better explain the condition parameter of rcu_dereference_check() that describes the conditions under which the dereference is permitted to take place (and incorporate Yong Zhang's suggestion). This condition is only checked under lockdep proving. Signed-off-by: David Howells <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-14	rcu: Add rcu_access_pointer and rcu_dereference_protected	Paul E. McKenney	1	-0/+32
	This patch adds variants of rcu_dereference() that handle situations where the RCU-protected data structure cannot change, perhaps due to our holding the update-side lock, or where the RCU-protected pointer is only to be fetched, not dereferenced. These are needed due to some performance concerns with using rcu_dereference() where it is not required, aside from the need for lockdep/sparse checking. The new rcu_access_pointer() primitive is for the case where the pointer is be fetch and not dereferenced. This primitive may be used without protection, RCU or otherwise, due to the fact that it uses ACCESS_ONCE(). The new rcu_dereference_protected() primitive is for the case where updates are prevented, for example, due to holding the update-side lock. This primitive does neither ACCESS_ONCE() nor smp_read_barrier_depends(), so can only be used when updates are somehow prevented. Suggested-by: David Howells <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-04-14	ARM: 6052/1: kdump: make kexec work in interrupt context	Mika Westerberg	1	-4/+6
	When crash happens in interrupt context there is no userspace context. We always use current->active_mm in those cases. Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Russell King <[email protected]>
2010-04-14	ARM: 6051/1: VFP: preserve the HW context when calling signal handlers	Imre Deak	3	-17/+111
	From: Imre Deak <[email protected]> Signal handlers can use floating point, so prevent them to corrupt the main thread's VFP context. So far there were two signal stack frame formats defined based on the VFP implementation, but the user struct used for ptrace covers all posibilities, so use it for the signal stack too. Introduce also a new user struct for VFP exception registers. In this too fields not relevant to the current VFP architecture are ignored. Support to save / restore the exception registers was added by Will Deacon. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Russell King <[email protected]>
2010-04-14	ARM: 6050/1: VFP: fix the SMP versions of vfp_{sync,flush}_hwstate	Imre Deak	1	-21/+10
	From: Imre Deak <[email protected]> Recently the UP versions of these functions were refactored and as a side effect it became possible to call them for the current thread. This isn't true for the SMP versions however, so fix this up. Signed-off-by: Imre Deak <[email protected]> Signed-off-by: Russell King <[email protected]>
2010-04-14	ARM: 6007/1: fix highmem with VIPT cache and DMA	Nicolas Pitre	6	-20/+122
	The VIVT cache of a highmem page is always flushed before the page is unmapped. This cache flush is explicit through flush_cache_kmaps() in flush_all_zero_pkmaps(), or through __cpuc_flush_dcache_area() in kunmap_atomic(). There is also an implicit flush of those highmem pages that were part of a process that just terminated making those pages free as the whole VIVT cache has to be flushed on every task switch. Hence unmapped highmem pages need no cache maintenance in that case. However unmapped pages may still be cached with a VIPT cache because the cache is tagged with physical addresses. There is no need for a whole cache flush during task switching for that reason, and despite the explicit cache flushes in flush_all_zero_pkmaps() and kunmap_atomic(), some highmem pages that were mapped in user space end up still cached even when they become unmapped. So, we do have to perform cache maintenance on those unmapped highmem pages in the context of DMA when using a VIPT cache. Unfortunately, it is not possible to perform that cache maintenance using physical addresses as all the L1 cache maintenance coprocessor functions accept virtual addresses only. Therefore we have no choice but to set up a temporary virtual mapping for that purpose. And of course the explicit cache flushing when unmapping a highmem page on a system with a VIPT cache now can go, which should increase performance. While at it, because the code in __flush_dcache_page() has to be modified anyway, let's also make sure the mapped highmem pages are pinned with kmap_high_get() for the duration of the cache maintenance operation. Because kunmap() does unmap highmem pages lazily, it was reported by Gary King <[email protected]> that those pages ended up being unmapped during cache maintenance on SMP causing segmentation faults. Signed-off-by: Nicolas Pitre <[email protected]> Signed-off-by: Russell King <[email protected]>
2010-04-14	ARM: 5975/1: AT91 slow-clock suspend: don't wait when turning PLLs off	Anders Larsen	1	-4/+0
	From: Julien Langer <[email protected]> AT91: when turning off the PLLs during suspend, don't wait for the lock flag to be set. Previously the code would always run into the loop limitation of 1000 iterations because the flag is never set when turning the PLLs off. Comments from Anders Larsen: (in http://marc.info/?l=linux-kernel&m=127058929724193&w=2) Signed-off-by: Julien Langer <[email protected]> Signed-off-by: Anders Larsen <[email protected]> Acked-by: Andrew Victor <[email protected]> Signed-off-by: Russell King <[email protected]>
2010-04-13	Revert "Input: wacom - merge out and in prox events"	Dmitry Torokhov	1	-59/+104
	This reverts commit 776943fd6f104a6e8457dc95a17282e69e963666 as it causes issues with ISDv4 E3 touchscreens: https://bugzilla.kernel.org/show_bug.cgi?id=15670 Signed-off-by: Dmitry Torokhov <[email protected]>
2010-04-13	forcedeth: fix tx limit2 flag check	Ayaz Abdulla	1	-1/+1
	This is a fix for bug 572201 @ bugs.debian.org This patch fixes the TX_LIMIT feature flag. The previous logic check for TX_LIMIT2 also took into account a device that only had TX_LIMIT set. Reported-by: Stephen Mulcahu <[email protected]> Reported-by: Ben Huchings <[email protected]> Signed-off-by: Ayaz Abdulla <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-13	Merge branch 'pm-fixes' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 * 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: PM / Hibernate: user.c, fix SNAPSHOT_SET_SWAP_AREA handling
2010-04-13	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6	Linus Torvalds	6	-23/+39
	* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: NFSv4: fix delegated locking NFS: Ensure that the WRITE and COMMIT RPC calls are always uninterruptible NFS: Fix a race with the new commit code NFS: Ensure that writeback_single_inode() calls write_inode() when syncing NFS: Fix the mode calculation in nfs_find_open_context NFSv4: Fall back to ordinary lookup if nfs4_atomic_open() returns EISDIR
2010-04-13	ceph: use separate class for ceph sockets' sk_lock	Sage Weil	1	-0/+8
	Use a separate class for ceph sockets to prevent lockdep confusion. Because ceph sockets only get passed kernel pointers, there is no dependency from sk_lock -> mmap_sem. If we share the same class as other sockets, lockdep detects a circular dependency from mmap_sem (page fault) -> fs mutex -> sk_lock -> mmap_sem because dependencies are noted from both ceph and user contexts. Using a separate class prevents the sk_lock(ceph) -> mmap_sem dependency and makes lockdep happy. Signed-off-by: Sage Weil <[email protected]>
2010-04-13	ceph: reserve one more caps space when doing readdir	Yehuda Sadeh	1	-1/+1
	We were missing space for the directory cap. The result was a BUG at fs/ceph/caps.c:2178. Signed-off-by: Yehuda Sadeh <[email protected]> Signed-off-by: Sage Weil <[email protected]>
2010-04-13	ceph: queue_cap_snap should always queue dirty context	Sage Weil	2	-11/+8
	This simplifies the calling convention, and fixes a bug where we queue a capsnap with a context other than i_head_snapc (the one that matches the dirty pages). The result was a BUG at fs/ceph/caps.c:2178 on writeback completion when a capsnap matching the writeback snapc could not be found. Signed-off-by: Sage Weil <[email protected]>
2010-04-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6	Linus Torvalds	17	-135/+206
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Add some more commentary to __raw_local_irq_save() sparc64: Fix memory leak in pci_register_iommu_region(). sparc64: Add kmemleak annotation to sun4v_build_virq() sparc64: Support kmemleak. sparc64: Add function graph tracer support. sparc64: Give a stack frame to the ftrace call sites. sparc64: Use a seperate counter for timer interrupts and NMI checks, like x86. sparc64: Remove profiling from some low-level bits. sparc64: Kill unnecessary static on local var in ftrace_call_replace(). sparc64: Kill CONFIG_STACK_DEBUG code. sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up. sparc64: Adjust __raw_local_irq_save() to cooperate in NMIs. sparc64: Use kstack_valid() in die_if_kernel().
2010-04-13	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6	Linus Torvalds	30	-138/+348
	* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (25 commits) smc91c92_cs: define multicast_table as unsigned char can: avoids a false warning e1000e: stop cleaning when we reach tx_ring->next_to_use igb: restrict WoL for 82576 ET2 Quad Port Server Adapter virtio_net: missing sg_init_table Revert "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb" iwlwifi: need check for valid qos packet before free tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb udp: fix for unicast RX path optimization myri10ge: fix rx_pause in myri10ge_set_pauseparam net: corrected documentation for hardware time stamping stmmac: use resource_size() x.25 attempts to negotiate invalid throughput x25: Patch to fix bug 15678 - x25 accesses fields beyond end of packet. bridge: Fix IGMP3 report parsing cnic: Fix crash during bnx2x MTU change. qlcnic: fix set mac addr r6040: fix r6040_multicast_list vhost-net: fix vq_memory_access_ok error checking ath9k: fix double calls to ath_radio_enable ...
2010-04-13	Merge branch 'iommu/fixes' of ↵	Ingo Molnar	8	-34/+68
	git://git.kernel.org/pub/scm/linux/kernel/git/joro/linux-2.6-iommu into x86/urgent
2010-04-13	smc91c92_cs: define multicast_table as unsigned char	Ken Kawasaki	1	-7/+6
	smc91c92_cs: * define multicast_table as unsigned char * remove unnecessary "#ifndef final_version" Signed-off-by: Ken Kawasaki <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-13	can: avoids a false warning	Eric Dumazet	1	-1/+1
	At this point optlen == sizeof(sfilter) but some compilers are dumb. Reported-by: Németh Márton <[email protected] Signed-off-by: Eric Dumazet <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-13	e1000e: stop cleaning when we reach tx_ring->next_to_use	Terry Loftin	1	-0/+2
	Tx ring buffers after tx_ring->next_to_use are volatile and could change, possibly causing a crash. Stop cleaning when we hit tx_ring->next_to_use. Signed-off-by: Terry Loftin <[email protected]> Acked-by: Bruce Allan <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-13	igb: restrict WoL for 82576 ET2 Quad Port Server Adapter	Stefan Assmann	2	-0/+2
	Restrict Wake-on-LAN to first port on 82576 ET2 quad port NICs, as it is only supported there. Signed-off-by: Stefan Assmann <[email protected]> Acked-by: Alexander Duyck <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-13	sparc64: Add some more commentary to __raw_local_irq_save()	David S. Miller	1	-0/+7
	Suggested by Peter Zijlstra Signed-off-by: David S. Miller <[email protected]>
2010-04-13	ALSA: aaci - Fix alignment faults on ARM Cortex introduced by commit 29a4f2d3	Philby John	1	-2/+5
	The commit 29a4f2d3 used writel() at offset 0x26 which is half-word aligned causing unaligned exceptions on a Cortex-A8. The original patch solved the "aaci-pl041 fpga:04: ac97 read back fail" issue on a soft reset. Reading from any arbitrary aaci register seems to solve this issue. Signed-off-by: Philby John <[email protected]> Acked-by: Russell King <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2010-04-13	Merge branch 'master' of /home/davem/src/GIT/linux-2.6/	David S. Miller	4482	-3286/+22501
	Conflicts: lib/Kconfig.debug
2010-04-12	sparc64: Fix memory leak in pci_register_iommu_region().	David S. Miller	1	-3/+8
	Found by kmemleak. If request_resource() fails, we leak the struct resource we allocated to represent the IOMMU mapping area. This actually happens on sun4v machines because the IOMEM area is only reported sans the IOMMU region, unlike all previous systems. I'll need to fix that at some point, but for now fix the leak. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Add kmemleak annotation to sun4v_build_virq()	David S. Miller	1	-0/+8
	The only reference we store to this memory is in the form of a physical address, so kmemleak can't see it. Add a kmemleak_not_leak() annotation. It's probably useful to be able to look at a dump of these things either via debugfs or similar, and thus we could at some point store them in some kind of table and therefore get rid of this annotation. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Support kmemleak.	David S. Miller	2	-1/+5
	Only missing thing was an _sdata marker in vmlinux.lds.S Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Add function graph tracer support.	David S. Miller	10	-15/+132
	Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Give a stack frame to the ftrace call sites.	David S. Miller	1	-15/+16
	It's the only way we'll be able to implement the function graph tracer properly. A positive is that we no longer have to worry about the linker over-optimizing the tail call, since we don't use a tail call any more. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Use a seperate counter for timer interrupts and NMI checks, like x86.	David S. Miller	3	-3/+3
	This keeps us from having to use kstat_irqs_cpu() from the NMI handler, the former of which is a profiled function. Instead we use a currently empty slot in the cpu_data Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Remove profiling from some low-level bits.	David S. Miller	1	-1/+8
	These include the timer implementation, perf events support, and the performance counter register (pcr) programming layer. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Kill unnecessary static on local var in ftrace_call_replace().	David S. Miller	1	-1/+1
	Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Kill CONFIG_STACK_DEBUG code.	David S. Miller	2	-78/+1
	The generic stack tracer does this job just as well. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Add HAVE_FUNCTION_TRACE_MCOUNT_TEST and tidy up.	David S. Miller	2	-7/+16
	Check function_trace_stop at ftrace_caller Toss mcount_call and dummy call of ftrace_stub, unnecessary. Document problems we'll have if the final kernel image link ever turns on relaxation. Properly size 'ftrace_call' so it looks right when inspecting instructions under gdb et al. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Adjust __raw_local_irq_save() to cooperate in NMIs.	David S. Miller	1	-2/+12
	If we are in an NMI then doing a plain raw_local_irq_disable() will write PIL_NORMAL_MAX into %pil, which is lower than PIL_NMI, and thus we'll re-enable NMIs and recurse. Doing a simple: %pil = %pil \| PIL_NORMAL_MAX does what we want, if we're already at PIL_NMI (15) we leave it at that setting, else we set it to PIL_NORMAL_MAX (14). This should get the function tracer working on sparc64. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	sparc64: Use kstack_valid() in die_if_kernel().	David S. Miller	1	-23/+3
	This gets rid of a local function (is_kernel_stack()) which tries to do the same thing, yet poorly in that it doesn't handle IRQ stacks properly. Signed-off-by: David S. Miller <[email protected]>
2010-04-12	virtio_net: missing sg_init_table	Shirley Ma	1	-0/+2
	Add missing sg_init_table for sg_set_buf in virtio_net which induced in defer skb patch. Reported-by: Thomas Müller <[email protected]> Tested-by: Thomas Müller <[email protected]> Signed-off-by: Shirley Ma <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2010-04-12	Linux 2.6.34-rc4	Linus Torvalds	1	-1/+1

2010-04-12	Merge branch 'anonvma'	Linus Torvalds	2	-43/+84
	* anonvma: anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma anon_vma: clone the anon_vma chain in the right order vma_adjust: fix the copying of anon_vma chains Simplify and comment on anon_vma re-use for anon_vma_prepare()
2010-04-12	Merge master.kernel.org:/home/rmk/linux-2.6-arm	Linus Torvalds	20	-77/+422
	* master.kernel.org:/home/rmk/linux-2.6-arm: (21 commits) ARM: Fix ioremap_cached()/ioremap_wc() for SMP platforms ARM: 6043/1: AT91 slow-clock resume: Don't wait for a disabled PLL to lock ARM: 6031/1: fix Thumb-2 decompressor ARM: 6029/1: ep93xx: gpio.c: local functions should be static ARM: 6028/1: ARM: add MAINTAINERS for U300 ARM: 6024/1: bcmring: fix missing down on semaphore in dma.c MXC: mach_armadillo5x0: Add USB Host support. ARM mach-mx3: duplicated include ARM mach-mx3: duplicated include imx31: add watchdog device on litekit board. imx3: Add watchdog platform device support MXC: mach-mx31_3ds: add support for freescale mc13783 power management device. MXC: mach-mx31_3ds: Add SPI1 device support. MXC: mach-mx31_3ds: Add support for on board NAND Flash. MXC: mach-mx31_3ds: Update variable names over recent mach name modification. imx31: fix parent clock for rtc i.MX51: remove NFC AXI static mapping i.MX51: determine silicon revision dynamically i.MX51: map TZIC dynamically i.MX51: Use correct clock for gpt ...
2010-04-12	Merge branch 'master' of ↵	Linus Torvalds	2	-5/+21
	git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-unstable: Btrfs: make sure the chunk allocator doesn't create zero length chunks Btrfs: fix data enospc check overflow
2010-04-12	Merge branch 'for_linus' of ↵	Linus Torvalds	3	-6/+16
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: quota: Fix possible dq_flags corruption quota: Hide warnings about writes to the filesystem before quota was turned on ext3: symlink must be handled via filesystem specific operation ext2: symlink must be handled via filesystem specific operation
2010-04-12	Merge branch 'for_linus' of ↵	Linus Torvalds	5	-10/+16
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6: udf: add speciffic ->setattr callback udf: potential integer overflow
2010-04-12	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	44	-614/+1046
	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: (36 commits) MIPS: Calculate proper ebase value for 64-bit kernels MIPS: Alchemy: DB1200: Remove custom wait implementation MIPS: Big Sur: Make defconfig more useful. MIPS: Fix __vmalloc() etc. on MIPS for non-GPL modules MIPS: Sibyte: Fix M3 TLB exception handler workaround. MIPS: BCM63xx: Fix build failure in board_bcm963xx.c MIPS: uasm: Add OR instruction. MIPS: Sibyte: Apply M3 workaround only on affected chip types and versions. MIPS: BCM63xx: Initialize gpio_out_low & out_high to current value at boot. MIPS: BCM63xx: Register SSB SPROM fallback in board's first stage callback MIPS: BCM63xx: Fix typo in cpu-feature-overrides file. MIPS: BCM63xx: Add support for second uart. MIPS: BCM63xx: Fix double gpio registration. MIPS: BCM63xx: Add DWVS0 board MIPS: BCM63xx: Add the RTA1025W-16 BCM6348-based board to suppported boards. MIPS: BCM63xx: Fix BCM6338 and BCM6345 gpio count MIPS: libgcc.h: Checkpatch cleanup MIPS: Loongson-2F: Flush the branch target history in BTB and RAS MIPS: Move signal trampolines off of the stack. MIPS: Preliminary VDSO ...
2010-04-12	Merge branch 'for-2.6.34' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	1	-1/+4
	* 'for-2.6.34' of git://linux-nfs.org/~bfields/linux: svcrdma: RDMA support not yet compatible with RPC6
2010-04-12	Merge branch 'for-linus' of ↵	Linus Torvalds	3	-3/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: fix typo "numer" -> "number" in alloc.c nilfs2: Remove an uninitialization warning in nilfs_btree_propagate_v() nilfs2: fix a wrong type conversion in nilfs_ioctl()
2010-04-12	anonvma: when setting up page->mapping, we need to pick the _oldest_ anonvma	Linus Torvalds	1	-2/+13
	Otherwise we might be mapping in a page in a new mapping, but that page (through the swapcache) would later be mapped into an old mapping too. The page->mapping must be the case that works for everybody, not just the mapping that happened to page it in first. Here's the scenario: - page gets allocated/mapped by process A. Let's call the anon_vma we associate the page with 'A' to keep it easy to track. - Process A forks, creating process B. The anon_vma in B is 'B', and has a chain that looks like 'B' -> 'A'. Everything is fine. - Swapping happens. The page (with mapping pointing to 'A') gets swapped out (perhaps not to disk - it's enough to assume that it's just not mapped any more, and lives entirely in the swap-cache) - Process B pages it in, which goes like this: do_swap_page -> page = lookup_swap_cache(entry); ... set_pte_at(mm, address, page_table, pte); page_add_anon_rmap(page, vma, address); And think about what happens here! In particular, what happens is that this will now be the "first" mapping of that page, so page_add_anon_rmap() used to do if (first) __page_set_anon_rmap(page, vma, address); and notice what anon_vma it will use? It will use the anon_vma for process B! What happens then? Trivial: process 'A' also pages it in (nothing happens, it's not the first mapping), and then process 'B' execve's or exits or unmaps, making anon_vma B go away. End result: process A has a page that points to anon_vma B, but anon_vma B does not exist any more. This can go on forever. Forget about RCU grace periods, forget about locking, forget anything like that. The bug is simply that page->mapping points to an anon_vma that was correct at one point, but was _not_ the one that was shared by all users of that possible mapping. Changing it to always use the deepest anon_vma in the anonvma chain gets us to the safest model. This can be improved in certain cases: if we know the page is private to just this particular mapping (for example, it's a new page, or it is the only swapcache entry), we could pick the top (most specific) anon_vma. But that's a future optimization. Make it _work_ reliably first. Reviewed-by: Rik van Riel <[email protected]> Acked-by: Johannes Weiner <[email protected]> Tested-by: Borislav Petkov <[email protected]> [ "What do you know, I think you fixed it!" ] Signed-off-by: Linus Torvalds <[email protected]>
2010-04-12	anon_vma: clone the anon_vma chain in the right order	Linus Torvalds	1	-1/+1
	We want to walk the chain in reverse order when cloning it, so that the order of the result chain will be the same as the order in the source chain. When we add entries to the chain, they go at the head of the chain, so we want to add the source head last. Reviewed-by: Rik van Riel <[email protected]> Acked-by: Johannes Weiner <[email protected]> Tested-by: Borislav Petkov <[email protected]> [ "No, it still oopses" ] Signed-off-by: Linus Torvalds <[email protected]>
2010-04-12	vma_adjust: fix the copying of anon_vma chains	Linus Torvalds	1	-16/+8
	When we move the boundaries between two vma's due to things like mprotect, we need to make sure that the anon_vma of the pages that got moved from one vma to another gets properly copied around. And that was not always the case, in this rather hard-to-follow code sequence. Clarify the code, and fix it so that it copies the anon_vma from the right source. Reviewed-by: Rik van Riel <[email protected]> Acked-by: Johannes Weiner <[email protected]> Tested-by: Borislav Petkov <[email protected]> [ "Yeah, not so much this one either" ] Signed-off-by: Linus Torvalds <[email protected]>
2010-04-12	Simplify and comment on anon_vma re-use for anon_vma_prepare()	Linus Torvalds	1	-24/+62
	This changes the anon_vma reuse case to require that we only reuse simple anon_vma's - ie the case when the vma only has a single anon_vma associated with it. This means that a reuse of an anon_vma from an adjacent vma will always guarantee that both vma's are associated not only with the same anon_vma, they will also have the same anon_vma chain (of just a single entry in this case). And since anon_vma re-use was the only case where the same anon_vma might be associated with different chains of anon_vma's, we now have the case that every vma that shares the same anon_vma will always also have the same chain. That makes it much easier to think about merging vma's that share the same anon_vma's: you can always just drop the other anon_vma chain in anon_vma_merge() since you know that they are always identical. This also splits up the function to validate the anon_vma re-use, and adds a lot of commentary about the possible races. Reviewed-by: Rik van Riel <[email protected]> Acked-by: Johannes Weiner <[email protected]> Tested-by: Borislav Petkov <[email protected]> [ "That didn't fix it" ] Signed-off-by: Linus Torvalds <[email protected]>