aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2011-05-13bridge: fix forwarding of IPv6Stephen Hemminger1-1/+1
The commit 6b1e960fdbd75dcd9bcc3ba5ff8898ff1ad30b6e bridge: Reset IPCB when entering IP stack on NF_FORWARD broke forwarding of IPV6 packets in bridge because it would call bp_parse_ip_options with an IPV6 packet. Reported-by: Noah Meyerhans <[email protected]> Signed-off-by: Stephen Hemminger <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-05-13drm/i915: Revert i915.semaphore=1 default from i915 mergeAndy Lutomirski1-1/+1
My Q67 / i7-2600 box has rev09 Sandy Bridge graphics. It hangs instantly when GNOME loads and it hangs so hard the reset button doesn't work. Setting i915.semaphore=0 fixes it. Semaphores were disabled in a1656b9090f7 ("drm/i915: Disable GPU semaphores by default") in 2.6.38 but were then re-enabled (by mistake?) by the merge 47ae63e0c2e5 ("Merge branch 'drm-intel-fixes' into drm-intel-next"). (It's worth noting that the offending change is i915_drv.c, which was not marked as a conflict - although a 'git show --cc' on the merge does show that neither parent had it set to 1) Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-13bonding,llc: Fix structure sizeof incompatibility for some PDUsVitalii Demianets2-9/+9
With some combinations of arch/compiler (e.g. arm-linux-gcc) the sizeof operator on structure returns value greater than expected. In cases when the structure is used for mapping PDU fields it may lead to unexpected results (such as holes and alignment problems in skb data). __packed prevents this undesired behavior. Signed-off-by: Vitalii Demianets <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-05-13vfs: micro-optimize acl_permission_check()Linus Torvalds1-1/+1
It's a hot function, and we're better off not mixing types in the mask calculations. The compiler just ends up mixing 16-bit and 32-bit operations, for no good reason. So do everything in 'unsigned int' rather than mixing 'unsigned int' masking with a 'umode_t' (16-bit) mode variable. This, together with the parent commit (47a150edc2ae: "Cache user_ns in struct cred") makes acl_permission_check() much nicer. Signed-off-by: Linus Torvalds <[email protected]>
2011-05-13Cache user_ns in struct credSerge E. Hallyn4-20/+27
If !CONFIG_USERNS, have current_user_ns() defined to (&init_user_ns). Get rid of _current_user_ns. This requires nsown_capable() to be defined in capability.c rather than as static inline in capability.h, so do that. Request_key needs init_user_ns defined at current_user_ns if !CONFIG_USERNS, so forward-declare that in cred.h if !CONFIG_USERNS at current_user_ns() define. Compile-tested with and without CONFIG_USERNS. Signed-off-by: Serge E. Hallyn <[email protected]> [ This makes a huge performance difference for acl_permission_check(), up to 30%. And that is one of the hottest kernel functions for loads that are pathname-lookup heavy. ] Signed-off-by: Linus Torvalds <[email protected]>
2011-05-13ocfs2/dlm: Target node death during resource migration leads to thread spinSunil Mushran1-0/+3
During resource migration, if the target node were to die, the thread doing the migration spins until the target node is not removed from the domain map. This patch slows the spin by making the thread wait for the recovery to kick in. Signed-off-by: Sunil Mushran <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2: Skip mount recovery for hard-ro mountsSunil Mushran1-0/+3
Patch skips mount recovery for hard-ro mounts which otherwise leads to an oops. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Mark Fasheh <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2/cluster: Heartbeat mismatch message improvedSunil Mushran1-17/+31
If o2hb finds unexpected values in the heartbeat slot, it prints a message "ERROR: Device "dm-6": another node is heartbeating in our slot!" This message could be misleading. This patch adds two more messages to help users better diagnose the problem. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Mark Fasheh <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2/cluster: Increase the live threshold for global heartbeatSunil Mushran1-1/+12
We have seen isolated cases (very few, I might add) of o2hb not detecting all live nodes on startup. One plausible reasoning for it is that other node had a hb io delay at the same time. The live threshold set at 2 (as low as it can be) could be increased to ameliorate the situation. But increasing the threshold directly affects mount time. Currently it takes around 5 secs to mount a volume in o2cb cluster with local heartbeat. Increasing the threshold will make mounts even slower. As the issue itself is rare, we have left things as they are for the local heartbeat mode. However we can improve the situation for global heartbeat mode as in that mode, we start the heartbeat much before the mount. This patch doubles the live threshold for the start of the first region in global heartbeat mode. Addresses internal Oracle bug#10635585. Signed-off-by: Sunil Mushran <[email protected]> Acked-by: Mark Fasheh <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2/dlm: Use negotiated o2dlm protocol versionSunil Mushran1-1/+2
Patch fixes a bug in the o2dlm protocol negotiation in that it is using the builtin version rather than the negotiated version during the domain join. This causes join errors when a node having kernel >= 2.6.37 joins a cluster with nodes having kernels < 2.6.37. This only affects the o2cb cluster stack. Signed-off-by: Sunil Mushran <[email protected]> Reported-by: Jacek Stepniewski <[email protected]> Acked-by: Mark Fasheh <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2: skip existing hole when removing the last extent_rec in punching-hole ↵Tristan Ye1-0/+12
codes. In the case of removing a partial extent record which covers a hole, current punching-hole logic will try to remove more than the length of whole extent record, which leads to the failure of following assert(fs/ocfs2/alloc.c): 5507 BUG_ON(cpos < le32_to_cpu(rec->e_cpos) || trunc_range > rec_range); This patch tries to skip existing hole at the last attempt of removing a partial extent record, what's more, it also adds some necessary comments for better understanding of punching-hole codes. Signed-off-by: Tristan Ye <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13ocfs2: Initialize data_ac (might be used uninitialized)Marcus Meissner1-1/+1
CLANG found that there is a path that has data_ac uninitialized, this place 2917 /* This gets us the dx_root */ 2918 ret = ocfs2_reserve_new_metadata_blocks(osb, 1, &meta_ac); 2919 if (ret) { 3 Taking true branch 2920 mlog_errno(ret); 2921 goto out; 4 Control jumps to line 3168 2922 } Goes to the out: label without data_ac being initialized. Ciao, Marcus Signed-Off-By: Marcus Meissner <[email protected]> Signed-off-by: Mark Fasheh <[email protected]> Signed-off-by: Joel Becker <[email protected]>
2011-05-13x86, mce, AMD: Fix leaving freed data in a listJulia Lawall1-0/+1
b may be added to a list, but is not removed before being freed in the case of an error. This is done in the corresponding deallocation function, so the code here has been changed to follow that. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E1,E2; identifier l; @@ *list_add(&E->l,E1); ... when != E1 when != list_del(&E->l) when != list_del_init(&E->l) when != E = E2 *kfree(E);// </smpl> Signed-off-by: Julia Lawall <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Robert Richter <[email protected]> Cc: Yinghai Lu <[email protected]> Cc: Andreas Herrmann <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-05-13OMAP3: set the core dpll clk rate in its set_rate functionAvinash H.M1-0/+1
The debug l3_ick/rate is not displaying the actual rate of the clock in hardware. This is because, the core dpll set_rate function doesn't update the clk.rate. After fixing, the l3_ick/rate is displaying proper values. Signed-off-by: Shweta Gulati <[email protected]> Signed-off-by: Avinash.H.M <[email protected]> Cc: Rajendra Nayak <[email protected]> Cc: Paul Wamsley <[email protected]> Acked-by: Paul Walmsley <[email protected]> Signed-off-by: Tony Lindgren <[email protected]>
2011-05-13btrfs: quasi-round-robin for chunk allocationArne Jansen2-305/+177
In a multi device setup, the chunk allocator currently always allocates chunks on the devices in the same order. This leads to a very uneven distribution, especially with RAID1 or RAID10 and an uneven number of devices. This patch always sorts the devices before allocating, and allocates the stripes on the devices with the most available space, as long as there is enough space available. In a low space situation, it first tries to maximize striping. The patch also simplifies the allocator and reduces the checks for corner cases. The simplification is done by several means. First, it defines the properties of each RAID type upfront. These properties are used afterwards instead of differentiating cases in several places. Second, the old allocator defined a minimum stripe size for each block group type, tried to find a large enough chunk, and if this fails just allocates a smaller one. This is now done in one step. The largest possible chunk (up to max_chunk_size) is searched and allocated. Because we now have only one pass, the allocation of the map (struct map_lookup) is moved down to the point where the number of stripes is already known. This way we avoid reallocation of the map. We still avoid allocating stripes that are not a multiple of STRIPE_SIZE.
2011-05-13btrfs: heed alloc_startArne Jansen1-4/+1
currently alloc_start is disregarded if the requested chunk size is bigger than (device size - alloc_start), but smaller than the device size. The only situation where I see this could have made sense was when a chunk equal the size of the device has been requested. This was possible as the allocator failed to take alloc_start into account when calculating the request chunk size. As this gets fixed by this patch, the workaround is not necessary anymore.
2011-05-13btrfs: move btrfs_cmp_device_free_bytes to super.cArne Jansen3-28/+26
this function won't be used here anymore, so move it super.c where it is used for df-calculation
2011-05-13drm/radeon/kms: add some evergreen/ni safe regsAlex Deucher2-0/+2
need to programmed from the userspace drivers. Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2011-05-13drm/radeon/kms: fix extended lvds info parsingAlex Deucher1-3/+11
On rev <= 1.1 tables, the offset is absolute, on newer tables, it's relative. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=700326 Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Jerome Glisse <[email protected]> Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]>
2011-05-13drm/radeon/kms: fix tiling reg on fusionAlex Deucher2-1/+5
The location of MC_ARB_RAMCFG changed on fusion. I've diffed all the other regs in evergreend.h and this is the only other reg that changed. Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2011-05-12rbd: fix leak of ops structSage Weil1-1/+5
The ops vector must be freed by the rbd_do_request caller. Signed-off-by: Sage Weil <[email protected]>
2011-05-12Merge branch 'for-linus' of ↵Linus Torvalds1-4/+0
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: SELinux: delete debugging printks from filename_trans rule processing
2011-05-12Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: net/9p/protocol.c: Fix a memory leak
2011-05-12Merge branch 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linuxLinus Torvalds1-1/+1
* 'for-2639-rc7/i2c-fixes' of git://git.fluff.org/bjdooks/linux: i2c: pnx: Fix crash due to wrong init of timer->data
2011-05-13Merge branch 'for-linus' of git://git.infradead.org/users/eparis/selinux ↵James Morris1-4/+0
into for-linus
2011-05-13i2c: pnx: Fix crash due to wrong init of timer->dataWolfram Sang1-1/+1
alg_data is already a pointer which must be passed directly. Reported-by: Dieter Ripp <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: Russell King <[email protected]> Cc: Ben Dooks <[email protected]> Signed-off-by: Ben Dooks <[email protected]>
2011-05-12ipv6: restore correct ECN handling on TCP xmitSteinar H. Gunderson1-3/+13
Since commit e9df2e8fd8fbc9 (Use appropriate sock tclass setting for routing lookup) we lost ability to properly add ECN codemarks to ipv6 TCP frames. It seems like TCP_ECN_send() calls INET_ECN_xmit(), which only sets the ECN bit in the IPv4 ToS field (inet_sk(sk)->tos), but after the patch, what's checked is inet6_sk(sk)->tclass, which is a completely different field. Close bug https://bugzilla.kernel.org/show_bug.cgi?id=34322 [Eric Dumazet] : added the INET_ECN_dontxmit() fix and replace macros by inline functions for clarity. Signed-off-by: Steinar H. Gunderson <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2011-05-12vsprintf: Turn kptr_restrict off by defaultIngo Molnar1-1/+1
kptr_restrict has been triggering bugs in apps such as perf, and it also makes the system less useful by default, so turn it off by default. This is how we generally handle security features that remove functionality, such as firewall code or SELinux - they have to be configured and activated from user-space. Distributions can turn kptr_restrict on again via this line in /etc/sysctrl.conf: kernel.kptr_restrict = 1 ( Also mark the variable __read_mostly while at it, as it's typically modified only once per bootup, or not at all. ) Signed-off-by: Ingo Molnar <[email protected]> Acked-by: David S. Miller <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-12net/9p/protocol.c: Fix a memory leakPedro Scarapicchia Junior1-0/+1
When p9pdu_readf() is called with "s" attribute, it allocates a pointer that will store a string. In p9dirent_read(), this pointer is not being released, leading to out of memory errors. This patch releases this pointer after string is copyed to dirent->d_name. Signed-off-by: Pedro Scarapicchia Junior <[email protected]> Signed-off-by: Eric Van Hensbergen <[email protected]>
2011-05-12x86: Fix UV BAU for non-consecutive nasidsCliff Wickman2-33/+76
This is a fix for the SGI Altix-UV Broadcast Assist Unit code, which is used for TLB flushing. Certain hardware configurations (that customers are ordering) cause nasids (numa address space id's) to be non-consecutive. Specifically, once you have more than 4 blades in a IRU (Individual Rack Unit - or 1/2 rack) but less than the maximum of 16, the nasid numbering becomes non-consecutive. This currently results in a 'catastrophic error' (CATERR) detected by the firmware during OS boot. The BAU is generating an 'INTD' request that is targeting a non-existent nasid value. Such configurations may also occur when a blade is configured off because of hardware errors. (There is one UV hub per blade.) This patch is required to support such configurations. The problem with the tlb_uv.c code is that is using the consecutive hub numbers as indices to the BAU distribution bit map. These are simply the ordinal position of the hub or blade within its partition. It should be using physical node numbers (pnodes), which correspond to the physical nasid values. Use of the hub number only works as long as the nasids in the partition are consecutive and increase with a stride of 1. This patch changes the index to be the pnode number, thus allowing nasids to be non-consecutive. It also provides a table in local memory for each cpu to translate target cpu number to target pnode and nasid. And it improves naming to properly reflect 'node' and 'uvhub' versus 'nasid'. Signed-off-by: Cliff Wickman <[email protected]> Cc: <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-05-12ne-h8300: Fix regression caused during net_device_ops conversionGeert Uytterhoeven2-9/+9
Changeset dcd39c90290297f6e6ed8a04bb20da7ac2b043c5 ("ne-h8300: convert to net_device_ops") broke ne-h8300 by adding 8390.o to the link. That meant that lib8390.c was included twice, once in ne-h8300.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2011-05-12hydra: Fix regression caused during net_device_ops conversionGeert Uytterhoeven2-8/+8
Changeset 5618f0d1193d6b051da9b59b0e32ad24397f06a4 ("hydra: convert to net_device_ops") broke hydra by adding 8390.o to the link. That meant that lib8390.c was included twice, once in hydra.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Signed-off-by: Geert Uytterhoeven <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2011-05-12zorro8390: Fix regression caused during net_device_ops conversionGeert Uytterhoeven2-7/+7
Changeset b6114794a1c394534659f4a17420e48cf23aa922 ("zorro8390: convert to net_device_ops") broke zorro8390 by adding 8390.o to the link. That meant that lib8390.c was included twice, once in zorro8390.c and once in 8390.c, subject to different macros. This patch reverts that by avoiding the wrappers in 8390.c. Fix based on commits 217cbfa856dc1cbc2890781626c4032d9e3ec59f ("mac8390: fix regression caused during net_device_ops conversion") and 4e0168fa4842e27795a75b205a510f25b62181d9 ("mac8390: fix build with NET_POLL_CONTROLLER"). Reported-by: Christian T. Steigies <[email protected]> Suggested-by: Finn Thain <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Tested-by: Christian T. Steigies <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2011-05-12Merge branch 'sfc-2.6.39' of ↵David S. Miller3-23/+53
git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6
2011-05-12SELinux: delete debugging printks from filename_trans rule processingEric Paris1-4/+0
The filename_trans rule processing has some printk(KERN_ERR ) messages which were intended as debug aids in creating the code but weren't removed before it was submitted. Remove them. Reported-by: Paul Bolle <[email protected]> Signed-off-by: Eric Paris <[email protected]>
2011-05-12Merge branch 'fix/asoc' of ↵Linus Torvalds6-9/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: WM8903: Fix Digital Capture Volume range ASoC: UDA134x: Remove POWER_OFF_ON_STANDBY define. ASoC: SSM2602: Fix reg_cache_size ASoC: SSM2602: Fix 'Mic Boost2' control ASoC: SSM2602: Properly annotate i2c probe and remove functions ASoC: sst_platform: add hw_free callback to fix resource leak ASoC: Don't crash on PM operations ASoC: JZ4740: Fix i2s shutdown
2011-05-12Merge branch 'stable/bug-fixes-for-rc7' of ↵Linus Torvalds5-125/+54
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/bug-fixes-for-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: x86/mm: Fix section mismatch derived from native_pagetable_reserve() x86,xen: introduce x86_init.mapping.pagetable_reserve Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""
2011-05-12Revert "drm/i915: Only enable the plane after setting the fb base (pre-ILK)"Linus Torvalds1-0/+2
This reverts commit 49183b2818de6899383bb82bc032f9344d6791ff. Quoth Franz Melchior: "This patch introduces a bug on my infamous "Acer Travelmate 5735Z-452G32Mnss": when KMS takes over, the frame buffer contents get completely garbled up on screen, with colored stripes and unreadable text (photo on request). Only when X11 is started, the screen gets restored again. Closing and re-opening the lid partly cures the mess, too: it makes the font readable, though horizontally stretched." Acked-by: Keith Packard <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Jesse Barnes <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-12Merge branch 'fbmem'Linus Torvalds2-26/+81
* fbmem: fbmem: make read/write/ioctl use the frame buffer at open time fbcon: add lifetime refcount to opened frame buffers
2011-05-12Merge branch 'for-linus' of ↵Linus Torvalds1-3/+10
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: ads7846 - remove unused variable from struct ads7845_ser_req Input: ads7846 - make transfer buffers DMA safe
2011-05-12x86/mm: Fix section mismatch derived from native_pagetable_reserve()Sedat Dilek1-1/+1
With CONFIG_DEBUG_SECTION_MISMATCH=y I see these warnings in next-20110415: LD vmlinux.o MODPOST vmlinux.o WARNING: vmlinux.o(.text+0x1ba48): Section mismatch in reference from the function native_pagetable_reserve() to the function .init.text:memblock_x86_reserve_range() The function native_pagetable_reserve() references the function __init memblock_x86_reserve_range(). This is often because native_pagetable_reserve lacks a __init annotation or the annotation of memblock_x86_reserve_range is wrong. This patch fixes the issue. Thanks to pipacs from PaX project for help on IRC. Acked-by: "H. Peter Anvin" <[email protected]> Signed-off-by: Sedat Dilek <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2011-05-12x86,xen: introduce x86_init.mapping.pagetable_reserveStefano Stabellini5-2/+54
Introduce a new x86_init hook called pagetable_reserve that at the end of init_memory_mapping is used to reserve a range of memory addresses for the kernel pagetable pages we used and free the other ones. On native it just calls memblock_x86_reserve_range while on xen it also takes care of setting the spare memory previously allocated for kernel pagetable pages from RO to RW, so that it can be used for other purposes. A detailed explanation of the reason why this hook is needed follows. As a consequence of the commit: commit 4b239f458c229de044d6905c2b0f9fe16ed9e01e Author: Yinghai Lu <[email protected]> Date: Fri Dec 17 16:58:28 2010 -0800 x86-64, mm: Put early page table high at some point init_memory_mapping is going to reach the pagetable pages area and map those pages too (mapping them as normal memory that falls in the range of addresses passed to init_memory_mapping as argument). Some of those pages are already pagetable pages (they are in the range pgt_buf_start-pgt_buf_end) therefore they are going to be mapped RO and everything is fine. Some of these pages are not pagetable pages yet (they fall in the range pgt_buf_end-pgt_buf_top; for example the page at pgt_buf_end) so they are going to be mapped RW. When these pages become pagetable pages and are hooked into the pagetable, xen will find that the guest has already a RW mapping of them somewhere and fail the operation. The reason Xen requires pagetables to be RO is that the hypervisor needs to verify that the pagetables are valid before using them. The validation operations are called "pinning" (more details in arch/x86/xen/mmu.c). In order to fix the issue we mark all the pages in the entire range pgt_buf_start-pgt_buf_top as RO, however when the pagetable allocation is completed only the range pgt_buf_start-pgt_buf_end is reserved by init_memory_mapping. Hence the kernel is going to crash as soon as one of the pages in the range pgt_buf_end-pgt_buf_top is reused (b/c those ranges are RO). For this reason we need a hook to reserve the kernel pagetable pages we used and free the other ones so that they can be reused for other purposes. On native it just means calling memblock_x86_reserve_range, on Xen it also means marking RW the pagetable pages that we allocated before but that haven't been used before. Another way to fix this is without using the hook is by adding a 'if (xen_pv_domain)' in the 'init_memory_mapping' code and calling the Xen counterpart, but that is just nasty. Signed-off-by: Stefano Stabellini <[email protected]> Acked-by: Yinghai Lu <[email protected]> Acked-by: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2011-05-12Revert "xen/mmu: Add workaround "x86-64, mm: Put early page table high""Konrad Rzeszutek Wilk1-123/+0
This reverts commit a38647837a411f7df79623128421eef2118b5884. It does not work with certain AMD machines. last_pfn = 0x100000 max_arch_pfn = 0x400000000 initial memory mapped : 0 - 02c3a000 Base memory trampoline at [ffff88000009b000] 9b000 size 20480 init_memory_mapping: 0000000000000000-0000000100000000 0000000000 - 0100000000 page 4k kernel direct mapping tables up to 100000000 @ ff7fb000-100000000 init_memory_mapping: 0000000100000000-00000001e0800000 0100000000 - 01e0800000 page 4k kernel direct mapping tables up to 1e0800000 @ 1df0f3000-1e0000000 xen: setting RW the range fffdc000 - 100000000 RAMDISK: 0203b000 - 02c3a000 No NUMA configuration found Faking a node at 0000000000000000-00000001e0800000 NUMA: Using 63 for the hash shift. Initmem setup node 0 0000000000000000-00000001e0800000 NODE_DATA [00000001dfffb000 - 00000001dfffffff] BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea PGD 0 Oops: 0003 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.39-0-virtual #6~smb1 RIP: e030:[<ffffffff81cf6a75>] [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP: e02b:ffffffff81c01e38 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 00000001e0800000 RCX: 0000000000001040 RDX: 0000000000004100 RSI: 0000000000000000 RDI: ffff8801dfffb000 RBP: ffffffff81c01e58 R08: 0000000000000020 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000bfe400 FS: 0000000000000000(0000) GS:ffffffff81cca000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 0000000001c03000 CR4: 0000000000000660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81c00000, task ffffffff81c0b020) Stack: 0000000000000040 0000000000000001 0000000000000000 ffffffffffffffff ffffffff81c01e88 ffffffff81cf6c25 0000000000000000 0000000000000000 ffffffff81cf687f 0000000000000000 ffffffff81c01ea8 ffffffff81cf6e45 Call Trace: [<ffffffff81cf6c25>] numa_register_memblks.constprop.3+0x150/0x181 [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6e45>] numa_init.part.2+0x1c/0x7c [<ffffffff81cf687f>] ? numa_add_memblk+0x7c/0x7c [<ffffffff81cf6f67>] numa_init+0x6c/0x70 [<ffffffff81cf7057>] initmem_init+0x39/0x3b [<ffffffff81ce5865>] setup_arch+0x64e/0x769 [<ffffffff815e43c1>] ? printk+0x51/0x53 [<ffffffff81cdf92b>] start_kernel+0xd4/0x3f3 [<ffffffff81cdf388>] x86_64_start_reservations+0x132/0x136 [<ffffffff81ce2ed4>] xen_start_kernel+0x588/0x58f Code: 41 00 00 48 8b 3c c5 a0 24 cc 81 31 c0 40 f6 c7 01 74 05 aa 66 ba ff 40 40 f6 c7 02 74 05 66 ab 83 ea 02 89 d1 c1 e9 02 f6 c2 02 <f3> ab 74 02 66 ab 80 e2 01 74 01 aa 49 63 c4 48 c1 eb 0c 44 89 RIP [<ffffffff81cf6a75>] setup_node_bootmem+0x18a/0x1ea RSP <ffffffff81c01e38> CR2: 0000000000000000 ---[ end trace a7919e7f17c0a725 ]--- Kernel panic - not syncing: Attempted to kill the idle task! Pid: 0, comm: swapper Tainted: G D 2.6.39-0-virtual #6~smb1 Reported-by: Stefan Bader <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2011-05-12btrfs: use unsigned type for single bit bitfieldDavid Sterba1-4/+4
Signed-off-by: David Sterba <[email protected]>
2011-05-12btrfs: use printk_ratelimited instead of printk_ratelimitDavid Sterba2-24/+10
As per printk_ratelimit comment, it should not be used. Signed-off-by: David Sterba <[email protected]>
2011-05-12Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix oops in revalidate when called with NULL nameidata
2011-05-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds5-9/+19
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc32: Fixed unaligned memory copying in function __csum_partial_copy_sparc_generic sparc32: fix sparcstation 5 boot sparc32: fix section mismatch warnings in apc, pmc and time_32
2011-05-12Merge branch 'fixes' of master.kernel.org:/home/rmk/linux-2.6-armLinus Torvalds10-71/+109
* 'fixes' of master.kernel.org:/home/rmk/linux-2.6-arm: ARM: 6870/1: The mandatory barrier rmb() must be a dsb() in for device accesses ARM: 6892/1: handle ptrace requests to change PC during interrupted system calls ARM: 6890/1: memmap: only free allocated memmap entries when using SPARSEMEM ARM: zImage: the page table memory must be considered before relocation ARM: zImage: make sure not to relocate on top of the relocation code ARM: zImage: Fix bad SP address after relocating kernel ARM: zImage: make sure the stack is 64-bit aligned ARM: RiscPC: acornfb: fix section mismatches ARM: RiscPC: etherh: fix section mismatches
2011-05-12fbmem: make read/write/ioctl use the frame buffer at open timeLinus Torvalds1-16/+34
read/write/ioctl on a fbcon file descriptor has traditionally used the fbcon not when it was opened, but as it was at the time of the call. That makes no sense, but the lack of sense is much more obvious now that we properly ref-count the usage - it means that the ref-counting doesn't actually protect operations we do on the frame buffer. This changes it to look at the fb_info that we got at open time, but in order to avoid using a frame buffer long after it has been unregistered, we do verify that it is still current, and return -ENODEV if not. Acked-by: Tim Gardner <[email protected]> Tested-by: Daniel J Blueman <[email protected]> Tested-by: Anca Emanuel <[email protected]> Cc: Bruno Prémont <[email protected]> Cc: Alan Cox <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Andy Whitcroft <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2011-05-12fbcon: add lifetime refcount to opened frame buffersLinus Torvalds2-10/+47
This just adds the refcount and the new registration lock logic. It does not (for example) actually change the read/write/ioctl routines to actually use the frame buffer that was opened: those function still end up alway susing whatever the current frame buffer is at the time of the call. Without this, if something holds the frame buffer open over a framebuffer switch, the close() operation after the switch will access a fb_info that has been free'd by the unregistering of the old frame buffer. (The read/write/ioctl operations will normally not cause problems, because they will - illogically - pick up the new fbcon instead. But a switch that happens just as one of those is going on might see problems too, the window is just much smaller: one individual op rather than the whole open-close sequence.) This use-after-free is apparently fairly easily triggered by the Ubuntu 11.04 boot sequence. Acked-by: Tim Gardner <[email protected]> Tested-by: Daniel J Blueman <[email protected]> Tested-by: Anca Emanuel <[email protected]> Cc: Bruno Prémont <[email protected]> Cc: Alan Cox <[email protected]> Cc: Paul Mundt <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Andy Whitcroft <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>