aboutsummaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)AuthorFilesLines
2015-09-08nios2: remove unused statistic countersBernd Weiberg1-18/+0
Removed some statistic counters to improve the performance of the handler. Signed-off-by: Bernd Weiberg <[email protected]> Signed-off-by: Ley Foon Tan <[email protected]>
2015-09-08nios2: fixed variable imm16 to s16Bernd Weiberg1-1/+1
Fxid variable imm16 to s16 instead of u16, offset might be negative. Signed-off-by: Bernd Weiberg <[email protected]> Signed-off-by: Ley Foon Tan <[email protected]>
2015-09-08nios2/time: Migrate to new 'set-state' interfaceViresh Kumar1-20/+29
Migrate nios2 driver to the new 'set-state' interface provided by clockevents core, the earlier 'set-mode' interface is marked obsolete now. This also enables us to implement callbacks for new states of clockevent devices, for example: ONESHOT_STOPPED. Cc: Ley Foon Tan <[email protected]> Cc: Tobias Klauser <[email protected]> Cc: Herbert Xu <[email protected]> Cc: Dmitry Torokhov <[email protected]> Cc: [email protected] Signed-off-by: Viresh Kumar <[email protected]> Acked-by: Ley Foon Tan <[email protected]>
2015-09-07ARM: 8429/1: disable GCC SRA optimizationArd Biesheuvel1-0/+8
While working on the 32-bit ARM port of UEFI, I noticed a strange corruption in the kernel log. The following snprintf() statement (in drivers/firmware/efi/efi.c:efi_md_typeattr_format()) snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]", was producing the following output in the log: | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | |WB|WT|WC|UC]* | | | | | |WB|WT|WC|UC] |RUN| | | | | | | |UC] |RUN| | | | | | | |UC] As it turns out, this is caused by incorrect code being emitted for the string() function in lib/vsprintf.c. The following code if (!(spec.flags & LEFT)) { while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } } for (i = 0; i < len; ++i) { if (buf < end) *buf = *s; ++buf; ++s; } while (len < spec.field_width--) { if (buf < end) *buf = ' '; ++buf; } when called with len == 0, triggers an issue in the GCC SRA optimization pass (Scalar Replacement of Aggregates), which handles promotion of signed struct members incorrectly. This is a known but as yet unresolved issue. (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932). In this particular case, it is causing the second while loop to be executed erroneously a single time, causing the additional space characters to be printed. So disable the optimization by passing -fno-ipa-sra. Cc: <[email protected]> Acked-by: Nicolas Pitre <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> Signed-off-by: Russell King <[email protected]>
2015-09-07powerpc/powernv/pci-ioda: fix kdump with non-power-of-2 crashkernel=Nishanth Aravamudan1-2/+8
The 32-bit TCE table initialization relies on the DMA window having a size equal to a power of 2 (and checks for it explicitly). But crashkernel= has no constraint that requires a power-of-2 be specified. This causes the kdump kernel to fail to boot as none of the PCI devices (including the disk controller) are successfully initialized. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Signed-off-by: Nishanth Aravamudan <[email protected]> Cc: [email protected] # 4.2 Tested-by: Jan Stancek <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2015-09-07powerpc/powernv/pci-ioda: fix 32-bit TCE table init in kdump kernelNishanth Aravamudan1-1/+9
When attempting to kdump with the 4.2 kernel, we see for each PCI device: pci 0003:01 : [PE# 000] Assign DMA32 space pci 0003:01 : [PE# 000] Setting up 32-bit TCE table at 0..80000000 pci 0003:01 : [PE# 000] Failed to create 32-bit TCE table, err -22 PCI: Domain 0004 has 8 available 32-bit DMA segments PCI: 4 PE# for a total weight of 70 pci 0004:01 : [PE# 002] Assign DMA32 space pci 0004:01 : [PE# 002] Setting up 32-bit TCE table at 0..80000000 pci 0004:01 : [PE# 002] Failed to create 32-bit TCE table, err -22 pci 0004:0d : [PE# 005] Assign DMA32 space pci 0004:0d : [PE# 005] Setting up 32-bit TCE table at 0..80000000 pci 0004:0d : [PE# 005] Failed to create 32-bit TCE table, err -22 pci 0004:0e : [PE# 006] Assign DMA32 space pci 0004:0e : [PE# 006] Setting up 32-bit TCE table at 0..80000000 pci 0004:0e : [PE# 006] Failed to create 32-bit TCE table, err -22 pci 0004:10 : [PE# 008] Assign DMA32 space pci 0004:10 : [PE# 008] Setting up 32-bit TCE table at 0..80000000 pci 0004:10 : [PE# 008] Failed to create 32-bit TCE table, err -22 and eventually the kdump kernel fails to boot as none of the PCI devices (including the disk controller) are successfully initialized. The EINVAL response is because the DMA window (the 2GB base window) is larger than the kdump kernel's reserved memory (crashkernel=, in this case specified to be 1024M). The check in question, if ((window_size > memory_hotplug_max()) || !is_power_of_2(window_size)) is a valid sanity check for pnv_pci_ioda2_table_alloc_pages(), so adjust the caller to pass in a smaller window size if our maximum memory value is smaller than the DMA window. After this change, the PCI devices successfully set up the 32-bit TCE table and kdump succeeds. The problem was seen on a Firestone machine originally. Fixes: aca6913f5551 ("powerpc/powernv/ioda2: Introduce helpers to allocate TCE pages") Cc: [email protected] # 4.2 Signed-off-by: Nishanth Aravamudan <[email protected]> Reviewed-by: Alexey Kardashevskiy <[email protected]> [mpe: Coding style pedantry, use u64, change the indentation] Signed-off-by: Michael Ellerman <[email protected]>
2015-09-06Silence compiler warning in arch/x86/kvm/emulate.cValdis Kletnieks1-1/+1
Compiler warning: CC [M] arch/x86/kvm/emulate.o arch/x86/kvm/emulate.c: In function "__do_insn_fetch_bytes": arch/x86/kvm/emulate.c:814:9: warning: "linear" may be used uninitialized in this function [-Wmaybe-uninitialized] GCC is smart enough to realize that the inlined __linearize may return before setting the value of linear, but not smart enough to realize the same X86EMU_CONTINUE blocks actual use of the value. However, the value of 'linear' can only be set to one value, so hoisting the one line of code upwards makes GCC happy with the code. Reported-by: Aruna Hewapathirane <[email protected]> Tested-by: Aruna Hewapathirane <[email protected]> Signed-off-by: Valdis Kletnieks <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2015-09-06kvm: compile process_smi_save_seg_64() only for x86_64Alexander Kuleshov1-0/+2
The process_smi_save_seg_64() function called only in the process_smi_save_state_64() if the CONFIG_X86_64 is set. This patch adds #ifdef CONFIG_X86_64 around process_smi_save_seg_64() to prevent following warning message: arch/x86/kvm/x86.c:5946:13: warning: ‘process_smi_save_seg_64’ defined but not used [-Wunused-function] static void process_smi_save_seg_64(struct kvm_vcpu *vcpu, char *buf, int n) ^ Signed-off-by: Alexander Kuleshov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2015-09-06KVM: x86: avoid uninitialized variable warningPaolo Bonzini1-3/+4
This does not show up on all compiler versions, so it sneaked into the first 4.3 pull request. The fix is to mimic the logic of the "print sptes" loop in the "fill array" loop. Then leaf and root can be both initialized unconditionally. Note that "leaf" now points to the first unused element of the array, not the last filled element. Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2015-09-05Merge branch 'for-linus' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "In this one: - d_move fixes (Eric Biederman) - UFS fixes (me; locking is mostly sane now, a bunch of bugs in error handling ought to be fixed) - switch of sb_writers to percpu rwsem (Oleg Nesterov) - superblock scalability (Josef Bacik and Dave Chinner) - swapon(2) race fix (Hugh Dickins)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (65 commits) vfs: Test for and handle paths that are unreachable from their mnt_root dcache: Reduce the scope of i_lock in d_splice_alias dcache: Handle escaped paths in prepend_path mm: fix potential data race in SyS_swapon inode: don't softlockup when evicting inodes inode: rename i_wb_list to i_io_list sync: serialise per-superblock sync operations inode: convert inode_sb_list_lock to per-sb inode: add hlist_fake to avoid the inode hash lock in evict writeback: plug writeback at a high level change sb_writers to use percpu_rw_semaphore shift percpu_counter_destroy() into destroy_super_work() percpu-rwsem: kill CONFIG_PERCPU_RWSEM percpu-rwsem: introduce percpu_rwsem_release() and percpu_rwsem_acquire() percpu-rwsem: introduce percpu_down_read_trylock() document rwsem_release() in sb_wait_write() fix the broken lockdep logic in __sb_start_write() introduce __sb_writers_{acquired,release}() helpers ufs_inode_get{frag,block}(): get rid of 'phys' argument ufs_getfrag_block(): tidy up a bit ...
2015-09-05Merge tag 'media/v4.3-1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - new DVB frontend drivers: ascot2e, cxd2841er, horus3a, lnbh25 - new HDMI capture driver: tc358743 - new driver for NetUP DVB new boards (netup_unidvb) - IR support for DVBSky cards (smipcie-ir) - Coda driver has gain macroblock tiling support - Renesas R-Car gains JPEG codec driver - new DVB platform driver for STi boards: c8sectpfe - added documentation for the media core kABI to device-drivers DocBook - lots of driver fixups, cleanups and improvements * tag 'media/v4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (297 commits) [media] c8sectpfe: Remove select on undefined LIBELF_32 [media] i2c: fix platform_no_drv_owner.cocci warnings [media] cx231xx: Use wake_up_interruptible() instead of wake_up_interruptible_nr() [media] tc358743: only queue subdev notifications if devnode is set [media] tc358743: add missing Kconfig dependency/select [media] c8sectpfe: Use %pad to print 'dma_addr_t' [media] DocBook media: Fix typo "the the" in xml files [media] tc358743: make reset gpio optional [media] tc358743: set direction of reset gpio using devm_gpiod_get [media] dvbdev: document most of the functions/data structs [media] dvb_frontend.h: document the struct dvb_frontend [media] dvb-frontend.h: document struct dtv_frontend_properties [media] dvb-frontend.h: document struct dvb_frontend_ops [media] dvb: Use DVBFE_ALGO_HW where applicable [media] dvb_frontend.h: document struct analog_demod_ops [media] dvb_frontend.h: Document struct dvb_tuner_ops [media] Docbook: Document struct analog_parameters [media] dvb_frontend.h: get rid of dvbfe_modcod [media] add documentation for struct dvb_tuner_info [media] dvb_frontend: document dvb_frontend_tune_settings ...
2015-09-05Merge branch 'akpm' (patches from Andrew)Linus Torvalds12-11/+24
Merge patch-bomb from Andrew Morton: - a few misc things - Andy's "ambient capabilities" - fs/nofity updates - the ocfs2 queue - kernel/watchdog.c updates and feature work. - some of MM. Includes Andrea's userfaultfd feature. [ Hadn't noticed that userfaultfd was 'default y' when applying the patches, so that got fixed in this merge instead. We do _not_ mark new features that nobody uses yet 'default y' - Linus ] * emailed patches from Andrew Morton <[email protected]>: (118 commits) mm/hugetlb.c: make vma_has_reserves() return bool mm/madvise.c: make madvise_behaviour_valid() return bool mm/memory.c: make tlb_next_batch() return bool mm/dmapool.c: change is_page_busy() return from int to bool mm: remove struct node_active_region mremap: simplify the "overlap" check in mremap_to() mremap: don't do uneccesary checks if new_len == old_len mremap: don't do mm_populate(new_addr) on failure mm: move ->mremap() from file_operations to vm_operations_struct mremap: don't leak new_vma if f_op->mremap() fails mm/hugetlb.c: make vma_shareable() return bool mm: make GUP handle pfn mapping unless FOLL_GET is requested mm: fix status code which move_pages() returns for zero page mm: memcontrol: bring back the VM_BUG_ON() in mem_cgroup_swapout() genalloc: add support of multiple gen_pools per device genalloc: add name arg to gen_pool_get() and devm_gen_pool_create() mm/memblock: WARN_ON when nid differs from overlap region Documentation/features/vm: add feature description and arch support status for batched TLB flush after unmap mm: defer flush of writable TLB entries mm: send one IPI per CPU to TLB flush all entries after unmapping pages ...
2015-09-05ARM: dts: AM437x: Add the internal and external clock nodes for rtcKeerthy4-0/+33
rtc can either be supplied from internal 32k clock or external crystal generated 32k clock. Internal clock is SOC specific and the external clock is board dependent. Adding the corresponding nodes. Signed-off-by: Keerthy <[email protected]> Acked-by: Tony Lindgren <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05ARM: config: Switch PXA27x platforms to use PXA RTC driverRob Herring6-6/+6
With the SA1100 and PXA RTC drivers be mutually exclusive and no longer sharing hardware, PXA27x/PXA3xx platforms must use the PXA RTC driver as the SA1100 platform device is no longer registered. This change should be almost transparent to userspace. Former users of pxa-rtc should be aware that 2 RTCs will be available on their kernels, rtc0 being sa1100-rtc and rtc1 being pxa-rtc. Any userspace relying on the fact that rtc0 was pxa-rtc should be fixed. As a consequence: - the first reboot after the switch will have the wrong time, - on dual boot platform where the other OS programs some logic into the sa1100 rtc IP, a lack of fix in userspace, ie. a kernel changing sa1100-rtc thinking it is pxa-rtc could have dire consequence, such as wiping the other OS data partition. (Thanks to Robert Jarmik for help on the above commit text.) Signed-off-by: Rob Herring <[email protected]> Acked-by: Robert Jarzmik <[email protected]> Cc: Daniel Mack <[email protected]> Cc: Haojian Zhuang <[email protected]> Cc: Sergey Lapin <[email protected]> Cc: Russell King <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Philipp Zabel <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05ARM: mmp: remove unused RTC register definitionsRob Herring1-23/+0
Now that register definitions have been moved to the driver, regs-rtc.h is no longer used and can be removed. Signed-off-by: Rob Herring <[email protected]> Cc: Eric Miao <[email protected]> Cc: Haojian Zhuang <[email protected]> Cc: Russell King <[email protected]> Cc: [email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05ARM: sa1100: remove unused RTC register definitionsRob Herring1-34/+0
Now that register definitions have been moved to the driver, we can remove them from machine specific code. Signed-off-by: Rob Herring <[email protected]> Cc: Russell King <[email protected]> Cc: [email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05ARM: pxa: add memory resource to SA1100 RTC deviceRob Herring1-16/+2
The drivers for the SA1100 and PXA RTCs are now mutually exclusive, so add the memory resource for the sa1100-rtc device. Since the memory resource is already present in the pxa_rtc_resources, that makes sa1100_rtc_resources and pxa_rtc_resources equivalent, so use pxa_rtc_resources for both devices and remove the duplicate sa1100_rtc_resources. Signed-off-by: Rob Herring <[email protected]> Cc: Daniel Mack <[email protected]> Cc: Haojian Zhuang <[email protected]> Acked-by: Robert Jarzmik <[email protected]> Cc: Russell King <[email protected]> Cc: [email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05rtc: pxa: convert to use shared sa1100 functionsRob Herring2-2/+0
Currently, the rtc-sa1100 and rtc-pxa drivers co-exist as rtc-pxa has a superset of functionality. Having 2 drivers sharing the same memory resource is not allowed by the driver model if resources are properly declared. This problem was avoided by not adding memory resources to the SA1100 RTC driver, but that prevents clean-up of the SA1100 driver. This commit converts the PXA RTC to use the exported SA1100 RTC functions. Now the sa1100-rtc and pxa-rtc devices are mutually exclusive, so we must remove the sa1100-rtc from pxa27x and pxa3xx. Signed-off-by: Rob Herring <[email protected]> Cc: Daniel Mack <[email protected]> Cc: Haojian Zhuang <[email protected]> Cc: Robert Jarzmik <[email protected]> Cc: Russell King <[email protected]> Cc: Alessandro Zummo <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Alexandre Belloni <[email protected]>
2015-09-05x86/vm86: Block non-root vm86(old) if mmap_min_addr != 0Andy Lutomirski1-0/+27
vm86 exposes an interesting attack surface against the entry code. Since vm86 is mostly useless anyway if mmap_min_addr != 0, just turn it off in that case. There are some reports that vbetool can work despite setting mmap_min_addr to zero. This shouldn't break that use case, as CAP_SYS_RAWIO already overrides mmap_min_addr. Suggested-by: Linus Torvalds <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Austin S Hemmelgarn <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Josh Boyer <[email protected]> Cc: Kees Cook <[email protected]> Cc: Matthew Garrett <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stas Sergeev <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-09-05Merge branch 'linus' into x86/urgent, to be able to merge a dependent fixIngo Molnar1060-24161/+39678
Signed-off-by: Ingo Molnar <[email protected]>
2015-09-04genalloc: add name arg to gen_pool_get() and devm_gen_pool_create()Vladimir Zapolskiy4-4/+4
This change modifies gen_pool_get() and devm_gen_pool_create() client interfaces adding one more argument "name" of a gen_pool object. Due to implementation gen_pool_get() is capable to retrieve only one gen_pool associated with a device even if multiple gen_pools are created, fortunately right at the moment it is sufficient for the clients, hence provide NULL as a valid argument on both producer devm_gen_pool_create() and consumer gen_pool_get() sides. Because only one created gen_pool per device is addressable, explicitly add a restriction to devm_gen_pool_create() to create only one gen_pool per device, this implies two possible error codes returned by the function, account it on client side (only misc/sram). This completes client side changes related to genalloc updates. [[email protected]: gen_pool_get() cleanup] Signed-off-by: Vladimir Zapolskiy <[email protected]> Cc: Philipp Zabel <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Russell King <[email protected]> Cc: Nicolas Ferre <[email protected]> Cc: Alexandre Belloni <[email protected]> Cc: Jean-Christophe Plagniol-Villard <[email protected]> Cc: Shawn Guo <[email protected]> Cc: Sascha Hauer <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04mm: send one IPI per CPU to TLB flush all entries after unmapping pagesMel Gorman2-0/+7
An IPI is sent to flush remote TLBs when a page is unmapped that was potentially accesssed by other CPUs. There are many circumstances where this happens but the obvious one is kswapd reclaiming pages belonging to a running process as kswapd and the task are likely running on separate CPUs. On small machines, this is not a significant problem but as machine gets larger with more cores and more memory, the cost of these IPIs can be high. This patch uses a simple structure that tracks CPUs that potentially have TLB entries for pages being unmapped. When the unmapping is complete, the full TLB is flushed on the assumption that a refill cost is lower than flushing individual entries. Architectures wishing to do this must give the following guarantee. If a clean page is unmapped and not immediately flushed, the architecture must guarantee that a write to that linear address from a CPU with a cached TLB entry will trap a page fault. This is essentially what the kernel already depends on but the window is much larger with this patch applied and is worth highlighting. The architecture should consider whether the cost of the full TLB flush is higher than sending an IPI to flush each individual entry. An additional architecture helper called flush_tlb_local is required. It's a trivial wrapper with some accounting in the x86 case. The impact of this patch depends on the workload as measuring any benefit requires both mapped pages co-located on the LRU and memory pressure. The case with the biggest impact is multiple processes reading mapped pages taken from the vm-scalability test suite. The test case uses NR_CPU readers of mapped files that consume 10*RAM. Linear mapped reader on a 4-node machine with 64G RAM and 48 CPUs 4.2.0-rc1 4.2.0-rc1 vanilla flushfull-v7 Ops lru-file-mmap-read-elapsed 159.62 ( 0.00%) 120.68 ( 24.40%) Ops lru-file-mmap-read-time_range 30.59 ( 0.00%) 2.80 ( 90.85%) Ops lru-file-mmap-read-time_stddv 6.70 ( 0.00%) 0.64 ( 90.38%) 4.2.0-rc1 4.2.0-rc1 vanilla flushfull-v7 User 581.00 611.43 System 5804.93 4111.76 Elapsed 161.03 122.12 This is showing that the readers completed 24.40% faster with 29% less system CPU time. From vmstats, it is known that the vanilla kernel was interrupted roughly 900K times per second during the steady phase of the test and the patched kernel was interrupts 180K times per second. The impact is lower on a single socket machine. 4.2.0-rc1 4.2.0-rc1 vanilla flushfull-v7 Ops lru-file-mmap-read-elapsed 25.33 ( 0.00%) 20.38 ( 19.54%) Ops lru-file-mmap-read-time_range 0.91 ( 0.00%) 1.44 (-58.24%) Ops lru-file-mmap-read-time_stddv 0.28 ( 0.00%) 0.47 (-65.34%) 4.2.0-rc1 4.2.0-rc1 vanilla flushfull-v7 User 58.09 57.64 System 111.82 76.56 Elapsed 27.29 22.55 It's still a noticeable improvement with vmstat showing interrupts went from roughly 500K per second to 45K per second. The patch will have no impact on workloads with no memory pressure or have relatively few mapped pages. It will have an unpredictable impact on the workload running on the CPU being flushed as it'll depend on how many TLB entries need to be refilled and how long that takes. Worst case, the TLB will be completely cleared of active entries when the target PFNs were not resident at all. [[email protected]: trace tlb flush after disabling preemption in try_to_unmap_flush] Signed-off-by: Mel Gorman <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Cc: Dave Hansen <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04x86, mm: trace when an IPI is about to be sentMel Gorman1-0/+1
When unmapping pages it is necessary to flush the TLB. If that page was accessed by another CPU then an IPI is used to flush the remote CPU. That is a lot of IPIs if kswapd is scanning and unmapping >100K pages per second. There already is a window between when a page is unmapped and when it is TLB flushed. This series increases the window so multiple pages can be flushed using a single IPI. This should be safe or the kernel is hosed already. Patch 1 simply made the rest of the series easier to write as ftrace could identify all the senders of TLB flush IPIS. Patch 2 tracks what CPUs potentially map a PFN and then sends an IPI to flush the entire TLB. Patch 3 tracks when there potentially are writable TLB entries that need to be batched differently Patch 4 increases SWAP_CLUSTER_MAX to further batch flushes The performance impact is documented in the changelogs but in the optimistic case on a 4-socket machine the full series reduces interrupts from 900K interrupts/second to 60K interrupts/second. This patch (of 4): It is easy to trace when an IPI is received to flush a TLB but harder to detect what event sent it. This patch makes it easy to identify the source of IPIs being transmitted for TLB flushes on x86. Signed-off-by: Mel Gorman <[email protected]> Reviewed-by: Rik van Riel <[email protected]> Reviewed-by: Dave Hansen <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04userfaultfd: activate syscallAndrea Arcangeli2-0/+2
This activates the userfaultfd syscall. [[email protected]: activate syscall fix] [[email protected]: don't enable userfaultfd on powerpc] Signed-off-by: Andrea Arcangeli <[email protected]> Acked-by: Pavel Emelyanov <[email protected]> Cc: Sanidhya Kashyap <[email protected]> Cc: [email protected] Cc: "Kirill A. Shutemov" <[email protected]> Cc: Andres Lagar-Cavilla <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Peter Feiner <[email protected]> Cc: "Dr. David Alan Gilbert" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: "Huangpeng (Peter)" <[email protected]> Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04watchdog: rename watchdog_suspend() and watchdog_resume()Ulrich Obergfell1-2/+2
Rename watchdog_suspend() to lockup_detector_suspend() and watchdog_resume() to lockup_detector_resume() to avoid confusion with the watchdog subsystem and to be consistent with the existing name lockup_detector_init(). Also provide comment blocks to explain the watchdog_running and watchdog_suspended variables and their relationship. Signed-off-by: Ulrich Obergfell <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Don Zickus <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04watchdog: use suspend/resume interface in fixup_ht_bug()Ulrich Obergfell1-2/+5
Remove watchdog_nmi_disable_all() and watchdog_nmi_enable_all() since these functions are no longer needed. If a subsystem has a need to deactivate the watchdog temporarily, it should utilize the watchdog_suspend() and watchdog_resume() functions. [[email protected]: fix build with CONFIG_LOCKUP_DETECTOR=m] Signed-off-by: Ulrich Obergfell <[email protected]> Reviewed-by: Aaron Tomlin <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Don Zickus <[email protected]> Cc: Ulrich Obergfell <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Chris Metcalf <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04kernel/watchdog: move NMI function header declarations from watchdog.h to nmi.hGuenter Roeck1-1/+1
The kernel's NMI watchdog has nothing to do with the watchdog subsystem. Its header declarations should be in linux/nmi.h, not linux/watchdog.h. The code provided two sets of dummy functions if HARDLOCKUP_DETECTOR is not configured, one in the include file and one in kernel/watchdog.c. Remove the dummy functions from kernel/watchdog.c and use those from the include file. Signed-off-by: Guenter Roeck <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Peter Zijlstra (Intel) <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Don Zickus <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-04sh: use PFN_DOWN macroAlexander Kuleshov2-4/+4
Replace ((x) >> PAGE_SHIFT) with the predefined PFN_DOWN macro. Signed-off-by: Alexander Kuleshov <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2015-09-05CRISv10: delete unused lib/dmacopy.cRabin Vincent1-42/+0
This file is never built. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv10: delete unused lib/old_checksum.cRabin Vincent1-86/+0
This file is never built. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: fix switch_mm() lockdep splatRabin Vincent1-1/+8
With lockdep support implemented on CRISv32, we get the following splat. switch_mm() can be called both from the scheduler() (with interrupts disabled) and from flush_old_exec (via activate_mm()), with interrupts enabled. Fix it by disabling interrupts in activate_mm(), similar to powerpc and hexagon. t====================================================== [ INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected ] 3.19.0-08802-g20bc9f1-dirty #323 Not tainted ------------------------------------------------------ init/1 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (mmu_context_lock){+.+...}, at: [<c0009290>] switch_mm+0x22/0xc6 and this task is already holding: (&rq->lock){-.-.-.}, at: [<c01a0756>] __schedule+0x5e/0x648 which would create a new lock dependency: (&rq->lock){-.-.-.} -> (mmu_context_lock){+.+...} but this new dependency connects a HARDIRQ-irq-safe lock: (&rq->lock){-.-.-.} ... which became HARDIRQ-irq-safe at: [<c002b03c>] scheduler_tick+0x28/0x5e [<c0007c6c>] timer_interrupt+0x4e/0x6a [<c0043ac4>] handle_irq_event_percpu+0x54/0x13c [<c004343c>] generic_handle_irq+0x2a/0x36 to a HARDIRQ-irq-unsafe lock: (mmu_context_lock){+.+...} ... which became HARDIRQ-irq-unsafe at: ... [<c0039e60>] __lock_acquire+0x8f8/0x1d9c [<c0009290>] switch_mm+0x22/0xc6 [<c009c260>] flush_old_exec+0x500/0x5d4 [<c00da4c6>] load_elf_phdrs+0x7a/0x84 [<c00dbdb0>] load_elf_binary+0x21c/0x13b4 [<c009cdb6>] do_execve+0x22/0x2c [<c001dcf2>] ____call_usermodehelper+0x0/0x154 [<c000581e>] ret_from_kernel_thread+0xe/0x14 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(mmu_context_lock); local_irq_disable(); lock(&rq->lock); lock(mmu_context_lock); <Interrupt> lock(&rq->lock); *** DEADLOCK *** 1 lock held by init/1: #0: (&rq->lock){-.-.-.}, at: [<c01a0756>] __schedule+0x5e/0x648 Call Trace: [<c019fe9e>] printk+0x0/0x4e [<c00368f8>] print_shortest_lock_dependencies+0x0/0x15c [<c0048628>] print_stack_trace+0x0/0x88 [<c0038912>] __lock_is_held+0x3e/0x5e [<c003b894>] lock_acquire+0x8a/0xcc [<c01a50c4>] _raw_spin_lock+0x44/0x7a [<c0009290>] switch_mm+0x22/0xc6 [<c01a06f8>] __schedule+0x0/0x648 [<c01a0d76>] schedule+0x36/0x7c [<c0037d04>] trace_hardirqs_on+0x0/0x1e [<c0004e18>] do_work_pending+0x30/0xd4 [<c000591a>] _work_pending+0xe/0x12 Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: enable LOCKDEP_SUPPORTRabin Vincent1-0/+4
Now that we have stack tracing and irq flags tracing support, we can also enable lockdep support Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: add STACKTRACE_SUPPORTRabin Vincent4-0/+88
Add stacktrace support, which is required for lockdep and tracing. The stack tracing simply looks at all kernel text symbols found on the stack, similar to the trap stack dumping code, which can also be converted to use this. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: annotate irq enable in idle loopRabin Vincent1-2/+2
Use a call to local_irq_enable() instead of incline asm so that the irqsoff latency tracer knows that interrupts are enabled here. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: add support for irqflags tracingRabin Vincent3-1/+20
Add support irqflags tracing, which is required for things like lockdep and ftrace. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic types.hRabin Vincent3-13/+1
CRIS' types.h is functionally identical to the asm-generic version. Effective diff: +#ifndef _ASM_GENERIC_TYPES_H +#define _ASM_GENERIC_TYPES_H + #include <asm-generic/int-ll64.h> + +#endif Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic shmbuf.hRabin Vincent2-42/+1
CRIS' shmbuf.h is equivalent to the asm-generic verison. Effective diff: -#ifndef _CRIS_SHMBUF_H -#define _CRIS_SHMBUF_H +#ifndef __ASM_GENERIC_SHMBUF_H +#define __ASM_GENERIC_SHMBUF_H + +#include <asm/bitsperlong.h> struct ipc64_perm shm_perm; size_t shm_segsz; __kernel_time_t shm_atime; +#if __BITS_PER_LONG != 64 unsigned long __unused1; +#endif __kernel_time_t shm_dtime; +#if __BITS_PER_LONG != 64 unsigned long __unused2; +#endif __kernel_time_t shm_ctime; +#if __BITS_PER_LONG != 64 unsigned long __unused3; +#endif __kernel_pid_t shm_cpid; __kernel_pid_t shm_lpid; - unsigned long shm_nattch; - unsigned long __unused4; - unsigned long __unused5; + __kernel_ulong_t shm_nattch; + __kernel_ulong_t __unused4; + __kernel_ulong_t __unused5; }; struct shminfo64 { - unsigned long shmmax; - unsigned long shmmin; - unsigned long shmmni; - unsigned long shmseg; - unsigned long shmall; - unsigned long __unused1; - unsigned long __unused2; - unsigned long __unused3; - unsigned long __unused4; + __kernel_ulong_t shmmax; + __kernel_ulong_t shmmin; + __kernel_ulong_t shmmni; + __kernel_ulong_t shmseg; + __kernel_ulong_t shmall; + __kernel_ulong_t __unused1; + __kernel_ulong_t __unused2; + __kernel_ulong_t __unused3; + __kernel_ulong_t __unused4; }; #endif Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic msgbuf.hRabin Vincent2-33/+1
CRIS' msgbuf.h is equivalent to the asm-generic version. Effective diff: -#ifndef _CRIS_MSGBUF_H -#define _CRIS_MSGBUF_H - - +#ifndef __ASM_GENERIC_MSGBUF_H +#define __ASM_GENERIC_MSGBUF_H +#include <asm/bitsperlong.h> struct msqid64_ds { struct ipc64_perm msg_perm; __kernel_time_t msg_stime; +#if __BITS_PER_LONG != 64 unsigned long __unused1; +#endif __kernel_time_t msg_rtime; +#if __BITS_PER_LONG != 64 unsigned long __unused2; +#endif __kernel_time_t msg_ctime; +#if __BITS_PER_LONG != 64 unsigned long __unused3; - unsigned long msg_cbytes; - unsigned long msg_qnum; - unsigned long msg_qbytes; +#endif + __kernel_ulong_t msg_cbytes; + __kernel_ulong_t msg_qnum; + __kernel_ulong_t msg_qbytes; __kernel_pid_t msg_lspid; __kernel_pid_t msg_lrpid; - unsigned long __unused4; - unsigned long __unused5; + __kernel_ulong_t __unused4; + __kernel_ulong_t __unused5; }; #endif Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic socket.hRabin Vincent2-92/+1
CRIS' socket.h is equivalent to the asm-generic version. Effective diff: -#ifndef _ASM_SOCKET_H -#define _ASM_SOCKET_H - - +#ifndef __ASM_GENERIC_SOCKET_H +#define __ASM_GENERIC_SOCKET_H #include <asm/sockios.h> #define SO_LINGER 13 #define SO_BSDCOMPAT 14 #define SO_REUSEPORT 15 +#ifndef SO_PASSCRED #define SO_PASSCRED 16 #define SO_PEERCRED 17 #define SO_RCVLOWAT 18 #define SO_SNDLOWAT 19 #define SO_RCVTIMEO 20 #define SO_SNDTIMEO 21 +#endif #define SO_SECURITY_AUTHENTICATION 22 Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic sembuf.hRabin Vincent2-25/+1
CRIS's sembuf.h is equivalent to the asm-generic version. Effective diff: -#ifndef _CRIS_SEMBUF_H -#define _CRIS_SEMBUF_H +#ifndef __ASM_GENERIC_SEMBUF_H +#define __ASM_GENERIC_SEMBUF_H +#include <asm/bitsperlong.h> struct semid64_ds { struct ipc64_perm sem_perm; __kernel_time_t sem_otime; +#if __BITS_PER_LONG != 64 unsigned long __unused1; +#endif __kernel_time_t sem_ctime; +#if __BITS_PER_LONG != 64 unsigned long __unused2; +#endif unsigned long sem_nsems; unsigned long __unused3; unsigned long __unused4; Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic sockios.hRabin Vincent2-13/+1
CRIS' sockios.h is equivalent to the asm-generic version. Effective diff: -#ifndef __ARCH_CRIS_SOCKIOS__ -#define __ARCH_CRIS_SOCKIOS__ +#ifndef __ASM_GENERIC_SOCKIOS_H +#define __ASM_GENERIC_SOCKIOS_H Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic auxvec.hRabin Vincent2-4/+1
CRIS's auxvec.h is empty just like the asm-generic version. Effective diff: -#ifndef __ASMCRIS_AUXVEC_H -#define __ASMCRIS_AUXVEC_H +#ifndef __ASM_GENERIC_AUXVEC_H +#define __ASM_GENERIC_AUXVEC_H + Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: use generic headers via KbuildRabin Vincent12-31/+10
Use Kbuild magic to include the generic headers. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-04Merge branch 'drm-next' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds3-7/+6
Pull drm updates from Dave Airlie: "This is the main pull request for the drm for 4.3. Nouveau is probably the biggest amount of changes in here, since it missed 4.2. Highlights below, along with the usual bunch of fixes. All stuff outside drm should have applicable acks. Highlights: - new drivers: freescale dcu kms driver - core: more atomic fixes disable some dri1 interfaces on kms drivers drop fb panic handling, this was just getting more broken, as more locking was required. new core fbdev Kconfig support - instead of each driver enable/disabling it struct_mutex cleanups - panel: more new panels cleanup Kconfig - i915: Skylake support enabled by default legacy modesetting using atomic infrastructure Skylake fixes GEN9 workarounds - amdgpu: Fiji support CGS support for amdgpu Initial GPU scheduler - off by default Lots of bug fixes and optimisations. - radeon: DP fixes misc fixes - amdkfd: Add Carrizo support for amdkfd using amdgpu. - nouveau: long pending cleanup to complete driver, fully bisectable which makes it larger, perfmon work more reclocking improvements maxwell displayport fixes - vmwgfx: new DX device support, supports OpenGL 3.3 screen targets support - mgag200: G200eW support G200e new revision support - msm: dragonboard 410c support, msm8x94 support, msm8x74v1 support yuv format support dma plane support mdp5 rotation initial hdcp - sti: atomic support - exynos: lots of cleanups atomic modesetting/pageflipping support render node support - tegra: tegra210 support (dc, dsi, dp/hdmi) dpms with atomic modesetting support - atmel: support for 3 more atmel SoCs new input formats, PRIME support. - dwhdmi: preparing to add audio support - rockchip: yuv plane support" * 'drm-next' of git://people.freedesktop.org/~airlied/linux: (1369 commits) drm/amdgpu: rename gmc_v8_0_init_compute_vmid drm/amdgpu: fix vce3 instance handling drm/amdgpu: remove ib test for the second VCE Ring drm/amdgpu: properly enable VM fault interrupts drm/amdgpu: fix warning in scheduler drm/amdgpu: fix buffer placement under memory pressure drm/amdgpu/cz: fix cz_dpm_update_low_memory_pstate logic drm/amdgpu: fix typo in dce11 watermark setup drm/amdgpu: fix typo in dce10 watermark setup drm/amdgpu: use top down allocation for non-CPU accessible vram drm/amdgpu: be explicit about cpu vram access for driver BOs (v2) drm/amdgpu: set MEC doorbell range for Fiji drm/amdgpu: implement burst NOP for SDMA drm/amdgpu: add insert_nop ring func and default implementation drm/amdgpu: add amdgpu_get_sdma_instance helper function drm/amdgpu: add AMDGPU_MAX_SDMA_INSTANCES drm/amdgpu: add burst_nop flag for sdma drm/amdgpu: add count field for the SDMA NOP packet v2 drm/amdgpu: use PT for VM sync on unmap drm/amdgpu: make wait_event uninterruptible in push_job ...
2015-09-05CRIS: UAPI: fix elf.h exportRabin Vincent4-1/+8
CRIS userspace (uClibc for one) expects asm/elf.h to be exported but this header appears to have gone missing at some point. Move it to uapi/ and export it. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: don't make asm/elf.h depend on asm/user.hRabin Vincent3-8/+7
We're going to export asm/elf.h; remove its dependencies on the non-exported asm/user.h and the unused asm/system.h include. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRIS: UAPI: fix ptrace.hRabin Vincent6-3/+8
The exported ptrace.h header on CRIS references an "arch" directory which does not exist. Fix this by having the variants in the same directory and including them conditionally, similar to other architectures. Signed-off-by: Rabin Vincent <[email protected]> Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: Squash compile warnings for axisflashmapJesper Nilsson1-4/+5
Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: Add GPIO driver to the default configsJesper Nilsson4-14/+12
Fix a number of small issues visible when GPIO is enabled: - Correct missing default for !ETRAXFS in Kconfig - Remove information on number of bits for some Kconfigs related to the GPIO, they are different in ETRAX FS and ARTPEC-3 - Fix compile warning in ARTPEC-3 GPIO driver Signed-off-by: Jesper Nilsson <[email protected]>
2015-09-05CRISv32: ETRAX FS: Squash warnings in pinmux driverJesper Nilsson1-2/+6
Squash the followng warnings arch/cris/arch-v32/mach-fs/pinmux.c: In function 'crisv32_pinmux_alloc_fixed': arch/cris/arch-v32/mach-fs/pinmux.c:104:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/cris/arch-v32/mach-fs/pinmux.c: In function 'crisv32_pinmux_dealloc_fixed': arch/cris/arch-v32/mach-fs/pinmux.c:238:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement] arch/cris/arch-v32/mach-fs/pinmux.c: In function '__crisv32_pinmux_alloc': arch/cris/arch-v32/mach-fs/pinmux.c:49:1: warning: control reaches end of non-void function [-Wreturn-type] Signed-off-by: Jesper Nilsson <[email protected]>