aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-06-26lightnvm: pblk: set metadata list for all I/OsJavier González2-38/+54
Set a dma area for all I/Os in order to read/write from/to the metadata stored on the per-sector out-of-bound area. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: choose optimal victim GC lineJavier González1-1/+15
At the moment, we separate the closed lines on three different list based on their number of valid sectors. GC recycles lines from each list based on capacity. Lines from each list are taken in a FIFO fashion. Since the number of lines is limited (it corresponds to the number of blocks in a LUN, which is somewhere between 1000-2000), we can afford scanning the lists to choose the optimal line to be recycled. This helps specially in lines with a high number of valid sectors. If the number of blocks per LUN increases, we will consider a more efficient policy. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: decouple bad block from line allocJavier González1-16/+37
Decouple bad block discovery from line allocation logic. This allows to return meaningful error codes in case of bad block discovery failure. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: simplify meta. memory allocationJavier González4-8/+8
smeta size will always be suitable for a kmalloc allocation. Simplify the code and leave the vmalloc fallback only for emeta, where the pblk configuration has an impact. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: issue multiplane reads if possibleJavier González4-12/+51
If a read request is sequential and its size aligns with a multi-plane page size, use the multi-plane hint to process the I/O in parallel in the controller. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: delete redundant buffer pointerJavier González7-41/+11
After refactoring the metadata path, the backpointer controlling synced I/Os in a line becomes unnecessary; metadata is scheduled on the write thread, thus we know when the end of the line is reached and act on it directly. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: delete redundant debug line statJavier González1-5/+3
Remove a legacy variable that helped verifying the consistency of the run-time metadata for the free line list. With the new metadata layout, this check is no longer necessary. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: sched. metadata on write threadJavier González8-285/+673
At the moment, line metadata is persisted on a separate work queue, that is kicked each time that a line is closed. The assumption when designing this was that freeing the write thread from creating a new write request was better than the potential impact of writes colliding on the media (user I/O and metadata I/O). Experimentation has proven that this assumption is wrong; collision can cause up to 25% of bandwidth and introduce long tail latencies on the write thread, which potentially cause user write threads to spend more time spinning to get a free entry on the write buffer. This patch moves the metadata logic to the write thread. When a line is closed, remaining metadata is written in memory and is placed on a metadata queue. The write thread then takes the metadata corresponding to the previous line, creates the write request and schedules it to minimize collisions on the media. Using this approach, we see that we can saturate the media's bandwidth, which helps reducing both write latencies and the spinning time for user writer threads. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: rename read request poolJavier González5-37/+38
Read requests allocate some extra memory to store its per I/O context. Instead of requiring yet another memory pool for other type of requests, generalize this context allocation (and change naming accordingly). Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: generalize erase pathJavier González6-90/+116
Erase I/Os are scheduled with the following goals in mind: (i) minimize LUNs collisions with write I/Os, and (ii) even out the price of erasing on every write, instead of putting all the burden on when garbage collection runs. This works well on the current design, but is specific to the default mapping algorithm. This patch generalizes the erase path so that other mapping algorithms can select an arbitrary line to be erased instead. It also gets rid of the erase semaphore since it creates jittering for user writes. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: expose max sec per write on sysfsJavier González4-1/+48
Allow to configure the number of maximum sectors per write command through sysfs. This makes it easier to tune write command sizes for different controller configurations. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: add debug stat for read cache hitsJavier González4-1/+10
Add a new debug counter to measure cache hits on the read path Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: pblk: spare double cpu_to_le64 calc.Javier González2-4/+5
Spare a double calculation on the fast write path. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: propagate right error code to targetJavier González1-1/+1
If nvme_alloc_request fails, propagate the right error, instead of assuming ENOMEM. Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26lightnvm: re-convert ppa format on I/O failureJavier González1-1/+7
In case of a failure when submitting a request, convert the ppa_list addresses to the target format so that it can interpret ppas for recovery Signed-off-by: Javier González <[email protected]> Signed-off-by: Matias Bjørling <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-06-26Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreamingLinus Torvalds1-1/+1
Pull c6x fixlet from Mark Salter: "Update maintainer email" * tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming: MAINTAINERS: update email address for C6x maintainer
2017-06-26Merge branch 'for-linus' of ↵Linus Torvalds1-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 bugfix from Martin Schwidefsky: "One last s390 patch for 4.12 Revert the re-IPL semantics back to the v4.7 state. It turned out that the memory layout may change due to memory hotplug if load-normal is used" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/ipl: revert Load Normal semantics for LPAR CCW-type re-IPL
2017-06-26perf machine: Fix segfault for kernel.kptr_restrict=2Jiri Olsa1-4/+6
Michael reported the segfault when kernel.kptr_restrict=2 is set. $ perf record ls ... perf: Segmentation fault Obtained 16 stack frames. ./perf(dump_stack+0x2d) [0x5068df] ./perf(sighandler_dump_stack+0x2d) [0x5069bf] ./perf() [0x43e47b] /lib64/libc.so.6(+0x3594f) [0x7f762004794f] /lib64/libc.so.6(strlen+0x26) [0x7f762009ef86] /lib64/libc.so.6(__strdup+0xd) [0x7f762009ecbd] ./perf(maps__set_kallsyms_ref_reloc_sym+0x4d) [0x51590f] ./perf(machine__create_kernel_maps+0x136) [0x50a7de] ./perf(perf_session__create_kernel_maps+0x2c) [0x510a81] ./perf(perf_session__new+0x13d) [0x510e23] ./perf() [0x43fd61] ./perf(cmd_record+0x704) [0x441823] ./perf() [0x4bc1a0] ./perf() [0x4bc40d] ./perf() [0x4bc55f] ./perf(main+0x2d5) [0x4bc939] Segmentation fault (core dumped) The reason is that with kernel.kptr_restrict=2, we don't get the symbol from machine__get_running_kernel_start, which we want to use in maps__set_kallsyms_ref_reloc_sym and we crash. Check the symbol name value before calling maps__set_kallsyms_ref_reloc_sym() and succeed without ref_reloc_sym being set. It's safe because we check its existence before we use it. Reported-by: Michael Petlan <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-06-26EDAC, pnd2: Make function sbi_send() staticColin Ian King1-2/+2
The function sbi_send() is local to just pnd2_edac.c and does not need to be in global scope, so make it static. Signed-off-by: Colin Ian King <[email protected]> Cc: Tony Luck <[email protected]> Cc: linux-edac <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Borislav Petkov <[email protected]>
2017-06-26x86/microcode: Make a couple of symbols staticColin Ian King2-2/+2
The helper function __load_ucode_amd() and pointer intel_ucode_patch do not need to be in global scope, so make them static. Fixes those sparse warnings: "symbol '__load_ucode_amd' was not declared. Should it be static?" "symbol 'intel_ucode_patch' was not declared. Should it be static?" Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected]
2017-06-26powerpc/32: Avoid miscompilation w/GCC 4.6.3 - don't inline copy_to/from_user()Michael Ellerman1-7/+1
Larry Finger reported that his Powerbook G4 was no longer booting with v4.12-rc, userspace was up but giving weird errors such as: udevd[64]: starting version 175 udevd[64]: Unable to receive ctrl message: Bad address. modprobe: chdir(4.12-rc1): No such file or directory He bisected the problem to commit 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER"). Al identified that the problem is actually a miscompilation by GCC 4.6.3, which is exposed by the above commit. Al also pointed out that inlining copy_to/from_user() is probably of little or no benefit, which is correct. Using Anton's copy_to_user benchmark, with a pathological single byte copy, we see a small increase in performance by *removing* inlining: Before (inlined): # time ./copy_to_user -w -l 1 -i 10000000 ( x 3 ) real 0m22.063s real 0m22.059s real 0m22.076s After: # time ./copy_to_user -w -l 1 -i 10000000 ( x 3 ) real 0m21.325s real 0m21.299s real 0m21.364s So as a small performance improvement and to avoid the miscompilation, drop inlining copy_to/from_user() on 32-bit. Fixes: 3448890c32c3 ("powerpc: get rid of zeroing, switch to RAW_COPY_USER") Reported-by: Larry Finger <[email protected]> Suggested-by: Al Viro <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-26drm/vmwgfx: Free hash table allocated by cmdbuf managed res mgrDeepak Rawat1-0/+1
The hash table created during vmw_cmdbuf_res_man_create was never freed. This causes memory leak in context creation. Added the corresponding drm_ht_remove in vmw_cmdbuf_res_man_destroy. Tested for memory leak by running piglit overnight and kernel memory is not inflated which earlier was. Cc: <[email protected]> Signed-off-by: Deepak Rawat <[email protected]> Reviewed-by: Sinclair Yeh <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
2017-06-26x86/mm/hotplug: Fix BUG_ON() after hot-remove by not freeing PUDJérôme Glisse1-1/+7
Since commit: af2cf278ef4f ("x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()") we no longer free PUDs so that we do not have to synchronize all PGDs on hot-remove/vfree(). But the new 5-level page table patchset reverted that for 4-level page tables, in the following commit: f2a6a7050109: ("x86: Convert the rest of the code to support p4d_t") This patch restores the damage and disables free_pud() if we are in the 4-level page table case, thus avoiding BUG_ON() after hot-remove. Signed-off-by: Jérôme Glisse <[email protected]> [ Clarified the changelog and the code comments. ] Reviewed-by: Kirill A. Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Logan Gunthorpe <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-26drm/i915: Disable EXEC_OBJECT_ASYNC when doing relocationsChris Wilson1-3/+14
If we write a relocation into the buffer, we require our own implicit synchronisation added after the start of the execbuf, outside of the user's control. As we may end up clflushing, or doing the patch itself on the GPU, asynchronously we need to look at the implicit serialisation on obj->resv and hence need to disable EXEC_OBJECT_ASYNC for this object. If the user does trigger a stall for relocations, we make sure the stall is complete enough so that the batch is not submitted before we complete those relocations. Fixes: 77ae9957897d ("drm/i915: Enable userspace to opt-out of implicit fencing") Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Jason Ekstrand <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> (cherry picked from commit 071750e550af46b5d3a84ad56c2a108c3e136284) [danvet: Resolve conflicts, resolution reviewed by Tvrtko on irc.] Signed-off-by: Daniel Vetter <[email protected]>
2017-06-26drm/i915: Hold struct_mutex for per-file stats in debugfs/i915_gem_objectChris Wilson1-1/+5
As we walk the obj->vma_list in per_file_stats(), we need to hold struct_mutex to prevent alteration of that list. Fixes: 1d2ac403ae3b ("drm: Protect dev->filelist with its own mutex") Fixes: c84455b4bacc ("drm/i915: Move debug only per-request pid tracking from request to ctx") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101460 Signed-off-by: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]> (cherry picked from commit 0caf81b5c53d9bd332a95dbcb44db8de0b397a7c) Signed-off-by: Daniel Vetter <[email protected]>
2017-06-26drm/i915: Retire the VMA's fence tracker before unbindingChris Wilson1-0/+5
Since we may track unfenced access (GPU access to the vma that explicitly requires no fence), vma->last_fence may be set without any attached fence (vma->fence) and so will not be flushed when we call i915_vma_put_fence(). Since we stopped doing a full retire of the activity trackers for unbind, we need to explicitly retire each tracker. Fixes: b0decaf75bd9 ("drm/i915: Track active vma requests") Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Tvrtko Ursulin <[email protected]> (cherry picked from commit 760a898d8069111704e1bd43f00ebf369ae46e57) Signed-off-by: Daniel Vetter <[email protected]>
2017-06-25Linux 4.12-rc7Linus Torvalds1-1/+1
2017-06-25Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix to unbreak the vdso32 build for 64bit kernels caused by excess #includes in the mshyperv header" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mshyperv: Remove excess #includes from mshyperv.h
2017-06-25Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds7-33/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few fixes for timekeeping and timers: - Plug a subtle race due to a missing READ_ONCE() in the timekeeping code where reloading of a pointer results in an inconsistent callback argument being supplied to the clocksource->read function. - Correct the CLOCK_MONOTONIC_RAW sub-nanosecond accounting in the time keeping core code, to prevent a possible discontuity. - Apply a similar fix to the arm64 vdso clock_gettime() implementation - Add missing includes to clocksource drivers, which relied on indirect includes which fails in certain configs. - Use the proper iomem pointer for read/iounmap in a probe function" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64/vdso: Fix nsec handling for CLOCK_MONOTONIC_RAW time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting time: Fix clock->read(clock) race around clocksource changes clocksource: Explicitly include linux/clocksource.h when needed clocksource/drivers/arm_arch_timer: Fix read and iounmap of incorrect variable
2017-06-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds3-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "Three fixlets for perf: - Return the proper error code if aux buffers for a event are not supported. - Calculate the probe offset for inlined functions correctly - Update the Skylake DTLB load/store miss event so it can count 1G TLB entries as well" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf probe: Fix probe definition for inlined functions perf/x86/intel: Add 1G DTLB load/store miss support for SKL perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)
2017-06-25Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Thomas Gleixner: "A single fix for the MIPS GIC to prevent ftrace recursion" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips-gic: Mark count and compare accessors notrace
2017-06-25Merge branch 'for-linus' of ↵Linus Torvalds3-16/+28
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: - a quirk to i8042 to ignore timeout bit on Lifebook AH544 - a fixup to Synaptics RMI function 54 that was breaking some Dells - a fix for memory leak in soc_button_array driver * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics-rmi4 - only read the F54 query registers which are used Input: i8042 - add Fujitsu Lifebook AH544 to notimeout list Input: soc_button_array - fix leaking the ACPI button descriptor buffer
2017-06-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds4-14/+35
Pull SCSI target fixes from Nicholas Bellinger: "Here are the target-pending fixes for v4.12-rc7 that have been queued up for the last 2 weeks. This includes: - Fix a TMR related kref underflow detected by the recent refcount_t conversion in upstream. - Fix a iscsi-target corner case during explicit connection logout timeout failure. - Address last fallout in iscsi-target immediate data handling from v4.4 target-core now allowing control CDB payload underflow" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: iscsi-target: Reject immediate data underflow larger than SCSI transfer length iscsi-target: Fix delayed logout processing greater than SECONDS_FOR_LOGOUT_COMP target: Fix kref->refcount underflow in transport_cmd_finish_abort
2017-06-25tcp: reset sk_rx_dst in tcp_disconnect()WANG Cong1-0/+2
We have to reset the sk->sk_rx_dst when we disconnect a TCP connection, because otherwise when we re-connect it this dst reference is simply overridden in tcp_finish_connect(). This fixes a dst leak which leads to a loopback dev refcnt leak. It is a long-standing bug, Kevin reported a very similar (if not same) bug before. Thanks to Andrei for providing such a reliable reproducer which greatly narrows down the problem. Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.") Reported-by: Andrei Vagin <[email protected]> Reported-by: Kevin Xu <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-25net: ipv6: reset daddr and dport in sk if connect() failsWei Wang2-2/+9
In __ip6_datagram_connect(), reset sk->sk_v6_daddr and inet->dport if error occurs. In udp_v6_early_demux(), check for sk_state to make sure it is in TCP_ESTABLISHED state. Together, it makes sure unconnected UDP socket won't be considered as a valid candidate for early demux. v3: add TCP_ESTABLISHED state check in udp_v6_early_demux() v2: fix compilation error Fixes: 5425077d73e0 ("net: ipv6: Add early demux handler for UDP unicast") Signed-off-by: Wei Wang <[email protected]> Acked-by: Maciej Żenczykowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-25bnx2x: Don't log mc removal needlesslyMintz, Yuval1-1/+1
When mc configuration changes bnx2x_config_mcast() can return 0 for success, negative for failure and positive for benign reason preventing its immediate work, e.g., when the command awaits the completion of a previously sent command. When removing all configured macs on a 578xx adapter, if a positive value would be returned driver would errneously log it as an error. Fixes: c7b7b483ccc9 ("bnx2x: Don't flush multicast MACs") Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-24Merge tag 'kbuild-fixes-v4.12-2' of ↵Linus Torvalds7-12/+21
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: "Nothing scary, just some random fixes: - fix warnings of host programs - fix "make tags" when COMPILED_SOURCE=1 is specified along with O= - clarify help message of C=1 option - fix dependency for ncurses compatibility check - fix "make headers_install" for fakechroot environment" * tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix sparse warnings in nconfig kbuild: fix header installation under fakechroot environment kconfig: Check for libncurses before menuconfig Kbuild: tiny correction on `make help` tags: honor COMPILED_SOURCE with apart output directory genksyms: add printf format attribute to error_with_pos()
2017-06-24Merge branch 'for-linus' of ↵Linus Torvalds1-6/+14
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull timer fix from Eric Biederman: "This fixes an issue of confusing injected signals with the signals from posix timers that has existed since posix timers have been in the kernel. This patch is slightly simpler than my earlier version of this patch as I discovered in testing that I had misspelled "#ifdef CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made setting of resched_timer uncondtional. I have tested this and verified that without this patch there is a nasty hang that is easy to trigger, and with this patch everything works properly" Thomas Gleixner dixit: "It fixes the problem at hand and covers the ptrace case as well, which I missed. Reviewed-and-tested-by: Thomas Gleixner <[email protected]>" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Only reschedule timers on signals timers have sent
2017-06-24sched/fair: Remove effective_load()Rik van Riel1-123/+1
The effective_load() function was only used by the NUMA balancing code, and not by the regular load balancing code. Now that the NUMA balancing code no longer uses it either, get rid of it. Signed-off-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24sched/numa: Implement NUMA node level wake_affine()Rik van Riel1-59/+71
Since select_idle_sibling() can place a task anywhere on a socket, comparing loads between individual CPU cores makes no real sense for deciding whether to do an affine wakeup across sockets, either. Instead, compare the load between the sockets in a similar way the load balancer and the numa balancing code do. Signed-off-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24sched/fair: Simplify wake_affine() for the single socket caseRik van Riel1-1/+12
Then 'this_cpu' and 'prev_cpu' are in the same socket, select_idle_sibling() will do its thing regardless of the return value of wake_affine(). Just return true and don't look at all the other things. Signed-off-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24sched/numa: Override part of migrate_degrades_locality() when idle balancingRik van Riel1-0/+4
Several tests in the NAS benchmark seem to run a lot slower with NUMA balancing enabled, than with NUMA balancing disabled. The slower run time corresponds with increased idle time. Overriding the final test of migrate_degrades_locality (but still doing the other NUMA tests first) seems to improve performance of those benchmarks. Reported-by: Jirka Hladky <[email protected]> Signed-off-by: Rik van Riel <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24Merge branch 'linus' into sched/core, to pick up fixesIngo Molnar119-665/+1158
Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24x86/paravirt: Remove unnecessary return from void functionAnton Vasilyev1-1/+1
The patch removes unnecessary return from void function. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Anton Vasilyev <[email protected]> Cc: Alok Kataria <[email protected]> Cc: Chris Wright <[email protected]> Cc: Jeremy Fitzhardinge <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24x86/boot: Add missing strchr() declarationTommy Nguyen1-0/+1
The Sparse static analyzer emits this warning: symbol 'strchr' was not declared. Should it be static? This patch adds the appropriate extern declaration to string.h to fix the warning. Signed-off-by: Tommy Nguyen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/20170623143601.GA20743@NoChina Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24x86/mshyperv: Remove excess #includes from mshyperv.hThomas Gleixner1-2/+1
A recent commit included linux/slab.h in linux/irq.h. This breaks the build of vdso32 on a 64-bit kernel. The reason is that linux/irq.h gets included into the vdso code via linux/interrupt.h which is included from asm/mshyperv.h. That makes the 32-bit vdso compile fail, because slab.h includes the pgtable headers for 64-bit on a 64-bit build. Neither linux/clocksource.h nor linux/interrupt.h are needed in the mshyperv.h header file itself - it has a dependency on <linux/atomic.h>. Remove the includes and unbreak the build. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: K. Y. Srinivasan <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vitaly Kuznetsov <[email protected]> Cc: [email protected] Fixes: dee863b571b0 ("hv: export current Hyper-V clocksource") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos Signed-off-by: Ingo Molnar <[email protected]>
2017-06-24x86/mmap, ASLR: Do not treat unlimited-stack tasks as legacy mmapMichal Hocko1-3/+0
Since the following commit in 2008: cc503c1b43e0 ("x86: PIE executable randomization") We added a heuristics to treat applications with RLIMIT_STACK configured to unlimited as legacy. This means: a) set the mmap_base to 1/3 of address space + randomization and b) mmap from bottom to top. This makes some sense as it allows the stack to grow really large. On the other hand it reduces the address space usable for default mmaps (without address hint) quite a lot. We have received a bug report that SAP HANA workload has hit into this limitation. We could argue that the user just got what he asked for when setting up the unlimited stack but to be realistic growing stack up to 1/6 TASK_SIZE (allowed by mmap_base) is pretty much unimited in the real life. This would give mmap 20TB of additional address space which is quite nice. Especially when it is much more likely to use that address space than the reserved stack. Digging into the history the original implementation of the randomization: 8817210d4d96 ("[PATCH] x86_64: Flexmap for 32bit and randomized mappings for 64bit") didn't have this restriction. So let's try and remove this assumption - hopefully nothing breaks. Signed-off-by: Michal Hocko <[email protected]> Acked-by: Jiri Kosina <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Dave Jones <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ So I've applied this to tip:x86/mm with a wider Cc: list - if anyone objects to this change please holler. ] Signed-off-by: Ingo Molnar <[email protected]>
2017-06-23Merge tag 'powerpc-4.12-7' of ↵Linus Torvalds7-50/+166
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.12. Most of these actually came in last week but got held up for some more testing. - three fixes for kprobes/ftrace/livepatch interactions. - properly handle data breakpoints when using the Radix MMU. - fix for perf sampling of registers during call_usermodehelper(). - properly initialise the thread_info on our emergency stacks - add an explicit flush when doing TLB invalidations for a process using NPU2. Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria, Masami Hiramatsu" * tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Initialise thread_info for emergency stacks powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD powerpc/perf: Fix oops when kthread execs user process powerpc/64s: Handle data breakpoints in Radix mode powerpc/kprobes: Skip livepatch_handler() for jprobes powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS powerpc/kprobes: Pause function_graph tracing during jprobes handling
2017-06-23Merge tag 'acpi-4.12-rc7' of ↵Linus Torvalds2-31/+39
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "This fixes the ACPI-based enumeration of some I2C and SPI devices broken in 4.11. Specifics: - I2C and SPI devices are expected to be enumerated by the I2C and SPI subsystems, respectively, but due to a change made during the 4.11 cycle, in some cases the ACPI core marks them as already enumerated which causes the I2C and SPI subsystems to overlook them, so fix that (Jarkko Nikula)" * tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Fix enumeration for special SPI and I2C devices
2017-06-23Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang. * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: imx: Use correct function to write to register