aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-02-06sched/rt: Use container_of() to get root domain in rto_push_irq_work_func()Steven Rostedt (VMware)1-7/+8
When the rto_push_irq_work_func() is called, it looks at the RT overloaded bitmask in the root domain via the runqueue (rq->rd). The problem is that during CPU up and down, nothing here stops rq->rd from changing between taking the rq->rd->rto_lock and releasing it. That means the lock that is released is not the same lock that was taken. Instead of using this_rq()->rd to get the root domain, as the irq work is part of the root domain, we can simply get the root domain from the irq work that is passed to the routine: container_of(work, struct root_domain, rto_push_work) This keeps the root domain consistent. Reported-by: Pavan Kondeti <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 4bdced5c9a292 ("sched/rt: Simplify the IPI based RT balancing logic") Link: http://lkml.kernel.org/r/CAEU1=PkiHO35Dzna8EQqNSKW1fr1y1zRQ5y66X117MG06sQtNA@mail.gmail.com Signed-off-by: Ingo Molnar <[email protected]>
2018-02-06sched/core: Optimize update_stats_*()Peter Zijlstra2-16/+20
These functions are already gated by schedstats_enabled(), there is no point in then issuing another static_branch for every individual update in them. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-06sched/core: Optimize ttwu_stat()Peter Zijlstra2-8/+10
The whole of ttwu_stat() is guarded by a single schedstat_enabled(), there is absolutely no point in then issuing another static_branch for every single schedstat_inc() in there. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05ibmvnic: fix empty firmware version and errors cleanupDesnes Augusto Nunes do Rosario1-10/+4
This patch makes sure that the firmware version is never NULL. Moreover, it also performs some cleanup on the error messages. Fixes: a107311d7fdf ("ibmvnic: fix firmware version when no firmware level has been provided by the VIOS server") Signed-off-by: Desnes A. Nunes do Rosario <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-05sctp: fix dst refcnt leak in sctp_v4_get_dstTommi Rantala1-6/+4
Fix dst reference count leak in sctp_v4_get_dst() introduced in commit 410f03831 ("sctp: add routing output fallback"): When walking the address_list, successive ip_route_output_key() calls may return the same rt->dst with the reference incremented on each call. The code would not decrement the dst refcount when the dst pointer was identical from the previous iteration, causing the dst refcnt leak. Testcase: ip netns add TEST ip netns exec TEST ip link set lo up ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add dummy2 type dummy ip link set dev dummy0 netns TEST ip link set dev dummy1 netns TEST ip link set dev dummy2 netns TEST ip netns exec TEST ip addr add 192.168.1.1/24 dev dummy0 ip netns exec TEST ip link set dummy0 up ip netns exec TEST ip addr add 192.168.1.2/24 dev dummy1 ip netns exec TEST ip link set dummy1 up ip netns exec TEST ip addr add 192.168.1.3/24 dev dummy2 ip netns exec TEST ip link set dummy2 up ip netns exec TEST sctp_test -H 192.168.1.2 -P 20002 -h 192.168.1.1 -p 20000 -s -B 192.168.1.3 ip netns del TEST In 4.4 and 4.9 kernels this results to: [ 354.179591] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 364.419674] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 374.663664] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 384.903717] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 395.143724] unregister_netdevice: waiting for lo to become free. Usage count = 1 [ 405.383645] unregister_netdevice: waiting for lo to become free. Usage count = 1 ... Fixes: 410f03831 ("sctp: add routing output fallback") Fixes: 0ca50d12f ("sctp: fix src address selection if using secondary addresses") Signed-off-by: Tommi Rantala <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-05sctp: fix dst refcnt leak in sctp_v6_get_dst()Alexey Kodanev1-3/+7
When going through the bind address list in sctp_v6_get_dst() and the previously found address is better ('matchlen > bmatchlen'), the code continues to the next iteration without releasing currently held destination. Fix it by releasing 'bdst' before continue to the next iteration, and instead of introducing one more '!IS_ERR(bdst)' check for dst_release(), move the already existed one right after ip6_dst_lookup_flow(), i.e. we shouldn't proceed further if we get an error for the route lookup. Fixes: dbc2b5e9a09e ("sctp: fix src address selection if using secondary addresses for ipv6") Signed-off-by: Alexey Kodanev <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-05Merge tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds3-11/+24
Pull more xfs updates from Darrick Wong: "As promised, here's a (much smaller) second pull request for the second week of the merge cycle. This time around we have a couple patches shutting off unsupported fs configurations, and a couple of cleanups. Last, we turn off EXPERIMENTAL for the reverse mapping btree, since the primary downstream user of that information (online fsck) is now upstream and I haven't seen any major failures in a few kernel releases. Summary: - Print scrub build status in the xfs build info. - Explicitly call out the remaining two scenarios where we don't support reflink and never have. - Remove EXPERIMENTAL tag from reverse mapping btree!" * tag 'xfs-4.16-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove experimental tag for reverse mapping xfs: don't allow reflink + realtime filesystems xfs: don't allow DAX on reflink filesystems xfs: add scrub to XFS_BUILD_OPTIONS xfs: fix u32 type usage in sb validation function
2018-02-05Merge tag 'perf-urgent-for-mingo-4.16-20180205' of ↵Ingo Molnar14-13/+91
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - Fix 'period' and 'freq' handling for 'perf record', also related: add Add PERF_SAMPLE_PERIOD into PEBS_FREERUNNING_FLAGS in the x86 perf kernel driver (Jiri Olsa) - Fix 'perf trace -i perf.data' callgraph handling (Ravi Bangoria) - Synchronize tooling headers for asound, s390 and powerpc KVM, sched and x86 features (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05Merge branch 'overlayfs-linus' of ↵Linus Torvalds15-409/+1905
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs Pull overlayfs updates from Miklos Szeredi: "This work from Amir adds NFS export capability to overlayfs. NFS exporting an overlay filesystem is a challange because we want to keep track of any copy-up of a file or directory between encoding the file handle and decoding it. This is achieved by indexing copied up objects by lower layer file handle. The index is already used for hard links, this patchset extends the use to NFS file handle decoding" * 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs: (51 commits) ovl: check ERR_PTR() return value from ovl_encode_fh() ovl: fix regression in fsnotify of overlay merge dir ovl: wire up NFS export operations ovl: lookup indexed ancestor of lower dir ovl: lookup connected ancestor of dir in inode cache ovl: hash non-indexed dir by upper inode for NFS export ovl: decode pure lower dir file handles ovl: decode indexed dir file handles ovl: decode lower file handles of unlinked but open files ovl: decode indexed non-dir file handles ovl: decode lower non-dir file handles ovl: encode lower file handles ovl: copy up before encoding non-connectable dir file handle ovl: encode non-indexed upper file handles ovl: decode connected upper dir file handles ovl: decode pure upper file handles ovl: encode pure upper file handles ovl: document NFS export vfs: factor out helpers d_instantiate_anon() and d_alloc_anon() ovl: store 'has_upper' and 'opaque' as bit flags ...
2018-02-05membarrier/selftest: Test private expedited sync core commandMathieu Desnoyers1-0/+73
Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED_SYNC_CORE commands. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Shuah Khan <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alan Stern <[email protected]> Cc: Alice Ferrazzi <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Elder <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier/arm64: Provide core serializing commandMathieu Desnoyers2-0/+5
Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier/x86: Provide core serializing commandMathieu Desnoyers4-3/+14
There are two places where core serialization is needed by membarrier: 1) When returning from the membarrier IPI, 2) After scheduler updates curr to a thread with a different mm, before going back to user-space, since the curr->mm is used by membarrier to check whether it needs to send an IPI to that CPU. x86-32 uses IRET as return from interrupt, and both IRET and SYSEXIT to go back to user-space. The IRET instruction is core serializing, but not SYSEXIT. x86-64 uses IRET as return from interrupt, which takes care of the IPI. However, it can return to user-space through either SYSRETL (compat code), SYSRETQ, or IRET. Given that SYSRET{L,Q} is not core serializing, we rely instead on write_cr3() performed by switch_mm() to provide core serialization after changing the current mm, and deal with the special case of kthread -> uthread (temporarily keeping current mm into active_mm) by adding a sync_core() in that specific case. Use the new sync_core_before_usermode() to guarantee this. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier: Provide core serializing command, *_SYNC_COREMathieu Desnoyers5-18/+106
Provide core serializing membarrier command to support memory reclaim by JIT. Each architecture needs to explicitly opt into that support by documenting in their architecture code how they provide the core serializing instructions required when returning from the membarrier IPI, and after the scheduler has updated the curr->mm pointer (before going back to user-space). They should then select ARCH_HAS_MEMBARRIER_SYNC_CORE to enable support for that command on their architecture. Architectures selecting this feature need to either document that they issue core serializing instructions when returning to user-space, or implement their architecture-specific sync_core_before_usermode(). Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05lockin/x86: Implement sync_core_before_usermode()Mathieu Desnoyers2-0/+29
Ensure that a core serializing instruction is issued before returning to user-mode. x86 implements return to user-space through sysexit, sysrel, and sysretq, which are not core serializing. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05locking: Introduce sync_core_before_usermode()Mathieu Desnoyers2-0/+24
Introduce an architecture function that ensures the current CPU issues a core serializing instruction before returning to usermode. This is needed for the membarrier "sync_core" command. Architectures defining the sync_core_before_usermode() static inline need to select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier/selftest: Test global expedited commandMathieu Desnoyers1-5/+54
Test the new MEMBARRIER_CMD_GLOBAL_EXPEDITED and MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED commands. Adapt to the MEMBARRIER_CMD_SHARED -> MEMBARRIER_CMD_GLOBAL rename. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Shuah Khan <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alan Stern <[email protected]> Cc: Alice Ferrazzi <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Elder <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier: Provide GLOBAL_EXPEDITED commandMathieu Desnoyers4-18/+153
Allow expedited membarrier to be used for data shared between processes through shared memory. Processes wishing to receive the membarriers register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. Those which want to issue membarrier invoke MEMBARRIER_CMD_GLOBAL_EXPEDITED. This allows extremely simple kernel-level implementation: we have almost everything we need with the PRIVATE_EXPEDITED barrier code. All we need to do is to add a flag in the mm_struct that will be used to check whether we need to send the IPI to the current thread of each CPU. There is a slight downside to this approach compared to targeting specific shared memory users: when performing a membarrier operation, all registered "global" receivers will get the barrier, even if they don't share a memory mapping with the sender issuing MEMBARRIER_CMD_GLOBAL_EXPEDITED. This registration approach seems to fit the requirement of not disturbing processes that really deeply care about real-time: they simply should not register with MEMBARRIER_CMD_REGISTER_GLOBAL_EXPEDITED. In order to align the membarrier command names, the "MEMBARRIER_CMD_SHARED" command is renamed to "MEMBARRIER_CMD_GLOBAL", keeping an alias of MEMBARRIER_CMD_SHARED to MEMBARRIER_CMD_GLOBAL for UAPI header backward compatibility. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier: Document scheduler barrier requirementsMathieu Desnoyers3-11/+36
Document the membarrier requirement on having a full memory barrier in __schedule() after coming from user-space, before storing to rq->curr. It is provided by smp_mb__after_spinlock() in __schedule(). Document that membarrier requires a full barrier on transition from kernel thread to userspace thread. We currently have an implicit barrier from atomic_dec_and_test() in mmdrop() that ensures this. The x86 switch_mm_irqs_off() full barrier is currently provided by many cpumask update operations as well as write_cr3(). Document that write_cr3() provides this barrier. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05powerpc, membarrier: Skip memory barrier in switch_mm()Mathieu Desnoyers8-11/+58
Allow PowerPC to skip the full memory barrier in switch_mm(), and only issue the barrier when scheduling into a task belonging to a process that has registered to use expedited private. Threads targeting the same VM but which belong to different thread groups is a tricky case. It has a few consequences: It turns out that we cannot rely on get_nr_threads(p) to count the number of threads using a VM. We can use (atomic_read(&mm->mm_users) == 1 && get_nr_threads(p) == 1) instead to skip the synchronize_sched() for cases where the VM only has a single user, and that user only has a single thread. It also turns out that we cannot use for_each_thread() to set thread flags in all threads using a VM, as it only iterates on the thread group. Therefore, test the membarrier state variable directly rather than relying on thread flags. This means membarrier_register_private_expedited() needs to set the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag, issue synchronize_sched(), and only then set MEMBARRIER_STATE_PRIVATE_EXPEDITED_READY which allows private expedited membarrier commands to succeed. membarrier_arch_switch_mm() now tests for the MEMBARRIER_STATE_PRIVATE_EXPEDITED flag. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alan Stern <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05membarrier/selftest: Test private expedited commandMathieu Desnoyers1-16/+95
Test the new MEMBARRIER_CMD_PRIVATE_EXPEDITED and MEMBARRIER_CMD_REGISTER_PRIVATE_EXPEDITED commands. Add checks expecting specific error values on system calls expected to fail. Signed-off-by: Mathieu Desnoyers <[email protected]> Acked-by: Thomas Gleixner <[email protected]> Acked-by: Shuah Khan <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alan Stern <[email protected]> Cc: Alice Ferrazzi <[email protected]> Cc: Andrea Parri <[email protected]> Cc: Andrew Hunter <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Dave Watson <[email protected]> Cc: David Sehr <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maged Michael <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Paul Elder <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Russell King <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-02-05Merge tag 'rproc-v4.16' of git://github.com/andersson/remoteprocLinus Torvalds11-204/+107
Pull remoteproc updates from Bjorn Andersson: "This contains a few bug fixes and a cleanup up of the resource-table handling in the framework, which removes the need for drivers with no resource table to provide a fake one" * tag 'rproc-v4.16' of git://github.com/andersson/remoteproc: remoteproc: Reset table_ptr on stop remoteproc: Drop dangling find_rsc_table dummies remoteproc: Move resource table load logic to find remoteproc: Don't handle empty resource table remoteproc: Merge rproc_ops and rproc_fw_ops remoteproc: Clone rproc_ops in rproc_alloc() remoteproc: Cache resource table size remoteproc: Remove depricated crash completion virtio_remoteproc: correct put_device virtio_device.dev
2018-02-05Merge tag 'rpmsg-v4.16' of git://github.com/andersson/remoteprocLinus Torvalds4-19/+55
Pull rpmsg updates from Bjorn Andersson: "This fixes a few issues found in the SMD and GLINK drivers and corrects the handling of SMD channels that are found in an (previously) unexpected state" * tag 'rpmsg-v4.16' of git://github.com/andersson/remoteproc: rpmsg: smd: Fix double unlock in __qcom_smd_send() rpmsg: glink: Fix missing mutex_init() in qcom_glink_alloc_channel() rpmsg: smd: Don't hold the tx lock during wait rpmsg: smd: Fail send on a closed channel rpmsg: smd: Wake up all waiters rpmsg: smd: Create device for all channels rpmsg: smd: Perform handshake during open rpmsg: glink: smem: Ensure ordering during tx drivers: rpmsg: remove duplicate includes remoteproc: qcom: Use PTR_ERR_OR_ZERO() in glink prob
2018-02-05Merge tag 'mmc-v4.16-2' of ↵Linus Torvalds3-10/+161
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC host fixes from Ulf Hansson: - renesas_sdhi: Fix build error in case NO_DMA=y - sdhci: Implement a bounce buffer to address throughput regressions * tag 'mmc-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: MMC_SDHI_{SYS,INTERNAL}_DMAC should depend on HAS_DMA mmc: sdhci: Implement an SDHCI-specific bounce buffer
2018-02-05Merge tag 'pwm/for-4.16-rc1' of ↵Linus Torvalds4-1/+30
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "The Meson PWM controller driver gains support for the AXG series and a minor bug is fixed for the STMPE driver. To round things off, the class is now set for PWM channels exported via sysfs which allows non-root access, provided that the system has been configured accordingly" * tag 'pwm/for-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: pwm: meson: Add clock source configuration for Meson-AXG dt-bindings: pwm: Update bindings for the Meson-AXG pwm: stmpe: Fix wrong register offset for hwpwm=2 case pwm: Set class for exported channels in sysfs
2018-02-05net: mediatek: Explicitly include pinctrl headersThierry Reding1-0/+1
The Mediatek ethernet driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the device.h header implicitly. Include these headers explicitly to avoid the build failure. Cc: Linus Walleij <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-02-05mmc: meson-gx-mmc: Explicitly include pinctr/consumer.hThierry Reding1-0/+1
The Meson GX MMC driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h being pulled in by the device.h header implicitly. Include the header explicitly to avoid the build failure. Cc: Linus Walleij <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-02-05drm/rockchip: lvds: Explicitly include pinctrl headersThierry Reding1-0/+1
The Rockchip LVDS driver fails to build after commit 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") because it relies on the pinctrl/consumer.h and pinctrl/devinfo.h being pulled in by the device.h header implicitly. Include these headers explicitly to avoid the build failure. Cc: Linus Walleij <[email protected]> Signed-off-by: Thierry Reding <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-02-05pinctrl: files should directly include apis they useStephen Rothwell1-0/+1
Fixes: 23c35f48f5fb ("pinctrl: remove include file from <linux/device.h>") Signed-off-by: Stephen Rothwell <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-02-05perf tools: Add trace/beauty/generated/ into .gitignoreRavi Bangoria1-0/+1
No functionality changes. Signed-off-by: Ravi Bangoria <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-02-05perf trace: Fix call-graph outputRavi Bangoria1-1/+4
Recently, Arnaldo fixed global vs event specific --max-stack usage with commit bd3dda9ab0fb ("perf trace: Allow overriding global --max-stack per event"). This commit is having a regression when we don't use --max-stack at all with perf trace. Ex, $ ./perf trace record -g ls $ ./perf trace -i perf.data 0.076 ( 0.002 ms): ls/9109 brk( 0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R 0.209 ( 0.031 ms): ls/9109 open(filename: 0x9f998978, flags: CLOEXEC This is missing call-traces. After patch: $ ./perf trace -i perf.data 0.076 ( 0.002 ms): ls/9109 brk( do_syscall_trace_leave ([kernel.kallsyms]) [0] ([unknown]) syscall_exit_work ([kernel.kallsyms]) brk (/usr/lib64/ld-2.17.so) _dl_sysdep_start (/usr/lib64/ld-2.17.so) _dl_start_final (/usr/lib64/ld-2.17.so) _dl_start (/usr/lib64/ld-2.17.so) _start (/usr/lib64/ld-2.17.so) 0.196 ( 0.008 ms): ls/9109 access(filename: 0x9f998b70, mode: R do_syscall_trace_leave ([kernel.kallsyms]) [0] ([unknown]) Signed-off-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Fixes: bd3dda9ab0fb ("perf trace: Allow overriding global --max-stack per event") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-02-05x86/events/intel/ds: Add PERF_SAMPLE_PERIOD into PEBS_FREERUNNING_FLAGSJiri Olsa1-1/+2
Stephane reported that we don't support period for enabling large PEBS data, which there's no reason for. Adding PERF_SAMPLE_PERIOD into freerunning flags. Tested it with: # perf record -e cycles:P -c 100 --no-timestamp -C 0 --period Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Tested-by: Stephane Eranian <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-02-05net/mlx5: increase async EQ to avoid EQ overrunMax Gurtovoy1-1/+1
Currently the async EQ has 256 entries only. It might not be big enough for the SW to handle all the needed pending events. For example, in case of many QPs (let's say 1024) connected to a SRQ created using NVMeOF target and the target goes down, the FW will raise 1024 "last WQE reached" events and may cause EQ overrun. Increase the EQ to more reasonable size, that beyond it the FW should be able to delay the event and raise it later on using internal backpressure mechanism. Signed-off-by: Max Gurtovoy <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-02-05mlx5: fix mlx5_get_vector_affinity to start from completion vector 0Sagi Grimberg1-1/+1
The consumers of this routine expects the affinity map of of vector index relative to the first completion vector. The upper layers are not aware of internal/private completion vectors that mlx5 allocates for its own usage. Hence, return the affinity map of vector index relative to the first completion vector. Fixes: 05e0cc84e00c ("net/mlx5: Fix get vector affinity helper function") Reported-by: Logan Gunthorpe <[email protected]> Tested-by: Max Gurtovoy <[email protected]> Reviewed-by: Max Gurtovoy <[email protected]> Cc: <[email protected]> # v4.15 Signed-off-by: Sagi Grimberg <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-02-05RDMA/hns: Fix the endian problem for hnsoulijun8-320/+363
The hip06 and hip08 run on a little endian ARM, it needs to revise the annotations to indicate that the HW uses little endian data in the various DMA buffers, and flow the necessary swaps throughout. The imm_data use big endian mode. The cpu_to_le32/le32_to_cpu swaps are no-op for this, which makes the only substantive change the handling of imm_data which is now mandatory swapped. This also keep match with the userspace hns driver and resolve the warning by sparse. Signed-off-by: Lijun Ou <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-02-05perf record: Fix period option handlingJiri Olsa3-4/+11
Stephan reported we don't unset PERIOD sample type when --no-period is specified. Adding the unset check and reset PERIOD if --no-period is specified. Committer notes: Check the sample_type, it shouldn't have PERF_SAMPLE_PERIOD there when --no-period is used. Before: # perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.018 MB perf.data (7 samples) ] # perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 # After: [root@jouet ~]# perf record --no-period sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.019 MB perf.data (17 samples) ] [root@jouet ~]# perf evlist -v cycles:ppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 [root@jouet ~]# Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Stephane Eranian <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-02-05perf evsel: Fix period/freq terms setupJiri Olsa1-0/+2
Stephane reported that we don't set properly PERIOD sample type for events with period term defined. Before: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME|PERIOD, ... After: $ perf record -e cpu/cpu-cycles,period=1000/u ls $ perf evlist -v cpu/cpu-cycles,period=1000/u: ... sample_type: IP|TID|TIME, ... Setting PERIOD sample type based on period term setup. Committer note: When we use -c or a period=N term in the event definition, then we don't need to ask the kernel, for this event, via perf_event_attr.sample_type |= PERF_SAMPLE_PERIOD, to put the event period in each sample for this event, as we know it already, it is in perf_event_attr.sample_period. Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Stephane Eranian <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-02-05dwc-xlgmac: remove Jie Deng as co-maintainerJie Deng1-1/+0
Jose Abreu is working on this driver and I will leave Synopsys soon. Thus it does not seem appropriate for me to be a co-maintainer anymore. Signed-off-by: Jie Deng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-05doc: Change the min default value of tcp_wmem/tcp_rmem.Tonghao Zhang1-2/+2
The SK_MEM_QUANTUM was changed from PAGE_SIZE to 4096. And the tcp_wmem/tcp_rmem min default values are 4096. Fixes: bd68a2a854ad ("net: set SK_MEM_QUANTUM to 4096") Cc: Eric Dumazet <[email protected]> Signed-off-by: Tonghao Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-02-05ovl: check ERR_PTR() return value from ovl_encode_fh()Amir Goldstein1-0/+3
Another fix for an issue reported by 0-day robot. Reported-by: Dan Carpenter <[email protected]> Fixes: 8ed5eec9d6c4 ("ovl: encode pure upper file handles") Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-02-05ovl: fix regression in fsnotify of overlay merge dirAmir Goldstein1-0/+2
A re-factoring patch in NFS export series has passed the wrong argument to ovl_get_inode() causing a regression in the very recent fix to fsnotify of overlay merge dir. The regression has caused merge directory inodes to be hashed by upper instead of lower real inode, when NFS export and directory indexing is disabled. That caused an inotify watch to become obsolete after directory copy up and drop caches. LTP test inotify07 was improved to catch this regression. The regression also caused multiple redirect dirs to same origin not to be detected on lookup with NFS export disabled. An xfstest was added to cover this case. Fixes: 0aceb53e73be ("ovl: do not pass overlay dentry to ovl_get_inode()") Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2018-02-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller28-141/+1716
Alexei Starovoitov says: ==================== pull-request: bpf 2018-02-02 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) support XDP attach in libbpf, from Eric. 2) minor fixes, from Daniel, Jakub, Yonghong, Alexei. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-02-04Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds38-554/+977
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull spectre/meltdown updates from Thomas Gleixner: "The next round of updates related to melted spectrum: - The initial set of spectre V1 mitigations: - Array index speculation blocker and its usage for syscall, fdtable and the n180211 driver. - Speculation barrier and its usage in user access functions - Make indirect calls in KVM speculation safe - Blacklisting of known to be broken microcodes so IPBP/IBSR are not touched. - The initial IBPB support and its usage in context switch - The exposure of the new speculation MSRs to KVM guests. - A fix for a regression in x86/32 related to the cpu entry area - Proper whitelisting for known to be safe CPUs from the mitigations. - objtool fixes to deal proper with retpolines and alternatives - Exclude __init functions from retpolines which speeds up the boot process. - Removal of the syscall64 fast path and related cleanups and simplifications - Removal of the unpatched paravirt mode which is yet another source of indirect unproteced calls. - A new and undisputed version of the module mismatch warning - A couple of cleanup and correctness fixes all over the place Yet another step towards full mitigation. There are a few things still missing like the RBS underflow mitigation for Skylake and other small details, but that's being worked on. That said, I'm taking a belated christmas vacation for a week and hope that everything is magically solved when I'm back on Feb 12th" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits) KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES KVM/x86: Add IBPB support KVM/x86: Update the reverse_cpuid list to include CPUID_7_EDX x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL x86/pti: Mark constant arrays as __initconst x86/spectre: Simplify spectre_v2 command line parsing x86/retpoline: Avoid retpolines for built-in __init functions x86/kvm: Update spectre-v1 mitigation KVM: VMX: make MSR bitmaps per-VCPU x86/paravirt: Remove 'noreplace-paravirt' cmdline option x86/speculation: Use Indirect Branch Prediction Barrier in context switch x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable" x86/spectre: Report get_user mitigation for spectre_v1 nl80211: Sanitize array index in parse_txq_params vfs, fdtable: Prevent bounds-check bypass via speculative execution x86/syscall: Sanitize syscall table de-references under speculation x86/get_user: Use pointer masking to limit speculation ...
2018-02-04Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds6-6/+13
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A small set of changes: - a fixup for kexec related to 5-level paging mode. That covers most of the cases except kexec from a 5-level kernel to a 4-level kernel. The latter needs more work and is going to come in 4.17 - two trivial fixes for build warnings triggered by LTO and gcc-8" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/power: Fix swsusp_arch_resume prototype x86/dumpstack: Avoid uninitlized variable x86/kexec: Make kexec (mostly) work in 5-level paging mode
2018-02-04Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds4-6/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Two small changes: - a fix for a interrupt regression caused by the vector management changes in 4.15 affecting museum pieces which rely on interrupt probing for legacy (e.g. parallel port) devices. One of the startup calls in the autoprobe code was not changed to the new activate_and_startup() function resulting in a warning and as a consequence failing to discover the device interrupt. - a trivial update to the copyright/license header of the STM32 irq chip driver" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq: Make legacy autoprobing work again irqchip/stm32: Fix copyright
2018-02-04Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-blockLinus Torvalds13-30/+83
Pull more block updates from Jens Axboe: "Most of this is fixes and not new code/features: - skd fix from Arnd, fixing a build error dependent on sla allocator type. - blk-mq scheduler discard merging fixes, one from me and one from Keith. This fixes a segment miscalculation for blk-mq-sched, where we mistakenly think two segments are physically contigious even though the request isn't carrying real data. Also fixes a bio-to-rq merge case. - Don't re-set a bit on the buffer_head flags, if it's already set. This can cause scalability concerns on bigger machines and workloads. From Kemi Wang. - Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to distuingish between a local (device related) resource starvation and a global one. The latter might happen without IO being in flight, so it has to be handled a bit differently. From Ming" * tag 'for-linus-20180204' of git://git.kernel.dk/linux-block: block: skd: fix incorrect linux/slab_def.h inclusion buffer: Avoid setting buffer bits that are already set blk-mq-sched: Enable merging discard bio into request blk-mq: fix discard merge with scheduler attached blk-mq: introduce BLK_STS_DEV_RESOURCE
2018-02-04Merge tag 'ntb-4.16' of git://github.com/jonmason/ntbLinus Torvalds14-1954/+3550
Pull NTB updates from Jon Mason: "Bug fixes galore, removal of the ntb atom driver, and updates to the ntb tools and tests to support the multi-port interface" * tag 'ntb-4.16' of git://github.com/jonmason/ntb: (37 commits) NTB: ntb_perf: fix cast to restricted __le32 ntb_perf: Fix an error code in perf_copy_chunk() ntb_hw_switchtec: Make function switchtec_ntb_remove() static NTB: ntb_tool: fix memory leak on 'buf' on error exit path NTB: ntb_perf: fix printing of resource_size_t NTB: ntb_hw_idt: Set NTB_TOPO_SWITCH topology NTB: ntb_test: Update ntb_perf tests NTB: ntb_test: Update ntb_tool MW tests NTB: ntb_test: Add ntb_tool Message tests NTB: ntb_test: Update ntb_tool Scratchpad tests NTB: ntb_test: Update ntb_tool DB tests NTB: ntb_test: Update ntb_tool link tests NTB: ntb_test: Add ntb_tool port tests NTB: ntb_test: Safely use paths with whitespace NTB: ntb_perf: Add full multi-port NTB API support NTB: ntb_tool: Add full multi-port NTB API support NTB: ntb_pp: Add full multi-port NTB API support NTB: Fix UB/bug in ntb_mw_get_align() NTB: Set dma mask and dma coherent mask to NTB devices NTB: Rename NTB messaging API methods ...
2018-02-04Merge tag 'mailbox-v4.16' of ↵Linus Torvalds3-16/+51
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox updates from Jassi Brar: "Misc driver changes only: - TI-MsgMgr: Fix print format for a printk - TI-MSgMgr: SPDX license switch for the driver - QCOM-IPC: Convert driver to use regmap - QCOM-IPC: Spawn sibling clock device from mailbox driver" * tag 'mailbox-v4.16' of git://git.linaro.org/landing-teams/working/fujitsu/integration: dt-bindings: mailbox: qcom: Document the APCS clock binding mailbox: qcom: Create APCS child device for clock controller mailbox: qcom: Convert APCS IPC driver to use regmap mailbox: ti-msgmgr: Use %zu for size_t print format mailbox: ti-msgmgr: Switch to SPDX Licensing
2018-02-04Merge branch 'i2c/for-4.16' of ↵Linus Torvalds45-1012/+1381
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "I2C has the following changes for you: - new flag to mark DMA safe buffers in i2c_msg. Also, some infrastructure around it. And docs. - huge refactoring of the at24 driver led by the new maintainer Bartosz - update I2C bus recovery to send STOP after recovery - conversion from gpio to gpiod for I2C bus recovery - adding a fault-injector to the i2c-gpio driver - lots of small driver improvements, and bigger ones to i2c-sh_mobile" * 'i2c/for-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (99 commits) i2c: mv64xxx: Add myself as maintainer for this driver i2c: mv64xxx: Fix clock resource by adding an optional bus clock i2c: mv64xxx: Remove useless test before clk_disable_unprepare i2c: mxs: use true and false for boolean values i2c: meson: update doc description to fix build warnings i2c: meson: add configurable divider factors dt-bindings: i2c: update documentation for the Meson-AXG i2c: imx-lpi2c: add runtime pm support i2c: rcar: fix some trivial typos in comments i2c: davinci: fix the cpufreq transition i2c: rk3x: add proper kerneldoc header i2c: rk3x: account for const type of of_device_id.data i2c: acorn: remove outdated path from file header i2c: acorn: add MODULE_LICENSE tag i2c: rcar: implement bus recovery i2c: send STOP after successful bus recovery i2c: ensure SDA is released in recovery if SDA is controllable i2c: add 'set_sda' to bus_recovery_info i2c: add identifier in declarations for i2c_bus_recovery i2c: make kerneldoc about bus recovery more precise ...
2018-02-04Merge tag 'fscrypt_for_linus' of ↵Linus Torvalds17-500/+500
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt Pull fscrypt updates from Ted Ts'o: "Refactor support for encrypted symlinks to move common code to fscrypt" Ted also points out about the merge: "This makes the f2fs symlink code use the fscrypt_encrypt_symlink() from the fscrypt tree. This will end up dropping the kzalloc() -> f2fs_kzalloc() change, which means the fscrypt-specific allocation won't get tested by f2fs's kmalloc error injection system; which is fine" * tag 'fscrypt_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/fscrypt: (26 commits) fscrypt: fix build with pre-4.6 gcc versions fscrypt: remove 'ci' parameter from fscrypt_put_encryption_info() fscrypt: document symlink length restriction fscrypt: fix up fscrypt_fname_encrypted_size() for internal use fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names fscrypt: calculate NUL-padding length in one place only fscrypt: move fscrypt_symlink_data to fscrypt_private.h fscrypt: remove fscrypt_fname_usr_to_disk() ubifs: switch to fscrypt_get_symlink() ubifs: switch to fscrypt ->symlink() helper functions ubifs: free the encrypted symlink target f2fs: switch to fscrypt_get_symlink() f2fs: switch to fscrypt ->symlink() helper functions ext4: switch to fscrypt_get_symlink() ext4: switch to fscrypt ->symlink() helper functions fscrypt: new helper function - fscrypt_get_symlink() fscrypt: new helper functions for ->symlink() fscrypt: trim down fscrypt.h includes fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h ...
2018-02-04IB/uverbs: Use the standard kConfig format for experimentalJason Gunthorpe1-1/+1
We really don't want people turning this on just yet, make it very clear with capital letters. Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Doug Ledford <[email protected]>