aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-10-18Revert "kprobes: Warn if optprobe handler tries to change execution path"Naveen N. Rao1-4/+1
This reverts commit: e863d539614641 ("kprobes: Warn if optprobe handler tries to change execution path") On PowerPC, we place a probe at kretprobe_trampoline to catch function returns and with CONFIG_OPTPROBES=y, this probe gets optimized. This works for us due to the way we handle the optprobe as described in commit: 762df10bad6954 ("powerpc/kprobes: Optimize kprobe in kretprobe_trampoline()") With the above commit, we end up with a warning. As such, revert this change. Reported-by: Michael Ellerman <[email protected]> Signed-off-by: Naveen N. Rao <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-18perf test shell trace+probe_libc_inet_pton.sh: Be compatible with Debian/UbuntuLi Zhijian1-3/+6
In debian/ubuntu, libc.so is located at a different place, /lib/x86_64-linux-gnu/libc-2.23.so, so it outputs like this when testing: PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.040 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.040/0.040/0.040/0.000 ms 0.000 probe_libc:inet_pton:(7f0e2db741c0)) __GI___inet_pton (/lib/x86_64-linux-gnu/libc-2.23.so) getaddrinfo (/lib/x86_64-linux-gnu/libc-2.23.so) [0xffffa9d40f34ff4d] (/bin/ping) Fix up the libc path to make sure this test works in more OSes. Committer testing: When this test fails one can use 'perf test -v', i.e. in verbose mode, where it'll show the expected backtrace, so, after applying this test: On Fedora 26: # perf test -v ping 62: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 23322 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms 0.000 probe_libc:inet_pton:(7fe344310d80)) __GI___inet_pton (/usr/lib64/libc-2.25.so) getaddrinfo (/usr/lib64/libc-2.25.so) _init (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # Signed-off-by: Li Zhijian <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Li Zhijian <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Philip Li <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-10-18perf xyarray: Fix wrong processing when closing evsel fdJin Yao1-2/+2
In current xyarray code, xyarray__max_x() returns max_y, and xyarray__max_y() returns max_x. It's confusing and for code logic it looks not correct. Error happens when closing evsel fd. Let's see this scenario: 1. Allocate an fd (pseudo-code) perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads) { evsel->fd = xyarray__new(ncpus, nthreads, sizeof(int)); } xyarray__new(int xlen, int ylen, size_t entry_size) { size_t row_size = ylen * entry_size; struct xyarray *xy = zalloc(sizeof(*xy) + xlen * row_size); xy->entry_size = entry_size; xy->row_size = row_size; xy->entries = xlen * ylen; xy->max_x = xlen; xy->max_y = ylen; ...... } So max_x is ncpus, max_y is nthreads and row_size = nthreads * 4. 2. Use perf syscall and get the fd int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus, struct thread_map *threads) { for (cpu = 0; cpu < cpus->nr; cpu++) { for (thread = 0; thread < nthreads; thread++) { int fd, group_fd; fd = sys_perf_event_open(&evsel->attr, pid, cpus->map[cpu], group_fd, flags); FD(evsel, cpu, thread) = fd; } } static inline void *xyarray__entry(struct xyarray *xy, int x, int y) { return &xy->contents[x * xy->row_size + y * xy->entry_size]; } These codes don't have issues. The issue happens in the closing of fd. 3. Close fd. void perf_evsel__close_fd(struct perf_evsel *evsel) { int cpu, thread; for (cpu = 0; cpu < xyarray__max_x(evsel->fd); cpu++) for (thread = 0; thread < xyarray__max_y(evsel->fd); ++thread) { close(FD(evsel, cpu, thread)); FD(evsel, cpu, thread) = -1; } } Since xyarray__max_x() returns max_y (nthreads) and xyarry__max_y() returns max_x (ncpus), so above code is actually to be: for (cpu = 0; cpu < nthreads; cpu++) for (thread = 0; thread < ncpus; ++thread) { close(FD(evsel, cpu, thread)); FD(evsel, cpu, thread) = -1; } It's not correct! This change is introduced by "475fb533fb7d" ("perf evsel: Fix buffer overflow while freeing events") This fix is to let xyarray__max_x() return max_x (ncpus) and let xyarry__max_y() return max_y (nthreads) Committer note: This was also fixed by Ravi Bangoria, who provided the same patch, noticing the problem with 'perf record': <quote Ravi> I see 'perf record -p <pid>' crashes with following log: *** Error in `./perf': free(): invalid next size (normal): 0x000000000298b340 *** ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7fd85c87e5] /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f7fd85d137a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7fd85d553c] ./perf(perf_evsel__close+0xb4)[0x4b7614] ./perf(perf_evlist__delete+0x100)[0x4ab180] ./perf(cmd_record+0x1d9)[0x43a5a9] ./perf[0x49aa2f] ./perf(main+0x631)[0x427841] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7fd8571830] ./perf(_start+0x29)[0x427a59] </> Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Fixes: d74be4767367 ("perf xyarray: Save max_x, max_y") Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/1508327446-15302-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-10-18Merge tag 'enforcement-4.14-rc6' of ↵Linus Torvalds2-0/+148
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull enforcement policy update from Greg KH: "Documentation: Add a file explaining the requested Linux kernel license enforcement policy Here's a new file to the kernel's Documentation directory. It adds a short document describing the views of how the Linux kernel community feels about enforcing the license of the kernel. The patch has been reviewed by a large number of kernel developers already, as seen by their acks on the patch, and their agreement of the statement with their names on it. The location of the file was also agreed upon by the Documentation maintainer, so all should be good there. For some background information about this statement, see this article written by some of the kernel developers involved in drafting it: http://kroah.com/log/blog/2017/10/16/linux-kernel-community-enforcement-statement/ and this article that answers a number of questions that came up in the discussion of this statement with the kernel developer community: http://kroah.com/log/blog/2017/10/16/linux-kernel-community-enforcement-statement-faq/ If anyone has any further questions about it, please let me, and the TAB members, know and we will be glad to help answer them" * tag 'enforcement-4.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Documentation: Add a file explaining the Linux kernel license enforcement policy
2017-10-18Merge branch 'for-linus' of ↵Linus Torvalds2-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Martin Schwidefsky: "Two bug fixes: - A fix for cputime accounting vs CPU hotplug - Add two options to zfcpdump_defconfig to make SCSI dump work again" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: fix zfcpdump-config s390/cputime: fix guest/irq/softirq times after CPU hotplug
2017-10-18Merge tag 'trace-v4.14-rc3' of ↵Linus Torvalds1-3/+11
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Testing a new trace event format, I triggered a bug by doing: # modprobe trace-events-sample # echo 1 > /sys/kernel/debug/tracing/events/sample-trace/enable # rmmod trace-events-sample This would cause an oops. The issue is that I added another trace event sample that reused a reg function of another trace event to create a thread to call the tracepoints. The problem was that the reg function couldn't handle nested calls (reg; reg; unreg; unreg;) and created two threads (instead of one) and only removed one on exit. This isn't a critical bug as the bug is only in sample code. But sample code should be free of known bugs to prevent others from copying it. This is why this is also marked for stable" * tag 'trace-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing/samples: Fix creation and deletion of simple_thread_fn creation
2017-10-18ALSA: hda - Fix incorrect TLV callback check introduced during set_fs() removalTakashi Iwai3-42/+89
The commit 99b5c5bb9a54 ("ALSA: hda - Remove the use of set_fs()") converted the get_kctl_0dB_offset() call for killing set_fs() usage in HD-audio codec code. The conversion assumed that the TLV callback used in HD-audio code is only snd_hda_mixer_amp() and applies the TLV calculation locally. Although this assumption is correct, and all slave kctls are actually with that callback, the current code is still utterly buggy; it doesn't hit this condition and falls back to the next check. It's because the function gets called after adding slave kctls to vmaster. By assigning a slave kctl, the slave kctl object is faked inside vmaster code, and the whole kctl ops are overridden. Thus the callback op points to a different value from what we've assumed. More badly, as reported by the KERNEXEC and UDEREF features of PaX, the code flow turns into the unexpected pitfall. The next fallback check is SNDRV_CTL_ELEM_ACCESS_TLV_READ access bit, and this always hits for each kctl with TLV. Then it evaluates the callback function pointer wrongly as if it were a TLV array. Although currently its side-effect is fairly limited, this incorrect reference may lead to an unpleasant result. For addressing the regression, this patch introduces a new helper to vmaster code, snd_ctl_apply_vmaster_slaves(). This works similarly like the existing map_slaves() in hda_codec.c: it loops over the slave list of the given master, and applies the given function to each slave. Then the initializer function receives the right kctl object and we can compare the correct pointer instead of the faked one. Also, for catching the similar breakage in future, give an error message when the unexpected TLV callback is found and bail out immediately. Fixes: 99b5c5bb9a54 ("ALSA: hda - Remove the use of set_fs()") Reported-by: PaX Team <[email protected]> Cc: <[email protected]> # v4.13 Signed-off-by: Takashi Iwai <[email protected]>
2017-10-18ALSA: hda: Remove superfluous '-' added by printk conversionTakashi Iwai1-1/+1
While converting the error messages to the standard macros in the commit 4e76a8833fac ("ALSA: hda - Replace with standard printk"), a superfluous '-' slipped in the code mistakenly. Its influence is almost negligible, merely shows a dB value as negative integer instead of positive integer (or vice versa) in the rare error message. So let's kill this embarrassing byte to show more correct value. Fixes: 4e76a8833fac ("ALSA: hda - Replace with standard printk") Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2017-10-18ALSA: hda: Abort capability probe at invalid register readTakashi Iwai1-0/+5
The loop in snd_hdac_bus_parse_capabilities() may go to nirvana when it hits an invalid register value read: BUG: unable to handle kernel paging request at ffffad5dc41f3fff IP: pci_azx_readl+0x5/0x10 [snd_hda_intel] Call Trace: snd_hdac_bus_parse_capabilities+0x3c/0x1f0 [snd_hda_core] azx_probe_continue+0x7d5/0x940 [snd_hda_intel] ..... This happened on a new Intel machine, and we need to check the value and abort the loop accordingly. [Note: the fixes tag below indicates only the commit where this patch can be applied; the original problem was introduced even before that commit] Fixes: 6720b38420a0 ("ALSA: hda - move bus_parse_capabilities to core") Cc: <[email protected]> Acked-by: Vinod Koul <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2017-10-18ALSA: seq: Enable 'use' locking in all configurationsBen Hutchings2-16/+0
The 'use' locking macros are no-ops if neither SMP or SND_DEBUG is enabled. This might once have been OK in non-preemptible configurations, but even in that case snd_seq_read() may sleep while relying on a 'use' lock. So always use the proper implementations. Cc: [email protected] Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2017-10-18Revert "tools/power turbostat: stop migrating, unless '-m'"Len Brown1-9/+1
This reverts commit c91fc8519d87715a3a173475ea3778794c139996. That change caused a C6 and PC6 residency regression on large idle systems. Users also complained about new output indicating jitter: turbostat: cpu6 jitter 3794 9142 Signed-off-by: Len Brown <[email protected]> Cc: 4.13+ <[email protected]> # v4.13+ Signed-off-by: Rafael J. Wysocki <[email protected]>
2017-10-17Merge tag 'scsi-fixes' of ↵Linus Torvalds5-5/+15
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Four mostly error leg fixes and one more important regression in a prior commit (the qla2xxx one)" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: fc: check for rport presence in fc_block_scsi_eh scsi: qla2xxx: Fix uninitialized work element scsi: libiscsi: fix shifting of DID_REQUEUE host byte scsi: libfc: fix a deadlock in fc_rport_work scsi: fixup kernel warning during rmmod()
2017-10-17tracing/samples: Fix creation and deletion of simple_thread_fn creationSteven Rostedt (VMware)1-3/+11
Commit 7496946a8 ("tracing: Add samples of DECLARE_EVENT_CLASS() and DEFINE_EVENT()") added template examples for all the events. It created a DEFINE_EVENT_FN() example which reused the foo_bar_reg and foo_bar_unreg functions. Enabling both the TRACE_EVENT_FN() and DEFINE_EVENT_FN() example trace events caused the foo_bar_reg to be called twice, creating the test thread twice. The foo_bar_unreg would remove it only once, even if it was called multiple times, leaving a thread existing when the module is unloaded, causing an oops. Add a ref count and allow foo_bar_reg() and foo_bar_unreg() be called by multiple trace events. Cc: [email protected] Fixes: 7496946a8 ("tracing: Add samples of DECLARE_EVENT_CLASS() and DEFINE_EVENT()") Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2017-10-17fs: Avoid invalidation in interrupt context in dio_complete()Lukas Czerner1-6/+13
Currently we try to defer completion of async DIO to the process context in case there are any mapped pages associated with the inode so that we can invalidate the pages when the IO completes. However the check is racy and the pages can be mapped afterwards. If this happens we might end up calling invalidate_inode_pages2_range() in dio_complete() in interrupt context which could sleep. This can be reproduced by generic/451. Fix this by passing the information whether we can or can't invalidate to the dio_complete(). Thanks Eryu Guan for reporting this and Jan Kara for suggesting a fix. Fixes: 332391a9935d ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") Reported-by: Eryu Guan <[email protected]> Reviewed-by: Jan Kara <[email protected]> Tested-by: Eryu Guan <[email protected]> Signed-off-by: Lukas Czerner <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-10-17perf buildid-list: Fix crash when processing PERF_RECORD_NAMESPACENamhyung Kim1-0/+2
Thomas reported that 'perf buildid-list' gets a SEGFAULT due to NULL pointer deref when he ran it on a data with namespace events. It was because the buildid_id__mark_dso_hit_ops lacks the namespace event handler and perf_too__fill_default() didn't set it. Program received signal SIGSEGV, Segmentation fault. 0x0000000000000000 in ?? () Missing separate debuginfos, use: dnf debuginfo-install audit-libs-2.7.7-1.fc25.s390x bzip2-libs-1.0.6-21.fc25.s390x elfutils-libelf-0.169-1.fc25.s390x +elfutils-libs-0.169-1.fc25.s390x libcap-ng-0.7.8-1.fc25.s390x numactl-libs-2.0.11-2.ibm.fc25.s390x openssl-libs-1.1.0e-1.1.ibm.fc25.s390x perl-libs-5.24.1-386.fc25.s390x +python-libs-2.7.13-2.fc25.s390x slang-2.3.0-7.fc25.s390x xz-libs-5.2.3-2.fc25.s390x zlib-1.2.8-10.fc25.s390x (gdb) where #0 0x0000000000000000 in ?? () #1 0x00000000010fad6a in machines__deliver_event (machines=<optimized out>, machines@entry=0x2c6fd18, evlist=<optimized out>, event=event@entry=0x3fffdf00470, sample=0x3ffffffe880, sample@entry=0x3ffffffe888, tool=tool@entry=0x1312968 <build_id.mark_dso_hit_ops>, file_offset=1136) at util/session.c:1287 #2 0x00000000010fbf4e in perf_session__deliver_event (file_offset=1136, tool=0x1312968 <build_id.mark_dso_hit_ops>, sample=0x3ffffffe888, event=0x3fffdf00470, session=0x2c6fc30) at util/session.c:1340 #3 perf_session__process_event (session=0x2c6fc30, session@entry=0x0, event=event@entry=0x3fffdf00470, file_offset=file_offset@entry=1136) at util/session.c:1522 #4 0x00000000010fddde in __perf_session__process_events (file_size=11880, data_size=<optimized out>, data_offset=<optimized out>, session=0x0) at util/session.c:1899 #5 perf_session__process_events (session=0x0, session@entry=0x2c6fc30) at util/session.c:1953 #6 0x000000000103b2ac in perf_session__list_build_ids (with_hits=<optimized out>, force=<optimized out>) at builtin-buildid-list.c:83 #7 cmd_buildid_list (argc=<optimized out>, argv=<optimized out>) at builtin-buildid-list.c:115 #8 0x00000000010a026c in run_builtin (p=0x1311f78 <commands+24>, argc=argc@entry=2, argv=argv@entry=0x3fffffff3c0) at perf.c:296 #9 0x000000000102bc00 in handle_internal_command (argv=<optimized out>, argc=2) at perf.c:348 #10 run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:392 #11 main (argc=<optimized out>, argv=0x3fffffff3c0) at perf.c:536 (gdb) Fix it by adding a stub event handler for namespace event. Committer testing: Further clarifying, plain using 'perf buildid-list' will not end up in a SEGFAULT when processing a perf.data file with namespace info: # perf record -a --namespaces sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.024 MB perf.data (1058 samples) ] # perf buildid-list | wc -l 38 # perf buildid-list | head -5 e2a171c7b905826fc8494f0711ba76ab6abbd604 /lib/modules/4.14.0-rc3+/build/vmlinux 874840a02d8f8a31cedd605d0b8653145472ced3 /lib/modules/4.14.0-rc3+/kernel/arch/x86/kvm/kvm-intel.ko ea7223776730cd8a22f320040aae4d54312984bc /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/i915/i915.ko 5961535e6732a8edb7f22b3f148bb2fa2e0be4b9 /lib/modules/4.14.0-rc3+/kernel/drivers/gpu/drm/drm.ko f045f54aa78cf1931cc893f78b6cbc52c72a8cb1 /usr/lib64/libc-2.25.so # It is only when one asks for checking what of those entries actually had samples, i.e. when we use either -H or --with-hits, that we will process all the PERF_RECORD_ events, and since tools/perf/builtin-buildid-list.c neither explicitely set a perf_tool.namespaces() callback nor the default stub was set that we end up, when processing a PERF_RECORD_NAMESPACE record, causing a SEGFAULT: # perf buildid-list -H Segmentation fault (core dumped) ^C # Reported-and-Tested-by: Thomas-Mich Richter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Hari Bathini <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas-Mich Richter <[email protected]> Fixes: f3b3614a284d ("perf tools: Add PERF_RECORD_NAMESPACES to include namespaces related info") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-10-17perf record: Fix documentation for a inexistent option '-l'Taeung Song1-2/+2
'perf record' had a '-l' option that meant "scale counter values" a very long time ago, but it currently belongs to 'perf stat' as '-c'. So remove it. I found this problem in the below case. $ perf record -e cycles -l sleep 3 Error: unknown switch `l Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-10-17Merge tag 'media/v4.14-2' of ↵Linus Torvalds12-41/+153
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Core fixes: - cec: Respond to unregistered initiators, when applicable - dvb_frontend: only use kref after initialized Driver-specific fixes: - qcom, camss: Make function vfe_set_selection static - qcom: VIDEO_QCOM_CAMSS should depend on HAS_DMA - s5p-cec: add NACK detection support - media: staging/imx: Fix uninitialized variable warning - dib3000mc: i2c transfers over usb cannot be done from stack - venus: init registered list on streamoff" * tag 'media/v4.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: dvb_frontend: only use kref after initialized media: platform: VIDEO_QCOM_CAMSS should depend on HAS_DMA media: cec: Respond to unregistered initiators, when applicable media: s5p-cec: add NACK detection support media: staging/imx: Fix uninitialized variable warning media: qcom: camss: Make function vfe_set_selection static media: venus: init registered list on streamoff media: dvb: i2c transfers over usb cannot be done from stack
2017-10-16xfs: move two more RT specific functions into CONFIG_XFS_RTArnd Bergmann1-24/+24
The last cleanup introduced two harmless warnings: fs/xfs/xfs_fsmap.c:480:1: warning: '__xfs_getfsmap_rtdev' defined but not used fs/xfs/xfs_fsmap.c:372:1: warning: 'xfs_getfsmap_rtdev_rtbitmap_helper' defined but not used This moves those two functions as well. Fixes: bb9c2e543325 ("xfs: move more RT specific code under CONFIG_XFS_RT") Signed-off-by: Arnd Bergmann <[email protected]> Reviewed-by: Brian Foster <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-10-16xfs: trim writepage mapping to within eofBrian Foster3-0/+25
The writeback rework in commit fbcc02561359 ("xfs: Introduce writeback context for writepages") introduced a subtle change in behavior with regard to the block mapping used across the ->writepages() sequence. The previous xfs_cluster_write() code would only flush pages up to EOF at the time of the writepage, thus ensuring that any pages due to file-extending writes would be handled on a separate cycle and with a new, updated block mapping. The updated code establishes a block mapping in xfs_writepage_map() that could extend beyond EOF if the file has post-eof preallocation. Because we now use the generic writeback infrastructure and pass the cached mapping to each writepage call, there is no implicit EOF limit in place. If eofblocks trimming occurs during ->writepages(), any post-eof portion of the cached mapping becomes invalid. The eofblocks code has no means to serialize against writeback because there are no pages associated with post-eof blocks. Therefore if an eofblocks trim occurs and is followed by a file-extending buffered write, not only has the mapping become invalid, but we could end up writing a page to disk based on the invalid mapping. Consider the following sequence of events: - A buffered write creates a delalloc extent and post-eof speculative preallocation. - Writeback starts and on the first writepage cycle, the delalloc extent is converted to real blocks (including the post-eof blocks) and the mapping is cached. - The file is closed and xfs_release() trims post-eof blocks. The cached writeback mapping is now invalid. - Another buffered write appends the file with a delalloc extent. - The concurrent writeback cycle picks up the just written page because the writeback range end is LLONG_MAX. xfs_writepage_map() attributes it to the (now invalid) cached mapping and writes the data to an incorrect location on disk (and where the file offset is still backed by a delalloc extent). This problem is reproduced by xfstests test generic/464, which triggers racing writes, appends, open/closes and writeback requests. To address this problem, trim the mapping used during writeback to within EOF when the mapping is validated. This ensures the mapping is revalidated for any pages encountered beyond EOF as of the time the current mapping was cached or last validated. Reported-by: Eryu Guan <[email protected]> Diagnosed-by: Eryu Guan <[email protected]> Signed-off-by: Brian Foster <[email protected]> Reviewed-by: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-10-16fs: invalidate page cache after end_io() in dio completionEryu Guan2-25/+36
Commit 332391a9935d ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") moved page cache invalidation from iomap_dio_rw() to iomap_dio_complete() for iomap based direct write path, but before the dio->end_io() call, and it re-introdued the bug fixed by commit c771c14baa33 ("iomap: invalidate page caches should be after iomap_dio_complete() in direct write"). I found this because fstests generic/418 started failing on XFS with v4.14-rc3 kernel, which is the regression test for this specific bug. So similarly, fix it by moving dio->end_io() (which does the unwritten extent conversion) before page cache invalidation, to make sure next buffer read reads the final real allocations not unwritten extents. I also add some comments about why should end_io() go first in case we get it wrong again in the future. Note that, there's no such problem in the non-iomap based direct write path, because we didn't remove the page cache invalidation after the ->direct_IO() in generic_file_direct_write() call, but I decided to fix dio_complete() too so we don't leave a landmine there, also be consistent with iomap_dio_complete(). Fixes: 332391a9935d ("fs: Fix page cache inconsistency when mixing buffered and AIO DIO") Signed-off-by: Eryu Guan <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Jan Kara <[email protected]> Reviewed-by: Lukas Czerner <[email protected]>
2017-10-16xfs: cancel dirty pages on invalidationDave Chinner1-12/+22
Recently we've had warnings arise from the vm handing us pages without bufferheads attached to them. This should not ever occur in XFS, but we don't defend against it properly if it does. The only place where we remove bufferheads from a page is in xfs_vm_releasepage(), but we can't tell the difference here between "page is dirty so don't release" and "page is dirty but is being invalidated so release it". In some places that are invalidating pages ask for pages to be released and follow up afterward calling ->releasepage by checking whether the page was dirty and then aborting the invalidation. This is a possible vector for releasing buffers from a page but then leaving it in the mapping, so we really do need to avoid dirty pages in xfs_vm_releasepage(). To differentiate between invalidated pages and normal pages, we need to clear the page dirty flag when invalidating the pages. This can be done through xfs_vm_invalidatepage(), and will result xfs_vm_releasepage() seeing the page as clean which matches the bufferhead state on the page after calling block_invalidatepage(). Hence we can re-add the page dirty check in xfs_vm_releasepage to catch the case where we might be releasing a page that is actually dirty and so should not have the bufferheads on it removed. This will remove one possible vector of "dirty page with no bufferheads" and so help narrow down the search for the root cause of that problem. Signed-Off-By: Dave Chinner <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2017-10-16perf tools: Add long time reviewers to MAINTAINERSArnaldo Carvalho de Melo1-0/+2
Jiri and Namhyung have long contributed a lot of code and time reviewing patches to tools/, so lets make that reflected in the MAINTAINERS file to encourage patch submitters to add them to the CC list, speeding up the process of tools/perf/ patch processing. Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-10-16ALSA: usb-audio: Add native DSD support for Pro-Ject Pre Box S2 DigitalJussi Laako1-0/+1
Add native DSD support quirk for Pro-Ject Pre Box S2 Digital USB id 2772:0230. Signed-off-by: Jussi Laako <[email protected]> Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2017-10-16Documentation: Add a file explaining the Linux kernel license enforcement policyGreg Kroah-Hartman2-0/+148
This adds a short document describing the views of how the Linux kernel community feels about enforcing the license of the kernel. Acked-by: Al Viro <[email protected]> Acked-by: Alex Elder (Linaro) <[email protected]> Acked-by: Andrea Arcangeli <[email protected]> Acked-by: Andy Gross <[email protected]> Acked-by: Aneesh Kumar K.V <[email protected]> Acked-by: Anna Schumaker <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Acked-by: Arnd Bergmann <[email protected]> Acked-by: Arvind Yadav <[email protected]> Acked-by: Bart Van Assche <[email protected]> Acked-by: Bhumika Goyal <[email protected]> Acked-by: Bjorn Andersson <[email protected]> Acked-by: Borislav Petkov <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Christophe JAILLET <[email protected]> Acked-by: Chuck Lever <[email protected]> Acked-by: Colin Ian King <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Daniel Lezcano <[email protected]> Acked-by: Daniel Vetter <[email protected]> Acked-by: Darrick J. Wong (Oracle) <[email protected]> Acked-by: Darrick J. Wong <[email protected]> Acked-by: David Kershner <[email protected]> Acked-by: David S. Miller <[email protected]> Acked-by: Dmitry Torokhov <[email protected]> Acked-by: Doug Ledford <[email protected]> Acked-by: Fabio Estevam <[email protected]> Acked-by: Felipe Balbi <[email protected]> Acked-by: Florian Westphal <[email protected]> Acked-by: Geert Uytterhoeven <[email protected]> Acked-by: Guenter Roeck <[email protected]> Acked-by: Hannes Reinecke <[email protected]> Acked-by: Hans de Goede <[email protected]> Acked-by: Heiko Carstens <[email protected]> Acked-by: Heiko Stuebner <[email protected]> Acked-by: Heiner Kallweit <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Ivan Safonov <[email protected]> Acked-by: Jaegeuk Kim <[email protected]> Acked-by: Jan Kara (SUSE) <[email protected]> Acked-by: Javier Martinez Canillas <[email protected]> Acked-by: Jeff Kirsher <[email protected]> Acked-by: Jens Axboe <[email protected]> Acked-by: Jes Sorensen <[email protected]> Acked-by: Jiri Kosina <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Joe Perches <[email protected]> Acked-by: Joerg Roedel (SUSE) <[email protected]> Acked-by: Johan Hovold <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Acked-by: Juergen Gross <[email protected]> Acked-by: Julia Lawall <[email protected]> Acked-by: K. Y. Srinivasan <[email protected]> Acked-by: Khalid Aziz <[email protected]> Acked-by: Krzysztof Kozlowski <[email protected]> Acked-by: Kuninori Morimoto <[email protected]> Acked-by: Larry Finger <[email protected]> Acked-by: Laura Abbott <[email protected]> Acked-by: Lee Jones <[email protected]> Acked-by: Leon Romanovsky <[email protected]> Acked-by: Linus Walleij (Linaro) <[email protected]> Acked-by: Lv Zheng <[email protected]> Acked-by: Martin K. Petersen (Oracle) <[email protected]> Acked-by: Masahiro Yamada <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Mike Marshall <[email protected]> Acked-by: Namhyung Kim <[email protected]> Acked-by: Neil Armstrong <[email protected]> Acked-by: Olof Johansson <[email protected]> Acked-by: Pablo Neira Ayuso <[email protected]> Acked-by: Paolo Bonzini <[email protected]> Acked-by: Paul Burton <[email protected]> Acked-by: Paul E. McKenney <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Rafael J. Wysocki <[email protected]> Acked-by: Ralf Baechle <[email protected]> Acked-by: Richard Weinberger <[email protected]> Acked-by: Rik van Riel <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Rob Herring <[email protected]> Acked-by: Sebastian Reichel (Collabora) <[email protected]> Acked-by: Shawn Guo <[email protected]> Acked-by: Shuah Khan <[email protected]> Acked-by: Simon Horman <[email protected]> Acked-by: Srinivas Kandagatla <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Acked-by: Sven Eckelmann <[email protected]> Acked-by: Takashi Iwai (SUSE) <[email protected]> Acked-by: Tejun Heo <[email protected]> Acked-by: Thierry Reding <[email protected]> Acked-by: Tony Luck <[email protected]> Acked-by: Ulf Hansson <[email protected]> Acked-by: Vinod Koul <[email protected]> Acked-by: Viresh Kumar <[email protected]> Acked-by: Vivien Didelot <[email protected]> Acked-by: Wei Yongjun <[email protected]> Acked-by: Xin Long <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-10-16s390: fix zfcpdump-configDimitri John Ledkov1-0/+2
zipl from s390-tools generates root=/dev/ram0 kernel cmdline for zfcpdump, thus BLK_DEV_RAM is required. zfcpdump initrd mounts DEBUG_FS, thus is also required. Bug-Ubuntu: https://launchpad.net/bugs/1722735 Bug-Ubuntu: https://launchpad.net/bugs/1719290 Signed-off-by: Dimitri John Ledkov <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-10-16s390/cputime: fix guest/irq/softirq times after CPU hotplugChristian Borntraeger1-0/+3
On CPU hotplug some cpu stats contain bogus values: $ cat /proc/stat cpu 0 0 49 1280 0 0 0 3 0 0 cpu0 0 0 49 618 0 0 0 3 0 0 cpu1 0 0 0 662 0 0 0 0 0 0 [...] $ echo 0 > /sys/devices/system/cpu/cpu1/online $ echo 1 > /sys/devices/system/cpu/cpu1/online $ cat /proc/stat cpu 0 0 49 3200 0 450359962737 450359962737 3 0 0 cpu0 0 0 49 1956 0 0 0 3 0 0 cpu1 0 0 0 1244 0 450359962737 450359962737 0 0 0 [...] pcpu_attach_task() needs the same assignments as vtime_task_switch. Signed-off-by: Christian Borntraeger <[email protected]> Fixes: b7394a5f4ce9 ("sched/cputime, s390: Implement delayed accounting of system time") Cc: [email protected] # 4.11+ Signed-off-by: Martin Schwidefsky <[email protected]>
2017-10-15Linux 4.14-rc5Linus Torvalds1-1/+1
2017-10-15Merge tag 'char-misc-4.14-rc5' of ↵Linus Torvalds8-81/+115
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are 4 patches to resolve some char/misc driver issues found these past weeks. One of them is a mei bugfix and another is a new mei device id. There is also a hyper-v fix for a reported issue, and a binder issue fix for a problem reported by a few people. All of these have been in my tree for a while, I don't know if linux-next is really testing much this month. But 0-day is happy with them :)" * tag 'char-misc-4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: binder: fix use-after-free in binder_transaction() Drivers: hv: vmbus: Fix bugs in rescind handling mei: me: add gemini lake devices id mei: always use domain runtime pm callbacks.
2017-10-15Merge tag 'usb-4.14-rc5' of ↵Linus Torvalds15-27/+86
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are a handful of USB driver fixes for 4.14-rc5. There is the "usual" usb-serial fixes and device ids, USB gadget fixes, and some more fixes found by the fuzz testing that is happening on the USB layer right now. All of these have been in my tree this week with no reported issues" * tag 'usb-4.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: usbtest: fix NULL pointer dereference usb: gadget: configfs: Fix memory leak of interface directory data usb: gadget: composite: Fix use-after-free in usb_composite_overwrite_options usb: misc: usbtest: Fix overflow in usbtest_do_ioctl() usb: renesas_usbhs: Fix DMAC sequence for receiving zero-length packet USB: dummy-hcd: Fix deadlock caused by disconnect detection usb: phy: tegra: Fix phy suspend for UDC USB: serial: console: fix use-after-free after failed setup USB: serial: console: fix use-after-free on disconnect USB: serial: qcserial: add Dell DW5818, DW5819 USB: serial: cp210x: add support for ELV TFD500 USB: serial: cp210x: fix partnum regression USB: serial: option: add support for TP-Link LTE module USB: serial: ftdi_sio: add id for Cypress WICED dev board
2017-10-15Merge tag 'dmaengine-fix-4.14-rc5' of ↵Linus Torvalds3-19/+40
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "Here are fixes for this round - fix spinlock usage amd fifo response for altera driver - fix ti crossbar race condition - fix edma memcpy align" * tag 'dmaengine-fix-4.14-rc5' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: altera: fix spinlock usage dmaengine: altera: fix response FIFO emptying dmaengine: ti-dma-crossbar: Fix possible race condition with dma_inuse dmaengine: edma: Align the memcpy acnt array size with the transfer
2017-10-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds16-88/+284
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "A landry list of fixes: - fix reboot breakage on some PCID-enabled system - fix crashes/hangs on some PCID-enabled systems - fix microcode loading on certain older CPUs - various unwinder fixes - extend an APIC quirk to more hardware systems and disable APIC related warning on virtualized systems - various Hyper-V fixes - a macro definition robustness fix - remove jprobes IRQ disabling - various mem-encryption fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Do the family check first x86/mm: Flush more aggressively in lazy TLB mode x86/apic: Update TSC_DEADLINE quirk with additional SKX stepping x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on hypervisors x86/mm: Disable various instrumentations of mm/mem_encrypt.c and mm/tlb.c x86/hyperv: Fix hypercalls with extended CPU ranges for TLB flushing x86/hyperv: Don't use percpu areas for pcpu_flush/pcpu_flush_ex structures x86/hyperv: Clear vCPU banks between calls to avoid flushing unneeded vCPUs x86/unwind: Disable unwinder warnings on 32-bit x86/unwind: Align stack pointer in unwinder dump x86/unwind: Use MSB for frame pointer encoding on 32-bit x86/unwind: Fix dereference of untrusted pointer x86/alternatives: Fix alt_max_short macro to really be a max() x86/mm/64: Fix reboot interaction with CR4.PCIDE kprobes/x86: Remove IRQ disabling from jprobe handlers kprobes/x86: Set up frame pointer in kprobe trampoline
2017-10-14Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds3-102/+49
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Three fixes that address an SMP balancing performance regression" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/core: Ensure load_balance() respects the active_mask sched/core: Address more wake_affine() regressions sched/core: Fix wake_affine() performance regression
2017-10-14Merge branch 'ras-urgent-for-linus' of ↵Linus Torvalds4-2/+10
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RAS fixes from Ingo Molnar: "A boot parameter fix, plus a header export fix" * 'ras-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mce: Hide mca_cfg RAS/CEC: Use the right length for "cec_disable"
2017-10-14Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds8-26/+74
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Some tooling fixes plus three kernel fixes: a memory leak fix, a statistics fix and a crash fix" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Fix memory leaks on allocation failures perf/core: Fix cgroup time when scheduling descendants perf/core: Avoid freeing static PMU contexts when PMU is unregistered tools include uapi bpf.h: Sync kernel ABI header with tooling header perf pmu: Unbreak perf record for arm/arm64 with events with explicit PMU perf script: Add missing separator for "-F ip,brstack" (and brstackoff) perf callchain: Compare dsos (as well) for CCKEY_FUNCTION
2017-10-14Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds3-30/+24
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Two lockdep fixes for bugs introduced by the cross-release dependency tracking feature - plus a commit that disables it because performance regressed in an absymal fashion on some systems" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Disable cross-release features for now locking/selftest: Avoid false BUG report locking/lockdep: Fix stacktrace mess
2017-10-14Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds3-2/+45
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Ingo Molnar: "A CPU hotplug related fix, plus two related sanity checks" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/cpuhotplug: Enforce affinity setting on startup of managed irqs genirq/cpuhotplug: Add sanity check for effective affinity mask genirq: Warn when effective affinity is not updated
2017-10-14Merge branch 'core-urgent-for-linus' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fix from Ingo Molnar: "A single objtool fix: avoid silently broken ORC debuginfo builds and error out instead" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Upgrade libelf-devel warning to error for CONFIG_ORC_UNWINDER
2017-10-14x86/microcode: Do the family check firstBorislav Petkov1-9/+18
On CPUs like AMD's Geode, for example, we shouldn't even try to load microcode because they do not support the modern microcode loading interface. However, we do the family check *after* the other checks whether the loader has been disabled on the command line or whether we're running in a guest. So move the family checks first in order to exit early if we're being loaded on an unsupported family. Reported-and-tested-by: Sven Glodowski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: <[email protected]> # 4.11.. Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-14locking/lockdep: Disable cross-release features for nowIngo Molnar1-2/+2
Johan Hovold reported a big lockdep slowdown on his system, caused by lockdep: > I had noticed that the BeagleBone Black boot time appeared to have > increased significantly with 4.14 and yesterday I finally had time to > investigate it. > > Boot time (from "Linux version" to login prompt) had in fact doubled > since 4.13 where it took 17 seconds (with my current config) compared to > the 35 seconds I now see with 4.14-rc4. > > I quick bisect pointed to lockdep and specifically the following commit: > > 28a903f63ec0 ("locking/lockdep: Handle non(or multi)-acquisition of a crosslock") Because the final v4.14 release is close, disable the cross-release lockdep features for now. Bisected-by: Johan Hovold <[email protected]> Debugged-by: Johan Hovold <[email protected]> Reported-by: Johan Hovold <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Byungchul Park <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tony Lindgren <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-14Merge branch '4.14-fixes' of ↵Linus Torvalds5-26/+28
git://git.linux-mips.org/pub/scm/ralf/upstream-linus Pull MIPS fixes from Ralf Baechle: "More MIPS fixes for 4.14: - Loongson 1: Set the default number of RX and TX queues to accomodate for recent changes of stmmac driver. - BPF: Fix uninitialised target compiler error. - Fix cmpxchg on 32 bit signed ints for 64 bit kernels with !kernel_uses_llsc - Fix generic-board-config.sh for builds using O= - Remove pr_err() calls from fpu_emu() for a case which is not a kernel error" * '4.14-fixes' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: MIPS: math-emu: Remove pr_err() calls from fpu_emu() MIPS: Fix generic-board-config.sh for builds using O= MIPS: Fix cmpxchg on 32b signed ints for 64b kernel with !kernel_uses_llsc MIPS: loongson1: set default number of rx and tx queues for stmmac MIPS: bpf: Fix uninitialised target compiler error
2017-10-14x86/mm: Flush more aggressively in lazy TLB modeAndy Lutomirski3-49/+136
Since commit: 94b1b03b519b ("x86/mm: Rework lazy TLB mode and TLB freshness tracking") x86's lazy TLB mode has been all the way lazy: when running a kernel thread (including the idle thread), the kernel keeps using the last user mm's page tables without attempting to maintain user TLB coherence at all. From a pure semantic perspective, this is fine -- kernel threads won't attempt to access user pages, so having stale TLB entries doesn't matter. Unfortunately, I forgot about a subtlety. By skipping TLB flushes, we also allow any paging-structure caches that may exist on the CPU to become incoherent. This means that we can have a paging-structure cache entry that references a freed page table, and the CPU is within its rights to do a speculative page walk starting at the freed page table. I can imagine this causing two different problems: - A speculative page walk starting from a bogus page table could read IO addresses. I haven't seen any reports of this causing problems. - A speculative page walk that involves a bogus page table can install garbage in the TLB. Such garbage would always be at a user VA, but some AMD CPUs have logic that triggers a machine check when it notices these bogus entries. I've seen a couple reports of this. Boris further explains the failure mode: > It is actually more of an optimization which assumes that paging-structure > entries are in WB DRAM: > > "TlbCacheDis: cacheable memory disable. Read-write. 0=Enables > performance optimization that assumes PML4, PDP, PDE, and PTE entries > are in cacheable WB-DRAM; memory type checks may be bypassed, and > addresses outside of WB-DRAM may result in undefined behavior or NB > protocol errors. 1=Disables performance optimization and allows PML4, > PDP, PDE and PTE entries to be in any memory type. Operating systems > that maintain page tables in memory types other than WB- DRAM must set > TlbCacheDis to insure proper operation." > > The MCE generated is an NB protocol error to signal that > > "Link: A specific coherent-only packet from a CPU was issued to an > IO link. This may be caused by software which addresses page table > structures in a memory type other than cacheable WB-DRAM without > properly configuring MSRC001_0015[TlbCacheDis]. This may occur, for > example, when page table structure addresses are above top of memory. In > such cases, the NB will generate an MCE if it sees a mismatch between > the memory operation generated by the core and the link type." > > I'm assuming coherent-only packets don't go out on IO links, thus the > error. To fix this, reinstate TLB coherence in lazy mode. With this patch applied, we do it in one of two ways: - If we have PCID, we simply switch back to init_mm's page tables when we enter a kernel thread -- this seems to be quite cheap except for the cost of serializing the CPU. - If we don't have PCID, then we set a flag and switch to init_mm the first time we would otherwise need to flush the TLB. The /sys/kernel/debug/x86/tlb_use_lazy_mode debug switch can be changed to override the default mode for benchmarking. In theory, we could optimize this better by only flushing the TLB in lazy CPUs when a page table is freed. Doing that would require auditing the mm code to make sure that all page table freeing goes through tlb_remove_page() as well as reworking some data structures to implement the improved flush logic. Reported-by: Markus Trippelsdorf <[email protected]> Reported-by: Adam Borowski <[email protected]> Signed-off-by: Andy Lutomirski <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Eric Biggers <[email protected]> Cc: Johannes Hirte <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Nadav Amit <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Roman Kagan <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 94b1b03b519b ("x86/mm: Rework lazy TLB mode and TLB freshness tracking") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-10-13Merge tag 'drm-fixes-for-v4.14-rc5' of ↵Linus Torvalds19-50/+119
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Couple of the arm people seem to wake up so this has imx and msm fixes, along with a bunch of i915 stable bounds fixes and an amdgpu regression fix. All seems pretty okay for now" * tag 'drm-fixes-for-v4.14-rc5' of git://people.freedesktop.org/~airlied/linux: drm/msm: fix _NO_IMPLICIT fencing case drm/msm: fix error path cleanup drm/msm/mdp5: Remove extra pm_runtime_put call in mdp5_crtc_cursor_set() drm/msm/dsi: Use correct pm_runtime_put variant during host_init drm/msm: fix return value check in _msm_gem_kernel_new() drm/msm: use proper memory barriers for updating tail/head drm/msm/mdp5: add missing max size for 8x74 v1 drm/amdgpu: fix placement flags in amdgpu_ttm_bind drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel gpu: ipu-v3: pre: implement workaround for ERR009624 gpu: ipu-v3: prg: wait for double buffers to be filled on channel startup gpu: ipu-v3: Allow channel burst locking on i.MX6 only drm/i915: Read timings from the correct transcoder in intel_crtc_mode_get() drm/i915: Order two completing nop_submit_request drm/i915: Silence compiler warning for hsw_power_well_enable() drm/i915: Use crtc_state_is_legacy_gamma in intel_color_check drm/i915/edp: Increase the T12 delay quirk to 1300ms drm/i915/edp: Get the Panel Power Off timestamp after panel is off sync_file: Return consistent status in SYNC_IOC_FILE_INFO drm/atomic: Unref duplicated drm_atomic_state in drm_atomic_helper_resume()
2017-10-14Merge tag 'drm-intel-fixes-2017-10-11' of ↵Dave Airlie6-19/+26
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes drm/i915 fixes for 4.14-rc5: Three fixes for stable: - Use crtc_state_is_legacy_gamma in intel_color_check (Maarten) - Read timings from the correct transcoder (Ville). - Fix HDMI on BSW (Jani). Other fixes: - eDP fixes (Manasi) - Silence compiler warnings (Chris) - Order two completing nop_submit_request (Chris) * tag 'drm-intel-fixes-2017-10-11' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915/bios: parse DDI ports also for CHV for HDMI DDC pin and DP AUX channel drm/i915: Read timings from the correct transcoder in intel_crtc_mode_get() drm/i915: Order two completing nop_submit_request drm/i915: Silence compiler warning for hsw_power_well_enable() drm/i915: Use crtc_state_is_legacy_gamma in intel_color_check drm/i915/edp: Increase the T12 delay quirk to 1300ms drm/i915/edp: Get the Panel Power Off timestamp after panel is off
2017-10-14Merge branch 'msm-fixes-4.14-rc4' of ↵Dave Airlie7-25/+35
git://people.freedesktop.org/~robclark/linux into drm-fixes bunch of msm fixes * 'msm-fixes-4.14-rc4' of git://people.freedesktop.org/~robclark/linux: drm/msm: fix _NO_IMPLICIT fencing case drm/msm: fix error path cleanup drm/msm/mdp5: Remove extra pm_runtime_put call in mdp5_crtc_cursor_set() drm/msm/dsi: Use correct pm_runtime_put variant during host_init drm/msm: fix return value check in _msm_gem_kernel_new() drm/msm: use proper memory barriers for updating tail/head drm/msm/mdp5: add missing max size for 8x74 v1
2017-10-13Merge branch 'akpm' (patches from Andrew)Linus Torvalds21-181/+245
Merge misc fixes from Andrew Morton: "18 fixes" * emailed patches from Andrew Morton <[email protected]>: mm, swap: use page-cluster as max window of VMA based swap readahead mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock kmemleak: clear stale pointers from task stacks fs/binfmt_misc.c: node could be NULL when evicting inode fs/mpage.c: fix mpage_writepage() for pages with buffers linux/kernel.h: add/correct kernel-doc notation tty: fall back to N_NULL if switching to N_TTY fails during hangup Revert "vmalloc: back off when the current task is killed" mm/cma.c: take __GFP_NOWARN into account in cma_alloc() scripts/kallsyms.c: ignore symbol type 'n' userfaultfd: selftest: exercise -EEXIST only in background transfer mm: only display online cpus of the numa node mm: remove unnecessary WARN_ONCE in page_vma_mapped_walk(). mm/mempolicy: fix NUMA_INTERLEAVE_HIT counter include/linux/of.h: provide of_n_{addr,size}_cells wrappers for !CONFIG_OF mm/madvise.c: add description for MADV_WIPEONFORK and MADV_KEEPONFORK lib/Kconfig.debug: kernel hacking menu: runtime testing: keep tests together mm/migrate: fix indexing bug (off by one) and avoid out of bound access
2017-10-13mm, swap: use page-cluster as max window of VMA based swap readaheadHuang Ying2-44/+7
When the VMA based swap readahead was introduced, a new knob /sys/kernel/mm/swap/vma_ra_max_order was added as the max window of VMA swap readahead. This is to make it possible to use different max window for VMA based readahead and original physical readahead. But Minchan Kim pointed out that this will cause a regression because setting page-cluster sysctl to zero cannot disable swap readahead with the change. To fix the regression, the page-cluster sysctl is used as the max window of both the VMA based swap readahead and original physical swap readahead. If more fine grained control is needed in the future, more knobs can be added as the subordinate knobs of the page-cluster sysctl. The vma_ra_max_order knob is deleted. Because the knob was introduced in v4.14-rc1, and this patch is targeting being merged before v4.14 releasing, there should be no existing users of this newly added ABI. Link: http://lkml.kernel.org/r/[email protected] Fixes: ec560175c0b6fce ("mm, swap: VMA based swap readahead") Signed-off-by: "Huang, Ying" <[email protected]> Reported-by: Minchan Kim <[email protected]> Acked-by: Minchan Kim <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Tim Chen <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-10-13mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lockWill Deacon1-15/+10
Loading the pmd without holding the pmd_lock exposes us to races with concurrent updaters of the page tables but, worse still, it also allows the compiler to cache the pmd value in a register and reuse it later on, even if we've performed a READ_ONCE in between and seen a more recent value. In the case of page_vma_mapped_walk, this leads to the following crash when the pmd loaded for the initial pmd_trans_huge check is all zeroes and a subsequent valid table entry is loaded by check_pmd. We then proceed into map_pte, but the compiler re-uses the zero entry inside pte_offset_map, resulting in a junk pointer being installed in pvmw->pte: PC is at check_pte+0x20/0x170 LR is at page_vma_mapped_walk+0x2e0/0x540 [...] Process doio (pid: 2463, stack limit = 0xffff00000f2e8000) Call trace: check_pte+0x20/0x170 page_vma_mapped_walk+0x2e0/0x540 page_mkclean_one+0xac/0x278 rmap_walk_file+0xf0/0x238 rmap_walk+0x64/0xa0 page_mkclean+0x90/0xa8 clear_page_dirty_for_io+0x84/0x2a8 mpage_submit_page+0x34/0x98 mpage_process_page_bufs+0x164/0x170 mpage_prepare_extent_to_map+0x134/0x2b8 ext4_writepages+0x484/0xe30 do_writepages+0x44/0xe8 __filemap_fdatawrite_range+0xbc/0x110 file_write_and_wait_range+0x48/0xd8 ext4_sync_file+0x80/0x4b8 vfs_fsync_range+0x64/0xc0 SyS_msync+0x194/0x1e8 This patch fixes the problem by ensuring that READ_ONCE is used before the initial checks on the pmd, and this value is subsequently used when checking whether or not the pmd is present. pmd_check is removed and the pmd_present check is inlined directly. Link: http://lkml.kernel.org/r/[email protected] Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Signed-off-by: Will Deacon <[email protected]> Tested-by: Yury Norov <[email protected]> Tested-by: Richard Ruigrok <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-10-13kmemleak: clear stale pointers from task stacksKonstantin Khlebnikov2-1/+5
Kmemleak considers any pointers on task stacks as references. This patch clears newly allocated and reused vmap stacks. Link: http://lkml.kernel.org/r/150728990124.744199.8403409836394318684.stgit@buzz Signed-off-by: Konstantin Khlebnikov <[email protected]> Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-10-13fs/binfmt_misc.c: node could be NULL when evicting inodeEryu Guan1-1/+1
inode->i_private is assigned by a Node pointer only after registering a new binary format, so it could be NULL if inode was created by bm_fill_super() (or iput() was called by the error path in bm_register_write()), and this could result in NULL pointer dereference when evicting such an inode. e.g. mount binfmt_misc filesystem then umount it immediately: mount -t binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc umount /proc/sys/fs/binfmt_misc will result in BUG: unable to handle kernel NULL pointer dereference at 0000000000000013 IP: bm_evict_inode+0x16/0x40 [binfmt_misc] ... Call Trace: evict+0xd3/0x1a0 iput+0x17d/0x1d0 dentry_unlink_inode+0xb9/0xf0 __dentry_kill+0xc7/0x170 shrink_dentry_list+0x122/0x280 shrink_dcache_parent+0x39/0x90 do_one_tree+0x12/0x40 shrink_dcache_for_umount+0x2d/0x90 generic_shutdown_super+0x1f/0x120 kill_litter_super+0x29/0x40 deactivate_locked_super+0x43/0x70 deactivate_super+0x45/0x60 cleanup_mnt+0x3f/0x70 __cleanup_mnt+0x12/0x20 task_work_run+0x86/0xa0 exit_to_usermode_loop+0x6d/0x99 syscall_return_slowpath+0xba/0xf0 entry_SYSCALL_64_fastpath+0xa3/0xa Fix it by making sure Node (e) is not NULL. Link: http://lkml.kernel.org/r/[email protected] Fixes: 83f918274e4b ("exec: binfmt_misc: shift filp_close(interp_file) from kill_node() to bm_evict_inode()") Signed-off-by: Eryu Guan <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Cc: Alexander Viro <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-10-13fs/mpage.c: fix mpage_writepage() for pages with buffersMatthew Wilcox3-5/+16
When using FAT on a block device which supports rw_page, we can hit BUG_ON(!PageLocked(page)) in try_to_free_buffers(). This is because we call clean_buffers() after unlocking the page we've written. Introduce a new clean_page_buffers() which cleans all buffers associated with a page and call it from within bdev_write_page(). [[email protected]: s/PAGE_SIZE/~0U/ per Linus and Matthew] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Reported-by: Toshi Kani <[email protected]> Reported-by: OGAWA Hirofumi <[email protected]> Tested-by: Toshi Kani <[email protected]> Acked-by: Johannes Thumshirn <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>