aboutsummaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)AuthorFilesLines
2019-02-20dma-mapping: remove dma_mark_declared_memory_occupiedChristoph Hellwig1-23/+0
This API is not used anywhere, so remove it. Signed-off-by: Christoph Hellwig <[email protected]>
2019-02-20dma-mapping: move CONFIG_DMA_CMA to kernel/dma/KconfigChristoph Hellwig1-0/+77
This is where all the related code already lives. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]>
2019-02-20dma-mapping: improve selection of dma_declare_coherent availabilityChristoph Hellwig2-2/+2
This API is primarily used through DT entries, but two architectures and two drivers call it directly. So instead of selecting the config symbol for random architectures pull it in implicitly for the actual users. Also rename the Kconfig option to describe the feature better. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Paul Burton <[email protected]> # MIPS Acked-by: Lee Jones <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]>
2019-02-20Merge tag 'gpio-v5.1-updates-for-linus-part-2' of ↵Linus Walleij1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel gpio: updates for v5.1 - part 2 - gpio-mockup updates improving the user-space testing interface and adding line state tracking for correct edge interrupts - interrupt simulator patch exposing the irq type configuration to users
2019-02-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-11/+27
Two easily resolvable overlapping change conflicts, one in TCP and one in the eBPF verifier. Signed-off-by: David S. Miller <[email protected]>
2019-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2-5/+14
Pull networking fixes from David Miller: 1) Fix suspend and resume in mt76x0u USB driver, from Stanislaw Gruszka. 2) Missing memory barriers in xsk, from Magnus Karlsson. 3) rhashtable fixes in mac80211 from Herbert Xu. 4) 32-bit MIPS eBPF JIT fixes from Paul Burton. 5) Fix for_each_netdev_feature() on big endian, from Hauke Mehrtens. 6) GSO validation fixes from Willem de Bruijn. 7) Endianness fix for dwmac4 timestamp handling, from Alexandre Torgue. 8) More strict checks in tcp_v4_err(), from Eric Dumazet. 9) af_alg_release should NULL out the sk after the sock_put(), from Mao Wenan. 10) Missing unlock in mac80211 mesh error path, from Wei Yongjun. 11) Missing device put in hns driver, from Salil Mehta. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) sky2: Increase D3 delay again vhost: correctly check the return value of translate_desc() in log_used() net: netcp: Fix ethss driver probe issue net: hns: Fixes the missing put_device in positive leg for roce reset net: stmmac: Fix a race in EEE enable callback qed: Fix iWARP syn packet mac address validation. qed: Fix iWARP buffer size provided for syn packet processing. r8152: Add support for MAC address pass through on RTL8153-BD mac80211: mesh: fix missing unlock on error in table_path_del() net/mlx4_en: fix spelling mistake: "quiting" -> "quitting" net: crypto set sk to NULL when af_alg_release. net: Do not allocate page fragments that are not skb aligned mm: Use fixed constant in page_frag_alloc instead of size + 1 tcp: tcp_v4_err() should be more careful tcp: clear icsk_backoff in tcp_write_queue_purge() net: mv643xx_eth: disable clk on error path in mv643xx_eth_shared_probe() qmi_wwan: apply SET_DTR quirk to Sierra WP7607 net: stmmac: handle endianness in dwmac4_get_timestamp doc: Mention MSG_ZEROCOPY implementation for UDP mlxsw: __mlxsw_sp_port_headroom_set(): Fix a use of local variable ...
2019-02-19bpf: check that BPF programs run with preemption disabledPeter Zijlstra1-0/+28
Introduce cant_sleep() macro for annotation of functions that cannot sleep. Use it in BPF_PROG_RUN to catch execution of BPF programs in preemptable context. Suggested-by: Jann Horn <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Song Liu <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-02-19irq/irq_sim: add irq_set_type() callbackBartosz Golaszewski1-0/+12
Implement the irq_set_type() callback and call irqd_set_trigger_type() internally so that users interested in the configured trigger type can later retrieve it using irqd_get_trigger_type(). We only support edge trigger types. Signed-off-by: Bartosz Golaszewski <[email protected]> Acked-by: Marc Zyngier <[email protected]>
2019-02-19cpuset: remove unused task_has_mempolicy()Masahiro Yamada1-13/+0
This is a remnant of commit 5f155f27cb7f ("mm, cpuset: always use seqlock when changing task's nodemask"). Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2019-02-18Merge tag 'trace-v5.0-rc4-3' of ↵Linus Torvalds2-9/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "Two more tracing fixes - Have kprobes not use copy_from_user() to access kernel addresses, because kprobes can legitimately poke at bad kernel memory, which will fault. Copy from user code should never fault in kernel space. Using probe_mem_read() can handle kernel address space faulting. - Put back the entries counter in the tracing output that was accidentally removed" * tag 'trace-v5.0-rc4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix number of entries in trace header kprobe: Do not use uaccess functions to access kernel memory that can fault
2019-02-18swiotlb: remove swiotlb_dma_supportedChristoph Hellwig1-12/+0
The only user left is powerpc, but even there the generic dma-direct version works just as well, given that we guarantee that the swiotlb buffer must always be addressable. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18dma-mapping, powerpc: simplify the arch dma_set_mask overrideChristoph Hellwig2-2/+10
Instead of letting the architecture supply all of dma_set_mask just give it an additional hook selected by Kconfig. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18powerpc/dma: stop overriding dma_get_required_maskChristoph Hellwig1-2/+0
The ppc_md and pci_controller_ops methods are unused now and can be removed. The dma_nommu implementation is generic to the generic one except for using max_pfn instead of calling into the memblock API, and all other dma_map_ops instances implement a method of their own. Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18dma-direct: we might need GFP_DMA for 32-bit dma masksChristoph Hellwig1-2/+1
If there is no ZONE_DMA32 we might need GFP_DMA to be able to allocate memory that satisfies a 32-bit DMA mask. Reported-by: Christian Zigotzky <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Tested-by: Christian Zigotzky <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2019-02-18genirq/affinity: Remove the leftovers of the original set supportThomas Gleixner1-16/+4
Now that the NVME driver is converted over to the calc_set() callback, the workarounds of the original set support can be removed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-18genirq/affinity: Add new callback for (re)calculating interrupt setsMing Lei1-18/+44
The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contains one or more interrupts. Each set is mapped to a specific functionality of a device, e.g. general I/O queues and read I/O queus of multiqueue block devices. The number of interrupts per set is defined by the driver. It depends on the total number of available interrupts for the device, which is determined by the PCI capabilites and the availability of underlying CPU resources, and the number of queues which the device provides and the driver wants to instantiate. The driver passes initial configuration for the interrupt allocation via a pointer to struct irq_affinity. Right now the allocation mechanism is complex as it requires to have a loop in the driver to determine the maximum number of interrupts which are provided by the PCI capabilities and the underlying CPU resources. This loop would have to be replicated in every driver which wants to utilize this mechanism. That's unwanted code duplication and error prone. In order to move this into generic facilities it is required to have a mechanism, which allows the recalculation of the interrupt sets and their size, in the core code. As the core code does not have any knowledge about the underlying device, a driver specific callback is required in struct irq_affinity, which can be invoked by the core code. The callback gets the number of available interupts as an argument, so the driver can calculate the corresponding number and size of interrupt sets. At the moment the struct irq_affinity pointer which is handed in from the driver and passed through to several core functions is marked 'const', but for the callback to be able to modify the data in the struct it's required to remove the 'const' qualifier. Add the optional callback to struct irq_affinity, which allows drivers to recalculate the number and size of interrupt sets and remove the 'const' qualifier. For simple invocations, which do not supply a callback, a default callback is installed, which just sets nr_sets to 1 and transfers the number of spreadable vectors to the set_size array at index 0. This is for now guarded by a check for nr_sets != 0 to keep the NVME driver working until it is converted to the callback mechanism. To make sure that the driver configuration is correct under all circumstances the callback is invoked even when there are no interrupts for queues left, i.e. the pre/post requirements already exhaust the numner of available interrupts. At the PCI layer irq_create_affinity_masks() has to be invoked even for the case where the legacy interrupt is used. That ensures that the callback is invoked and the device driver can adjust to that situation. [ tglx: Fixed the simple case (no sets required). Moved the sanity check for nr_sets after the invocation of the callback so it catches broken drivers. Fixed the kernel doc comments for struct irq_affinity and de-'This patch'-ed the changelog ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-18genirq/affinity: Store interrupt sets size in struct irq_affinityMing Lei1-4/+12
The interrupt affinity spreading mechanism supports to spread out affinities for one or more interrupt sets. A interrupt set contains one or more interrupts. Each set is mapped to a specific functionality of a device, e.g. general I/O queues and read I/O queus of multiqueue block devices. The number of interrupts per set is defined by the driver. It depends on the total number of available interrupts for the device, which is determined by the PCI capabilites and the availability of underlying CPU resources, and the number of queues which the device provides and the driver wants to instantiate. The driver passes initial configuration for the interrupt allocation via a pointer to struct irq_affinity. Right now the allocation mechanism is complex as it requires to have a loop in the driver to determine the maximum number of interrupts which are provided by the PCI capabilities and the underlying CPU resources. This loop would have to be replicated in every driver which wants to utilize this mechanism. That's unwanted code duplication and error prone. In order to move this into generic facilities it is required to have a mechanism, which allows the recalculation of the interrupt sets and their size, in the core code. As the core code does not have any knowledge about the underlying device, a driver specific callback will be added to struct affinity_desc, which will be invoked by the core code. The callback will get the number of available interupts as an argument, so the driver can calculate the corresponding number and size of interrupt sets. To support this, two modifications for the handling of struct irq_affinity are required: 1) The (optional) interrupt sets size information is contained in a separate array of integers and struct irq_affinity contains a pointer to it. This is cumbersome and as the maximum number of interrupt sets is small, there is no reason to have separate storage. Moving the size array into struct affinity_desc avoids indirections and makes the code simpler. 2) At the moment the struct irq_affinity pointer which is handed in from the driver and passed through to several core functions is marked 'const'. With the upcoming callback to recalculate the number and size of interrupt sets, it's necessary to remove the 'const' qualifier. Otherwise the callback would not be able to update the data. Implement #1 and store the interrupt sets size in 'struct irq_affinity'. No functional change. [ tglx: Fixed the memcpy() size so it won't copy beyond the size of the source. Fixed the kernel doc comments for struct irq_affinity and de-'This patch'-ed the changelog ] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-18genirq/affinity: Code consolidationThomas Gleixner1-29/+27
All information and calculations in the interrupt affinity spreading code is strictly unsigned int. Though the code uses int all over the place. Convert it over to unsigned int. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Ming Lei <[email protected]> Acked-by: Marc Zyngier <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Jens Axboe <[email protected]> Cc: [email protected] Cc: Sagi Grimberg <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Keith Busch <[email protected]> Cc: Sumit Saxena <[email protected]> Cc: Kashyap Desai <[email protected]> Cc: Shivasharan Srikanteshwara <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-02-17Merge tag 'gpio-v5.1-updates-for-linus' of ↵Linus Walleij35-147/+423
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux into devel gpio updates for v5.1 - support for a new variant of pca953x - documentation fix from Wolfram - some tegra186 name changes - two minor fixes for madera and altera-a10sr
2019-02-17Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds2-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Two fixes on the kernel side: fix an over-eager condition that failed larger perf ring-buffer sizes, plus fix crashes in the Intel BTS code for a corner case, found by fuzzing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix impossible ring-buffer sizes warning perf/x86: Add check_period PMU callback
2019-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller2-44/+134
Alexei Starovoitov says: ==================== pull-request: bpf-next 2019-02-16 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) numerous libbpf API improvements, from Andrii, Andrey, Yonghong. 2) test all bpf progs in alu32 mode, from Jiong. 3) skb->sk access and bpf_sk_fullsock(), bpf_tcp_sock() helpers, from Martin. 4) support for IP encap in lwt bpf progs, from Peter. 5) remove XDP_QUERY_XSK_UMEM dead code, from Jan. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-02-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-5/+14
Alexei Starovoitov says: ==================== pull-request: bpf 2019-02-16 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) fix lockdep false positive in bpf_get_stackid(), from Alexei. 2) several AF_XDP fixes, from Bjorn, Magnus, Davidlohr. 3) fix narrow load from struct bpf_sock, from Martin. 4) mips JIT fixes, from Paul. 5) gso handling fix in bpf helpers, from Willem. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-02-16swiotlb: drop pointless static qualifier in swiotlb_create_debugfs()YueHaibing1-1/+1
There is no need to have the 'struct dentry *d_swiotlb_usage' variable static since new value always be assigned before use it. Signed-off-by: YueHaibing <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2019-02-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller5-23/+121
The netfilter conflicts were rather simple overlapping changes. However, the cls_tcindex.c stuff was a bit more complex. On the 'net' side, Cong is fixing several races and memory leaks. Whilst on the 'net-next' side we have Vlad adding the rtnl-ness support. What I've decided to do, in order to resolve this, is revert the conversion over to using a workqueue that Cong did, bringing us back to pure RCU. I did it this way because I believe that either Cong's races don't apply with have Vlad did things, or Cong will have to implement the race fix slightly differently. Signed-off-by: David S. Miller <[email protected]>
2019-02-15cgroup, rstat: Don't flush subtree root unless necessaryTejun Heo1-4/+6
cgroup_rstat_cpu_pop_updated() is used to traverse the updated cgroups on flush. While it was only visiting updated ones in the subtree, it was visiting @root unconditionally. We can easily check whether @root is updated or not by looking at its ->updated_next just as with the cgroups in the subtree. * Remove the unnecessary cgroup_parent() test. The system root cgroup is never updated and thus its ->updated_next is always NULL. No need to test whether cgroup_parent() exists in addition to ->updated_next. * Terminate traverse if ->updated_next is NULL. This can only happen for subtree @root and there's no reason to visit it if it's not marked updated. This reduces cpu consumption when reading a lot of rstat backed files. In a micro benchmark reading stat from ~1600 cgroups, the sys time was lowered by >40%. Signed-off-by: Tejun Heo <[email protected]>
2019-02-15uprobes: convert uprobe.ref to refcount_tElena Reshetova1-4/+4
atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable uprobe.ref is used as pure reference counter. Convert it to refcount_t and fix up the operations. **Important note for maintainers: Some functions from refcount_t API defined in lib/refcount.c have different memory ordering guarantees than their atomic counterparts. The full comparison can be seen in https://lkml.org/lkml/2017/11/15/57 and it is hopefully soon in state to be merged to the documentation tree. Normally the differences should not matter since refcount_t provides enough guarantees to satisfy the refcounting use cases, but in some rare cases it might matter. Please double check that you don't have some undocumented memory guarantees for this variable usage. For the uprobe.ref it might make a difference in following places: - put_uprobe(): decrement in refcount_dec_and_test() only provides RELEASE ordering and control dependency on success vs. fully ordered atomic counterpart Link: http://lkml.kernel.org/r/[email protected] Suggested-by: Kees Cook <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Reviewed-by: David Windsor <[email protected]> Reviewed-by: Hans Liljestrand <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Signed-off-by: Elena Reshetova <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-02-15ftrace: Allow enabling of filters via index of available_filter_functionsSteven Rostedt (VMware)3-0/+36
Enabling of large number of functions by echoing in a large subset of the functions in available_filter_functions can take a very long time. The process requires testing all functions registered by the function tracer (which is in the 10s of thousands), and doing a kallsyms lookup to convert the ip address into a name, then comparing that name with the string passed in. When a function causes the function tracer to crash the system, a binary bisect of the available_filter_functions can be done to find the culprit. But this requires passing in half of the functions in available_filter_functions over and over again, which makes it basically a O(n^2) operation. With 40,000 functions, that ends up bing 1,600,000,000 opertions! And enabling this can take over 20 minutes. As a quick speed up, if a number is passed into one of the filter files, instead of doing a search, it just enables the function at the corresponding line of the available_filter_functions file. That is: # echo 50 > set_ftrace_filter # cat set_ftrace_filter x86_pmu_commit_txn # head -50 available_filter_functions | tail -1 x86_pmu_commit_txn This allows setting of half the available_filter_functions to take place in less than a second! # time seq 20000 > set_ftrace_filter real 0m0.042s user 0m0.005s sys 0m0.015s # wc -l set_ftrace_filter 20000 set_ftrace_filter Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-02-15tracing: Fix number of entries in trace headerQuentin Perret1-0/+2
The following commit 441dae8f2f29 ("tracing: Add support for display of tgid in trace output") removed the call to print_event_info() from print_func_help_header_irq() which results in the ftrace header not reporting the number of entries written in the buffer. As this wasn't the original intent of the patch, re-introduce the call to print_event_info() to restore the orginal behaviour. Link: http://lkml.kernel.org/r/[email protected] Acked-by: Joel Fernandes <[email protected]> Cc: [email protected] Fixes: 441dae8f2f29 ("tracing: Add support for display of tgid in trace output") Signed-off-by: Quentin Perret <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-02-15kprobe: Do not use uaccess functions to access kernel memory that can faultChangbin Du1-9/+1
The userspace can ask kprobe to intercept strings at any memory address, including invalid kernel address. In this case, fetch_store_strlen() would crash since it uses general usercopy function, and user access functions are no longer allowed to access kernel memory. For example, we can crash the kernel by doing something as below: $ sudo kprobe 'p:do_sys_open +0(+0(%si)):string' [ 103.620391] BUG: GPF in non-whitelisted uaccess (non-canonical address?) [ 103.622104] general protection fault: 0000 [#1] SMP PTI [ 103.623424] CPU: 10 PID: 1046 Comm: cat Not tainted 5.0.0-rc3-00130-gd73aba1-dirty #96 [ 103.625321] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-2-g628b2e6-dirty-20190104_103505-linux 04/01/2014 [ 103.628284] RIP: 0010:process_fetch_insn+0x1ab/0x4b0 [ 103.629518] Code: 10 83 80 28 2e 00 00 01 31 d2 31 ff 48 8b 74 24 28 eb 0c 81 fa ff 0f 00 00 7f 1c 85 c0 75 18 66 66 90 0f ae e8 48 63 ca 89 f8 <8a> 0c 31 66 66 90 83 c2 01 84 c9 75 dc 89 54 24 34 89 44 24 28 48 [ 103.634032] RSP: 0018:ffff88845eb37ce0 EFLAGS: 00010246 [ 103.635312] RAX: 0000000000000000 RBX: ffff888456c4e5a8 RCX: 0000000000000000 [ 103.637057] RDX: 0000000000000000 RSI: 2e646c2f6374652f RDI: 0000000000000000 [ 103.638795] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 103.640556] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 [ 103.642297] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 103.644040] FS: 0000000000000000(0000) GS:ffff88846f000000(0000) knlGS:0000000000000000 [ 103.646019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 103.647436] CR2: 00007ffc79758038 CR3: 0000000463360006 CR4: 0000000000020ee0 [ 103.649147] Call Trace: [ 103.649781] ? sched_clock_cpu+0xc/0xa0 [ 103.650747] ? do_sys_open+0x5/0x220 [ 103.651635] kprobe_trace_func+0x303/0x380 [ 103.652645] ? do_sys_open+0x5/0x220 [ 103.653528] kprobe_dispatcher+0x45/0x50 [ 103.654682] ? do_sys_open+0x1/0x220 [ 103.655875] kprobe_ftrace_handler+0x90/0xf0 [ 103.657282] ftrace_ops_assist_func+0x54/0xf0 [ 103.658564] ? __call_rcu+0x1dc/0x280 [ 103.659482] 0xffffffffc00000bf [ 103.660384] ? __ia32_sys_open+0x20/0x20 [ 103.661682] ? do_sys_open+0x1/0x220 [ 103.662863] do_sys_open+0x5/0x220 [ 103.663988] do_syscall_64+0x60/0x210 [ 103.665201] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 103.666862] RIP: 0033:0x7fc22fadccdd [ 103.668034] Code: 48 89 54 24 e0 41 83 e2 40 75 32 89 f0 25 00 00 41 00 3d 00 00 41 00 74 24 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 77 33 f3 c3 66 0f 1f 84 00 00 00 00 00 48 8d 44 [ 103.674029] RSP: 002b:00007ffc7972c3a8 EFLAGS: 00000287 ORIG_RAX: 0000000000000101 [ 103.676512] RAX: ffffffffffffffda RBX: 0000562f86147a21 RCX: 00007fc22fadccdd [ 103.678853] RDX: 0000000000080000 RSI: 00007fc22fae1428 RDI: 00000000ffffff9c [ 103.681151] RBP: ffffffffffffffff R08: 0000000000000000 R09: 0000000000000000 [ 103.683489] R10: 0000000000000000 R11: 0000000000000287 R12: 00007fc22fce90a8 [ 103.685774] R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000 [ 103.688056] Modules linked in: [ 103.689131] ---[ end trace 43792035c28984a1 ]--- This can be fixed by using probe_mem_read() instead, as it can handle faulting kernel memory addresses, which kprobes can legitimately do. Link: http://lkml.kernel.org/r/[email protected] Cc: [email protected] Fixes: 9da3f2b7405 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses") Signed-off-by: Changbin Du <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2019-02-15Merge branch 'for-linus' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull signal fix from Eric Biederman: "Just a single patch that restores PTRACE_EVENT_EXIT functionality that was accidentally broken by last weeks fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Restore the stop PTRACE_EVENT_EXIT
2019-02-14Merge branch 'linus' into irq/coreThomas Gleixner24-92/+242
Pick up upstream changes to avoid conflicts for pending patches.
2019-02-14genirq: Fix wrong name in request_percpu_nmi() descriptionJulien Thierry1-2/+2
ready_percpu_nmi() was the previous name of prepare_percpu_nmi(). Update request_percpu_nmi() comment with the correct function name. Signed-off-by: Julien Thierry <[email protected]> Reported-by: Li Wei <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Marc Zyngier <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2019-02-13Merge tag 'trace-v5.0-rc4' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "This fixes kprobes/uprobes dynamic processing of strings, where it processes the args but does not update the remaining length of the buffer that the string arguments will be placed in. It constantly passes in the total size of buffer used instead of passing in the remaining size of the buffer used. This could cause issues if the strings are larger than the max size of an event which could cause the strings to be written beyond what was reserved on the buffer" * tag 'trace-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: probeevent: Correctly update remaining space in dynamic area
2019-02-13dma-mapping: remove an incorrect __iommem annotationChristoph Hellwig1-1/+1
memmap return a regular void pointer, not and __iomem one. Signed-off-by: Christoph Hellwig <[email protected]>
2019-02-13dma-mapping: add a kconfig symbol for arch_teardown_dma_ops availabilityChristoph Hellwig1-0/+3
Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Catalin Marinas <[email protected]> # arm64
2019-02-13dma-mapping: add a kconfig symbol for arch_setup_dma_ops availabilityChristoph Hellwig1-0/+3
Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Paul Burton <[email protected]> # MIPS Acked-by: Catalin Marinas <[email protected]> # arm64
2019-02-13dma-mapping: move debug configuration options to kernel/dmaAndy Shevchenko1-0/+36
This is a follow up to the commit cf65a0f6f6ff ("dma-mapping: move all DMA mapping code to kernel/dma") which moved source code of DMA API to kernel/dma folder. Since there is no file left in the lib that require DMA API debugging options move the latter to kernel/dma as well. Cc: Christoph Hellwig <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2019-02-13signal: Restore the stop PTRACE_EVENT_EXITEric W. Biederman1-2/+5
In the middle of do_exit() there is there is a call "ptrace_event(PTRACE_EVENT_EXIT, code);" That call places the process in TACKED_TRACED aka "(TASK_WAKEKILL | __TASK_TRACED)" and waits for for the debugger to release the task or SIGKILL to be delivered. Skipping past dequeue_signal when we know a fatal signal has already been delivered resulted in SIGKILL remaining pending and TIF_SIGPENDING remaining set. This in turn caused the scheduler to not sleep in PTACE_EVENT_EXIT as it figured a fatal signal was pending. This also caused ptrace_freeze_traced in ptrace_check_attach to fail because it left a per thread SIGKILL pending which is what fatal_signal_pending tests for. This difference in signal state caused strace to report strace: Exit of unknown pid NNNNN ignored Therefore update the signal handling state like dequeue_signal would when removing a per thread SIGKILL, by removing SIGKILL from the per thread signal mask and clearing TIF_SIGPENDING. Acked-by: Oleg Nesterov <[email protected]> Reported-by: Oleg Nesterov <[email protected]> Reported-by: Ivan Delalande <[email protected]> Cc: [email protected] Fixes: 35634ffa1751 ("signal: Always notice exiting tasks") Signed-off-by: "Eric W. Biederman" <[email protected]>
2019-02-13genirq: introduce irq_chip_mask_ack_parent()Linus Walleij1-0/+11
The hierarchical irqchip never before ran into a situation where the parent is not "simple", i.e. does not implement .irq_ack() and .irq_mask() like most, but the qcom-pm8xxx.c happens to implement only .irq_mask_ack(). Since we want to make ssbi-gpio a hierarchical child of this irqchip, it must *also* only implement .irq_mask_ack() and call down to the parent, and for this we of course need irq_chip_mask_ack_parent(). Cc: Marc Zyngier <[email protected]> Cc: Thomas Gleixner <[email protected]> Acked-by: Marc Zyngier <[email protected]> Signed-off-by: Brian Masney <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2019-02-13genirq: introduce irq_domain_translate_twocellBrian Masney1-11/+34
Add a new function irq_domain_translate_twocell() that is to be used as the translate function in struct irq_domain_ops for the v2 IRQ API. This patch also changes irq_domain_xlate_twocell() from the v1 IRQ API to call irq_domain_translate_twocell() in the v2 IRQ API. This required changes to of_phandle_args_to_fwspec()'s arguments so that it can be called from multiple places. Cc: Thomas Gleixner <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Signed-off-by: Brian Masney <[email protected]> Signed-off-by: Linus Walleij <[email protected]>
2019-02-13Merge branch 'rcu-next' of ↵Ingo Molnar21-691/+409
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull the latest RCU tree from Paul E. McKenney: - Additional cleanups after RCU flavor consolidation - Grace-period forward-progress cleanups and improvements - Documentation updates - Miscellaneous fixes - spin_is_locked() conversions to lockdep - SPDX changes to RCU source and header files - SRCU updates - Torture-test updates, including nolibc updates and moving nolibc to tools/include Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13sched/fair: Use non-atomic cpumask_{set,clear}_cpu()Viresh Kumar2-4/+4
The cpumasks updated here are not subject to concurrency and using atomic bitops for them is pointless and expensive. Use the non-atomic variants instead. Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Viresh Kumar <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vincent Guittot <[email protected]> Link: http://lkml.kernel.org/r/2e2a10f84b9049a81eef94ed6d5989447c21e34a.1549963617.git.viresh.kumar@linaro.org Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13kprobes: Prohibit probing on lockdep functionsMasami Hiramatsu1-1/+6
Some lockdep functions can be involved in breakpoint handling and probing on those functions can cause a breakpoint recursion. Prohibit probing on those functions by blacklist. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrea Righi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/154998810578.31052.1680977921449292812.stgit@devbox Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13kprobes: Prohibit probing on RCU debug routineMasami Hiramatsu1-0/+2
Since kprobe itself depends on RCU, probing on RCU debug routine can cause recursive breakpoint bugs. Prohibit probing on RCU debug routines. int3 ->do_int3() ->ist_enter() ->RCU_LOCKDEP_WARN() ->debug_lockdep_rcu_enabled() -> int3 Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrea Righi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/154998807741.31052.11229157537816341591.stgit@devbox Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13kprobes: Prohibit probing on hardirq tracersMasami Hiramatsu2-2/+12
Since kprobes breakpoint handling involves hardirq tracer, probing these functions cause breakpoint recursion problem. Prohibit probing on those functions. Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrea Righi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/154998802073.31052.17255044712514564153.stgit@devbox Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13kprobes: Search non-suffixed symbol in blacklistMasami Hiramatsu1-1/+20
Newer GCC versions can generate some different instances of a function with suffixed symbols if the function is optimized and only has a part of that. (e.g. .constprop, .part etc.) In this case, it is not enough to check the entry of kprobe blacklist because it only records non-suffixed symbol address. To fix this issue, search non-suffixed symbol in blacklist if given address is within a symbol which has a suffix. Note that this can cause false positive cases if a kprobe-safe function is optimized to suffixed instance and has same name symbol which is blacklisted. But I would like to chose a fail-safe design for this issue. Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrea Righi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/154998799234.31052.6136378903570418008.stgit@devbox Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13x86/kprobes: Prohibit probing on functions before kprobe_int3_handler()Masami Hiramatsu1-0/+2
Prohibit probing on the functions called before kprobe_int3_handler() in do_int3(). More specifically, ftrace_int3_handler(), poke_int3_handler(), and ist_enter(). And since rcu_nmi_enter() is called by ist_enter(), it also should be marked as NOKPROBE_SYMBOL. Since those are handled before kprobe_int3_handler(), probing those functions can cause a breakpoint recursion and crash the kernel. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrea Righi <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Mathieu Desnoyers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/154998793571.31052.11301258949601150994.stgit@devbox Signed-off-by: Ingo Molnar <[email protected]>
2019-02-13perf/core: Fix impossible ring-buffer sizes warningIngo Molnar1-1/+1
The following commit: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") results in perf recording failures with larger mmap areas: root@skl:/tmp# perf record -g -a failed to mmap with 12 (Cannot allocate memory) The root cause is that the following condition is buggy: if (order_base_2(size) >= MAX_ORDER) goto fail; The problem is that @size is in bytes and MAX_ORDER is in pages, so the right test is: if (order_base_2(size) >= PAGE_SHIFT+MAX_ORDER) goto fail; Fix it. Reported-by: "Jin, Yao" <[email protected]> Bisected-by: Borislav Petkov <[email protected]> Analyzed-by: Peter Zijlstra <[email protected]> Cc: Julien Thierry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: <[email protected]> Fixes: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes") Signed-off-by: Ingo Molnar <[email protected]>
2019-02-12audit: mark expected switch fall-throughGustavo A. R. Silva1-1/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: kernel/auditfilter.c: In function ‘audit_krule_to_data’: kernel/auditfilter.c:668:7: warning: this statement may fall through [-Wimplicit-fallthrough=] if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) { ^ kernel/auditfilter.c:674:3: note: here default: ^~~~~~~ Warning level 3 was used: -Wimplicit-fallthrough=3 Notice that, in this particular case, the code comment is modified in accordance with what GCC is expecting to find. This patch is part of the ongoing efforts to enable -Wimplicit-fallthrough. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2019-02-12swiotlb: checking whether swiotlb buffer is full with io_tlb_usedDongli Zhang1-0/+4
This patch uses io_tlb_used to help check whether swiotlb buffer is full. io_tlb_used is no longer used for only debugfs. It is also used to help optimize swiotlb_tbl_map_single(). Suggested-by: Joe Jin <[email protected]> Signed-off-by: Dongli Zhang <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>