aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2018-01-30Merge tag v4.15 of ↵Jason Gunthorpe9-78/+148
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git To resolve conflicts in: drivers/infiniband/hw/mlx5/main.c drivers/infiniband/hw/mlx5/qp.c From patches merged into the -rc cycle. The conflict resolution matches what linux-next has been carrying. Signed-off-by: Jason Gunthorpe <[email protected]>
2018-01-29Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-blockLinus Torvalds3-1/+132
Pull block updates from Jens Axboe: "This is the main pull request for block IO related changes for the 4.16 kernel. Nothing major in this pull request, but a good amount of improvements and fixes all over the map. This contains: - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and Paolo. - Support for SMR zones for deadline and mq-deadline from Damien and Christoph. - Set of fixes for bcache by way of Michael Lyle, including fixes from himself, Kent, Rui, Tang, and Coly. - Series from Matias for lightnvm with fixes from Hans Holmberg, Javier, and Matias. Mostly centered around pblk, and the removing rrpc 1.2 in preparation for supporting 2.0. - A couple of NVMe pull requests from Christoph. Nothing major in here, just fixes and cleanups, and support for command tracing from Johannes. - Support for blk-throttle for tracking reads and writes separately. From Joseph Qi. A few cleanups/fixes also for blk-throttle from Weiping. - Series from Mike Snitzer that enables dm to register its queue more logically, something that's alwways been problematic on dm since it's a stacked device. - Series from Ming cleaning up some of the bio accessor use, in preparation for supporting multipage bvecs. - Various fixes from Ming closing up holes around queue mapping and quiescing. - BSD partition fix from Richard Narron, fixing a problem where we can't mount newer (10/11) FreeBSD partitions. - Series from Tejun reworking blk-mq timeout handling. The previous scheme relied on atomic bits, but it had races where we would think a request had timed out if it to reused at the wrong time. - null_blk now supports faking timeouts, to enable us to better exercise and test that functionality separately. From me. - Kill the separate atomic poll bit in the request struct. After this, we don't use the atomic bits on blk-mq anymore at all. From me. - sgl_alloc/free helpers from Bart. - Heavily contended tag case scalability improvement from me. - Various little fixes and cleanups from Arnd, Bart, Corentin, Douglas, Eryu, Goldwyn, and myself" * 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits) block: remove smart1,2.h nvme: add tracepoint for nvme_complete_rq nvme: add tracepoint for nvme_setup_cmd nvme-pci: introduce RECONNECTING state to mark initializing procedure nvme-rdma: remove redundant boolean for inline_data nvme: don't free uuid pointer before printing it nvme-pci: Suspend queues after deleting them bsg: use pr_debug instead of hand crafted macros blk-mq-debugfs: don't allow write on attributes with seq_operations set nvme-pci: Fix queue double allocations block: Set BIO_TRACE_COMPLETION on new bio during split blk-throttle: use queue_is_rq_based block: Remove kblockd_schedule_delayed_work{,_on}() blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly() lib/scatterlist: Fix chaining support in sgl_alloc_order() blk-throttle: track read and write request individually block: add bdev_read_only() checks to common helpers block: fail op_is_write() requests to read-only partitions blk-throttle: export io_serviced_recursive, io_service_bytes_recursive ...
2018-01-29Merge tag 'mfd-next-4.16' of ↵Linus Torvalds1-1/+57
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "New Drivers: - Add support for RAVE Supervisory Processor Moved drivers: - Move Realtek Card Reader Driver to Misc New Device Support: - Add support for Pinctrl to axp20x New Functionality: - Add resume support to atmel-flexcom Fix-ups: - Split MFD (mfd) and userspace handlers (platform) in cros_ec - Fix trivial (whitespace, spelling) issue(s) in pcf50633-core - Clean-up error handling in ab8500-debugfs - General tidying up in tmio_core - Kconfig fix-ups for qcom-pm8xxx - Licensing changes (SPDX) to stm32-lptimer, stm32-timers - Device Tree fixups in mc13xxx - Simplify/remove unused code in cros_ec_spi, axp20x, ti_am335x_tscadc, kempld-core, intel_soc_pmic_core.c, ab8500-debugfs" * tag 'mfd-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (32 commits) mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Apollo Lake mfd: axp20x: Mark axp288 CHRG_BAK_CTRL register volatile mfd: ab8500: Introduce DEFINE_SHOW_ATTRIBUTE() macro atmel_flexcom: Support resuming after a chip reset mfd: Remove duplicate includes dt-bindings: mfd: mc13xxx: Add the unit address to sysled mfd: stm32: Adopt SPDX identifier mfd: axp20x: Add pinctrl cell for AXP813 mfd: pm8xxx: Make elegible for COMPILE_TEST mfd: kempld-core: Use resource_size function on resource object mfd: tmio: Move register macros to tmio_core.c mfd: cros ec: spi: Simplify delay handling between SPI messages mfd: palmas: Assign the right powerhold mask for tps65917 mfd: ab8500-debugfs: Use common error handling code in ab8500_print_modem_registers() mfd: ti_am335x_tscadc: Remove redundant assignment to node mfd: pcf50633: Fix spelling mistake: 'Falied' -> 'Failed' dt-bindings: watchdog: Add bindings for RAVE SP watchdog driver watchdog: Add RAVE SP watchdog driver mfd: Add driver for RAVE Supervisory Processor serdev: Introduce devm_serdev_device_open() ...
2018-01-26bpf: add further test cases around div/mod and othersDaniel Borkmann1-2/+6
Update selftests to relfect recent changes and add various new test cases. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-01-26PCI: Add SPDX GPL-2.0 when no license was specifiedBjorn Helgaas1-0/+1
b24413180f56 ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license") added SPDX GPL-2.0 to several PCI files that previously contained no license information. Add SPDX GPL-2.0 to all other PCI files that did not contain any license information and hence were under the default GPL version 2 license of the kernel. Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]>
2018-01-23vsprintf: Do not have bprintf dereference pointersSteven Rostedt (VMware)1-13/+69
When trace_printk() was introduced, it was discussed that making it be as low overhead as possible, that the processing of the format string should be delayed until it is read. That is, a "trace_printk()" should not convert the %d into numbers and so on, but instead, save the fmt string and all the args in the buffer at the time of recording. When the trace_printk() data is read, it would then parse the format string and do the conversions of the saved arguments in the tracing buffer. The code to perform this was added to vsprintf where vbin_printf() would save the arguments of a specified format string in a buffer, then bstr_printf() could be used to convert the buffer with the same format string into the final output, as if vsprintf() was called in one go. The issue arises when dereferenced pointers are used. The problem is that something like %*pbl which reads a bitmask, will save the pointer to the bitmask in the buffer. Then the reading of the buffer via bstr_printf() will then look at the pointer to process the final output. Obviously the value of that pointer could have changed since the time it was recorded to the time the buffer is read. Worse yet, the bitmask could be unmapped, and the reading of the trace buffer could actually cause a kernel oops. Another problem is that user space tools such as perf and trace-cmd do not have access to the contents of these pointers, and they become useless when the tracing buffer is extracted. Instead of having vbin_printf() simply save the pointer in the buffer for later processing, have it perform the formatting at the time bin_printf() is called. This will fix the issue of dereferencing pointers at a later time, and has the extra benefit of having user space tools understand these values. Since perf and trace-cmd already can handle %p[sSfF] via saving kallsyms, their pointers are saved and not processed during vbin_printf(). If they were converted, it would break perf and trace-cmd, as they would not know how to deal with the conversion. Link: http://lkml.kernel.org/r/[email protected] Reported-by: Thomas Gleixner <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2018-01-23kobject: Export kobj_ns_grab_current() and kobj_ns_drop()Bart Van Assche1-0/+2
Make it possible to call these two functions from a kernel module. Note: despite their name, these two functions can be used meaningfully independent of kobjects. A later patch will add calls to these functions from the SRP driver because this patch series modifies the SRP driver such that it can hold a reference to a namespace that can last longer than the lifetime of the process through which the namespace reference was obtained. Signed-off-by: Bart Van Assche <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2018-01-22test_firmware: fix missing unlock on error in config_num_requests_store()Wei Yongjun1-0/+1
Add the missing unlock before return from function config_num_requests_store() in the error handling case. Fixes: c92316bf8e94 ("test_firmware: add batched firmware tests") Cc: stable <[email protected]> Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-01-22test_firmware: make local symbol test_fw_config staticWei Yongjun1-1/+1
Fixes the following sparse warnings: lib/test_firmware.c:99:20: warning: symbol 'test_fw_config' was not declared. Should it be static? Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-01-22Merge branch 'for-4.16-print-symbol' into for-4.16Petr Mladek1-2/+1
2018-01-19bpf: add couple of test cases for signed extended immsDaniel Borkmann1-0/+104
Add a couple of test cases for interpreter and JIT that are related to an issue we faced some time ago in Cilium [1], which is fixed in LLVM with commit e53750e1e086 ("bpf: fix bug on silently truncating 64-bit immediate"). Test cases were run-time checking kernel to behave as intended which should also provide some guidance for current or new JITs in case they should trip over this. Added for cBPF and eBPF. [1] https://github.com/cilium/cilium/pull/2162 Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-01-19lib/scatterlist: Fix chaining support in sgl_alloc_order()Bart Van Assche1-5/+27
This patch avoids that workloads with large block sizes (megabytes) can trigger the following call stack with the ib_srpt driver (that driver is the only driver that chains scatterlists allocated by sgl_alloc_order()): BUG: Bad page state in process kworker/0:1H pfn:2423a78 page:fffffb03d08e9e00 count:-3 mapcount:0 mapping: (null) index:0x0 flags: 0x57ffffc0000000() raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: nonzero _count CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G I 4.15.0-rc7.bart+ #1 Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015 Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x5c/0x83 bad_page+0xf5/0x10f get_page_from_freelist+0xa46/0x11b0 __alloc_pages_nodemask+0x103/0x290 sgl_alloc_order+0x101/0x180 target_alloc_sgl+0x2c/0x40 [target_core_mod] srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt] srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt] __ib_process_cq+0x55/0xa0 [ib_core] ib_cq_poll_work+0x1b/0x60 [ib_core] process_one_work+0x141/0x340 worker_thread+0x47/0x3e0 kthread+0xf5/0x130 ret_from_fork+0x1f/0x30 Fixes: e80a0af4759a ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()") Reported-by: Laurence Oberman <[email protected]> Tested-by: Laurence Oberman <[email protected]> Signed-off-by: Bart Van Assche <[email protected]> Cc: Nicholas A. Bellinger <[email protected]> Cc: Laurence Oberman <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-01-15swiotlb: remove various exportsChristoph Hellwig1-13/+0
All these symbols are only used by arch dma_ops implementations or xen-swiotlb. None of which can be modular. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]>
2018-01-15swiotlb: refactor coherent buffer allocationChristoph Hellwig1-57/+65
Factor out a new swiotlb_alloc_buffer helper that allocates DMA coherent memory from the swiotlb bounce buffer. This allows to simplify the swiotlb_alloc implemenation that uses dma_direct_alloc to try to allocate a reachable buffer first. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]>
2018-01-15swiotlb: refactor coherent buffer freeingChristoph Hellwig1-14/+21
Factor out a new swiotlb_free_buffer helper that checks if an address is allocated from the swiotlb bounce buffer, and if yes frees it. This allows to simplify the swiotlb_free implemenation that uses dma_direct_free to free the non-bounce buffer allocations. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]>
2018-01-15swiotlb: wire up ->dma_supported in swiotlb_dma_opsChristoph Hellwig1-0/+1
To properly reject too small DMA masks based on the addressability of the bounce buffer. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]>
2018-01-15swiotlb: add common swiotlb_map_opsChristoph Hellwig1-0/+43
Currently all architectures that want to use swiotlb have to implement their own dma_map_ops instances. Provide a generic one based on the x86 implementation which first calls into dma_direct to try a full blown direct mapping implementation (including e.g. CMA) before falling back allocating from the swiotlb buffer. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]>
2018-01-15swiotlb: rename swiotlb_free to swiotlb_exitChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
2018-01-15swiotlb: suppress warning when __GFP_NOWARN is setChristian König1-6/+9
TTM tries to allocate coherent memory in chunks of 2MB first to improve TLB efficiency and falls back to allocating 4K pages if that fails. Suppress the warning when the 2MB allocations fails since there is a valid fall back path. Signed-off-by: Christian König <[email protected]> Reported-by: Mike Galbraith <[email protected]> Acked-by: Konrad Rzeszutek Wilk <[email protected]> Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082 CC: [email protected] Signed-off-by: Christoph Hellwig <[email protected]>
2018-01-15dma-direct: reject too small dma masksChristoph Hellwig1-0/+19
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Robin Murphy <[email protected]>
2018-01-15dma-direct: make dma_direct_{alloc,free} available to other implementationsChristoph Hellwig1-3/+3
So that they don't need to indirect through the operation vector. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Vladimir Murzin <[email protected]>
2018-01-15dma-direct: retry allocations using GFP_DMA for small masksChristoph Hellwig1-1/+24
If an attempt to allocate memory succeeded, but isn't inside the supported DMA mask, retry the allocation with GFP_DMA set as a last resort. Based on the x86 code, but an off by one error in what is now dma_coherent_ok has been fixed vs the x86 code. Signed-off-by: Christoph Hellwig <[email protected]>
2018-01-15dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32Christoph Hellwig1-0/+14
This allows to dip into zones for lower memory if they are available. If one of the zones is not available the corresponding GFP_* flag will evaluate to 0 so they won't change anything. We provide an arch tunable for those architectures that do not use GFP_DMA for the lowest 24-bits, given that there are a few. Roughly based on the x86 code. Signed-off-by: Christoph Hellwig <[email protected]>
2018-01-15dma-direct: use node local allocations for coherent memoryChristoph Hellwig1-1/+1
To preserve the x86 behavior. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Robin Murphy <[email protected]>
2018-01-15dma-direct: add support for CMA allocationChristoph Hellwig1-6/+18
Try the CMA allocator for coherent allocations if supported. Roughly modelled after the x86 code. Signed-off-by: Christoph Hellwig <[email protected]>
2018-01-15dma-direct: add dma address sanity checksChristoph Hellwig1-1/+30
Roughly based on the x86 pci-nommu implementation. Signed-off-by: Christoph Hellwig <[email protected]>
2018-01-15dma-direct: use phys_to_dmaChristoph Hellwig1-11/+7
This means it uses whatever linear remapping scheme that the architecture provides is used in the generic dma_direct ops. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Vladimir Murzin <[email protected]>
2018-01-15dma-direct: rename dma_noop to dma_directChristoph Hellwig3-22/+17
The trivial direct mapping implementation already does a virtual to physical translation which isn't strictly a noop, and will soon learn to do non-direct but linear physical to dma translations through the device offset and a few small tricks. Rename it to a better fitting name. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Vladimir Murzin <[email protected]>
2018-01-12error-injection: Support fault injection frameworkMasami Hiramatsu1-0/+10
Support in-kernel fault-injection framework via debugfs. This allows you to inject a conditional error to specified function using debugfs interfaces. Here is the result of test script described in Documentation/fault-injection/fault-injection.txt =========== # ./test_fail_function.sh 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.0227404 s, 46.1 MB/s btrfs-progs v4.4 See http://btrfs.wiki.kernel.org for more information. Label: (null) UUID: bfa96010-12e9-4360-aed0-42eec7af5798 Node size: 16384 Sector size: 4096 Filesystem size: 1001.00MiB Block group profiles: Data: single 8.00MiB Metadata: DUP 58.00MiB System: DUP 12.00MiB SSD detected: no Incompat features: extref, skinny-metadata Number of devices: 1 Devices: ID SIZE PATH 1 1001.00MiB /dev/loop2 mount: mount /dev/loop2 on /opt/tmpmnt failed: Cannot allocate memory SUCCESS! =========== Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-01-12error-injection: Add injectable error typesMasami Hiramatsu1-7/+36
Add injectable error types for each error-injectable function. One motivation of error injection test is to find software flaws, mistakes or mis-handlings of expectable errors. If we find such flaws by the test, that is a program bug, so we need to fix it. But if the tester miss input the error (e.g. just return success code without processing anything), it causes unexpected behavior even if the caller is correctly programmed to handle any errors. That is not what we want to test by error injection. To clarify what type of errors the caller must expect for each injectable function, this introduces injectable error types: - EI_ETYPE_NULL : means the function will return NULL if it fails. No ERR_PTR, just a NULL. - EI_ETYPE_ERRNO : means the function will return -ERRNO if it fails. - EI_ETYPE_ERRNO_NULL : means the function will return -ERRNO (ERR_PTR) or NULL. ALLOW_ERROR_INJECTION() macro is expanded to get one of NULL, ERRNO, ERRNO_NULL to record the error type for each function. e.g. ALLOW_ERROR_INJECTION(open_ctree, ERRNO) This error types are shown in debugfs as below. ==== / # cat /sys/kernel/debug/error_injection/list open_ctree [btrfs] ERRNO io_ctl_init [btrfs] ERRNO ==== Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-01-12error-injection: Separate error-injection from kprobeMasami Hiramatsu3-0/+218
Since error-injection framework is not limited to be used by kprobes, nor bpf. Other kernel subsystems can use it freely for checking safeness of error-injection, e.g. livepatch, ftrace etc. So this separate error-injection framework from kprobes. Some differences has been made: - "kprobe" word is removed from any APIs/structures. - BPF_ALLOW_ERROR_INJECTION() is renamed to ALLOW_ERROR_INJECTION() since it is not limited for BPF too. - CONFIG_FUNCTION_ERROR_INJECTION is the config item of this feature. It is automatically enabled if the arch supports error injection feature for kprobe or ftrace etc. Signed-off-by: Masami Hiramatsu <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-01-12crypto: chacha20 - use rol32() macro from bitops.hEric Biggers1-37/+32
For chacha20_block(), use the existing 32-bit left-rotate function instead of defining one ourselves. Signed-off-by: Eric Biggers <[email protected]> Signed-off-by: Herbert Xu <[email protected]>
2018-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-4/+7
BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <[email protected]>
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-4/+7
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-01-10dma-mapping: move swiotlb arch helpers to a new headerChristoph Hellwig1-1/+1
phys_to_dma, dma_to_phys and dma_capable are helpers published by architecture code for use of swiotlb and xen-swiotlb only. Drivers are not supposed to use these directly, but use the DMA API instead. Move these to a new asm/dma-direct.h helper, included by a linux/dma-direct.h wrapper that provides the default linear mapping unless the architecture wants to override it. In the MIPS case the existing dma-coherent.h is reused for now as untangling it will take a bit of work. Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Robin Murphy <[email protected]>
2018-01-09bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov1-4/+7
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-01-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-8/+34
2018-01-09treewide: Use DEVICE_ATTR_RWJoe Perches2-18/+10
Convert DEVICE_ATTR uses to DEVICE_ATTR_RW where possible. Done with perl script: $ git grep -w --name-only DEVICE_ATTR | \ xargs perl -i -e 'local $/; while (<>) { s/\bDEVICE_ATTR\s*\(\s*(\w+)\s*,\s*\(?(\s*S_IRUGO\s*\|\s*S_IWUSR|\s*S_IWUSR\s*\|\s*S_IRUGO\s*|\s*0644\s*)\)?\s*,\s*\1_show\s*,\s*\1_store\s*\)/DEVICE_ATTR_RW(\1)/g; print;}' Signed-off-by: Joe Perches <[email protected]> Acked-by: Felipe Balbi <[email protected]> Acked-by: Andy Shevchenko <[email protected]> Acked-by: Bartlomiej Zolnierkiewicz <[email protected]> Acked-by: Zhang Rui <[email protected]> Acked-by: Jarkko Nikula <[email protected]> Acked-by: Jani Nikula <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-01-09symbol lookup: introduce dereference_symbol_descriptor()Sergey Senozhatsky1-3/+2
dereference_symbol_descriptor() invokes appropriate ARCH specific function descriptor dereference callbacks: - dereference_kernel_function_descriptor() if the pointer is a kernel symbol; - dereference_module_function_descriptor() if the pointer is a module symbol. This is the last step needed to make '%pS/%ps' smart enough to handle function descriptor dereference on affected ARCHs and to retire '%pF/%pf'. To refresh it: Some architectures (ia64, ppc64, parisc64) use an indirect pointer for C function pointers - the function pointer points to a function descriptor and we need to dereference it to get the actual function pointer. Function descriptors live in .opd elf section and all affected ARCHs (ia64, ppc64, parisc64) handle it properly for kernel and modules. So we, technically, can decide if the dereference is needed by simply looking at the pointer: if it belongs to .opd section then we need to dereference it. The kernel and modules have their own .opd sections, obviously, that's why we need to split dereference_function_descriptor() and use separate kernel and module dereference arch callbacks. Link: http://lkml.kernel.org/r/20171206043649.GB15885@jagdpanzerIV Cc: Fenghua Yu <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: James Bottomley <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jessica Yu <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Tested-by: Tony Luck <[email protected]> #ia64 Tested-by: Santosh Sivaraj <[email protected]> #powerpc Tested-by: Helge Deller <[email protected]> #parisc64 Signed-off-by: Petr Mladek <[email protected]>
2018-01-08lib/crc-ccitt: Add CCITT-FALSE CRC16 variantAndrew Morton1-1/+57
In support of a soon to be published MFD driver using serdev to talk to a supervisory processor that uses the CCITT-FALSE CRC16 variant in it's protocol, this patch was tested successfully on an i.MX6 ARM platform. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Andrey Vostrikov <[email protected]> Signed-off-by: Andrey Smirnov <[email protected]> Tested-by: Chris Healy <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Lee Jones <[email protected]>
2018-01-06lib/scatterlist: Introduce sgl_alloc() and sgl_free()Bart Van Assche2-0/+109
Many kernel drivers contain code that allocates and frees both a scatterlist and the pages that populate that scatterlist. Introduce functions in lib/scatterlist.c that perform these tasks instead of duplicating this functionality in multiple drivers. Only include these functions in the build if CONFIG_SGL_ALLOC=y to avoid that the kernel size increases if this functionality is not used. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-01-05Merge branch 'linus' of ↵Linus Torvalds1-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - racy use of ctx->rcvused in af_alg - algif_aead crash in chacha20poly1305 - freeing bogus pointer in pcrypt - build error on MIPS in mpi - memory leak in inside-secure - memory overwrite in inside-secure - NULL pointer dereference in inside-secure - state corruption in inside-secure - build error without CRYPTO_GF128MUL in chelsio - use after free in n2" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - do not use areq->result for partial results crypto: inside-secure - fix request allocations in invalidation path crypto: inside-secure - free requests even if their handling failed crypto: inside-secure - per request invalidation lib/mpi: Fix umul_ppmm() for MIPS64r6 crypto: pcrypt - fix freeing pcrypt instances crypto: n2 - cure use after free crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t crypto: chacha20poly1305 - validate the digest size crypto: chelsio - select CRYPTO_GF128MUL
2018-01-05lib: do not use print_symbol()Sergey Senozhatsky1-2/+1
print_symbol() is a very old API that has been obsoleted by %pS format specifier in a normal printk() call. Replace print_symbol() with a direct printk("%pS") call. Link: http://lkml.kernel.org/r/[email protected] To: Andrew Morton <[email protected]> To: Russell King <[email protected]> To: Catalin Marinas <[email protected]> To: Mark Salter <[email protected]> To: Tony Luck <[email protected]> To: David Howells <[email protected]> To: Yoshinori Sato <[email protected]> To: Guan Xuetao <[email protected]> To: Borislav Petkov <[email protected]> To: Greg Kroah-Hartman <[email protected]> To: Thomas Gleixner <[email protected]> To: Peter Zijlstra <[email protected]> To: Vineet Gupta <[email protected]> To: Fengguang Wu <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Petr Mladek <[email protected]> Cc: LKML <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> [[email protected]: updated commit message] Signed-off-by: Petr Mladek <[email protected]>
2018-01-03Merge branch 'for-mingo' of ↵Ingo Molnar2-29/+16
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Miscellaneous fixes. - Torture-test updates. Signed-off-by: Ingo Molnar <[email protected]>
2018-01-02Merge 4.15-rc6 into driver-core-nextGreg Kroah-Hartman5-40/+70
We want the fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2018-01-01errseq: Add to documentation treeMatthew Wilcox1-16/+21
- Move errseq.rst into core-api - Add errseq to the core-api index - Promote the header to a more prominent header type, otherwise we get three entries in the table of contents. - Reformat the table to look nicer and be a little more proportional in terms of horizontal width per bit (the SF bit is still disproportionately large, but there's no way to fix that). - Include errseq kernel-doc in the errseq.rst - Neaten some kernel-doc markup Signed-off-by: Matthew Wilcox <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: Randy Dunlap <[email protected]> Signed-off-by: Jonathan Corbet <[email protected]>
2017-12-31Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A pile of fixes for long standing issues with the timer wheel and the NOHZ code: - Prevent timer base confusion accross the nohz switch, which can cause unlocked access and data corruption - Reinitialize the stale base clock on cpu hotplug to prevent subtle side effects including rollovers on 32bit - Prevent an interrupt storm when the timer softirq is already pending caused by tick_nohz_stop_sched_tick() - Move the timer start tracepoint to a place where it actually makes sense - Add documentation to timerqueue functions as they caused confusion several times now" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timerqueue: Document return values of timerqueue_add/del() timers: Invoke timer_start_debug() where it makes sense nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() timers: Reinitialize per cpu bases on hotplug timers: Use deferrable base independent of base::nohz_active
2017-12-31Merge tag 'driver-core-4.15-rc6' of ↵Linus Torvalds1-4/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core fixes for 4.15-rc6, resolving some reported issues. The first is a cacheinfo fix for DT based systems to resolve a reported issue that has been around for a while, and the other is to resolve a regression in the kobject uevent code that showed up in 4.15-rc1. Both have been in linux-next for a while with no reported issues" * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kobject: fix suppressing modalias in uevents delivered over netlink drivers: base: cacheinfo: fix cache type for non-architected system cache
2017-12-29timerqueue: Document return values of timerqueue_add/del()Thomas Gleixner1-3/+5
The return values of timerqueue_add/del() are not documented in the kernel doc comment. Add proper documentation. Signed-off-by: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Sebastian Siewior <[email protected]> Cc: [email protected] Cc: Paul McKenney <[email protected]> Cc: Anna-Maria Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2017-12-22blk-mq: improve heavily contended tag caseJens Axboe1-1/+1
Even with a number of waitqueues, we can get into a situation where we are heavily contended on the waitqueue lock. I got a report on spc1 where we're spending seconds doing this. Arguably the use case is nasty, I reproduce it with one device and 1000 threads banging on the device. But that doesn't mean we shouldn't be handling it better. What ends up happening is that a thread will fail to get a tag, add itself to the waitqueue, and subsequently get woken up when a tag is freed - only to find itself going back to sleep on the waitqueue. Instead of waking all threads, use an exclusive wait and wake up our sbitmap batch count instead. This seems to work well for me (massive improvement for this use case), and it survives basic testing. But I haven't fully verified it yet. An additional improvement is running the queue and checking for a new tag BEFORE needing to add ourselves to the waitqueue. Signed-off-by: Jens Axboe <[email protected]>