aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-07-11s390/dasd: Make dasd_setup_queue() a discipline functionJan Höppner7-80/+103
ECKD, FBA, and the DIAG discipline use slightly different block layer settings. In preparation of even more diverse queue settings, make dasd_setup_queue() a discipline function. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Add new ioctl to release spaceJan Höppner5-0/+364
Userspace tools might have the need to release space for Extent Space Efficient (ESE) volumes when working with such a device. Provide the necessarry interface for such a task by implementing a new ioctl BIODASDRAS. The ioctl uses the format_data_t data structure for data input: typedef struct format_data_t { unsigned int start_unit; /* from track */ unsigned int stop_unit; /* to track */ unsigned int blksize; /* sectorsize */ unsigned int intensity; } format_data_t; If the intensity is set to 0x40, start_unit and stop_unit are ignored and space for the entire volume is released. Otherwise, if intensity is set to 0, the respective range is released (if possible). Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Add dasd_sleep_on_queue_interruptible()Jan Höppner2-0/+10
There is dasd_sleep_on() and dasd_sleep_on_interruptible() to start CCW requests uninterruptible and interruptible. However, there is only dasd_sleep_on_queue() to start requests from CCW queues uninterruptible. Add dasd_sleep_on_queue_interruptible() to provide a way to start requests from CCW queues interruptible. _dasd_sleep_on_queue() already provides this functionality. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Add missing intensity definitionJan Höppner1-0/+1
The definition for the bit that removes the write permission for record zero when formatting was missing. Add it to complete the list. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Fix whitespaceJan Höppner1-75/+75
Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Add dynamic formatting support for ESE volumesJan Höppner3-14/+239
A dynamic formatting is issued whenever a write request returns with either a No Record Found error (Command Mode), Incorrect Length error (Transport Mode), or File Protected error (Transport Mode). All three cases mean that the tracks in question haven't been initialized in a desired format yet. The part of the volume that was tried to be written on is then formatted and the original request is re-queued. As the formatting will happen during normal I/O operations, it is quite likely that there won't be any memory available to build the respective request. Another two pages of memory are allocated per volume specifically for the dynamic formatting. The dasd_eckd_build_format() function is extended to make sure that the original startdev is reused. Also, all formatting and format check functions use the new memory pool exclusively now to reduce complexity. Read operations will always return zero data when unformatted areas are read. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Recognise data for ESE volumesJan Höppner4-4/+430
In order to work with Extent Space Efficient (ESE) volumes, certain viable information about those volumes and the corresponding extent pool (such as extent size, configured space, allocated space, etc.) can be provided. Use the CCW commands Volume Storage Query and Logical Configuration Query to receive detailed information about ESE volumes and the extent pool respectively. These information are made accessible via internal functions for subsequent users, and via sysfs attributes for userpsace usage. The new sysfs attributes reside in separate directories called capacity and extent_pool. attributes: ese: 0/1 depending on whether the volume is an ESE volume Capacity related attributes: space_allocated: Space currently allocated by the volume (in cyl) space_configured: Remaining space in the extent pool (in cyl) logical_capacity: The entire addressable space for this volume (in cyl) Extent Pool related attributes: pool_id: ID of the extent pool the volume in question resides in pool_oos: Extent pool is out-of-space extent_size: Size of a single extent in this pool cap_at_warnlevel Extent pool capacity at warn level warn_threshold: Threshold at which percentage of remaining extent pool space a warning message is issued Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Put sub-order definitions in a separate sectionJan Höppner1-2/+6
There are orders and sub-orders. Put them in different sections for a better overview. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Make layout analysis ESE compatibleJan Höppner1-6/+6
The disk layout and volume information of a DASD reside in the first two tracks of cylinder 0. When a DASD is set online, currently the first three tracks are read and analysed to confirm an expected layout. For CDL (Compatible Disk Layout) only count area data of the first track is evaluated and checked against expected key and data lengths. For LDL (Linux Disk Layout) the first and third track is evaluated. However, an LDL formatted volume is expected to be in the same format across all tracks. Checking the third track therefore doesn't have any more value than checking any other track at random. Now, an Extent Space Efficient (ESE) DASD is initialised by only formatting the first two tracks, as those tracks always contain all information necessarry. Checking the third track on an ESE volume will therefore most likely fail with a record not found error, as the third track will be empty. This in turn leads to the device being recognised with a volume size of 0. Attempts to write volume information on the first two tracks then fail with "no space left on device" errors. Initialising the first three tracks for an ESE volume is not a viable solution, because the third track is already a regular track and could contain user data. With that there is potential for data corruption. Instead, always only analyse the first two tracks, as it is sufficiant for both CDL and LDL, and allow ESE volumes to be recognised as well. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Remove old defines and functionJan Höppner1-21/+0
Commit 4d284cac76d0 ("[S390] Avoid excessive inlining.") removed bytes_per_record() which was the only user of the defines ECKD_C0 and ECKD_F*, and round_up_multiple(). Let's get rid of those. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11s390/dasd: Remove unused structs and function prototypesJan Höppner1-25/+0
There are structs that have never been used. There are also two function prototypes which were forgotton in commit f9f8d02fae0d ("[S390] dasd: revert LCU optimization"). Clean up and keep the header file tidy. Signed-off-by: Jan Höppner <[email protected]> Reviewed-by: Stefan Haberland <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-07-11Merge tag 'acpi-5.3-rc1-2' of ↵Linus Torvalds2-9/+17
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Revert a recent ACPICA commit causing systems to hang at boot time" * tag 'acpi-5.3-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPICA: Update table load object initialization"
2019-07-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2056-113205/+104686
Pull networking updates from David Miller: "Some highlights from this development cycle: 1) Big refactoring of ipv6 route and neigh handling to support nexthop objects configurable as units from userspace. From David Ahern. 2) Convert explored_states in BPF verifier into a hash table, significantly decreased state held for programs with bpf2bpf calls, from Alexei Starovoitov. 3) Implement bpf_send_signal() helper, from Yonghong Song. 4) Various classifier enhancements to mvpp2 driver, from Maxime Chevallier. 5) Add aRFS support to hns3 driver, from Jian Shen. 6) Fix use after free in inet frags by allocating fqdirs dynamically and reworking how rhashtable dismantle occurs, from Eric Dumazet. 7) Add act_ctinfo packet classifier action, from Kevin Darbyshire-Bryant. 8) Add TFO key backup infrastructure, from Jason Baron. 9) Remove several old and unused ISDN drivers, from Arnd Bergmann. 10) Add devlink notifications for flash update status to mlxsw driver, from Jiri Pirko. 11) Lots of kTLS offload infrastructure fixes, from Jakub Kicinski. 12) Add support for mv88e6250 DSA chips, from Rasmus Villemoes. 13) Various enhancements to ipv6 flow label handling, from Eric Dumazet and Willem de Bruijn. 14) Support TLS offload in nfp driver, from Jakub Kicinski, Dirk van der Merwe, and others. 15) Various improvements to axienet driver including converting it to phylink, from Robert Hancock. 16) Add PTP support to sja1105 DSA driver, from Vladimir Oltean. 17) Add mqprio qdisc offload support to dpaa2-eth, from Ioana Radulescu. 18) Add devlink health reporting to mlx5, from Moshe Shemesh. 19) Convert stmmac over to phylink, from Jose Abreu. 20) Add PTP PHC (Physical Hardware Clock) support to mlxsw, from Shalom Toledo. 21) Add nftables SYNPROXY support, from Fernando Fernandez Mancera. 22) Convert tcp_fastopen over to use SipHash, from Ard Biesheuvel. 23) Track spill/fill of constants in BPF verifier, from Alexei Starovoitov. 24) Support bounded loops in BPF, from Alexei Starovoitov. 25) Various page_pool API fixes and improvements, from Jesper Dangaard Brouer. 26) Just like ipv4, support ref-countless ipv6 route handling. From Wei Wang. 27) Support VLAN offloading in aquantia driver, from Igor Russkikh. 28) Add AF_XDP zero-copy support to mlx5, from Maxim Mikityanskiy. 29) Add flower GRE encap/decap support to nfp driver, from Pieter Jansen van Vuuren. 30) Protect against stack overflow when using act_mirred, from John Hurley. 31) Allow devmap map lookups from eBPF, from Toke Høiland-Jørgensen. 32) Use page_pool API in netsec driver, Ilias Apalodimas. 33) Add Google gve network driver, from Catherine Sullivan. 34) More indirect call avoidance, from Paolo Abeni. 35) Add kTLS TX HW offload support to mlx5, from Tariq Toukan. 36) Add XDP_REDIRECT support to bnxt_en, from Andy Gospodarek. 37) Add MPLS manipulation actions to TC, from John Hurley. 38) Add sending a packet to connection tracking from TC actions, and then allow flower classifier matching on conntrack state. From Paul Blakey. 39) Netfilter hw offload support, from Pablo Neira Ayuso" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2080 commits) net/mlx5e: Return in default case statement in tx_post_resync_params mlx5: Return -EINVAL when WARN_ON_ONCE triggers in mlx5e_tls_resync(). net: dsa: add support for BRIDGE_MROUTER attribute pkt_sched: Include const.h net: netsec: remove static declaration for netsec_set_tx_de() net: netsec: remove superfluous if statement netfilter: nf_tables: add hardware offload support net: flow_offload: rename tc_cls_flower_offload to flow_cls_offload net: flow_offload: add flow_block_cb_is_busy() and use it net: sched: remove tcf block API drivers: net: use flow block API net: sched: use flow block API net: flow_offload: add flow_block_cb_{priv, incref, decref}() net: flow_offload: add list handling functions net: flow_offload: add flow_block_cb_alloc() and flow_block_cb_free() net: flow_offload: rename TCF_BLOCK_BINDER_TYPE_* to FLOW_BLOCK_BINDER_TYPE_* net: flow_offload: rename TC_BLOCK_{UN}BIND to FLOW_BLOCK_{UN}BIND net: flow_offload: add flow_block_cb_setup_simple() net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC net: hisilicon: Add an rx_desc to adapt HI13X1_GMAC ...
2019-07-11Merge tag 'clone3-v5.3' of ↵Linus Torvalds17-41/+218
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull clone3 system call from Christian Brauner: "This adds the clone3 syscall which is an extensible successor to clone after we snagged the last flag with CLONE_PIDFD during the 5.2 merge window for clone(). It cleanly supports all of the flags from clone() and thus all legacy workloads. There are few user visible differences between clone3 and clone. First, CLONE_DETACHED will cause EINVAL with clone3 so we can reuse this flag. Second, the CSIGNAL flag is deprecated and will cause EINVAL to be reported. It is superseeded by a dedicated "exit_signal" argument in struct clone_args thus freeing up even more flags. And third, clone3 gives CLONE_PIDFD a dedicated return argument in struct clone_args instead of abusing CLONE_PARENT_SETTID's parent_tidptr argument. The clone3 uapi is designed to be easy to handle on 32- and 64 bit: /* uapi */ struct clone_args { __aligned_u64 flags; __aligned_u64 pidfd; __aligned_u64 child_tid; __aligned_u64 parent_tid; __aligned_u64 exit_signal; __aligned_u64 stack; __aligned_u64 stack_size; __aligned_u64 tls; }; and a separate kernel struct is used that uses proper kernel typing: /* kernel internal */ struct kernel_clone_args { u64 flags; int __user *pidfd; int __user *child_tid; int __user *parent_tid; int exit_signal; unsigned long stack; unsigned long stack_size; unsigned long tls; }; The system call comes with a size argument which enables the kernel to detect what version of clone_args userspace is passing in. clone3 validates that any additional bytes a given kernel does not know about are set to zero and that the size never exceeds a page. A nice feature is that this patchset allowed us to cleanup and simplify various core kernel codepaths in kernel/fork.c by making the internal _do_fork() function take struct kernel_clone_args even for legacy clone(). This patch also unblocks the time namespace patchset which wants to introduce a new CLONE_TIMENS flag. Note, that clone3 has only been wired up for x86{_32,64}, arm{64}, and xtensa. These were the architectures that did not require special massaging. Other architectures treat fork-like system calls individually and after some back and forth neither Arnd nor I felt confident that we dared to add clone3 unconditionally to all architectures. We agreed to leave this up to individual architecture maintainers. This is why there's an additional patch that introduces __ARCH_WANT_SYS_CLONE3 which any architecture can set once it has implemented support for clone3. The patch also adds a cond_syscall(clone3) for architectures such as nios2 or h8300 that generate their syscall table by simply including asm-generic/unistd.h. The hope is to get rid of __ARCH_WANT_SYS_CLONE3 and cond_syscall() rather soon" * tag 'clone3-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: arch: handle arches who do not yet define clone3 arch: wire-up clone3() syscall fork: add clone3
2019-07-11dlm: no need to check return value of debugfs_create functionsGreg Kroah-Hartman3-27/+7
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: David Teigland <[email protected]>
2019-07-11dlm: check if workqueues are NULL before flushing/destroyingDavid Windsor1-6/+12
If the DLM lowcomms stack is shut down before any DLM traffic can be generated, flush_workqueue() and destroy_workqueue() can be called on empty send and/or recv workqueues. Insert guard conditionals to only call flush_workqueue() and destroy_workqueue() on workqueues that are not NULL. Signed-off-by: David Windsor <[email protected]> Signed-off-by: David Teigland <[email protected]>
2019-07-11kconfig: remove meaningless if-conditional in conf_read()Masahiro Yamada1-4/+2
sym_is_choice(sym) has already been checked by previous if-block: if (sym_is_choice(sym) || (sym->flags & SYMBOL_NO_WRITE)) continue; Hence, the following code is redundant, and the comment is misleading: if (!sym_is_choice(sym)) continue; /* fall through */ It always takes 'continue', never falls though. Clean up the dead code. Signed-off-by: Masahiro Yamada <[email protected]>
2019-07-11kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefixMasahiro Yamada1-2/+2
arch/mips/Makefile passes prefixes that start with '-' to cc-cross-prefix when $(tool-archpref) evaluates to the empty string. They are filtered-out before the $(shell ...) invocation. Otherwise, 'command -v' would be confused. $ command -v -linux-gcc bash: command: -l: invalid option command: usage: command [-pVv] command [arg ...] Since commit 913ab9780fc0 ("kbuild: use more portable 'command -v' for cc-cross-prefix"), cc-cross-prefix throws away the stderr output, so the console is not polluted in any way. This is not a big deal in practice, but I see a slightly better taste in adding '--' to teach it that '-linux-gcc' is an argument instead of a command option. This will cause extra forking of subshell, but it will not be noticeable performance regression. Signed-off-by: Masahiro Yamada <[email protected]>
2019-07-11Merge tag 'kvm-arm-for-5.3' of ↵Paolo Bonzini17500-128453/+30919
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm updates for 5.3 - Add support for chained PMU counters in guests - Improve SError handling - Handle Neoverse N1 erratum #1349291 - Allow side-channel mitigation status to be migrated - Standardise most AArch64 system register accesses to msr_s/mrs_s - Fix host MPIDR corruption on 32bit
2019-07-11Documentation: virtual: Add toctree hooksLuke Nowakowski-Krijger2-0/+29
Added toctree hooks for indexing. Hooks added only for newly added files. The hook for the top of the tree will be added in a later patch series when a few more substantial changes have been added. Signed-off-by: Luke Nowakowski-Krijger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-07-11Documentation: kvm: Convert cpuid.txt to .rstLuke Nowakowski-Krijger2-91/+107
Convert cpuid.txt to .rst format to be parsable by sphinx. Change format and spacing to make function definitions and return values much more clear. Also added a table that is parsable by sphinx and makes the information much more clean. Updated Author email to their new active email address. Added license identifier with the consent of the author. Signed-off-by: Luke Nowakowski-Krijger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-07-11Documentation: virtual: Convert paravirt_ops.txt to .rstLuke Nowakowski-Krijger1-8/+11
Convert paravirt_opts.txt to .rst format to be able to be parsed by sphinx. Made some minor spacing and formatting corrections to make defintions much more clear and easy to read. Added default kernel license to the document. Signed-off-by: Luke Nowakowski-Krijger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-07-11KVM: x86: Unconditionally enable irqs in guest contextSean Christopherson2-9/+12
On VMX, KVM currently does not re-enable irqs until after it has exited the guest context. As a result, a tick that fires in the window between VM-Exit and guest_exit_irqoff() will be accounted as system time. While said window is relatively small, it's large enough to be problematic in some configurations, e.g. if VM-Exits are consistently occurring a hair earlier than the tick irq. Intentionally toggle irqs back off so that guest_exit_irqoff() can be used in lieu of guest_exit() in order to avoid the save/restore of flags in guest_exit(). On my Haswell system, "nop; cli; sti" is ~6 cycles, versus ~28 cycles for "pushf; pop <reg>; cli; push <reg>; popf". Fixes: f2485b3e0c6c0 ("KVM: x86: use guest_exit_irqoff") Reported-by: Wei Yang <[email protected]> Signed-off-by: Sean Christopherson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2019-07-11KVM: x86: PMU Event FilterEric Hankland7-0/+110
Some events can provide a guest with information about other guests or the host (e.g. L3 cache stats); providing the capability to restrict access to a "safe" set of events would limit the potential for the PMU to be used in any side channel attacks. This change introduces a new VM ioctl that sets an event filter. If the guest attempts to program a counter for any blacklisted or non-whitelisted event, the kernel counter won't be created, so any RDPMC/RDMSR will show 0 instances of that event. Signed-off-by: Eric Hankland <[email protected]> [Lots of changes. All remaining bugs are probably mine. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2019-07-11x86/stacktrace: Prevent infinite loop in arch_stack_walk_user()Eiichi Tsukata1-5/+3
arch_stack_walk_user() checks `if (fp == frame.next_fp)` to prevent a infinite loop by self reference but it's not enogh for circular reference. Once a lack of return address is found, there is no point to continue the loop, so break out. Fixes: 02b67518e2b1 ("tracing: add support for userspace stacktraces in tracing/iter_ctrl") Signed-off-by: Eiichi Tsukata <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Linus Torvalds <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-07-10Merge tag 'pidfd-updates-v5.3' of ↵Linus Torvalds29-44/+571
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull pidfd updates from Christian Brauner: "This adds two main features. - First, it adds polling support for pidfds. This allows process managers to know when a (non-parent) process dies in a race-free way. The notification mechanism used follows the same logic that is currently used when the parent of a task is notified of a child's death. With this patchset it is possible to put pidfds in an {e}poll loop and get reliable notifications for process (i.e. thread-group) exit. - The second feature compliments the first one by making it possible to retrieve pollable pidfds for processes that were not created using CLONE_PIDFD. A lot of processes get created with traditional PID-based calls such as fork() or clone() (without CLONE_PIDFD). For these processes a caller can currently not create a pollable pidfd. This is a problem for Android's low memory killer (LMK) and service managers such as systemd. Both patchsets are accompanied by selftests. It's perhaps worth noting that the work done so far and the work done in this branch for pidfd_open() and polling support do already see some adoption: - Android is in the process of backporting this work to all their LTS kernels [1] - Service managers make use of pidfd_send_signal but will need to wait until we enable waiting on pidfds for full adoption. - And projects I maintain make use of both pidfd_send_signal and CLONE_PIDFD [2] and will use polling support and pidfd_open() too" [1] https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22 https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22 [2] https://github.com/lxc/lxc/blob/aab6e3eb73c343231cdde775db938994fc6f2803/src/lxc/start.c#L1753 * tag 'pidfd-updates-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: tests: add pidfd_open() tests arch: wire-up pidfd_open() pid: add pidfd_open() pidfd: add polling selftests pidfd: add polling support
2019-07-10Merge tag 'm68k-for-v5.3-tag2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k fix from Geert Uytterhoeven: "Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire. This is a fix for an issue detected in next, to avoid introducing build failures when merging Christoph's dma-mapping tree later" * tag 'm68k-for-v5.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: Don't select ARCH_HAS_DMA_PREP_COHERENT for nommu or coldfire
2019-07-10Merge branch 'for-next' of ↵Linus Torvalds21-187/+145
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68nommu updates from Greg Ungerer: "A series of cleanups for the FLAT format binary loader, binfmt_flat, from Christoph. The end goal is to support no-MMU on RISC-V, and the last patch enables that" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: riscv: add binfmt_flat support binfmt_flat: don't offset the data start binfmt_flat: move the MAX_SHARED_LIBS definition to binfmt_flat.c binfmt_flat: remove the persistent argument from flat_get_addr_from_rp binfmt_flat: provide an asm-generic/flat.h binfmt_flat: make support for old format binaries optional binfmt_flat: add a ARCH_HAS_BINFMT_FLAT option binfmt_flat: add endianess annotations binfmt_flat: use fixed size type for the on-disk format binfmt_flat: consolidate two version of flat_v2_reloc_t binfmt_flat: remove the unused OLD_FLAT_FLAG_RAM definition binfmt_flat: remove the uapi <linux/flat.h> header binfmt_flat: replace flat_argvp_envp_on_stack with a Kconfig variable binfmt_flat: remove flat_old_ram_flag binfmt_flat: provide a default version of flat_get_relocate_addr binfmt_flat: remove flat_set_persistent binfmt_flat: remove flat_reloc_valid
2019-07-10Merge tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linuxLinus Torvalds30-256/+1034
Pull nfsd updates from Bruce Fields: "Highlights: - Add a new /proc/fs/nfsd/clients/ directory which exposes some long-requested information about NFSv4 clients (like open files) and allows forced revocation of client state. - Replace the global duplicate reply cache by a cache per network namespace; previously, a request in one network namespace could incorrectly match an entry from another, though we haven't seen this in production. This is the last remaining container bug that I'm aware of; at this point you should be able to run separate nfsd's in each network namespace, each with their own set of exports, and everything should work. - Cleanup and modify lock code to show the pid of lockd as the owner of NLM locks. This is the correct version of the bugfix originally attempted in b8eee0e90f97 ("lockd: Show pid of lockd for remote locks")" * tag 'nfsd-5.3' of git://linux-nfs.org/~bfields/linux: (34 commits) nfsd: Make __get_nfsdfs_client() static nfsd: Make two functions static nfsd: Fix misuse of strlcpy sunrpc/cache: remove the exporting of cache_seq_next nfsd: decode implementation id nfsd: create xdr_netobj_dup helper nfsd: allow forced expiration of NFSv4 clients nfsd: create get_nfsdfs_clp helper nfsd4: show layout stateids nfsd: show lock and deleg stateids nfsd4: add file to display list of client's opens nfsd: add more information to client info file nfsd: escape high characters in binary data nfsd: copy client's address including port number to cl_addr nfsd4: add a client info file nfsd: make client/ directory names small ints nfsd: add nfsd/clients directory nfsd4: use reference count to free client nfsd: rename cl_refcount nfsd: persist nfsd filesystem across mounts ...
2019-07-10Merge tag 'gfs2-for-5.3' of ↵Linus Torvalds23-230/+190
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: "Some relatively minor changes for gfs2: - An initial batch of obvious cleanups and fixes from Bob's recovery patch queue. - Two iomap conversion patches and some cleanups from Christoph Hellwig. - A cosmetic cleanup from Kefeng Wang (Huawei). - Another minor fix and cleanup by me" * tag 'gfs2-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Remove unused gfs2_iomap_alloc argument gfs2: don't use buffer_heads in gfs2_allocate_page_backing gfs2: use iomap_bmap instead of generic_block_bmap gfs2: mark stuffed_readpage static gfs2: merge gfs2_writepage_common into gfs2_writepage gfs2: merge gfs2_writeback_aops and gfs2_ordered_aops gfs2: remove the unused gfs2_stuffed_write_end function gfs2: use page_offset in gfs2_page_mkwrite gfs2: replace more printk with calls to fs_info and friends gfs2: dump fsid when dumping glock problems gfs2: simplify gfs2_freeze by removing case gfs2: Rename SDF_SHUTDOWN to SDF_WITHDRAWN gfs2: Warn when a journal replay overwrites a rgrp with buffers gfs2: log which portion of the journal is replayed gfs2: eliminate tr_num_revoke_rm gfs2: kthread and remount improvements gfs2: Use IS_ERR_OR_NULL gfs2: Clean up freeing struct gfs2_sbd
2019-07-10Merge tag 'ext4_for_linus' of ↵Linus Torvalds23-234/+483
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Many bug fixes and cleanups, and an optimization for case-insensitive lookups" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix coverity warning on error path of filename setup ext4: replace ktype default_attrs with default_groups ext4: rename htree_inline_dir_to_tree() to ext4_inlinedir_to_tree() ext4: refactor initialize_dirent_tail() ext4: rename "dirent_csum" functions to use "dirblock" ext4: allow directory holes jbd2: drop declaration of journal_sync_buffer() ext4: use jbd2_inode dirty range scoping jbd2: introduce jbd2_inode dirty range scoping mm: add filemap_fdatawait_range_keep_errors() ext4: remove redundant assignment to node ext4: optimize case-insensitive lookups ext4: make __ext4_get_inode_loc plug ext4: clean up kerneldoc warnigns when building with W=1 ext4: only set project inherit bit for directory ext4: enforce the immutable flag on open files ext4: don't allow any modifications to an immutable file jbd2: fix typo in comment of journal_submit_inode_data_buffers jbd2: fix some print format mistakes ext4: gracefully handle ext4_break_layouts() failure during truncate
2019-07-10Merge tag 'afs-next-20190628' of ↵Linus Torvalds14-82/+369
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull afs updates from David Howells: "A set of minor changes for AFS: - Remove an unnecessary check in afs_unlink() - Add a tracepoint for tracking callback management - Add a tracepoint for afs_server object usage - Use struct_size() - Add mappings for AFS UAE abort codes to Linux error codes, using symbolic names rather than hex numbers in the .c file" * tag 'afs-next-20190628' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Add support for the UAE error table fs/afs: use struct_size() in kzalloc() afs: Trace afs_server usage afs: Add some callback management tracepoints afs: afs_unlink() doesn't need to check dentry->d_inode
2019-07-10Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscryptLinus Torvalds14-286/+363
Pull fscrypt updates from Eric Biggers: - Preparations for supporting encryption on ext4 filesystems where the filesystem block size is smaller than PAGE_SIZE. - Don't allow setting encryption policies on dead directories. - Various cleanups. * tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt: fscrypt: document testing with xfstests fscrypt: remove selection of CONFIG_CRYPTO_SHA256 fscrypt: remove unnecessary includes of ratelimit.h fscrypt: don't set policy for a dead directory ext4: encrypt only up to last block in ext4_bio_write_page() ext4: decrypt only the needed block in __ext4_block_zero_page_range() ext4: decrypt only the needed blocks in ext4_block_write_begin() ext4: clear BH_Uptodate flag on decryption error fscrypt: decrypt only the needed blocks in __fscrypt_decrypt_bio() fscrypt: support decrypting multiple filesystem blocks per page fscrypt: introduce fscrypt_decrypt_block_inplace() fscrypt: handle blocksize < PAGE_SIZE in fscrypt_zeroout_range() fscrypt: support encrypting multiple filesystem blocks per page fscrypt: introduce fscrypt_encrypt_block_inplace() fscrypt: clean up some BUG_ON()s in block encryption/decryption fscrypt: rename fscrypt_do_page_crypto() to fscrypt_crypt_block() fscrypt: remove the "write" part of struct fscrypt_ctx fscrypt: simplify bounce page handling
2019-07-10Merge tag 'copy-file-range-fixes-1' of ↵Linus Torvalds9-100/+257
git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull copy_file_range updates from Darrick Wong: "This fixes numerous parameter checking problems and inconsistent behaviors in the new(ish) copy_file_range system call. Now the system call will actually check its range parameters correctly; refuse to copy into files for which the caller does not have sufficient privileges; update mtime and strip setuid like file writes are supposed to do; and allows copying up to the EOF of the source file instead of failing the call like we used to. Summary: - Create a generic copy_file_range handler and make individual filesystems responsible for calling it (i.e. no more assuming that do_splice_direct will work or is appropriate) - Refactor copy_file_range and remap_range parameter checking where they are the same - Install missing copy_file_range parameter checking(!) - Remove suid/sgid and update mtime like any other file write - Change the behavior so that a copy range crossing the source file's eof will result in a short copy to the source file's eof instead of EINVAL - Permit filesystems to decide if they want to handle cross-superblock copy_file_range in their local handlers" * tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: fuse: copy_file_range needs to strip setuid bits and update timestamps vfs: allow copy_file_range to copy across devices xfs: use file_modified() helper vfs: introduce file_modified() helper vfs: add missing checks to copy_file_range vfs: remove redundant checks from generic_remap_checks() vfs: introduce generic_file_rw_checks() vfs: no fallback for ->copy_file_range vfs: introduce generic_copy_file_range()
2019-07-10Merge tag 'iomap-5.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds5-37/+47
Pull iomap updates from Darrick Wong: "There are a few fixes for gfs2 but otherwise it's pretty quiet so far. - Only mark inode dirty at the end of writing to a file (instead of once for every page written). - Fix for an accounting error in the page_done callback" * tag 'iomap-5.3-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: iomap: fix page_done callback for short writes fs: fold __generic_write_end back into generic_write_end iomap: don't mark the inode dirty in iomap_write_end
2019-07-10Merge tag 'for_v5.3-rc1' of ↵Linus Torvalds9-151/+195
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, udf and quota updates from Jan Kara: - some ext2 fixes and cleanups - a fix of udf bug when extending files - a fix of quota Q_XGETQSTAT[V] handling * tag 'for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix incorrect final NOT_ALLOCATED (hole) extent length ext2: Use kmemdup rather than duplicating its implementation quota: honor quota type in Q_XGETQSTAT[V] calls ext2: Always brelse bh on failure in ext2_iget() ext2: add missing brelse() in ext2_iget() ext2: Fix a typo in ext2_getattr argument ext2: fix a typo in comment ext2: add missing brelse() in ext2_new_inode() ext2: optimize ext2_xattr_get() ext2: introduce new helper for xattr entry comparison ext2: merge xattr next entry check to ext2_xattr_entry_valid() ext2: code cleanup for ext2_preread_inode() ext2: code cleanup by using test_opt() and clear_opt() doc: ext2: update description of quota options for ext2 ext2: Strengthen xattr block checks ext2: Merge loops in ext2_xattr_set() ext2: introduce helper for xattr entry validation ext2: introduce helper for xattr header validation quota: add dqi_dirty_list description to comment of Dquot List Management
2019-07-10Merge tag 'fsnotify_for_v5.3-rc1' of ↵Linus Torvalds16-71/+76
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "This contains cleanups of the fsnotify name removal hook and also a patch to disable fanotify permission events for 'proc' filesystem" * tag 'fsnotify_for_v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fsnotify: get rid of fsnotify_nameremove() fsnotify: move fsnotify_nameremove() hook out of d_delete() configfs: call fsnotify_rmdir() hook debugfs: call fsnotify_{unlink,rmdir}() hooks debugfs: simplify __debugfs_remove_file() devpts: call fsnotify_unlink() hook tracefs: call fsnotify_{unlink,rmdir}() hooks rpc_pipefs: call fsnotify_{unlink,rmdir}() hooks btrfs: call fsnotify_rmdir() hook fsnotify: add empty fsnotify_{unlink,rmdir}() hooks fanotify: Disallow permission events for proc filesystem
2019-07-10Merge tag 'locks-v5.3-1' of ↵Linus Torvalds3-22/+79
git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "Just a couple of small lease-related patches this cycle. One from Ira to add a new tracepoint that fires during lease conflict checks, and another patch from Amir to reduce false positives when checking for lease conflicts" * tag 'locks-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: locks: eliminate false positive conflicts for write lease locks: Add trace_leases_conflict
2019-07-10f2fs: improve print log in f2fs_sanity_check_ckpt()Chao Yu1-1/+3
As Park Ju Hyung suggested: "I'd like to suggest to write down an actual version of f2fs-tools here as we've seen older versions of fsck doing even more damage and the users might not have the latest f2fs-tools installed." This patch give a more detailed info of how we fix such corruption to user to avoid damageable repair with low version fsck. Signed-off-by: Park Ju Hyung <[email protected]> Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2019-07-10Revert "Merge tag 'keys-acl-20190703' of ↵Linus Torvalds46-992/+325
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" This reverts merge 0f75ef6a9cff49ff612f7ce0578bced9d0b38325 (and thus effectively commits 7a1ade847596 ("keys: Provide KEYCTL_GRANT_PERMISSION") 2e12256b9a76 ("keys: Replace uid/gid/perm permissions checking with an ACL") that the merge brought in). It turns out that it breaks booting with an encrypted volume, and Eric biggers reports that it also breaks the fscrypt tests [1] and loading of in-kernel X.509 certificates [2]. The root cause of all the breakage is likely the same, but David Howells is off email so rather than try to work it out it's getting reverted in order to not impact the rest of the merge window. [1] https://lore.kernel.org/lkml/[email protected]/ [2] https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/lkml/CAHk-=wjxoeMJfeBahnWH=9zShKp2bsVy527vo3_y8HfOdhwAAw@mail.gmail.com/ Reported-by: Eric Biggers <[email protected]> Cc: David Howells <[email protected]> Cc: James Morris <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2019-07-10f2fs: avoid out-of-range memory accessOcean Chen1-0/+5
blkoff_off might over 512 due to fs corrupt or security vulnerability. That should be checked before being using. Use ENTRIES_IN_SUM to protect invalid value in cur_data_blkoff. Signed-off-by: Ocean Chen <[email protected]> Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2019-07-10f2fs: fix to avoid long latency during umountHeng Xiao1-0/+4
In umount, we give an constand time to handle pending discard, previously, in __issue_discard_cmd() we missed to check timeout condition in loop, result in delaying long time, fix it. Signed-off-by: Heng Xiao <[email protected]> [Chao Yu: add commit message] Signed-off-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2019-07-10f2fs: allow all the users to pin a fileJaegeuk Kim1-3/+0
This patch allows users to pin files. Reviewed-by: Chao Yu <[email protected]> Signed-off-by: Jaegeuk Kim <[email protected]>
2019-07-10x86/asm: Move native_write_cr0/4() out of lineThomas Gleixner5-68/+61
The pinning of sensitive CR0 and CR4 bits caused a boot crash when loading the kvm_intel module on a kernel compiled with CONFIG_PARAVIRT=n. The reason is that the static key which controls the pinning is marked RO after init. The kvm_intel module contains a CR4 write which requires to update the static key entry list. That obviously does not work when the key is in a RO section. With CONFIG_PARAVIRT enabled this does not happen because the CR4 write uses the paravirt indirection and the actual write function is built in. As the key is intended to be immutable after init, move native_write_cr0/4() out of line. While at it consolidate the update of the cr4 shadow variable and store the value right away when the pinning is initialized on a booting CPU. No point in reading it back 20 instructions later. This allows to confine the static key and the pinning variable to cpu/common and allows to mark them static. Fixes: 8dbec27a242c ("x86/asm: Pin sensitive CR0 bits") Fixes: 873d50d58f67 ("x86/asm: Pin sensitive CR4 bits") Reported-by: Linus Torvalds <[email protected]> Reported-by: Xi Ruoyao <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Xi Ruoyao <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-07-10timekeeping/vsyscall: Use __iter_div_u64_rem()Arnd Bergmann1-5/+1
On 32-bit x86 when building with clang-9, the 'division' loop gets turned back into an inefficient division that causes a link error: kernel/time/vsyscall.o: In function `update_vsyscall': vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' Use the existing __iter_div_u64_rem() function which is used to address the same issue in other places. Fixes: 44f57d788e7d ("timekeeping: Provide a generic update_vsyscall() implementation") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-07-10xfs: chain bios the right way around in xfs_rw_bdevChristoph Hellwig1-1/+1
We need to chain the earlier bios to the later ones, so that submit_bio_wait waits on the bio that all the completions are dispatched to. Fixes: 6ad5b3255b9e ("xfs: use bios directly to read and write the log recovery buffers") Reported-by: Dave Chinner <[email protected]> Tested-by: Dave Chinner <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Darrick J. Wong <[email protected]> Signed-off-by: Darrick J. Wong <[email protected]>
2019-07-10x86/pgtable/32: Fix LOWMEM_PAGES constantArnd Bergmann1-1/+1
clang points out that the computation of LOWMEM_PAGES causes a signed integer overflow on 32-bit x86: arch/x86/kernel/head32.c:83:20: error: signed shift result (0x100000000) requires 34 bits to represent, but 'int' only has 32 bits [-Werror,-Wshift-overflow] (PAGE_TABLE_SIZE(LOWMEM_PAGES) << PAGE_SHIFT); ^~~~~~~~~~~~ arch/x86/include/asm/pgtable_32.h:109:27: note: expanded from macro 'LOWMEM_PAGES' #define LOWMEM_PAGES ((((2<<31) - __PAGE_OFFSET) >> PAGE_SHIFT)) ~^ ~~ arch/x86/include/asm/pgtable_32.h:98:34: note: expanded from macro 'PAGE_TABLE_SIZE' #define PAGE_TABLE_SIZE(pages) ((pages) / PTRS_PER_PGD) Use the _ULL() macro to make it a 64-bit constant. Fixes: 1e620f9b23e5 ("x86/boot/32: Convert the 32-bit pgtable setup code from assembly to C") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2019-07-11kbuild: Inform user to pass ARCH= for make mrproperGeert Uytterhoeven1-1/+1
When cross-compiling an out-of-tree build with an unclean source tree directory, the build fails with: /path/to/kernel/source/tree is not clean, please run 'make mrproper' in the '/path/to/kernel/source/tree' directory. However, doing so does not fix the problem, as "make mrproper" now requires passing the target architecture to the make command, else it won't remove $(srctree)/arch/$(SRCARCH)/include/generated. "git ls-files -o" doesn't give a clue, as it doesn't list (empty) directories, only files. Improve usability by including the ARCH= option in the error output. Fixes: a788b2ed81ab ("kbuild: check arch/$(SRCARCH)/include/generated before out-of-tree build") Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2019-07-11kbuild: fix compression errors getting ignoredHarald Seiler1-5/+5
A missing compression utility or other errors were not picked up by make and an empty kernel image was produced. By removing the &&, errors will no longer be ignored. Signed-off-by: Harald Seiler <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2019-07-11kbuild: add a flag to force absolute path for srctreeMasahiro Yamada3-2/+14
In old days, Kbuild always used an absolute path for $(srctree). Since commit 890676c65d69 ("kbuild: Use relative path when building in the source tree"), $(srctree) is '.' when O= was not passed from the command line. Yet, using absolute paths is useful in some cases even without O=, for instance, to create a cscope file with absolute path tags. 'O=.' was known to work as a workaround to force Kbuild to use absolute paths even when you are building in the source tree. Since commit 25b146c5b8ce ("kbuild: allow Kbuild to start from any directory"), Kbuild is too clever to be tricked. Even if you pass 'O=.' Kbuild notices you are building in the source tree, then use '.' for $(srctree). So, 'make O=. cscope' is no help to create absolute path tags. We cannot force one or the other according to commit e93bc1a0cab3 ("Revert "kbuild: specify absolute paths for cscope""). Both of relative path and absolute path have pros and cons. This commit adds a new flag KBUILD_ABS_SRCTREE to allow users to choose the absolute path for $(srctree). 'make KBUILD_ABS_SRCTREE=1 cscope' will work as a replacement of 'make O=. cscope'. Reported-by: Pawan Gupta <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>