aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-11-30Merge tag 'afs-next-20191121' of ↵Linus Torvalds10-54/+50
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull AFS updates from David Howells: "Minor cleanups and fix: - Minor fix to make some debugging statements display information from the correct iov_iter. - Rename some members and variables to make things more obvious or consistent. - Provide a helper to wrap increments of the usage count on the afs_read struct. - Use scnprintf() to print into a stack buffer rather than sprintf(). - Remove some set but unused variables" * tag 'afs-next-20191121' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: afs: Remove set but not used variable 'ret' afs: Remove set but not used variables 'before', 'after' afs: xattr: use scnprintf afs: Introduce an afs_get_read() refcount helper afs: Rename desc -> req in afs_fetch_data() afs: Switch the naming of call->iter and call->_iter afs: Use call->_iter not &call->iter in debugging statements
2019-11-30Merge tag 'ext4_for_linus' of ↵Linus Torvalds30-1461/+1691
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "This merge window saw the the following new featuers added to ext4: - Direct I/O via iomap (required the iomap-for-next branch from Darrick as a prereq). - Support for using dioread-nolock where the block size < page size. - Support for encryption for file systems where the block size < page size. - Rework of journal credits handling so a revoke-heavy workload will not cause the journal to run out of space. - Replace bit-spinlocks with spinlocks in jbd2 Also included were some bug fixes and cleanups, mostly to clean up corner cases from fuzzed file systems and error path handling" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (59 commits) ext4: work around deleting a file with i_nlink == 0 safely ext4: add more paranoia checking in ext4_expand_extra_isize handling jbd2: make jbd2_handle_buffer_credits() handle reserved handles ext4: fix a bug in ext4_wait_for_tail_page_commit ext4: bio_alloc with __GFP_DIRECT_RECLAIM never fails ext4: code cleanup for get_next_id ext4: fix leak of quota reservations ext4: remove unused variable warning in parse_options() ext4: Enable encryption for subpage-sized blocks fs/buffer.c: support fscrypt in block_read_full_page() ext4: Add error handling for io_end_vec struct allocation jbd2: Fine tune estimate of necessary descriptor blocks jbd2: Provide trace event for handle restarts ext4: Reserve revoke credits for freed blocks jbd2: Make credit checking more strict jbd2: Rename h_buffer_credits to h_total_credits jbd2: Reserve space for revoke descriptor blocks jbd2: Drop jbd2_space_needed() jbd2: Account descriptor blocks into t_outstanding_credits jbd2: Factor out common parts of stopping and restarting a handle ...
2019-11-30Merge tag 'vfs-5.5-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds1-3/+11
Pull splice fix from Darrick Wong: "Fix another place in the splice code where a pipe could ask a filesystem for a longer read than the pipe actually has free buffer space" * tag 'vfs-5.5-merge-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: splice: only read in as much information as there is pipe buffer space
2019-11-30Merge tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds26-922/+1215
Pull iomap updates from Darrick Wong: "In this release, we hoisted as much of XFS' writeback code into iomap as was practicable, refactored the unshare file data function, added the ability to perform buffered io copy on write, and tweaked various parts of the directio implementation as needed to port ext4's directio code (that will be a separate pull). Summary: - Make iomap_dio_rw callers explicitly tell us if they want us to wait - Port the xfs writeback code to iomap to complete the buffered io library functions - Refactor the unshare code to share common pieces - Add support for performing copy on write with buffered writes - Other minor fixes - Fix unchecked return in iomap_bmap - Fix a type casting bug in a ternary statement in iomap_dio_bio_actor - Improve tracepoints for easier diagnostic ability - Fix pipe page leakage in directio reads" * tag 'iomap-5.5-merge-11' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (31 commits) iomap: Fix pipe page leakage during splicing iomap: trace iomap_appply results iomap: fix return value of iomap_dio_bio_actor on 32bit systems iomap: iomap_bmap should check iomap_apply return value iomap: Fix overflow in iomap_page_mkwrite fs/iomap: remove redundant check in iomap_dio_rw() iomap: use a srcmap for a read-modify-write I/O iomap: renumber IOMAP_HOLE to 0 iomap: use write_begin to read pages to unshare iomap: move the zeroing case out of iomap_read_page_sync iomap: ignore non-shared or non-data blocks in xfs_file_dirty iomap: always use AOP_FLAG_NOFS in iomap_write_begin iomap: remove the unused iomap argument to __iomap_write_end iomap: better document the IOMAP_F_* flags iomap: enhance writeback error message iomap: pass a struct page to iomap_finish_page_writeback iomap: cleanup iomap_ioend_compare iomap: move struct iomap_page out of iomap.h iomap: warn on inline maps in iomap_writepage_map iomap: lift the xfs writeback code to iomap ...
2019-11-30Revert "dt-bindings: remoteproc: stm32: add wakeup-source"Bjorn Andersson1-3/+0
The DeviceTree binding document was converted to YAML in a patch that is being merged through the devicetree tree, as such this patch needs to be rewritten and is currently cause for a merge conflict. This reverts commit 14ea1d04ed0f7bae60951bdb8eeaa55cdbb26c73. Signed-off-by: Bjorn Andersson <[email protected]>
2019-11-30net: sched: fix `tc -s class show` no bstats on class with nolock subqueuesDust Li4-5/+6
When a classful qdisc's child qdisc has set the flag TCQ_F_CPUSTATS (pfifo_fast for example), the child qdisc's cpu_bstats should be passed to gnet_stats_copy_basic(), but many classful qdisc didn't do that. As a result, `tc -s class show dev DEV` always return 0 for bytes and packets in this case. Pass the child qdisc's cpu_bstats to gnet_stats_copy_basic() to fix this issue. The qstats also has this problem, but it has been fixed in 5dd431b6b9 ("net: sched: introduce and use qstats read...") and bstats still remains buggy. Fixes: 22e0f8b9322c ("net: sched: make bstats per cpu and estimator RCU safe") Signed-off-by: Dust Li <[email protected]> Signed-off-by: Tony Lu <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-30Merge tag 'for-linus-hmm' of ↵Linus Torvalds31-2139/+1298
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull hmm updates from Jason Gunthorpe: "This is another round of bug fixing and cleanup. This time the focus is on the driver pattern to use mmu notifiers to monitor a VA range. This code is lifted out of many drivers and hmm_mirror directly into the mmu_notifier core and written using the best ideas from all the driver implementations. This removes many bugs from the drivers and has a very pleasing diffstat. More drivers can still be converted, but that is for another cycle. - A shared branch with RDMA reworking the RDMA ODP implementation - New mmu_interval_notifier API. This is focused on the use case of monitoring a VA and simplifies the process for drivers - A common seq-count locking scheme built into the mmu_interval_notifier API usable by drivers that call get_user_pages() or hmm_range_fault() with the VA range - Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen GntDev drivers to the new API. This deletes a lot of wonky driver code. - Two improvements for hmm_range_fault(), from testing done by Ralph" * tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap mm/hmm: make full use of walk_page_range() xen/gntdev: use mmu_interval_notifier_insert mm/hmm: remove hmm_mirror and related drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror drm/amdgpu: Call find_vma under mmap_sem nouveau: use mmu_interval_notifier instead of hmm_mirror nouveau: use mmu_notifier directly for invalidate_range_start drm/radeon: use mmu_interval_notifier_insert RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv RDMA/odp: Use mmu_interval_notifier_insert() mm/hmm: define the pre-processor related parts of hmm.h even if disabled mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror mm/mmu_notifier: add an interval tree notifier mm/mmu_notifier: define the header pre-processor parts even if disabled mm/hmm: allow snapshot of the special zero page
2019-11-30net: ethernet: ti: ale: ensure vlan/mdb deleted when no membersGrygorii Strashko1-3/+9
The recently updated ALE APIs cpsw_ale_del_mcast() and cpsw_ale_del_vlan_modify() have an issue and will not delete ALE entry even if VLAN/mcast group has no more members. Hence fix it here and delete ALE entry if !port_mask. The issue affected only new cpsw switchdev driver. Fixes: e85c14370783 ("net: ethernet: ti: ale: modify vlan/mdb api for switchdev") Signed-off-by: Grygorii Strashko <[email protected]> Acked-by: Ilias Apalodimas <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-30net/mlx5e: Fix build error without IPV6YueHaibing1-36/+38
If IPV6 is not set and CONFIG_MLX5_ESWITCH is y, building fails: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:322:5: error: redefinition of mlx5e_tc_tun_create_header_ipv6 int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c:7:0: drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.h:67:1: note: previous definition of mlx5e_tc_tun_create_header_ipv6 was here mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use #ifdef to guard this, also move mlx5e_route_lookup_ipv6 to cleanup unused warning. Reported-by: Hulk Robot <[email protected]> Fixes: e689e998e102 ("net/mlx5e: TC, Stub out ipv6 tun create header function") Signed-off-by: YueHaibing <[email protected]> Acked-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-30Merge tag 'drm-vmwgfx-coherent-2019-11-29' of ↵Linus Torvalds23-123/+1996
git://anongit.freedesktop.org/drm/drm Pull drm coherent memory support for vmwgfx from Dave Airlie: "This is a separate pull for the mm pagewalking + drm/vmwgfx work Thomas did and you were involved in, I've left it separate in case you don't feel as comfortable with it as the other stuff. It has mm acks/r-b in the right places from what I can see" * tag 'drm-vmwgfx-coherent-2019-11-29' of git://anongit.freedesktop.org/drm/drm: drm/vmwgfx: Add surface dirty-tracking callbacks drm/vmwgfx: Implement an infrastructure for read-coherent resources drm/vmwgfx: Use an RBtree instead of linked list for MOB resources drm/vmwgfx: Implement an infrastructure for write-coherent resources mm: Add write-protect and clean utilities for address space ranges mm: Add a walk_page_mapping() function to the pagewalk code mm: pagewalk: Take the pagetable lock in walk_pte_range() mm: Remove BUG_ON mmap_sem not held from xxx_trans_huge_lock() drm/ttm: Convert vm callbacks to helpers drm/ttm: Remove explicit typecasts of vm_private_data
2019-11-30x86/ioperm: Save an indentation level in tss_update_io_bitmap()Borislav Petkov1-26/+26
... for better readability. No functional changes. [ Minor edit. ] Signed-off-by: Borislav Petkov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2019-11-30s390/livepatch: Implement reliable stack tracing for the consistency modelMiroslav Benes2-0/+44
The livepatch consistency model requires reliable stack tracing architecture support in order to work properly. In order to achieve this, two main issues have to be solved. First, reliable and consistent call chain backtracing has to be ensured. Second, the unwinder needs to be able to detect stack corruptions and return errors. The "zSeries ELF Application Binary Interface Supplement" says: "The stack pointer points to the first word of the lowest allocated stack frame. If the "back chain" is implemented this word will point to the previously allocated stack frame (towards higher addresses), except for the first stack frame, which shall have a back chain of zero (NULL). The stack shall grow downwards, in other words towards lower addresses." "back chain" is optional. GCC option -mbackchain enables it. Quoting Martin Schwidefsky [1]: "The compiler is called with the -mbackchain option, all normal C function will store the backchain in the function prologue. All functions written in assembler code should do the same, if you find one that does not we should fix that. The end result is that a task that *voluntarily* called schedule() should have a proper backchain at all times. Dependent on the use case this may or may not be enough. Asynchronous interrupts may stop the CPU at the beginning of a function, if kernel preemption is enabled we can end up with a broken backchain. The production kernels for IBM Z are all compiled *without* kernel preemption. So yes, we might get away without the objtool support. On a side-note, we do have a line item to implement the ORC unwinder for the kernel, that includes the objtool support. Once we have that we can drop the -mbackchain option for the kernel build. That gives us a nice little performance benefit. I hope that the change from backchain to the ORC unwinder will not be too hard to implement in the livepatch tools." Since -mbackchain is enabled by default when the kernel is compiled, the call chain backtracing should be currently ensured and objtool should not be necessary for livepatch purposes. Regarding the second issue, stack corruptions and non-reliable states have to be recognized by the unwinder. Mainly it means to detect preemption or page faults, the end of the task stack must be reached, return addresses must be valid text addresses and hacks like function graph tracing and kretprobes must be properly detected. Unwinding a running task's stack is not a problem, because there is a livepatch requirement that every checked task is blocked, except for the current task. Due to that, the implementation can be much simpler compared to the existing non-reliable infrastructure. We can consider a task's kernel/thread stack only and skip the other stacks. [1] 20180912121106.31ffa97c@mschwideX1 [not archived on lore.kernel.org] Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: Heiko Carstens <[email protected]> Tested-by: Miroslav Benes <[email protected]> Signed-off-by: Miroslav Benes <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: add stack pointer alignment sanity checksMiroslav Benes2-0/+8
ABI requires SP to be aligned 8 bytes, report unwinding error otherwise. Link: https://lkml.kernel.org/r/[email protected] Reviewed-by: Heiko Carstens <[email protected]> Tested-by: Miroslav Benes <[email protected]> Signed-off-by: Miroslav Benes <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: filter out unreliable bogus %r14Vasily Gorbik1-0/+5
Currently unwinder unconditionally returns %r14 from the first frame pointed by %r15 from pt_regs. A task could be interrupted when a function already allocated this frame (if it needs it) for its callees or to store local variables. In that case this frame would contain random values from stack or values stored there by a callee. As we are only interested in %r14 to get potential return address, skip bogus return addresses which doesn't belong to kernel text. This helps to avoid duplicating filtering logic in unwider users, most of which use unwind_get_return_address() and would choke on bogus 0 address returned by it otherwise. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: start unwinding from reliable stateVasily Gorbik2-17/+31
A comment in arch/s390/include/asm/unwind.h says: > If 'first_frame' is not zero unwind_start skips unwind frames until it > reaches the specified stack pointer. > The end of the unwinding is indicated with unwind_done, this can be true > right after unwind_start, e.g. with first_frame!=0 that can not be found. > unwind_next_frame skips to the next frame. > Once the unwind is completed unwind_error() can be used to check if there > has been a situation where the unwinder could not correctly understand > the tasks call chain. With this change backchain unwinder now comply with behaviour described. As well as matches orc unwinder implementation. Now unwinder starts from reliable state, i.e. __unwind_start own stack frame is taken or stack frame generated by __switch_to (ksp) - both known to be valid. In case of pt_regs %r15 is better match for pt_regs psw, than sometimes random "sp" caller passed. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/test_unwind: add program check context testsVasily Gorbik1-0/+47
Add unwinding from program check handler tests. Unwinder should be able to unwind through pt_regs stored by program check handler on task stack. Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/test_unwind: add irq context testsVasily Gorbik1-0/+45
Add unwinding from irq context tests. Unwinder should be able to unwind through irq stack to task stack up to task pt_regs. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/test_unwind: print verbose unwinding resultsVasily Gorbik2-2/+11
Add stack name, sp and reliable information into test unwinding results. Also consider ip outside of kernel text as failure if the state is reported reliable. Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/test_unwind: add CALL_ON_STACK testsVasily Gorbik1-7/+19
Add CALL_ON_STACK helper testing. Tests make sure that we can unwind from switched stack to original one up to task pt_regs (nodat -> task stack). UWM_SWITCH_STACK could not be used together with UWM_THREAD because get_stack_info explicitly restricts unwinding to task stack if task != current. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: fix register clobbering in CALL_ON_STACKVasily Gorbik1-2/+2
CALL_ON_STACK defines and initializes register variables. Inline assembly which follows might trigger compiler to generate memory access for "stack" argument (e.g. in case of S390_lowcore.nodat_stack). This memory access produces a function call under kasan with outline instrumentation which clobbers registers. Switch "stack" argument in CALL_ON_STACK helper to use memory reference constraint and perform load instead. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/test_unwind: require that unwinding ended successfullyVasily Gorbik1-0/+4
Currently unwinder test passes if unwinding results contain unwindme_func2 and unwindme_func1 functions. Now that unwinder reports success upon reaching task pt_regs, check that unwinding ended successfully in every test. Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: add a test for the internal APIIlya Leoshkevich3-0/+248
unwind_for_each_frame can take at least 8 different sets of parameters. Add a test to make sure they all are handled in a sane way. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Ilya Leoshkevich <[email protected]> Co-developed-by: Vasily Gorbik <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: always inline get_stack_pointerVasily Gorbik1-2/+2
Always inline get_stack_pointer() to avoid potential problems due to compiler inlining decisions, i.e. getting stack pointer of get_stack_pointer() itself which is later reused. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/pci: add error message on device number limitNiklas Schnelle1-0/+2
The config option CONFIG_PCI_NR_FUNCTIONS sets a limit on the number of PCI functions we can support. Previously on reaching this limit there was no indication why newly attached devices are not recognized by Linux which could be quite confusing. Thus this patch adds a pr_err() for this case. Reviewed-by: Peter Oberparleiter <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/pci: add error message for UID collisionNiklas Schnelle1-0/+3
When UID checking was turned off during runtime in the underlying hypervisor, a PCI device may be attached with the same UID. This is already detected but happens silently. Add an error message so it can more easily be understood why a device was not added. Reviewed-by: Peter Oberparleiter <[email protected]> Signed-off-by: Niklas Schnelle <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/cpum_sf: Check for SDBT and SDB consistencyThomas Richter1-2/+15
Each SBDT is located at a 4KB page and contains 512 entries. Each entry of a SDBT points to a SDB, a 4KB page containing sampled data. The last entry is a link to another SDBT page. When an event is created the function sequence executed is: __hw_perf_event_init() +--> allocate_buffers() +--> realloc_sampling_buffers() +---> alloc_sample_data_block() Both functions realloc_sampling_buffers() and alloc_sample_data_block() allocate pages and the allocation can fail. This is handled correctly and all allocated pages are freed and error -ENOMEM is returned to the top calling function. Finally the event is not created. Once the event has been created, the amount of initially allocated SDBT and SDB can be too low. This is detected during measurement interrupt handling, where the amount of lost samples is calculated. If the number of lost samples is too high considering sampling frequency and already allocated SBDs, the number of SDBs is enlarged during the next execution of cpumsf_pmu_enable(). If more SBDs need to be allocated, functions realloc_sampling_buffers() +---> alloc-sample_data_block() are called to allocate more pages. Page allocation may fail and the returned error is ignored. A SDBT and SDB setup already exists. However the modified SDBTs and SDBs might end up in a situation where the first entry of an SDBT does not point to an SDB, but another SDBT, basicly an SBDT without payload. This can not be handled by the interrupt handler, where an SDBT must have at least one entry pointing to an SBD. Add a check to avoid SDBTs with out payload (SDBs) when enlarging the buffer setup. Signed-off-by: Thomas Richter <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/cpum_sf: Use TEAR_REG macro consistantlyThomas Richter1-8/+1
The macro TEAR_REG() saves the last used SDBT address in the perf_hw_event structure. This is also done by function hw_reset_registers() which is a one-liner and simply uses macro TEAR_REG(). Remove function hw_reset_registers(), which is only used one time and use macro TEAR_REG() instead. This macro is used throughout the code anyway. Signed-off-by: Thomas Richter <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/cpum_sf: Remove unnecessary check for pending SDBsThomas Richter1-2/+1
In interrupt handling the function extend_sampling_buffer() is called after checking for a possibly extension. This check is not necessary as the called function itself performs this check again. Signed-off-by: Thomas Richter <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/cpum_sf: Replace function name in debug statementsThomas Richter2-52/+57
Replace hard coded function names in debug statements by the "%s ...", __func__ construct suggested by checkpatch.pl script. Use consistent debug print format of the form variable blank value. Also add leading 0x for all hex values. Print allocated page addresses consistantly as hex numbers with leading 0x. Signed-off-by: Thomas Richter <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/kaslr: store KASLR offset for early dumpsGerald Schaefer2-1/+6
The KASLR offset is added to vmcoreinfo in arch_crash_save_vmcoreinfo(), so that it can be found by crash when processing kernel dumps. However, arch_crash_save_vmcoreinfo() is called during a subsys_initcall, so if the kernel crashes before that, we have no vmcoreinfo and no KASLR offset. Fix this by storing the KASLR offset in the lowcore, where the vmcore_info pointer will be stored, and where it can be found by crash. In order to make it distinguishable from a real vmcore_info pointer, mark it as uneven (KASLR offset itself is aligned to THREAD_SIZE). When arch_crash_save_vmcoreinfo() stores the real vmcore_info pointer in the lowcore, it overwrites the KASLR offset. At that point, the KASLR offset is not yet added to vmcoreinfo, so we also need to move the mem_assign_absolute() behind the vmcoreinfo_append_str(). Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Cc: <[email protected]> # v5.2+ Signed-off-by: Gerald Schaefer <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: stop gracefully at task pt_regsVasily Gorbik1-1/+7
Consider reaching task pt_regs graceful unwinder termination. Task pt_regs itself never contains a valid state to which a task might return within the kernel context (user task pt_regs is a special case). Since we already avoid printing user task pt_regs and in most cases we don't even bother filling task pt_regs psw and r15 with something reasonable simply skip task pt_regs altogether. With this change unwind_error() now accurately represent whether unwinder reached task pt_regs successfully or failed along the way. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/head64: correct init_task stack setupVasily Gorbik1-1/+1
Add missing allocation of pt_regs at the bottom of the stack. This makes it consistent with other stack setup cases and also what stack unwinder expects. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: make reuse_sp default when unwinding pt_regsVasily Gorbik2-15/+7
Currently unwinder yields 2 entries when pt_regs are met: sp="address of pt_regs itself" ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" And neither of those 2 states (combination of sp and ip) ever happened. reuse_sp has been introduced by commit a1d863ac3e10 ("s390/unwind: fix mixing regs and sp"). reuse_sp=true makes unwinder keen to produce the following result, when pt_regs are given (as an arg to unwind_start): sp=pt_regs->gprs[15] ip=pt_regs->psw sp=pt_regs->gprs[15] ip="r14 from stack frame pointed by pt_regs->gprs[15]" The first state is an actual state in which a task was when pt_regs were collected. The second state is marked unreliable and is for debugging purposes to cover the case when a task has been interrupted in between stack frame allocation and writing back_chain - in this case r14 might show an actual caller. Make unwinder behaviour enabled via reuse_sp=true default and drop the special case handling. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: report an error if pt_regs are not on stackVasily Gorbik1-1/+1
If unwinder is looking at pt_regs which is not on stack then something went wrong and an error has to be reported rather than successful unwinding termination. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: avoid misusing CALL_ON_STACK for task stack setupVasily Gorbik3-9/+13
CALL_ON_STACK is intended to be used for temporary stack switching with potential return to the caller. When CALL_ON_STACK is misused to switch from nodat stack to task stack back_chain information would later lead stack unwinder from task stack into (per cpu) nodat stack which is reused for other purposes. This would yield confusing unwinding result or errors. To avoid that introduce CALL_ON_STACK_NORETURN to be used instead. It makes sure that back_chain is zeroed and unwinder finishes gracefully ending up at task pt_regs. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: correct CALL_ON_STACK back_chain savingVasily Gorbik1-1/+14
Currently CALL_ON_STACK saves r15 as back_chain in the first stack frame of the stack we about to switch to. But if a function which uses CALL_ON_STACK calls other function it allocates a stack frame for a callee. In this case r15 is pointing to a callee stack frame and not a stack frame of function itself. This results in dummy unwinding entry with random sp and ip values. Introduce and utilize current_frame_address macro to get an address of actual function stack frame. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/unwind: unify task is current checksVasily Gorbik3-6/+3
Avoid mixture of task == NULL and task == current meaning the same thing and simply always initialize task with current in unwind_start. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: disable preemption when switching to nodat stack with CALL_ON_STACKVasily Gorbik2-3/+11
Make sure preemption is disabled when temporary switching to nodat stack with CALL_ON_STACK helper, because nodat stack is per cpu. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: always inline disabled_waitVasily Gorbik1-1/+1
disabled_wait uses _THIS_IP_ and assumes that compiler would inline it. Make sure this assumption is always correct by utilizing __always_inline. Reviewed-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/vdso: fix getcpuHeiko Carstens4-10/+14
getcpu reads the required values for cpu and node with two instructions. This might lead to an inconsistent result if user space gets preempted and migrated to a different CPU between the two instructions. Fix this by using just a single instruction to read both values at once. This is currently rather a theoretical bug, since there is no real NUMA support available (except for NUMA emulation). Reviewed-by: Christian Borntraeger <[email protected]> Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/smp,vdso: fix ASCE handlingHeiko Carstens1-0/+5
When a secondary CPU is brought up it must initialize its control registers. CPU A which triggers that a secondary CPU B is brought up stores its control register contents into the lowcore of new CPU B, which then loads these values on startup. This is problematic in various ways: the control register which contains the home space ASCE will correctly contain the kernel ASCE; however control registers for primary and secondary ASCEs are initialized with whatever values were present in CPU A. Typically: - the primary ASCE will contain the user process ASCE of the process that triggered onlining of CPU B. - the secondary ASCE will contain the percpu VDSO ASCE of CPU A. Due to lazy ASCE handling we may also end up with other combinations. When then CPU B switches to a different process (!= idle) it will fixup the primary ASCE. However the problem is that the (wrong) ASCE from CPU A was loaded into control register 1: as soon as an ASCE is attached (aka loaded) a CPU is free to generate TLB entries using that address space. Even though it is very unlikey that CPU B will actually generate such entries, this could result in TLB entries of the address space of the process that ran on CPU A. These entries shouldn't exist at all and could cause problems later on. Furthermore the secondary ASCE of CPU B will not be updated correctly. This means that processes may see wrong results or even crash if they access VDSO data on CPU B. The correct VDSO ASCE will eventually be loaded on return to user space as soon as the kernel executed a call to strnlen_user or an atomic futex operation on CPU B. Fix both issues by intializing the to be loaded control register contents with the correct ASCEs and also enforce (re-)loading of the ASCEs upon first context switch and return to user space. Fixes: 0aaba41b58bc ("s390: remove all code using the access register mode") Cc: [email protected] # v4.15+ Signed-off-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390/zcrypt: handle new reply code FILTERED_BY_HYPERVISORHarald Freudenberger1-0/+2
This patch introduces support for a new architectured reply code 0x8B indicating that a hypervisor layer (if any) has rejected an ap message. Linux may run as a guest on top of a hypervisor like zVM or KVM. So the crypto hardware seen by the ap bus may be restricted by the hypervisor for example only a subset like only clear key crypto requests may be supported. Other requests will be filtered out - rejected by the hypervisor. The new reply code 0x8B will appear in such cases and needs to get recognized by the ap bus and zcrypt device driver zoo. Signed-off-by: Harald Freudenberger <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-30s390: implement perf_arch_fetch_caller_regsIlya Leoshkevich1-0/+7
On s390 bpf_get_stack_raw_tp() returns 0 entries for both kernel and user stacks. While there is no practical unwinding solution for userspace on s390 at this moment, there certainly is a kernel unwinder. However, it is not properly integrated with BPF. In order to start unwinding, bpf_get_stack_raw_tp() obtains the current kernel register values using perf_fetch_caller_regs(), which is not implemented for s390. The actual unwinding then happens by passing those registers to perf_callchain_kernel(). Implement perf_arch_fetch_caller_regs() for s390, where __builtin_frame_address(0) points to back_chain. Signed-off-by: Ilya Leoshkevich <[email protected]> Acked-by: Heiko Carstens <[email protected]> Signed-off-by: Vasily Gorbik <[email protected]>
2019-11-29xtensa: clean up system_call/xtensa_rt_sigreturn interactionMax Filippov3-10/+6
system_call assembly code always pushes pointer to struct pt_regs as the last additional parameter for all system calls. The only user of this feature is xtensa_rt_sigreturn. Avoid this special case. Define xtensa_rt_sigreturn as accepting no argiments. Use current_pt_regs to get pointer to struct pt_regs in xtensa_rt_sigreturn. Don't pass additional parameter from system_call code. Signed-off-by: Max Filippov <[email protected]>
2019-11-29xtensa: fix system_call interaction with ptraceMax Filippov2-4/+18
Don't overwrite return value if system call was cancelled at entry by ptrace. Return status code from do_syscall_trace_enter so that pt_regs::syscall doesn't need to be changed to skip syscall. Signed-off-by: Max Filippov <[email protected]>
2019-11-29xtensa: rearrange syscall tracingMax Filippov3-7/+4
system_call saves and restores syscall number across system call to make clone and execv entry and exit tracing match. This complicates things when syscall code may be changed by ptrace. Preserve syscall code in copy_thread and start_thread directly instead of doing tricks in system_call. Signed-off-by: Max Filippov <[email protected]>
2019-11-29selftests: pmtu: use -oneline for ip route list cacheThadeu Lima de Souza Cascardo1-3/+2
Some versions of iproute2 will output more than one line per entry, which will cause the test to fail, like: TEST: ipv6: list and flush cached exceptions [FAIL] can't list cached exceptions That happens, for example, with iproute2 4.15.0. When using the -oneline option, this will work just fine: TEST: ipv6: list and flush cached exceptions [ OK ] This also works just fine with a more recent version of iproute2, like 5.4.0. For some reason, two lines are printed for the IPv4 test no matter what version of iproute2 is used. Use the same -oneline parameter there instead of counting the lines twice. Fixes: b964641e9925 ("selftests: pmtu: Make list_flush_ipv6_exception test more demanding") Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]> Acked-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-11-29Merge branch 'for-5.5/whiskers' into for-linusJiri Kosina1-43/+103
- robustification of tablet mode support in google-whiskers driver (Dmitry Torokhov)
2019-11-29Merge branch 'for-5.5/logitech' into for-linusJiri Kosina7-0/+990
- Support for Logitech G15 (Hans de Goede) - silencing of non-informative error flow in dmesg from logitechi-hiddpp (Hans de Goede)
2019-11-29Merge branch 'for-5.5/ish' into for-linusJiri Kosina1-1/+1
- typo fix (Geert Uytterhoeven)