aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-11-30Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linusJens Axboe4-4/+11
Pull NVMe fixes from Christoph: "Various fixlets all over." * 'nvme-4.20' of git://git.infradead.org/nvme: nvme-rdma: fix double freeing of async event data nvme: flush namespace scanning work just before removing namespaces nvme: warn when finding multi-port subsystems without multipathing enabled nvme-pci: fix surprise removal nvme-fc: initialize nvme_req(rq)->ctrl after calling __nvme_fc_init_request() nvme: Free ctrl device name on init failure
2018-11-30block: fix single range discard mergeMing Lei1-1/+1
There are actually two kinds of discard merge: - one is the normal discard merge, just like normal read/write request, and call it single-range discard - another is the multi-range discard, queue_max_discard_segments(rq->q) > 1 For the former case, queue_max_discard_segments(rq->q) is 1, and we should handle this kind of discard merge like the normal read/write request. This patch fixes the following kernel panic issue[1], which is caused by not removing the single-range discard request from elevator queue. Guangwu has one raid discard test case, in which this issue is a bit easier to trigger, and I verified that this patch can fix the kernel panic issue in Guangwu's test case. [1] kernel panic log from Jens's report BUG: unable to handle kernel NULL pointer dereference at 0000000000000148 PGD 0 P4D 0. Oops: 0000 [#1] SMP PTI CPU: 37 PID: 763 Comm: kworker/37:1H Not tainted \ 4.20.0-rc3-00649-ge64d9a554a91-dirty #14 Hardware name: Wiwynn \ Leopard-Orv2/Leopard-DDR BW, BIOS LBM08 03/03/2017 Workqueue: kblockd \ blk_mq_run_work_fn RIP: \ 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 \ 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 00 00 00 \ 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 20 72 37 \ f6 87 b0 00 00 00 02 RSP: 0018:ffffc90004aabd30 EFLAGS: 00010246 \ RAX: 0000000000000003 RBX: ffff888465ea1300 RCX: ffffc90004aabde8 RDX: 00000000ffffffff RSI: ffffc90004aabde8 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffff888465ea1348 R09: 0000000000000000 R10: 0000000000001000 R11: 00000000ffffffff R12: ffff888465ea1300 R13: 0000000000000000 R14: ffff888465ea1348 R15: ffff888465d10000 FS: 0000000000000000(0000) GS:ffff88846f9c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000148 CR3: 000000000220a003 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: blk_mq_dispatch_rq_list+0xec/0x480 ? elv_rb_del+0x11/0x30 blk_mq_do_dispatch_sched+0x6e/0xf0 blk_mq_sched_dispatch_requests+0xfa/0x170 __blk_mq_run_hw_queue+0x5f/0xe0 process_one_work+0x154/0x350 worker_thread+0x46/0x3c0 kthread+0xf5/0x130 ? process_one_work+0x350/0x350 ? kthread_destroy_worker+0x50/0x50 ret_from_fork+0x1f/0x30 Modules linked in: sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel \ kvm switchtec irqbypass iTCO_wdt iTCO_vendor_support efivars cdc_ether usbnet mii \ cdc_acm i2c_i801 lpc_ich mfd_core ipmi_si ipmi_devintf ipmi_msghandler acpi_cpufreq \ button sch_fq_codel nfsd nfs_acl lockd grace auth_rpcgss oid_registry sunrpc nvme \ nvme_core fuse sg loop efivarfs autofs4 CR2: 0000000000000148 \ ---[ end trace 340a1fb996df1b9b ]--- RIP: 0010:blk_mq_get_driver_tag+0x81/0x120 Code: 24 10 48 89 7c 24 20 74 21 83 fa ff 0f 95 c0 48 8b 4c 24 28 65 48 33 0c 25 28 \ 00 00 00 0f 85 96 00 00 00 48 83 c4 30 5b 5d c3 <48> 8b 87 48 01 00 00 8b 40 04 39 43 \ 20 72 37 f6 87 b0 00 00 00 02 Fixes: 445251d0f4d329a ("blk-mq: fix discard merge with scheduler attached") Reported-by: Jens Axboe <[email protected]> Cc: Guangwu Zhang <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Jianchao Wang <[email protected]> Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-11-30stackleak: Disable function tracing and kprobes for stackleak_erase()Alexander Popov1-1/+3
The stackleak_erase() function is called on the trampoline stack at the end of syscall. This stack is not big enough for ftrace and kprobes operations, e.g. it can be exhausted if we use kprobe_events for stackleak_erase(). So let's disable function tracing and kprobes of stackleak_erase(). Reported-by: kernel test robot <[email protected]> Fixes: 10e9ae9fabaf ("gcc-plugins: Add STACKLEAK plugin for tracking the kernel stack") Signed-off-by: Alexander Popov <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Signed-off-by: Kees Cook <[email protected]>
2018-11-30Merge tag 'pstore-v4.20-rc5' of ↵Linus Torvalds2-10/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull pstore fix from Kees Cook: "Fix corrupted compression due to unlucky size choice with ECC" * tag 'pstore-v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: pstore/ram: Correctly calculate usable PRZ bytes
2018-11-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds11-123/+121
Pull rdma fixes from Jason Gunthorpe: "This is a bit later than usual for our first -rc but I'm not seeing anything worry-some in the RDMA tree right now. Quiet so far this -rc cycle, only a few internal driver related bugs and a small series fixing ODP bugs found by more advanced testing. A set of small driver and core code fixes: - Small series fixing longtime user triggerable bugs in the ODP processing inside mlx5 and core code - Various small driver malfunctions and crashes (use after, free, error unwind, implementation bugs) - A misfunction of the RDMA GID cache that can be triggered by the administrator" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Initialize return variable in case pagefault was skipped IB/mlx5: Fix page fault handling for MW IB/umem: Set correct address to the invalidation function IB/mlx5: Skip non-ODP MR when handling a page fault RDMA/hns: Bugfix pbl configuration for rereg mr iser: set sector for ambiguous mr status errors RDMA/rdmavt: Fix rvt_create_ah function signature IB/mlx5: Avoid load failure due to unknown link width IB/mlx5: Fix XRC QP support after introducing extended atomic RDMA/bnxt_re: Avoid accessing the device structure after it is freed RDMA/bnxt_re: Fix system hang when registration with L2 driver fails RDMA/core: Add GIDs while changing MAC addr only for registered ndev RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WR net/mlx5: Fix XRC SRQ umem valid bits
2018-11-30nvme-rdma: fix double freeing of async event dataPrabhath Sajeepa1-0/+2
Some error paths in configuration of admin queue free data buffer associated with async request SQE without resetting the data buffer pointer to NULL, This buffer is also freed up again if the controller is shutdown or reset. Signed-off-by: Prabhath Sajeepa <[email protected]> Reviewed-by: Roland Dreier <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2018-11-30nvme: flush namespace scanning work just before removing namespacesSagi Grimberg1-1/+3
nvme_stop_ctrl can be called also for reset flow and there is no need to flush the scan_work as namespaces are not being removed. This can cause deadlock in rdma, fc and loop drivers since nvme_stop_ctrl barriers before controller teardown (and specifically I/O cancellation of the scan_work itself) takes place, but the scan_work will be blocked anyways so there is no need to flush it. Instead, move scan_work flush to nvme_remove_namespaces() where it really needs to flush. Reported-by: Ming Lei <[email protected]> Signed-off-by: Sagi Grimberg <[email protected]> Reviewed-by: Keith Busch <[email protected]> Reviewed by: James Smart <[email protected]> Tested-by: Ewan D. Milne <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>
2018-11-30nvme: warn when finding multi-port subsystems without multipathing enabledChristoph Hellwig1-0/+3
Without CONFIG_NVME_MULTIPATH enabled a multi-port subsystem might show up as invididual devices and cause problems, warn about it. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]>
2018-11-30fscache, cachefiles: remove redundant variable 'cache'Colin Ian King1-3/+0
Variable 'cache' is being assigned but is never used hence it is redundant and can be removed. Cleans up clang warning: warning: variable 'cache' set but not used [-Wunused-but-set-variable] Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David Howells <[email protected]>
2018-11-30cachefiles: avoid deprecated get_seconds()Arnd Bergmann1-1/+1
get_seconds() returns an unsigned long can overflow on some architectures and is deprecated because of that. In cachefs, we cast that number to a a 32-bit integer, which will overflow in year 2106 on all architectures. As confirmed by David Howells, the overflow probably isn't harmful in the end, since the timestamps are only used to make the file names unique, but they don't strictly have to be in monotonically increasing order since the files only exist in order to be deleted as quickly as possible. Moving to ktime_get_real_seconds() avoids the deprecated interface. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David Howells <[email protected]>
2018-11-30cachefiles: Explicitly cast enumerated type in put_objectNathan Chancellor1-2/+4
Clang warns when one enumerated type is implicitly converted to another. fs/cachefiles/namei.c:247:50: warning: implicit conversion from enumeration type 'enum cachefiles_obj_ref_trace' to different enumeration type 'enum fscache_obj_ref_trace' [-Wenum-conversion] cache->cache.ops->put_object(&xobject->fscache, cachefiles_obj_put_wait_retry); Silence this warning by explicitly casting to fscache_obj_ref_trace, which is also done in put_object. Reported-by: Nick Desaulniers <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: David Howells <[email protected]>
2018-11-30fscache: fix race between enablement and dropping of objectNeilBrown1-0/+3
It was observed that a process blocked indefintely in __fscache_read_or_alloc_page(), waiting for FSCACHE_COOKIE_LOOKING_UP to be cleared via fscache_wait_for_deferred_lookup(). At this time, ->backing_objects was empty, which would normaly prevent __fscache_read_or_alloc_page() from getting to the point of waiting. This implies that ->backing_objects was cleared *after* __fscache_read_or_alloc_page was was entered. When an object is "killed" and then "dropped", FSCACHE_COOKIE_LOOKING_UP is cleared in fscache_lookup_failure(), then KILL_OBJECT and DROP_OBJECT are "called" and only in DROP_OBJECT is ->backing_objects cleared. This leaves a window where something else can set FSCACHE_COOKIE_LOOKING_UP and __fscache_read_or_alloc_page() can start waiting, before ->backing_objects is cleared There is some uncertainty in this analysis, but it seems to be fit the observations. Adding the wake in this patch will be handled correctly by __fscache_read_or_alloc_page(), as it checks if ->backing_objects is empty again, after waiting. Customer which reported the hang, also report that the hang cannot be reproduced with this fix. The backtrace for the blocked process looked like: PID: 29360 TASK: ffff881ff2ac0f80 CPU: 3 COMMAND: "zsh" #0 [ffff881ff43efbf8] schedule at ffffffff815e56f1 #1 [ffff881ff43efc58] bit_wait at ffffffff815e64ed #2 [ffff881ff43efc68] __wait_on_bit at ffffffff815e61b8 #3 [ffff881ff43efca0] out_of_line_wait_on_bit at ffffffff815e625e #4 [ffff881ff43efd08] fscache_wait_for_deferred_lookup at ffffffffa04f2e8f [fscache] #5 [ffff881ff43efd18] __fscache_read_or_alloc_page at ffffffffa04f2ffe [fscache] #6 [ffff881ff43efd58] __nfs_readpage_from_fscache at ffffffffa0679668 [nfs] #7 [ffff881ff43efd78] nfs_readpage at ffffffffa067092b [nfs] #8 [ffff881ff43efda0] generic_file_read_iter at ffffffff81187a73 #9 [ffff881ff43efe50] nfs_file_read at ffffffffa066544b [nfs] #10 [ffff881ff43efe70] __vfs_read at ffffffff811fc756 #11 [ffff881ff43efee8] vfs_read at ffffffff811fccfa #12 [ffff881ff43eff18] sys_read at ffffffff811fda62 #13 [ffff881ff43eff50] entry_SYSCALL_64_fastpath at ffffffff815e986e Signed-off-by: NeilBrown <[email protected]> Signed-off-by: David Howells <[email protected]>
2018-11-30fs: fix lost error code in dio_completeMaximilian Heyne1-2/+2
commit e259221763a40403d5bb232209998e8c45804ab8 ("fs: simplify the generic_write_sync prototype") reworked callers of generic_write_sync(), and ended up dropping the error return for the directio path. Prior to that commit, in dio_complete(), an error would be bubbled up the stack, but after that commit, errors passed on to dio_complete were eaten up. This was reported on the list earlier, and a fix was proposed in https://lore.kernel.org/lkml/[email protected]/, but never followed up with. We recently hit this bug in our testing where fencing io errors, which were previously erroring out with EIO, were being returned as success operations after this commit. The fix proposed on the list earlier was a little short -- it would have still called generic_write_sync() in case `ret` already contained an error. This fix ensures generic_write_sync() is only called when there's no pending error in the write. Additionally, transferred is replaced with ret to bring this code in line with other callers. Fixes: e259221763a4 ("fs: simplify the generic_write_sync prototype") Reported-by: Ravi Nankani <[email protected]> Signed-off-by: Maximilian Heyne <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> CC: Torsten Mehlan <[email protected]> CC: Uwe Dannowski <[email protected]> CC: Amit Shah <[email protected]> CC: David Woodhouse <[email protected]> CC: [email protected] Signed-off-by: Jens Axboe <[email protected]>
2018-11-29tracing/fgraph: Fix set_graph_function from showing interruptsSteven Rostedt (VMware)4-3/+62
The tracefs file set_graph_function is used to only function graph functions that are listed in that file (or all functions if the file is empty). The way this is implemented is that the function graph tracer looks at every function, and if the current depth is zero and the function matches something in the file then it will trace that function. When other functions are called, the depth will be greater than zero (because the original function will be at depth zero), and all functions will be traced where the depth is greater than zero. The issue is that when a function is first entered, and the handler that checks this logic is called, the depth is set to zero. If an interrupt comes in and a function in the interrupt handler is traced, its depth will be greater than zero and it will automatically be traced, even if the original function was not. But because the logic only looks at depth it may trace interrupts when it should not be. The recent design change of the function graph tracer to fix other bugs caused the depth to be zero while the function graph callback handler is being called for a longer time, widening the race of this happening. This bug was actually there for a longer time, but because the race window was so small it seldom happened. The Fixes tag below is for the commit that widen the race window, because that commit belongs to a series that will also help fix the original bug. Cc: [email protected] Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack") Reported-by: Joe Lawrence <[email protected]> Tested-by: Joe Lawrence <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2018-11-29tracepoint: Use __idx instead of idx in DO_TRACE macro to make it uniqueZenghui Yu1-3/+3
After enabling KVM event tracing, almost all of trace_kvm_exit()'s printk shows "kvm_exit: IRQ: ..." even if the actual exception_type is NOT IRQ. More specifically, trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT. This slight problem may have existed after commit e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are two variables in trace_kvm_exit() and __DO_TRACE() which have the same name, *idx*. Thus the actual value of *idx* will be overwritten when tracing. Fix it by adding a simple prefix. Cc: Joel Fernandes <[email protected]> Cc: Wang Haibin <[email protected]> Cc: [email protected] Cc: [email protected] Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU") Reviewed-by: Joel Fernandes (Google) <[email protected]> Signed-off-by: Zenghui Yu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2018-11-29afs: Use d_instantiate() rather than d_add() and don't d_drop()David Howells1-3/+1
Use d_instantiate() rather than d_add() and don't d_drop() in afs_vnode_new_inode(). The dentry shouldn't be removed as it's not changing its name. Reported-by: Al Viro <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-11-29afs: Fix missing net error handlingDavid Howells6-113/+135
kAFS can be given certain network errors (EADDRNOTAVAIL, EHOSTDOWN and ERFKILL) that it doesn't handle in its server/address rotation algorithms. They cause the probing and rotation to abort immediately rather than rotating. Fix this by: (1) Abstracting out the error prioritisation from the VL and FS rotation algorithms into a common function and expand usage into the server probing code. When multiple errors are available, this code selects the one we'd prefer to return. (2) Add handling for EADDRNOTAVAIL, EHOSTDOWN and ERFKILL. Fixes: 0fafdc9f888b ("afs: Fix file locking") Fixes: 0338747d8454 ("afs: Probe multiple fileservers simultaneously") Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-11-29afs: Fix validation/callback interactionDavid Howells1-6/+12
When afs_validate() is called to validate a vnode (inode), there are two unhandled cases in the fastpath at the top of the function: (1) If the vnode is promised (AFS_VNODE_CB_PROMISED is set), the break counters match and the data has expired, then there's an implicit case in which the vnode needs revalidating. This has no consequences since the default "valid = false" set at the top of the function happens to do the right thing. (2) If the vnode is not promised and it hasn't been deleted (AFS_VNODE_DELETED is not set) then there's a default case we're not handling in which the vnode is invalid. If the vnode is invalid, we need to bring cb_s_break and cb_v_break up to date before we refetch the status. As a consequence, once the server loses track of the client (ie. sufficient time has passed since we last sent it an operation), it will send us a CB.InitCallBackState* operation when we next try to talk to it. This calls afs_init_callback_state() which increments afs_server::cb_s_break, but this then doesn't propagate to the afs_vnode record. The result being that every afs_validate() call thereafter sends a status fetch operation to the server. Clarify and fix this by: (A) Setting valid in all the branches rather than initialising it at the top so that the compiler catches where we've missed. (B) Restructuring the logic in the 'promised' branch so that we set valid to false if the callback is due to expire (or has expired) and so that the final case is that the vnode is still valid. (C) Adding an else-statement that ups cb_s_break and cb_v_break if the promised and deleted cases don't match. Fixes: c435ee34551e ("afs: Overhaul the callback handling") Signed-off-by: David Howells <[email protected]> Signed-off-by: Al Viro <[email protected]>
2018-11-29Merge tag 'acpi-4.20-rc5' of ↵Linus Torvalds1-19/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Fix a recent regression in ACPICA releted to the Generic Serial Bus protocol handling and causing it to read or write too little or too much data in some cases, so incorrect data may be written to hardware as a result (Hans de Goede)" * tag 'acpi-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPICA: Fix handling of buffer-size in acpi_ex_write_data_to_field()
2018-11-29Merge tag 'pm-4.20-rc5' of ↵Linus Torvalds2-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix two issues in the operating performance points (OPP) framework. Specifics: - Fix the handling of the "operating-points-v2" property to avoid failures if multiple phandles are present in it which is legitimate (Viresh Kumar). - Drop the unnecessary static initialization of the .owner field in the ti_opp_supply_driver structure (YueHaibing)" * tag 'pm-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: OPP: Fix parsing of multiple phandles in "operating-points-v2" property opp: ti-opp-supply: Fix platform_no_drv_owner.cocci warnings
2018-11-29RDMA/mlx5: Initialize return variable in case pagefault was skippedLeon Romanovsky1-0/+1
Pagefaults occurred in non-ODP MR are completely valid events, so initialize return variable to 0. Fixes: 4d5422a309de ("IB/mlx5: Skip non-ODP MR when handling a page fault") Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2018-11-29pstore/ram: Correctly calculate usable PRZ bytesKees Cook2-10/+10
The actual number of bytes stored in a PRZ is smaller than the bytes requested by platform data, since there is a header on each PRZ. Additionally, if ECC is enabled, there are trailing bytes used as well. Normally this mismatch doesn't matter since PRZs are circular buffers and the leading "overflow" bytes are just thrown away. However, in the case of a compressed record, this rather badly corrupts the results. This corruption was visible with "ramoops.mem_size=204800 ramoops.ecc=1". Any stored crashes would not be uncompressable (producing a pstorefs "dmesg-*.enc.z" file), and triggering errors at boot: [ 2.790759] pstore: crypto_comp_decompress failed, ret = -22! Backporting this depends on commit 70ad35db3321 ("pstore: Convert console write to use ->write_buf") Reported-by: Joel Fernandes <[email protected]> Fixes: b0aad7a99c1d ("pstore: Add compression support to pstore") Signed-off-by: Kees Cook <[email protected]> Reviewed-by: Joel Fernandes (Google) <[email protected]>
2018-11-29Merge branch 'acpica-fixes'Rafael J. Wysocki1-19/+2
* acpica-fixes: ACPICA: Fix handling of buffer-size in acpi_ex_write_data_to_field()
2018-11-29Merge tag 'selinux-pr-20181129' of ↵Linus Torvalds1-1/+12
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull SELinux fix from Paul Moore: "One more SELinux fix for v4.20: add some missing netlink message to SELinux permission mappings. The netlink messages were added in v4.19, but unfortunately we didn't catch it then because the mechanism to catch these things was bypassed. In addition to adding the mappings, we're adding some comments to the code to hopefully prevent bypasses in the future" * tag 'selinux-pr-20181129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: add support for RTM_NEWCHAIN, RTM_DELCHAIN, and RTM_GETCHAIN
2018-11-29Merge tag 's390-4.20-3' of ↵Linus Torvalds10-14/+32
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: - Add two missing kfree calls on error paths in the vfio-ccw code - Make sure that all data structures of a mediated vfio-ccw device are initialized before registering it - Fix a sparse warning in vfio-ccw - A followup patch for the pgtable_bytes accounting, the page table downgrade for compat processes missed a mm_dec_nr_pmds() - Reject sampling requests in the PMU init function of the CPU measurement counter facility - With the vfio AP driver an AP queue needs to be reset on every device probe as the alternative driver could have modified the device state * tag 's390-4.20-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: correct pgtable_bytes on page table downgrade s390/zcrypt: reinit ap queue state machine during device probe s390/cpum_cf: Reject request for sampling in event initialization s390/cio: Fix cleanup when unsupported IDA format is used s390/cio: Fix cleanup of pfn_array alloc failure vfio: ccw: Register mediated device once all structures are initialized s390/cio: make vfio_ccw_io_region static
2018-11-29Merge tag 'sound-4.20-rc5' of ↵Linus Torvalds33-303/+456
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "As a usual pattern, we've got relatively large updates at rc5: - A fix for races in ALSA control user elements - ASoC DAPM regression due to component refactoring - A fix in error handling of ASoC iteration macro - ASoC Intel SST Skylake kconfig fix; a new Kconfig will appear as a consequence, but in the end it's a good cleanup - HD-audio and USB-audio quirks as always - Assort of ASoC driver fixes (pcm186x, Intel cht, rockchip, pcm3060, rsnd, omap, wm_adsp, qcom, sunxi, stm32)" * tag 'sound-4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (34 commits) ALSA: usb-audio: Add vendor and product name for Dell WD19 Dock ALSA: hda/realtek - Support ALC300 ALSA: hda/realtek - Add auto-mute quirk for HP Spectre x360 laptop ALSA: hda/realtek - fix the pop noise on headphone for lenovo laptops ALSA: control: Fix race between adding and removing a user element ALSA: sparc: Fix invalid snd_free_pages() at error path ALSA: wss: Fix invalid snd_free_pages() at error path ALSA: hda/realtek - fix headset mic detection for MSI MS-B171 ALSA: hda: Add ASRock N68C-S UCC the power_save blacklist ALSA: ac97: Fix incorrect bit shift at AC97-SPSA control write ASoC: omap-dmic: Add pm_qos handling to avoid overruns with CPU_IDLE ASoC: omap-mcpdm: Add pm_qos handling to avoid under/overruns with CPU_IDLE ASoC: omap-mcbsp: Fix latency value calculation for pm_qos ASoC: acpi: fix: continue searching when machine is ignored ASoC: Intel: Skylake: fix Kconfigs, make HDaudio codec optional MAINTAINERS: add ASoC maintainers for sound dt-bindings ASoC: pcm186x: Fix device reset-registers trigger value ASoC: dapm: Recalculate audio map forcely when card instantiated ASoC: omap-abe-twl6040: Fix missing audio card caused by deferred probing ASoC: pcm3060: Rename output widgets ...
2018-11-29Merge tag 'fixes_for_v4.20-rc5' of ↵Linus Torvalds4-10/+23
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2 and udf fixes from Jan Kara: "Three small ext2 and udf fixes" * tag 'fixes_for_v4.20-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2: fix potential use after free ext2: initialize opts.s_mount_opt as zero before using it udf: Allow mounting volumes with incorrect identification strings
2018-11-29pvcalls-front: fixes incorrect error handlingPan Bian1-2/+2
kfree() is incorrectly used to release the pages allocated by __get_free_page() and __get_free_pages(). Use the matching deallocators i.e., free_page() and free_pages(), respectively. Signed-off-by: Pan Bian <[email protected]> Reviewed-by: Stefano Stabellini <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-11-29Revert "xen/balloon: Mark unallocated host memory as UNUSABLE"Igor Druzhinin4-141/+13
This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f. That commit unintentionally broke Xen balloon memory hotplug with "hotplug_unpopulated" set to 1. As long as "System RAM" resource got assigned under a new "Unusable memory" resource in IO/Mem tree any attempt to online this memory would fail due to general kernel restrictions on having "System RAM" resources as 1st level only. The original issue that commit has tried to workaround fa564ad96366 ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f, 60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move and shrink AMD 64-bit window to avoid conflict") which made the original fix to Xen ballooning unnecessary. Signed-off-by: Igor Druzhinin <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-11-29xen: xlate_mmu: add missing header to fix 'W=1' warningSrikanth Boddepalli1-0/+1
Add a missing header otherwise compiler warns about missed prototype: drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes] int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma, ^~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Srikanth Boddepalli <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Reviewed-by: Joey Pabalinas <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-11-29xen/x86: add diagnostic printout to xen_mc_flush() in case of errorJuergen Gross1-15/+20
Failure of an element of a Xen multicall is signalled via a WARN() only if the kernel is compiled with MC_DEBUG. It is impossible to know which element failed and why it did so. Change that by printing the related information even without MC_DEBUG, even if maybe in some limited form (e.g. without information which caller produced the failing element). Move the printing out of the switch statement in order to have the same information for a single call. Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2018-11-29arm64: ftrace: Fix to enable syscall events on arm64Masami Hiramatsu1-0/+13
Since commit 4378a7d4be30 ("arm64: implement syscall wrappers") introduced "__arm64_" prefix to all syscall wrapper symbols in sys_call_table, syscall tracer can not find corresponding metadata from syscall name. In the result, we have no syscall ftrace events on arm64 kernel, and some bpf testcases are failed on arm64. To fix this issue, this introduces custom arch_syscall_match_sym_name() which skips first 8 bytes when comparing the syscall and symbol names. Fixes: 4378a7d4be30 ("arm64: implement syscall wrappers") Reported-by: Naresh Kamboju <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Will Deacon <[email protected]> Tested-by: Naresh Kamboju <[email protected]> Cc: [email protected] Signed-off-by: Catalin Marinas <[email protected]>
2018-11-29arm64: Add workaround for Cortex-A76 erratum 1286807Catalin Marinas4-5/+45
On the affected Cortex-A76 cores (r0p0 to r3p0), if a virtual address for a cacheable mapping of a location is being accessed by a core while another core is remapping the virtual address to a new physical page using the recommended break-before-make sequence, then under very rare circumstances TLBI+DSB completes before a read using the translation being invalidated has been observed by other observers. The workaround repeats the TLBI+DSB operation and is shared with the Qualcomm Falkor erratum 1009 Reviewed-by: Suzuki K Poulose <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2018-11-29selinux: add support for RTM_NEWCHAIN, RTM_DELCHAIN, and RTM_GETCHAINPaul Moore1-1/+12
Commit 32a4f5ecd738 ("net: sched: introduce chain object to uapi") added new RTM_* definitions without properly updating SELinux, this patch adds the necessary SELinux support. While there was a BUILD_BUG_ON() in the SELinux code to protect from exactly this case, it was bypassed in the broken commit. In order to hopefully prevent this from happening in the future, add additional comments which provide some instructions on how to resolve the BUILD_BUG_ON() failures. Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi") Cc: <[email protected]> # 4.19 Acked-by: David S. Miller <[email protected]> Signed-off-by: Paul Moore <[email protected]>
2018-11-29dmaengine: at_hdmac: fix module unloadingRichard Genoud1-0/+2
of_dma_controller_free() was not called on module onloading. This lead to a soft lockup: watchdog: BUG: soft lockup - CPU#0 stuck for 23s! Modules linked in: at_hdmac [last unloaded: at_hdmac] when of_dma_request_slave_channel() tried to call ofdma->of_dma_xlate(). Cc: [email protected] Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Acked-by: Ludovic Desroches <[email protected]> Signed-off-by: Richard Genoud <[email protected]> Signed-off-by: Vinod Koul <[email protected]>
2018-11-29dmaengine: at_hdmac: fix memory leak in at_dma_xlate()Richard Genoud1-1/+7
The leak was found when opening/closing a serial port a great number of time, increasing kmalloc-32 in slabinfo. Each time the port was opened, dma_request_slave_channel() was called. Then, in at_dma_xlate(), atslave was allocated with devm_kzalloc() and never freed. (Well, it was free at module unload, but that's not what we want). So, here, kzalloc is more suited for the job since it has to be freed in atc_free_chan_resources(). Cc: [email protected] Fixes: bbe89c8e3d59 ("at_hdmac: move to generic DMA binding") Reported-by: Mario Forner <[email protected]> Suggested-by: Alexandre Belloni <[email protected]> Acked-by: Alexandre Belloni <[email protected]> Acked-by: Ludovic Desroches <[email protected]> Signed-off-by: Richard Genoud <[email protected]> Signed-off-by: Vinod Koul <[email protected]>
2018-11-29Merge tag 'fixes-for-v4.20-rc4' of ↵Greg Kroah-Hartman3-67/+37
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-linus Felipe writes: USB: fixes for v4.20-rc4 In this second set of fixes for the current -rc cycle, we have some regressions fixes for the old omap_udc driver done by Aaro Koskinen. We're also reverting an old patch on dwc3 which is, now, known to break USB certification in some cases. We have a fix on u_ether for an unsafe list iteration. * tag 'fixes-for-v4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb: usb: gadget: u_ether: fix unsafe list iteration USB: omap_udc: fix rejection of out transfers when DMA is used USB: omap_udc: fix USB gadget functionality on Palm Tungsten E USB: omap_udc: fix omap_udc_start() on 15xx machines USB: omap_udc: fix crashes on probe error and module removal USB: omap_udc: use devm_request_irq() Revert "usb: dwc3: gadget: skip Set/Clear Halt when invalid"
2018-11-28scsi: storvsc: Fix a race in sub-channel creation that can cause panicDexuan Cui1-31/+30
We can concurrently try to open the same sub-channel from 2 paths: path #1: vmbus_onoffer() -> vmbus_process_offer() -> handle_sc_creation(). path #2: storvsc_probe() -> storvsc_connect_to_vsp() -> -> storvsc_channel_init() -> handle_multichannel_storage() -> -> vmbus_are_subchannels_present() -> handle_sc_creation(). They conflict with each other, but it was not an issue before the recent commit ae6935ed7d42 ("vmbus: split ring buffer allocation from open"), because at the beginning of vmbus_open() we checked newchannel->state so only one path could succeed, and the other would return with -EINVAL. After ae6935ed7d42, the failing path frees the channel's ringbuffer by vmbus_free_ring(), and this causes a panic later. Commit ae6935ed7d42 itself is good, and it just reveals the longstanding race. We can resolve the issue by removing path #2, i.e. removing the second vmbus_are_subchannels_present() in handle_multichannel_storage(). BTW, the comment "Check to see if sub-channels have already been created" in handle_multichannel_storage() is incorrect: when we unload the driver, we first close the sub-channel(s) and then close the primary channel, next the host sends rescind-offer message(s) so primary->sc_list will become empty. This means the first vmbus_are_subchannels_present() in handle_multichannel_storage() is never useful. Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open") Cc: [email protected] Cc: Long Li <[email protected]> Cc: Stephen Hemminger <[email protected]> Cc: K. Y. Srinivasan <[email protected]> Cc: Haiyang Zhang <[email protected]> Signed-off-by: Dexuan Cui <[email protected]> Signed-off-by: K. Y. Srinivasan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-11-29Merge tag 'drm-misc-fixes-2018-11-28-1' of ↵Dave Airlie5-8/+38
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes - mst: Don't try to validate ports while destroying them (Lyude) - Revert: Don't try to validate ports while destroying them (Lyude) - core: Don't set device to master unless set_master succeeds (Sergio) - meson: Do vblank_on/off on enable/disable (Neil) - meson: Use fast_io regmap option to avoid sleeping in irq ctx (Lyude) - meson: Don't walk off the end of the OSD EOTF LUTs (Lyude) Cc: Lyude Paul <[email protected]> Cc: Sergio Correia <[email protected]> Cc: Neil Armstrong <[email protected]> Signed-off-by: Dave Airlie <[email protected]> From: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20181128212936.GA21379@art_vandelay
2018-11-29Merge branch 'drm-fixes-4.20' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie3-12/+14
into drm-fixes Fixes for 4.20. Nothing major. - DC DP MST fix - GPUVM fix for huge page mapping - RLC fix for vega20 Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-11-29Merge tag 'drm-intel-fixes-2018-11-28' of ↵Dave Airlie3-3/+8
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Just gvt-fixes-2018-11-26 ""One to correct MOCS registers load on engine list, one for rpm lock warning fix, and another for use-after-free fix for partial ggtt list destroy. " Signed-off-by: Dave Airlie <[email protected]> From: Joonas Lahtinen <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-11-29Merge tag 'du-fixes-20181126' of git://linuxtv.org/pinchartl/media into ↵Dave Airlie1-3/+18
drm-fixes R-Car DU v4.20 regression fix Signed-off-by: Dave Airlie <[email protected]> From: Laurent Pinchart <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/8134504.ZSXK7gKU4H@avalon
2018-11-28scsi: vmw_pscsi: Rearrange code to avoid multiple calls to free_irq during ↵Cathy Avery1-2/+2
unload Currently pvscsi_remove calls free_irq more than once as pvscsi_release_resources and __pvscsi_shutdown both call pvscsi_shutdown_intr. This results in a 'Trying to free already-free IRQ' warning and stack trace. To solve the problem pvscsi_shutdown_intr has been moved out of pvscsi_release_resources. Signed-off-by: Cathy Avery <[email protected]> Reviewed-by: Ewan D. Milne <[email protected]> Reviewed-by: Dan Carpenter <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-11-29drm/ast: fixed reading monitor EDID not stable issueY.C. Chen1-6/+30
v1: over-sample data to increase the stability with some specific monitors v2: refine to avoid infinite loop v3: remove un-necessary "volatile" declaration [airlied: fix two checkpatch warnings] Signed-off-by: Y.C. Chen <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-11-29drm/ast: Fix incorrect free on ioregsSam Bobroff1-1/+2
If the platform has no IO space, ioregs is placed next to the already allocated regs. In this case, it should not be separately freed. This prevents a kernel warning from __vunmap "Trying to vfree() nonexistent vm area" when unloading the driver. Fixes: 0dd68309b9c5 ("drm/ast: Try to use MMIO registers when PIO isn't supported") Signed-off-by: Sam Bobroff <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2018-11-28scsi: libiscsi: Fix NULL pointer dereference in iscsi_eh_session_resetFred Herard1-2/+2
This commit addresses NULL pointer dereference in iscsi_eh_session_reset. Reference should not be made to session->leadconn when session->state is set to ISCSI_STATE_TERMINATE. Signed-off-by: Fred Herard <[email protected]> Reviewed-by: Konrad Rzeszutek Wilk <[email protected]> Reviewed-by: Lee Duncan <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]>
2018-11-28Revert "drm/dp_mst: Skip validating ports during destruction, just ref"Lyude Paul1-13/+2
This reverts commit: c54c7374ff44 ("drm/dp_mst: Skip validating ports during destruction, just ref") ugh. In drm_dp_destroy_connector_work(), we have a pretty good chance of freeing the actual struct drm_dp_mst_port. However, after destroying things we send a hotplug through (*mgr->cbs->hotplug)(mgr) which is where the problems start. For i915, this calls all the way down to the fbcon probing helpers, which start trying to access the port in a modeset. [ 45.062001] ================================================================== [ 45.062112] BUG: KASAN: use-after-free in ex_handler_refcount+0x146/0x180 [ 45.062196] Write of size 4 at addr ffff8882b4b70968 by task kworker/3:1/53 [ 45.062325] CPU: 3 PID: 53 Comm: kworker/3:1 Kdump: loaded Tainted: G O 4.20.0-rc4Lyude-Test+ #3 [ 45.062442] Hardware name: LENOVO 20BWS1KY00/20BWS1KY00, BIOS JBET71WW (1.35 ) 09/14/2018 [ 45.062554] Workqueue: events drm_dp_destroy_connector_work [drm_kms_helper] [ 45.062641] Call Trace: [ 45.062685] dump_stack+0xbd/0x15a [ 45.062735] ? dump_stack_print_info.cold.0+0x1b/0x1b [ 45.062801] ? printk+0x9f/0xc5 [ 45.062847] ? kmsg_dump_rewind_nolock+0xe4/0xe4 [ 45.062909] ? ex_handler_refcount+0x146/0x180 [ 45.062970] print_address_description+0x71/0x239 [ 45.063036] ? ex_handler_refcount+0x146/0x180 [ 45.063095] kasan_report.cold.5+0x242/0x30b [ 45.063155] __asan_report_store4_noabort+0x1c/0x20 [ 45.063313] ex_handler_refcount+0x146/0x180 [ 45.063371] ? ex_handler_clear_fs+0xb0/0xb0 [ 45.063428] fixup_exception+0x98/0xd7 [ 45.063484] ? raw_notifier_call_chain+0x20/0x20 [ 45.063548] do_trap+0x6d/0x210 [ 45.063605] ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper] [ 45.063732] do_error_trap+0xc0/0x170 [ 45.063802] ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper] [ 45.063929] do_invalid_op+0x3b/0x50 [ 45.063997] ? _GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper] [ 45.064103] invalid_op+0x14/0x20 [ 45.064162] RIP: 0010:_GLOBAL__sub_I_65535_1_drm_dp_aux_unregister_devnode+0x2f/0x1c6 [drm_kms_helper] [ 45.064274] Code: 00 48 c7 c7 80 fe 53 a0 48 89 e5 e8 5b 6f 26 e1 5d c3 48 8d 0e 0f 0b 48 8d 0b 0f 0b 48 8d 0f 0f 0b 48 8d 0f 0f 0b 49 8d 4d 00 <0f> 0b 49 8d 0e 0f 0b 48 8d 08 0f 0b 49 8d 4d 00 0f 0b 48 8d 0b 0f [ 45.064569] RSP: 0018:ffff8882b789ee10 EFLAGS: 00010282 [ 45.064637] RAX: ffff8882af47ae70 RBX: ffff8882af47aa60 RCX: ffff8882b4b70968 [ 45.064723] RDX: ffff8882af47ae70 RSI: 0000000000000008 RDI: ffff8882b788bdb8 [ 45.064808] RBP: ffff8882b789ee28 R08: ffffed1056f13db4 R09: ffffed1056f13db3 [ 45.064894] R10: ffffed1056f13db3 R11: ffff8882b789ed9f R12: ffff8882af47ad28 [ 45.064980] R13: ffff8882b4b70968 R14: ffff8882acd86728 R15: ffff8882b4b75dc8 [ 45.065084] drm_dp_mst_reset_vcpi_slots+0x12/0x80 [drm_kms_helper] [ 45.065225] intel_mst_disable_dp+0xda/0x180 [i915] [ 45.065361] intel_encoders_disable.isra.107+0x197/0x310 [i915] [ 45.065498] haswell_crtc_disable+0xbe/0x400 [i915] [ 45.065622] ? i9xx_disable_plane+0x1c0/0x3e0 [i915] [ 45.065750] intel_atomic_commit_tail+0x74e/0x3e60 [i915] [ 45.065884] ? intel_pre_plane_update+0xbc0/0xbc0 [i915] [ 45.065968] ? drm_atomic_helper_swap_state+0x88b/0x1d90 [drm_kms_helper] [ 45.066054] ? kasan_check_write+0x14/0x20 [ 45.066165] ? i915_gem_track_fb+0x13a/0x330 [i915] [ 45.066277] ? i915_sw_fence_complete+0xe9/0x140 [i915] [ 45.066406] ? __i915_sw_fence_complete+0xc50/0xc50 [i915] [ 45.066540] intel_atomic_commit+0x72e/0xef0 [i915] [ 45.066635] ? drm_dev_dbg+0x200/0x200 [drm] [ 45.066764] ? intel_atomic_commit_tail+0x3e60/0x3e60 [i915] [ 45.066898] ? intel_atomic_commit_tail+0x3e60/0x3e60 [i915] [ 45.067001] drm_atomic_commit+0xc4/0xf0 [drm] [ 45.067074] restore_fbdev_mode_atomic+0x562/0x780 [drm_kms_helper] [ 45.067166] ? drm_fb_helper_debug_leave+0x690/0x690 [drm_kms_helper] [ 45.067249] ? kasan_check_read+0x11/0x20 [ 45.067324] restore_fbdev_mode+0x127/0x4b0 [drm_kms_helper] [ 45.067364] ? kasan_check_read+0x11/0x20 [ 45.067406] drm_fb_helper_restore_fbdev_mode_unlocked+0x164/0x200 [drm_kms_helper] [ 45.067462] ? drm_fb_helper_hotplug_event+0x30/0x30 [drm_kms_helper] [ 45.067508] ? kasan_check_write+0x14/0x20 [ 45.070360] ? mutex_unlock+0x22/0x40 [ 45.073748] drm_fb_helper_set_par+0xb2/0xf0 [drm_kms_helper] [ 45.075846] drm_fb_helper_hotplug_event.part.33+0x1cd/0x290 [drm_kms_helper] [ 45.078088] drm_fb_helper_hotplug_event+0x1c/0x30 [drm_kms_helper] [ 45.082614] intel_fbdev_output_poll_changed+0x9f/0x140 [i915] [ 45.087069] drm_kms_helper_hotplug_event+0x67/0x90 [drm_kms_helper] [ 45.089319] intel_dp_mst_hotplug+0x37/0x50 [i915] [ 45.091496] drm_dp_destroy_connector_work+0x510/0x6f0 [drm_kms_helper] [ 45.093675] ? drm_dp_update_payload_part1+0x1220/0x1220 [drm_kms_helper] [ 45.095851] ? kasan_check_write+0x14/0x20 [ 45.098473] ? kasan_check_read+0x11/0x20 [ 45.101155] ? strscpy+0x17c/0x530 [ 45.103808] ? __switch_to_asm+0x34/0x70 [ 45.106456] ? syscall_return_via_sysret+0xf/0x7f [ 45.109711] ? read_word_at_a_time+0x20/0x20 [ 45.113138] ? __switch_to_asm+0x40/0x70 [ 45.116529] ? __switch_to_asm+0x34/0x70 [ 45.119891] ? __switch_to_asm+0x40/0x70 [ 45.123224] ? __switch_to_asm+0x34/0x70 [ 45.126540] ? __switch_to_asm+0x34/0x70 [ 45.129824] process_one_work+0x88d/0x15d0 [ 45.133172] ? pool_mayday_timeout+0x850/0x850 [ 45.136459] ? pci_mmcfg_check_reserved+0x110/0x128 [ 45.139739] ? wake_q_add+0xb0/0xb0 [ 45.143010] ? check_preempt_wakeup+0x652/0x1050 [ 45.146304] ? worker_enter_idle+0x29e/0x740 [ 45.149589] ? __schedule+0x1ec0/0x1ec0 [ 45.152937] ? kasan_check_read+0x11/0x20 [ 45.156179] ? _raw_spin_lock_irq+0xa3/0x130 [ 45.159382] ? _raw_read_unlock_irqrestore+0x30/0x30 [ 45.162542] ? kasan_check_write+0x14/0x20 [ 45.165657] worker_thread+0x1a5/0x1470 [ 45.168725] ? set_load_weight+0x2e0/0x2e0 [ 45.171755] ? process_one_work+0x15d0/0x15d0 [ 45.174806] ? __switch_to_asm+0x34/0x70 [ 45.177645] ? __switch_to_asm+0x40/0x70 [ 45.180323] ? __switch_to_asm+0x34/0x70 [ 45.182936] ? __switch_to_asm+0x40/0x70 [ 45.185539] ? __switch_to_asm+0x34/0x70 [ 45.188100] ? __switch_to_asm+0x40/0x70 [ 45.190628] ? __schedule+0x7d4/0x1ec0 [ 45.193143] ? save_stack+0xa9/0xd0 [ 45.195632] ? kasan_check_write+0x10/0x20 [ 45.198162] ? kasan_kmalloc+0xc4/0xe0 [ 45.200609] ? kmem_cache_alloc_trace+0xdd/0x190 [ 45.203046] ? kthread+0x9f/0x3b0 [ 45.205470] ? ret_from_fork+0x35/0x40 [ 45.207876] ? unwind_next_frame+0x43/0x50 [ 45.210273] ? __save_stack_trace+0x82/0x100 [ 45.212658] ? deactivate_slab.isra.67+0x3d4/0x580 [ 45.215026] ? default_wake_function+0x35/0x50 [ 45.217399] ? kasan_check_read+0x11/0x20 [ 45.219825] ? _raw_spin_lock_irqsave+0xae/0x140 [ 45.222174] ? __lock_text_start+0x8/0x8 [ 45.224521] ? replenish_dl_entity.cold.62+0x4f/0x4f [ 45.226868] ? __kthread_parkme+0x87/0xf0 [ 45.229200] kthread+0x2f7/0x3b0 [ 45.231557] ? process_one_work+0x15d0/0x15d0 [ 45.233923] ? kthread_park+0x120/0x120 [ 45.236249] ret_from_fork+0x35/0x40 [ 45.240875] Allocated by task 242: [ 45.243136] save_stack+0x43/0xd0 [ 45.245385] kasan_kmalloc+0xc4/0xe0 [ 45.247597] kmem_cache_alloc_trace+0xdd/0x190 [ 45.249793] drm_dp_add_port+0x1e0/0x2170 [drm_kms_helper] [ 45.252000] drm_dp_send_link_address+0x4a7/0x740 [drm_kms_helper] [ 45.254389] drm_dp_check_and_send_link_address+0x1a7/0x210 [drm_kms_helper] [ 45.256803] drm_dp_mst_link_probe_work+0x6f/0xb0 [drm_kms_helper] [ 45.259200] process_one_work+0x88d/0x15d0 [ 45.261597] worker_thread+0x1a5/0x1470 [ 45.264038] kthread+0x2f7/0x3b0 [ 45.266371] ret_from_fork+0x35/0x40 [ 45.270937] Freed by task 53: [ 45.273170] save_stack+0x43/0xd0 [ 45.275382] __kasan_slab_free+0x139/0x190 [ 45.277604] kasan_slab_free+0xe/0x10 [ 45.279826] kfree+0x99/0x1b0 [ 45.282044] drm_dp_free_mst_port+0x4a/0x60 [drm_kms_helper] [ 45.284330] drm_dp_destroy_connector_work+0x43e/0x6f0 [drm_kms_helper] [ 45.286660] process_one_work+0x88d/0x15d0 [ 45.288934] worker_thread+0x1a5/0x1470 [ 45.291231] kthread+0x2f7/0x3b0 [ 45.293547] ret_from_fork+0x35/0x40 [ 45.298206] The buggy address belongs to the object at ffff8882b4b70968 which belongs to the cache kmalloc-2k of size 2048 [ 45.303047] The buggy address is located 0 bytes inside of 2048-byte region [ffff8882b4b70968, ffff8882b4b71168) [ 45.308010] The buggy address belongs to the page: [ 45.310477] page:ffffea000ad2dc00 count:1 mapcount:0 mapping:ffff8882c080cf40 index:0x0 compound_mapcount: 0 [ 45.313051] flags: 0x8000000000010200(slab|head) [ 45.315635] raw: 8000000000010200 ffffea000aac2808 ffffea000abe8608 ffff8882c080cf40 [ 45.318300] raw: 0000000000000000 00000000000d000d 00000001ffffffff 0000000000000000 [ 45.320966] page dumped because: kasan: bad access detected [ 45.326312] Memory state around the buggy address: [ 45.329085] ffff8882b4b70800: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 45.331845] ffff8882b4b70880: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 45.334584] >ffff8882b4b70900: fc fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb [ 45.337302] ^ [ 45.340061] ffff8882b4b70980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 45.342910] ffff8882b4b70a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 45.345748] ================================================================== So, this definitely isn't a fix that we want. This being said; there's no real easy fix for this problem because of some of the catch-22's of the MST helpers current design. For starters; we always need to validate a port with drm_dp_get_validated_port_ref(), but validation relies on the lifetime of the port in the actual topology. So once the port is gone, it can't be validated again. If we were to try to make the payload helpers not use port validation, then we'd cause another problem: if the port isn't validated, it could be freed and we'd just start causing more KASAN issues. There are already hacks that attempt to workaround this in drm_dp_mst_destroy_connector_work() by re-initializing the kref so that it can be used again and it's memory can be freed once the VCPI helpers finish removing the port's respective payloads. But none of these really do anything helpful since the port still can't be validated since it's gone from the topology. Also, that workaround is immensely confusing to read through. What really needs to be done in order to fix this is to teach DRM how to track the lifetime of the structs for MST ports and branch devices separately from their lifetime in the actual topology. Simply put; this means having two different krefs-one that removes the port/branch device from the topology, and one that finally calls kfree(). This would let us simplify things, since we'd now be able to keep ports around without having to keep them in the topology at the same time, which is exactly what we need in order to teach our VCPI helpers to only validate ports when it's actually necessary without running the risk of trying to use unallocated memory. Such a fix is on it's way, but for now let's play it safe and just revert this. If this bug has been around for well over a year, we can wait a little while to get an actual proper fix here. Signed-off-by: Lyude Paul <[email protected]> Fixes: c54c7374ff44 ("drm/dp_mst: Skip validating ports during destruction, just ref") Cc: Daniel Vetter <[email protected]> Cc: Sean Paul <[email protected]> Cc: Jerry Zuo <[email protected]> Cc: Harry Wentland <[email protected]> Cc: [email protected] # v4.6+ Acked-by: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2018-11-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds68-261/+1312
Pull networking fixes from David Miller: 1) ARM64 JIT fixes for subprog handling from Daniel Borkmann. 2) Various sparc64 JIT bug fixes (fused branch convergance, frame pointer usage detection logic, PSEODU call argument handling). 3) Fix to use BH locking in nf_conncount, from Taehee Yoo. 4) Fix race of TX skb freeing in ipheth driver, from Bernd Eckstein. 5) Handle return value of TX NAPI completion properly in lan743x driver, from Bryan Whitehead. 6) MAC filter deletion in i40e driver clears wrong state bit, from Lihong Yang. 7) Fix use after free in rionet driver, from Pan Bian. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits) s390/qeth: fix length check in SNMP processing net: hisilicon: remove unexpected free_netdev rapidio/rionet: do not free skb before reading its length i40e: fix kerneldoc for xsk methods ixgbe: recognize 1000BaseLX SFP modules as 1Gbps i40e: Fix deletion of MAC filters igb: fix uninitialized variables netfilter: nf_tables: deactivate expressions in rule replecement routine lan743x: Enable driver to work with LAN7431 tipc: fix lockdep warning during node delete lan743x: fix return value for lan743x_tx_napi_poll net: via: via-velocity: fix spelling mistake "alignement" -> "alignment" qed: fix spelling mistake "attnetion" -> "attention" net: thunderx: fix NULL pointer dereference in nic_remove sctp: increase sk_wmem_alloc when head->truesize is increased firestream: fix spelling mistake: "Inititing" -> "Initializing" net: phy: add workaround for issue where PHY driver doesn't bind to the device usbnet: ipheth: fix potential recvmsg bug and recvmsg bug 2 sparc: Adjust bpf JIT prologue for PSEUDO calls. bpf, doc: add entries of who looks over which jits ...
2018-11-28Merge tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds3-13/+50
Pull Xtensa fixes from Max Filippov: - fix kernel exception on userspace access to a currently disabled coprocessor - fix coprocessor data saving/restoring in configurations with multiple coprocessors - fix ptrace access to coprocessor data on configurations with multiple coprocessors with high alignment requirements * tag 'xtensa-20181128' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: fix coprocessor part of ptrace_{get,set}xregs xtensa: fix coprocessor context offset definitions xtensa: enable coprocessors that are being flushed
2018-11-28drm/amdgpu: Add delay after enable RLC ucodeshaoyunl1-3/+4
Driver shouldn't try to access any GFX registers until RLC is idle. During the test, it took 12 seconds for RLC to clear the BUSY bit in RLC_GPM_STAT register which is un-acceptable for driver. As per RLC engineer, it would take RLC Ucode less than 10,000 GFXCLK cycles to finish its critical section. In a lowest 300M enginer clock setting(default from vbios), 50 us delay is enough. This commit fix the hang when RLC introduce the work around for XGMI which requires more cycles to setup more registers than normal Signed-off-by: shaoyunl <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>