blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2017-06-06	sparc64: delete old wrap code	Pavel Tatashin	6	-45/+1
	The old method that is using xcall and softint to get new context id is deleted, as it is replaced by a method of using per_cpu_secondary_mm without xcall to perform the context wrap. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: new context wrap	Pavel Tatashin	1	-27/+54
	The current wrap implementation has a race issue: it is called outside of the ctx_alloc_lock, and also does not wait for all CPUs to complete the wrap. This means that a thread can get a new context with a new version and another thread might still be running with the same context. The problem is especially severe on CPUs with shared TLBs, like sun4v. I used the following test to very quickly reproduce the problem: - start over 8K processes (must be more than context IDs) - write and read values at a memory location in every process. Very quickly memory corruptions start happening, and what we read back does not equal what we wrote. Several approaches were explored before settling on this one: Approach 1: Move smp_new_mmu_context_version() inside ctx_alloc_lock, and wait for every process to complete the wrap. (Note: every CPU must WAIT before leaving smp_new_mmu_context_version_client() until every one arrives). This approach ends up with deadlocks, as some threads own locks which other threads are waiting for, and they never receive softint until these threads exit smp_new_mmu_context_version_client(). Since we do not allow the exit, deadlock happens. Approach 2: Handle wrap right during mondo interrupt. Use etrap/rtrap to enter into into C code, and issue new versions to every CPU. This approach adds some overhead to runtime: in switch_mm() we must add some checks to make sure that versions have not changed due to wrap while we were loading the new secondary context. (could be protected by PSTATE_IE but that degrades performance as on M7 and older CPUs as it takes 50 cycles for each access). Also, we still need a global per-cpu array of MMs to know where we need to load new contexts, otherwise we can change context to a thread that is going way (if we received mondo between switch_mm() and switch_to() time). Finally, there are some issues with window registers in rtrap() when context IDs are changed during CPU mondo time. The approach in this patch is the simplest and has almost no impact on runtime. We use the array with mm's where last secondary contexts were loaded onto CPUs and bump their versions to the new generation without changing context IDs. If a new process comes in to get a context ID, it will go through get_new_mmu_context() because of version mismatch. But the running processes do not need to be interrupted. And wrap is quicker as we do not need to xcall and wait for everyone to receive and complete wrap. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: add per-cpu mm of secondary contexts	Pavel Tatashin	2	-2/+4
	The new wrap is going to use information from this array to figure out mm's that currently have valid secondary contexts setup. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: redefine first version	Pavel Tatashin	2	-4/+4
	CTX_FIRST_VERSION defines the first context version, but also it defines first context. This patch redefines it to only include the first context version. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: combine activate_mm and switch_mm	Pavel Tatashin	1	-20/+1
	The only difference between these two functions is that in activate_mm we unconditionally flush context. However, there is no need to keep this difference after fixing a bug where cpumask was not reset on a wrap. So, in this patch we combine these. Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: reset mm cpumask after wrap	Pavel Tatashin	1	-0/+2
	After a wrap (getting a new context version) a process must get a new context id, which means that we would need to flush the context id from the TLB before running for the first time with this ID on every CPU. But, we use mm_cpumask to determine if this process has been running on this CPU before, and this mask is not reset after a wrap. So, there are two possible fixes for this issue: 1. Clear mm cpumask whenever mm gets a new context id 2. Unconditionally flush context every time process is running on a CPU This patch implements the first solution Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Steven Sistare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc/mm/hugepages: Fix setup_hugepagesz for invalid values.	Liam R. Howlett	1	-1/+2
	hugetlb_bad_size needs to be called on invalid values. Also change the pr_warn to a pr_err to better align with other platforms. Signed-off-by: Liam R. Howlett <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc: Machine description indices can vary	James Clarke	2	-4/+65
	VIO devices were being looked up by their index in the machine description node block, but this often varies over time as devices are added and removed. Instead, store the ID and look up using the type, config handle and ID. Signed-off-by: James Clarke <[email protected]> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=112541 Signed-off-by: David S. Miller <[email protected]>
2017-06-06	sparc64: mm: fix copy_tsb to correctly copy huge page TSBs	Mike Kravetz	2	-6/+12
	When a TSB grows beyond its current capacity, a new TSB is allocated and copy_tsb is called to copy entries from the old TSB to the new. A hash shift based on page size is used to calculate the index of an entry in the TSB. copy_tsb has hard coded PAGE_SHIFT in these calculations. However, for huge page TSBs the value REAL_HPAGE_SHIFT should be used. As a result, when copy_tsb is called for a huge page TSB the entries are placed at the incorrect index in the newly allocated TSB. When doing hardware table walk, the MMU does not match these entries and we end up in the TSB miss handling code. This code will then create and write an entry to the correct index in the TSB. We take a performance hit for the table walk miss and recreation of these entries. Pass a new parameter to copy_tsb that is the page size shift to be used when copying the TSB. Suggested-by: Anthony Yznaga <[email protected]> Signed-off-by: Mike Kravetz <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	arch/sparc: support NR_CPUS = 4096	Jane Chu	2	-6/+15
	Linux SPARC64 limits NR_CPUS to 4064 because init_cpu_send_mondo_info() only allocates a single page for NR_CPUS mondo entries. Thus we cannot use all 4096 CPUs on some SPARC platforms. To fix, allocate (2^order) pages where order is set according to the size of cpu_list for possible cpus. Since cpu_list_pa and cpu_mondo_block_pa are not used in asm code, there are no imm13 offsets from the base PA that will break because they can only reach one page. Orabug: 25505750 Signed-off-by: Jane Chu <[email protected]> Reviewed-by: Bob Picco <[email protected]> Reviewed-by: Atish Patra <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net: stmmac: fix a broken u32 less than zero check	Colin Ian King	1	-1/+1
	The check that queue is less or equal to zero is always true because queue is a u32; queue is decremented and will wrap around and never go -ve. Fix this by making queue an int. Detected by CoverityScan, CID#1428988 ("Unsigned compared against 0") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net: stmmac: fix completely hung TX when using TSO	Niklas Cassel	1	-1/+1
	stmmac_tso_allocator can fail to set the Last Descriptor bit on a descriptor that actually was the last descriptor. This happens when the buffer of the last descriptor ends up having a size of exactly TSO_MAX_BUFF_SIZE. When the IP eventually reaches the next last descriptor, which actually has the bit set, the DMA will hang. When the DMA hangs, we get a tx timeout, however, since stmmac does not do a complete reset of the IP in stmmac_tx_timeout, we end up in a state with completely hung TX. Signed-off-by: Niklas Cassel <[email protected]> Acked-by: Giuseppe Cavallaro <[email protected]> Acked-by: Alexandre TORGUE <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net: ethoc: enable NAPI before poll may be scheduled	Max Filippov	1	-1/+2
	ethoc_reset enables device interrupts, ethoc_interrupt may schedule a NAPI poll before NAPI is enabled in the ethoc_open, which results in device being unable to send or receive anything until it's closed and reopened. In case the device is flooded with ingress packets it may be unable to recover at all. Move napi_enable above ethoc_reset in the ethoc_open to fix that. Fixes: a1702857724f ("net: Add support for the OpenCores 10/100 Mbps Ethernet MAC.") Signed-off-by: Max Filippov <[email protected]> Reviewed-by: Tobias Klauser <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net: bridge: fix a null pointer dereference in br_afspec	Nikolay Aleksandrov	1	-1/+1
	We might call br_afspec() with p == NULL which is a valid use case if the action is on the bridge device itself, but the bridge tunnel code dereferences the p pointer without checking, so check if p is null first. Reported-by: Gustavo A. R. Silva <[email protected]> Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support") Signed-off-by: Nikolay Aleksandrov <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	ravb: Fix use-after-free on `ifconfig eth0 down`	Eugeniu Rosca	1	-12/+12
	Commit a47b70ea86bd ("ravb: unmap descriptors when freeing rings") has introduced the issue seen in [1] reproduced on H3ULCB board. Fix this by relocating the RX skb ringbuffer free operation, so that swiotlb page unmapping can be done first. Freeing of aligned TX buffers is not relevant to the issue seen in [1]. Still, reposition TX free calls as well, to have all kfree() operations performed consistently _after_ dma_unmap_()/dma_free_(). [1] Console screenshot with the problem reproduced: salvator-x login: root root@salvator-x:~# ifconfig eth0 up Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00: \ attached PHY driver [Micrel KSZ9031 Gigabit PHY] \ (mii_bus:phy_addr=e6800000.ethernet-ffffffff:00, irq=235) IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready root@salvator-x:~# root@salvator-x:~# ifconfig eth0 down ================================================================== BUG: KASAN: use-after-free in swiotlb_tbl_unmap_single+0xc4/0x35c Write of size 1538 at addr ffff8006d884f780 by task ifconfig/1649 CPU: 0 PID: 1649 Comm: ifconfig Not tainted 4.12.0-rc4-00004-g112eb07287d1 #32 Hardware name: Renesas H3ULCB board based on r8a7795 (DT) Call trace: [<ffff20000808f11c>] dump_backtrace+0x0/0x3a4 [<ffff20000808f4d4>] show_stack+0x14/0x1c [<ffff20000865970c>] dump_stack+0xf8/0x150 [<ffff20000831f8b0>] print_address_description+0x7c/0x330 [<ffff200008320010>] kasan_report+0x2e0/0x2f4 [<ffff20000831eac0>] check_memory_region+0x20/0x14c [<ffff20000831f054>] memcpy+0x48/0x68 [<ffff20000869ed50>] swiotlb_tbl_unmap_single+0xc4/0x35c [<ffff20000869fcf4>] unmap_single+0x90/0xa4 [<ffff20000869fd14>] swiotlb_unmap_page+0xc/0x14 [<ffff2000080a2974>] __swiotlb_unmap_page+0xcc/0xe4 [<ffff2000088acdb8>] ravb_ring_free+0x514/0x870 [<ffff2000088b25dc>] ravb_close+0x288/0x36c [<ffff200008aaf8c4>] __dev_close_many+0x14c/0x174 [<ffff200008aaf9b4>] __dev_close+0xc8/0x144 [<ffff200008ac2100>] __dev_change_flags+0xd8/0x194 [<ffff200008ac221c>] dev_change_flags+0x60/0xb0 [<ffff200008ba2dec>] devinet_ioctl+0x484/0x9d4 [<ffff200008ba7b78>] inet_ioctl+0x190/0x194 [<ffff200008a78c44>] sock_do_ioctl+0x78/0xa8 [<ffff200008a7a128>] sock_ioctl+0x110/0x3c4 [<ffff200008365a70>] vfs_ioctl+0x90/0xa0 [<ffff200008365dbc>] do_vfs_ioctl+0x148/0xc38 [<ffff2000083668f0>] SyS_ioctl+0x44/0x74 [<ffff200008083770>] el0_svc_naked+0x24/0x28 The buggy address belongs to the page: page:ffff7e001b6213c0 count:0 mapcount:0 mapping: (null) index:0x0 flags: 0x4000000000000000() raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff raw: 0000000000000000 ffff7e001b6213e0 0000000000000000 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff8006d884f680: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f700: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff >ffff8006d884f780: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ^ ffff8006d884f800: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff8006d884f880: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint root@salvator-x:~# Fixes: a47b70ea86bd ("ravb: unmap descriptors when freeing rings") Signed-off-by: Eugeniu Rosca <[email protected]> Acked-by: Sergei Shtylyov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net/ipv6: Fix CALIPSO causing GPF with datagram support	Richard Haines	1	-1/+5
	When using CALIPSO with IPPROTO_UDP it is possible to trigger a GPF as the IP header may have moved. Also update the payload length after adding the CALIPSO option. Signed-off-by: Richard Haines <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: Huw Davies <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	net: stmmac: ensure jumbo_frm error return is correctly checked for -ve value	Colin Ian King	1	-1/+2
	The current comparison of entry < 0 will never be true since entry is an unsigned integer. Make entry an int to ensure -ve error return values from the call to jumbo_frm are correctly being caught. Detected by CoverityScan, CID#1238760 ("Macro compares unsigned to 0") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-06-06	Merge tag 'wireless-drivers-for-davem-2017-06-06' of ↵	David S. Miller	19	-79/+118
	git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.12 It has been a slow start of cycle and this the first set of fixes for 4.12. Nothing really major here. wcn36xx * fix an issue with module reload brcmfmac * fix aligment regression on 64 bit systems iwlwifi * fixes for memory leaks, runtime PM, memory initialisation and other smaller problems * fix IBSS on devices using DQA mode (7260 and up) * fix the minimum firmware API requirement for 7265D, 3168, 8000 and 8265 ==================== Signed-off-by: David S. Miller <[email protected]>
2017-06-06	Merge tag 'media/v4.12-2' of ↵	Linus Torvalds	19	-48/+45
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "Some bug fixes: - Don't fail build if atomisp has warnings - Some CEC Kconfig changes to allow it to be used by DRM without media dependencies - A race fix at RC initialization code - A driver fix at rainshadow-cec IMHO, the one that affects most people in this series is a build fix: if you try to build the Kernel with W=1 or using gcc7 and all[yes\|mod]config, build will fail due to -Werror at atomisp makefiles" * tag 'media/v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] rc-core: race condition during ir_raw_event_register() [media] cec: drop MEDIA_CEC_DEBUG [media] cec: rename MEDIA_CEC_NOTIFIER to CEC_NOTIFIER [media] cec: select CEC_CORE instead of depend on it [media] rainshadow-cec: ensure exit_loop is intialized [media] atomisp: don't treat warnings as errors
2017-06-06	Merge branch '40GbE' of ↵	David S. Miller	3	-20/+22
	git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-06-06 This series contains fixes to i40e and i40evf only. Mauro S. M. Rodrigues fixes a flood in the kernel log which was introduced in a previous commit because of a mistaken substitution of __I40E_VSI_DOWN instead of __I40E_DOWN when testing the state of the PF. Björn Töpel fixes an issue introduced in a previous commit where the offset was incorrect and could lead to data corruption for architectures using PAGE_SIZE larger than 8191. Fixed the issue by updating the page_offset correctly using the proper setting for truesize. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-06-06	Revert "sit: reload iphdr in ipip6_rcv"	David S. Miller	1	-1/+0
	This reverts commit b699d0035836f6712917a41e7ae58d84359b8ff9. As per Eric Dumazet, the pskb_may_pull() is a NOP in this particular case, so the 'iph' reload is unnecessary. Signed-off-by: David S. Miller <[email protected]>
2017-06-06	drm: kirin: Fix drm_of_find_panel_or_bridge conversion	John Stultz	1	-1/+1
	This fixes a regression introduced by ebc944613567 ("drm: convert drivers to use drm_of_find_panel_or_bridge") that was recently merged, causing HDMI output to not work. For the kirin driver, the port value should be 1 instead of 0, so this oneline patch fixes it and gets graphics working again. Cc: Rob Herring <[email protected]> Cc: Archit Taneja <[email protected]> Cc: Philipp Zabel <[email protected]> Cc: Maxime Ripard <[email protected]> Cc: Sean Paul <[email protected]> Cc: Dave Airlie <[email protected]> Cc: Xinliang Liu <[email protected]> Fix-suggested-by: Rob Herring <[email protected]> Signed-off-by: John Stultz <[email protected]> Reviewed-by: Xinliang Liu <[email protected]> Signed-off-by: Sean Paul <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-06-06	powerpc/perf: Fix Power9 test_adder fields	Madhavan Srinivasan	1	-2/+2
	Commit 8d911904f3ce4 ('powerpc/perf: Add restrictions to PMC5 in power9 DD1') was added to restrict the use of PMC5 in Power9 DD1. Intention was to disable the use of PMC5 using raw event code. But instead of updating the power9_isa207_pmu structure (used on DD1), the commit incorrectly updated the power9_pmu structure. Fix it. Fixes: 8d911904f3ce ("powerpc/perf: Add restrictions to PMC5 in power9 DD1") Reported-by: Shriya <[email protected]> Signed-off-by: Madhavan Srinivasan <[email protected]> Tested-by: Shriya <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-06	powerpc/numa: Fix percpu allocations to be NUMA aware	Michael Ellerman	2	-2/+16
	In commit 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID"), we switched to the generic implementation of cpu_to_node(), which uses a percpu variable to hold the NUMA node for each CPU. Unfortunately we neglected to notice that we use cpu_to_node() in the allocation of our percpu areas, leading to a chicken and egg problem. In practice what happens is when we are setting up the percpu areas, cpu_to_node() reports that all CPUs are on node 0, so we allocate all percpu areas on node 0. This is visible in the dmesg output, as all pcpu allocs being in group 0: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 To fix it we need an early_cpu_to_node() which can run prior to percpu being setup. We already have the numa_cpu_lookup_table we can use, so just plumb it in. With the patch dmesg output shows two groups, 0 and 1: pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [1] 24 25 26 27 [1] 28 29 30 31 pcpu-alloc: [1] 32 33 34 35 [1] 36 37 38 39 pcpu-alloc: [1] 40 41 42 43 [1] 44 45 46 47 We can also check the data_offset in the paca of various CPUs, with the fix we see: CPU 0: data_offset = 0x0ffe8b0000 CPU 24: data_offset = 0x1ffe5b0000 And we can see from dmesg that CPU 24 has an allocation on node 1: node 0: [mem 0x0000000000000000-0x0000000fffffffff] node 1: [mem 0x0000001000000000-0x0000001fffffffff] Cc: [email protected] # v3.16+ Fixes: 8c272261194d ("powerpc/numa: Enable USE_PERCPU_NUMA_NODE_ID") Signed-off-by: Michael Ellerman <[email protected]> Reviewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-06	i40e/i40evf: proper update of the page_offset field	Björn Töpel	2	-2/+4
	In f8b45b74cc62 ("i40e/i40evf: Use build_skb to build frames") i40e_build_skb updates the page_offset field with an incorrect offset, which can lead to data corruption. This patch updates page_offset correctly, by properly setting truesize. Note that the bug only appears on architectures where PAGE_SIZE is 8192 or larger. Fixes: f8b45b74cc62 ("i40e/i40evf: Use build_skb to build frames") Signed-off-by: Björn Töpel <[email protected]> Acked-by: Alexander Duyck <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-06-06	i40e: Fix state flags for bit set and clean operations of PF	Mauro S. M. Rodrigues	1	-18/+18
	Commit 0da36b9774cc ("i40e: use DECLARE_BITMAP for state fields") introduced changes in the way i40e works with state flags converting them to bitmaps using kernel bitmap API. This change introduced a regression due to a mistaken substitution using __I40E_VSI_DOWN instead of __I40E_DOWN when testing state of a PF at i40e_reset_subtask() function. This caused a flood in the kernel log with the follow message: [49.013] i40e 0002:01:00.0: bad reset request 0x00000020 Commit d19cb64b9222 ("i40e: separate PF and VSI state flags") also introduced some misuse of the VSI and PF flags, so both could be considered as the offenders. This patch simply fixes the flags where it makes sense by changing __I40E_VSI_DOWN to __I40E_DOWN. Fixes: 0da36b9774cc ("i40e: use DECLARE_BITMAP for state fields") Fixes: d19cb64b9222 ("i40e: separate PF and VSI state flags") Reviewed-by: "Guilherme G. Piccoli" <[email protected]> Signed-off-by: "Mauro S. M. Rodrigues" <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2017-06-06	cxl: Avoid double free_irq() for psl,slice interrupts	Vaibhav Jain	1	-3/+11
	During an eeh call to cxl_remove can result in double free_irq of psl,slice interrupts. This can happen if perst_reloads_same_image == 1 and call to cxl_configure_adapter() fails during slot_reset callback. In such a case we see a kernel oops with following back-trace: Oops: Kernel access of bad area, sig: 11 [#1] Call Trace: free_irq+0x88/0xd0 (unreliable) cxl_unmap_irq+0x20/0x40 [cxl] cxl_native_release_psl_irq+0x78/0xd8 [cxl] pci_deconfigure_afu+0xac/0x110 [cxl] cxl_remove+0x104/0x210 [cxl] pci_device_remove+0x6c/0x110 device_release_driver_internal+0x204/0x2e0 pci_stop_bus_device+0xa0/0xd0 pci_stop_and_remove_bus_device+0x28/0x40 pci_hp_remove_devices+0xb0/0x150 pci_hp_remove_devices+0x68/0x150 eeh_handle_normal_event+0x140/0x580 eeh_handle_event+0x174/0x360 eeh_event_handler+0x1e8/0x1f0 This patch fixes the issue of double free_irq by checking that variables that hold the virqs (err_hwirq, serr_hwirq, psl_virq) are not '0' before un-mapping and resetting these variables to '0' when they are un-mapped. Cc: [email protected] Signed-off-by: Vaibhav Jain <[email protected]> Reviewed-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-06	powerpc/kernel: Initialize load_tm on task creation	Breno Leitao	1	-0/+1
	Currently tsk->thread.load_tm is not initialized in the task creation and can contain garbage on a new task. This is an undesired behaviour, since it affects the timing to enable and disable the transactional memory laziness (disabling and enabling the MSR TM bit, which affects TM reclaim and recheckpoint in the scheduling process). Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace") Cc: [email protected] # v4.9+ Signed-off-by: Breno Leitao <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2017-06-06	gpu: ipu-v3: Fix CSI selection for VDIC	Marek Vasut	1	-7/+8
	The description of the CSI_SEL bit in the i.MX6 reference manual is incorrect. It states "This bit defines which CSI is the input to the IC. This bit is effective only if IC_INPUT is bit cleared". From experiment it was found this is in fact not correct. The CSI_SEL bit selects which CSI is input to _both_ the VDIC _and_ the IC. If the IC_INPUT bit is set so that the IC is receiving from the VDIC, the IC ignores the CSI_SEL bit, but CSI_SEL still selects which CSI the VDIC receives from in that case. Signed-off-by: Marek Vasut <[email protected]> Signed-off-by: Steve Longerbeam <[email protected]> Signed-off-by: Philipp Zabel <[email protected]>
2017-06-06	drm/imx: imx-ldb: Accept drm_of_find_panel_or_bridge failure	Leonard Crestez	1	-1/+1
	Not having an endpoint bound in DT should not cause a failure here, there are fallbacks. So explicitly accept a missing endpoint. This behavior change was introduced by refactoring in drm_of parsing code and it should not require dts changes. In particular this fixes imx6qdl-sabreauto boards. Link: https://lists.freedesktop.org/archives/dri-devel/2017-May/141233.html Fixes: ebc944613567 ("drm: convert drivers to use drm_of_find_panel_or_bridge") Signed-off-by: Leonard Crestez <[email protected]> Signed-off-by: Philipp Zabel <[email protected]>
2017-06-06	gpu: ipu-v3: pre: only use internal clock gating	Lucas Stach	1	-8/+5
	By setting the SFTRST bit, the PRE will be held in the lowest power state with clocks to the internal blocks gated. When external clock gating is used (from the external clock controller, or by setting the CLKGATE bit) the PRE will sporadically fail to start. Signed-off-by: Lucas Stach <[email protected]> Fixes: d2a34232580a ("gpu: ipu-v3: add driver for Prefetch Resolve Engine") Signed-off-by: Philipp Zabel <[email protected]>
2017-06-06	Merge tag 'drm-misc-fixes-2017-06-02' of ↵	Dave Airlie	7	-20/+32
	git://anongit.freedesktop.org/git/drm-misc into drm-fixes Core Changes: - Grab locks in drm_atomic_helper_resume() (Daniel) - Fix oops when unplugging USB device (expand cleanup in drm_unplug_dev) (Hans) Driver Changes: - rockchip: Don't output 10-bit format to 8-bit encoders (Mark) Cc: Mark yao <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Hans de Goede <[email protected]> * tag 'drm-misc-fixes-2017-06-02' of git://anongit.freedesktop.org/git/drm-misc: drm: Fix oops + Xserver hang when unplugging USB drm devices drm: Fix locking in drm_atomic_helper_resume drm/rockchip: Correct vop out_mode configure
2017-06-06	Merge branch 'linux-4.12' of git://github.com/skeggsb/linux into drm-fixes	Dave Airlie	6	-38/+27
	4 nouveau regression fixes. * 'linux-4.12' of git://github.com/skeggsb/linux: drm/nouveau/tmr: fully separate alarm execution/pending lists drm/nouveau: enable autosuspend only when it'll actually be used drm/nouveau: replace multiple open-coded runpm support checks with function drm/nouveau/kms/nv50: add null check before pointer dereference
2017-06-06	drm/nouveau/tmr: fully separate alarm execution/pending lists	Ben Skeggs	2	-3/+5
	Reusing the list_head for both is a bad idea. Callback execution is done with the lock dropped so that alarms can be rescheduled from the callback, which means that with some unfortunate timing, lists can get corrupted. The execution list should not require its own locking, the single function that uses it can only be called from a single context. Signed-off-by: Ben Skeggs <[email protected]> Cc: [email protected]
2017-06-06	drm/nouveau: enable autosuspend only when it'll actually be used	Ben Skeggs	1	-2/+2
	This prevents a deadlock that somehow results from the suspend() -> forbid() -> resume() callchain. [ 125.266960] [drm] Initialized nouveau 1.3.1 20120801 for 0000:02:00.0 on minor 1 [ 370.120872] INFO: task kworker/4:1:77 blocked for more than 120 seconds. [ 370.120920] Tainted: G O 4.12.0-rc3 #20 [ 370.120947] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 370.120982] kworker/4:1 D13808 77 2 0x00000000 [ 370.120998] Workqueue: pm pm_runtime_work [ 370.121004] Call Trace: [ 370.121018] __schedule+0x2bf/0xb40 [ 370.121025] ? mark_held_locks+0x5f/0x90 [ 370.121038] schedule+0x3d/0x90 [ 370.121044] rpm_resume+0x107/0x870 [ 370.121052] ? finish_wait+0x90/0x90 [ 370.121065] ? pci_pm_runtime_resume+0xa0/0xa0 [ 370.121070] pm_runtime_forbid+0x4c/0x60 [ 370.121129] nouveau_pmops_runtime_suspend+0xaf/0xc0 [nouveau] [ 370.121139] pci_pm_runtime_suspend+0x5f/0x170 [ 370.121147] ? pci_pm_runtime_resume+0xa0/0xa0 [ 370.121152] __rpm_callback+0xb9/0x1e0 [ 370.121159] ? pci_pm_runtime_resume+0xa0/0xa0 [ 370.121166] rpm_callback+0x24/0x80 [ 370.121171] ? pci_pm_runtime_resume+0xa0/0xa0 [ 370.121176] rpm_suspend+0x138/0x6e0 [ 370.121192] pm_runtime_work+0x7b/0xc0 [ 370.121199] process_one_work+0x253/0x6a0 [ 370.121216] worker_thread+0x4d/0x3b0 [ 370.121229] kthread+0x133/0x150 [ 370.121234] ? process_one_work+0x6a0/0x6a0 [ 370.121238] ? kthread_create_on_node+0x70/0x70 [ 370.121246] ret_from_fork+0x2a/0x40 [ 370.121283] Showing all locks held in the system: [ 370.121291] 2 locks held by kworker/4:1/77: [ 370.121298] #0: ("pm"){.+.+.+}, at: [<ffffffffac0d3530>] process_one_work+0x1d0/0x6a0 [ 370.121315] #1: ((&dev->power.work)){+.+.+.}, at: [<ffffffffac0d3530>] process_one_work+0x1d0/0x6a0 [ 370.121330] 1 lock held by khungtaskd/81: [ 370.121333] #0: (tasklist_lock){.+.+..}, at: [<ffffffffac10fc8d>] debug_show_all_locks+0x3d/0x1a0 [ 370.121355] 1 lock held by dmesg/1639: [ 370.121358] #0: (&user->lock){+.+.+.}, at: [<ffffffffac124b6d>] devkmsg_read+0x4d/0x360 [ 370.121377] ============================================= Signed-off-by: Ben Skeggs <[email protected]>
2017-06-06	drm/nouveau: replace multiple open-coded runpm support checks with function	Ben Skeggs	3	-32/+18
	Signed-off-by: Ben Skeggs <[email protected]>
2017-06-06	drm/nouveau/kms/nv50: add null check before pointer dereference	Gustavo A. R. Silva	1	-1/+2
	Add null check before dereferencing pointer asyc Addresses-Coverity-ID: 1397932 Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Ben Skeggs <[email protected]>
2017-06-05	Merge branch 'for-4.12-fixes' of ↵	Linus Torvalds	4	-2/+28
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two cgroup fixes. One to address RCU delay of cpuset removal affecting userland visible behaviors. The other fixes a race condition between controller disable and cgroup removal" * 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cpuset: consider dying css as offline cgroup: Prevent kill_css() from being called more than once
2017-06-05	Merge branch 'for-4.12-fixes' of ↵	Linus Torvalds	5	-11/+62
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: - Revert of sata_mv devm_ioremap_resource() conversion. It made init fail if there are overlapping resources which led to detection failures on some setups. - A workaround for an Acer laptop which sometimes reports corrupt port map. - Other non-critical fixes. * 'for-4.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: fix error checking in in ata_parse_force_one() Revert "ata: sata_mv: Convert to devm_ioremap_resource()" ata: libahci: properly propagate return value of platform_get_irq() ata: sata_rcar: Handle return value of clk_prepare_enable ahci: Acer SA5-271 SSD Not Detected Fix
2017-06-05	Merge tag 'iwlwifi-for-kalle-2017-06-05' of ↵	Kalle Valo	17	-78/+115
	git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes Fixes for 4.12: * Some memory leaks; * IBSS support; * Some bugzilla bugs; * Some runtime PM fixes; * Rate-scaling issues; * Some locking problems;
2017-06-05	iwlwifi: fix host command memory leaks	Shahar S Matityahu	1	-3/+6
	Sending host command with CMD_WANT_SKB flag demands the release of the response buffer with iwl_free_resp function. The patch adds the memory release in all the relevant places Signed-off-by: Shahar S Matityahu <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: fix min API version for 7265D, 3168, 8000 and 8265	Luca Coelho	2	-4/+4
	In a previous commit, we removed support for API versions earlier than 22 for these NICs. By mistake, the *_UCODE_API_MIN definitions were set to 17. Fix that. Fixes: 4b87e5af638b ("iwlwifi: remove support for fw older than -17 and -22") Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: clear new beacon command template struct	Johannes Berg	1	-1/+1
	Clear the struct so that all reserved fields are zero when we send the struct down to the device. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: don't fail when removing a key from an inexisting sta	Luca Coelho	1	-8/+5
	The iwl_mvm_remove_sta_key() function handles removing a key when the sta doesn't exist anymore. Mistakenly, this was changed to return an error while fixing another bug. If the mvm_sta doesn't exist, we continue normally, but just don't try to remove the igtk key. Fixes: cd4d23c1ea9b ("iwlwifi: mvm: Fix removal of IGTK") Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: pcie: only use d0i3 in suspend/resume if system_pm is set to d0i3	Luca Coelho	1	-2/+4
	We only need to handle d0i3 entry and exit during suspend resume if system_pm is set to IWL_PLAT_PM_MODE_D0I3, otherwise d0i3 entry failures will cause suspend to fail. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=194791 Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: fix firmware debug restart recording	Emmanuel Grumbach	4	-19/+27
	When we want to stop the recording of the firmware debug and restart it later without reloading the firmware we don't need to resend the configuration that comes with host commands. Sending those commands confused the hardware and led to an NMI 0x66. Change the flow as following: * read the relevant registers (DBGC_IN_SAMPLE, DBGC_OUT_CTRL) * clear those registers * wait for the hardware to complete its write to the buffer * get the data * restore the value of those registers (to restart the recording) For early start (where the configuration is already compiled in the firmware), we don't need to set those registers after the firmware has been loaded, but only when we want to restart the recording without having restarted the firmware. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: tt: move ucode_loaded check under mutex	Johannes Berg	1	-3/+5
	The ucode_loaded check should be under the mutex, since it can otherwise change state after we looked at it and before we got the mutex. Fix that. Fixes: 5c89e7bc557e ("iwlwifi: mvm: add registration to cooling device") Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: support ibss in dqa mode	Liad Kaufman	1	-1/+12
	Allow working IBSS also when working in DQA mode. This is done by setting it to treat the queues the same as a BSS AP treats the queues. Fixes: 7948b87308a4 ("iwlwifi: mvm: enable dynamic queue allocation mode") Signed-off-by: Liad Kaufman <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: Fix command queue number on d0i3 flow	Haim Dreyfuss	1	-1/+4
	During d0i3 flow we flush all the queue except from the command queue. Currently, in this flow the command queue is hard coded to 9. In DQA the command queue number has changed from 9 to 0. Fix that. This fixes a problem in runtime PM resume flow. Fixes: 097129c9e625 ("iwlwifi: mvm: move cmd queue to be #0 in dqa mode") Signed-off-by: Haim Dreyfuss <[email protected]> Signed-off-by: Luca Coelho <[email protected]>
2017-06-05	iwlwifi: mvm: rs: start using LQ command color	Gregory Greenman	6	-36/+47
	Up until now, the driver was comparing the rate reported by the FW and the rate of the latest LQ command to avoid processing data belonging to the old LQ command. Recently, FW changed the meaning of the initial rate field in tx response and it holds the actual rate (which is not necessarily the initial rate of LQ's rate table). Use instead LQ cmd color to be able to filter out tx responses/BA notifications which where sent during earlier LQ commands' time frame. This fixes some throughput degradation in noisy environments. Signed-off-by: Gregory Greenman <[email protected]> Signed-off-by: Luca Coelho <[email protected]>