aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-04-24perf stat: Print out hint for mixed PMU group errorKan Liang1-1/+34
Perf doesn't support mixed events from different PMUs (except software event) in a group. For this case, only "<not counted>" or "<not supported>" are printed out. There is no hint which guides users to fix the issue. Checking the PMU type of events to determine if they are from the same PMU. There may be false alarm for the checking. E.g. the core PMU has different PMU type. But it should not happen often. The false alarm can also be tolerated, because: - It only happens on error path. - It just provides a possible solution for the issue. Signed-off-by: Kan Liang <[email protected]> Cc: Agustin Vega-Frias <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-24perf pmu: Fix core PMU alias list for X86 platformKan Liang1-13/+7
When counting uncore event with alias, core event is mistakenly involved, for example: perf stat --no-merge -e "unc_m_cas_count.all" -C0 sleep 1 Performance counter stats for 'CPU(s) 0': 0 unc_m_cas_count.all [uncore_imc_4] 0 unc_m_cas_count.all [uncore_imc_2] 0 unc_m_cas_count.all [uncore_imc_0] 153,640 unc_m_cas_count.all [cpu] 0 unc_m_cas_count.all [uncore_imc_5] 25,026 unc_m_cas_count.all [uncore_imc_3] 0 unc_m_cas_count.all [uncore_imc_1] 1.001447890 seconds time elapsed The reason is that current implementation doesn't check PMU name of a event when adding its alias into the alias list for core PMU. The uncore event aliases are mistakenly added. This bug was introduced in: commit 14b22ae028de ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices") Checking the PMU name for all PMUs on X86 and other architectures except ARM. There is no behavior change for ARM. Signed-off-by: Kan Liang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Agustin Vega-Frias <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: Will Deacon <[email protected]> Fixes: 14b22ae028de ("perf pmu: Add helper function is_pmu_core to detect PMU CORE devices") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-24virtio_balloon: add array of stat namesMichael S. Tsirkin1-0/+15
Jason Wang points out that it's very hard for users to build an array of stat names. The naive thing is to use VIRTIO_BALLOON_S_NR but that breaks if we add more stats - as done e.g. recently by commit 6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts"). Let's add an array of reasonably readable names. Fixes: 6c64fe7f2 ("virtio_balloon: export hugetlb page allocation counts") Cc: Jason Wang <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Jonathan Helman <[email protected]>
2018-04-24arm64: support __int128 with clangJason A. Donenfeld1-0/+4
Commit fb8722735f50 ("arm64: support __int128 on gcc 5+") added support for arm64 __int128 with gcc with a version-conditional, but neglected to enable this for clang, which in fact appears to support aarch64 __int128. This commit therefore enables it if the compiler is clang, using the same type of makefile conditional used elsewhere in the tree. Signed-off-by: Jason A. Donenfeld <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-04-24arm64: only advance singlestep for user instruction trapsMark Rutland1-1/+2
Our arm64_skip_faulting_instruction() helper advances the userspace singlestep state machine, but this is also called by the kernel BRK handler, as used for WARN*(). Thus, if we happen to hit a WARN*() while the user singlestep state machine is in the active-no-pending state, we'll advance to the active-pending state without having executed a user instruction, and will take a step exception earlier than expected when we return to userspace. Let's fix this by only advancing the state machine when skipping a user instruction. Signed-off-by: Mark Rutland <[email protected]> Cc: Andrey Konovalov <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-04-24arm64/kernel: rename module_emit_adrp_veneer->module_emit_veneer_for_adrpKim Phillips3-3/+3
Commit a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") introduced a function whose name ends with "_veneer". This clashes with commit bd8b22d2888e ("Kbuild: kallsyms: ignore veneers emitted by the ARM linker"), which removes symbols ending in "_veneer" from kallsyms. The problem was manifested as 'perf test -vvvvv vmlinux' failed, correctly claiming the symbol 'module_emit_adrp_veneer' was present in vmlinux, but not in kallsyms. ... ERR : 0xffff00000809aa58: module_emit_adrp_veneer not on kallsyms ... test child finished with -1 ---- end ---- vmlinux symtab matches kallsyms: FAILED! Fix the problem by renaming module_emit_adrp_veneer to module_emit_veneer_for_adrp. Now the test passes. Fixes: a257e02579e ("arm64/kernel: don't ban ADRP to work around Cortex-A53 erratum #843419") Acked-by: Ard Biesheuvel <[email protected]> Cc: Will Deacon <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Michal Marek <[email protected]> Signed-off-by: Kim Phillips <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-04-24arm64: ptrace: remove addr_limit manipulationMark Rutland1-6/+0
We transiently switch to KERNEL_DS in compat_ptrace_gethbpregs() and compat_ptrace_sethbpregs(), but in either case this is pointless as we don't perform any uaccess during this window. let's rip out the redundant addr_limit manipulation. Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-04-24ALSA: usb-audio: ADC3: Fix channel mapping conversion for ADC3.Michael Drake1-1/+1
The channel mapping is defined by bChRelationship, not bChPurpose. Fixes: 9a2fe9b801f5 ("ALSA: usb: initial USB Audio Device Class 3.0 support") Reviewed-by: Ruslan Bilovol <[email protected]> Signed-off-by: Michael Drake <[email protected]> Signed-off-by: Jorge Sanjuan <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-04-24RISC-V: build vdso-dummy.o with -no-pieAurelien Jarno1-1/+1
Debian toolcahin defaults to PIE, and I guess that will also be the case of most distributions. This causes the following build failure: AS arch/riscv/kernel/vdso/getcpu.o AS arch/riscv/kernel/vdso/flush_icache.o VDSOLD arch/riscv/kernel/vdso/vdso.so.dbg OBJCOPY arch/riscv/kernel/vdso/vdso.so AS arch/riscv/kernel/vdso/vdso.o VDSOLD arch/riscv/kernel/vdso/vdso-dummy.o LD arch/riscv/kernel/vdso/vdso-syms.o riscv64-linux-gnu-ld: attempted static link of dynamic object `arch/riscv/kernel/vdso/vdso-dummy.o' make[2]: *** [arch/riscv/kernel/vdso/Makefile:43: arch/riscv/kernel/vdso/vdso-syms.o] Error 1 make[1]: *** [scripts/Makefile.build:575: arch/riscv/kernel/vdso] Error 2 make: *** [Makefile:1018: arch/riscv/kernel] Error 2 While the root Makefile correctly passes "-fno-PIE" to build individual object files, the RISC-V kernel also builds vdso-dummy.o as an executable, which is therefore linked as PIE. Fix that by updating this specific link rule to also include "-no-pie". Signed-off-by: Aurelien Jarno <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-24riscv: there is no <asm/handle_irq.h>Christoph Hellwig1-1/+0
So don't list it as generic-y. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-24riscv: select DMA_DIRECT_OPS instead of redefining itChristoph Hellwig1-3/+1
DMA_DIRECT_OPS is defined in lib/Kconfig, so don't duplicate it in arch/riscv/Kconfig. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Palmer Dabbelt <[email protected]>
2018-04-24sfc: ARFS filter IDsEdward Cree6-46/+337
Associate an arbitrary ID with each ARFS filter, allowing to properly query for expiry. The association is maintained in a hash table, which is protected by a spinlock. v3: fix build warnings when CONFIG_RFS_ACCEL is disabled (thanks lkp-robot). v2: fixed uninitialised variable (thanks davem and lkp-robot). Fixes: 3af0f34290f6 ("sfc: replace asynchronous filter operations") Signed-off-by: Edward Cree <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-24net: ethtool: Add missing kernel doc for FEC parametersFlorian Fainelli1-0/+2
While adding support for ethtool::get_fecparam and set_fecparam, kernel doc for these functions was missed, add those. Fixes: 1a5f3da20bd9 ("net: ethtool: add support for forward error correction modes") Signed-off-by: Florian Fainelli <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-24packet: fix bitfield update raceWillem de Bruijn2-21/+49
Updates to the bitfields in struct packet_sock are not atomic. Serialize these read-modify-write cycles. Move po->running into a separate variable. Its writes are protected by po->bind_lock (except for one startup case at packet_create). Also replace a textual precondition warning with lockdep annotation. All others are set only in packet_setsockopt. Serialize these updates by holding the socket lock. Analogous to other field updates, also hold the lock when testing whether a ring is active (pg_vec). Fixes: 8dc419447415 ("[PACKET]: Add optional checksum computation for recvmsg") Reported-by: DaeRyong Jeong <[email protected]> Reported-by: Byoungyoung Lee <[email protected]> Signed-off-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-24ice: Do not check INTEVENT bit for OICR interruptsBen Shelton2-6/+0
According to the hardware spec, checking the INTEVENT bit isn't a reliable way to detect if an OICR interrupt has occurred. This is because this bit can be cleared by the hardware/firmware before the interrupt service routine has run. So instead, just check for OICR events every time. Fixes: 940b61af02f4 ("ice: Initialize PF and setup miscellaneous interrupt") Signed-off-by: Ben Shelton <[email protected]> Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-04-24random: fix possible sleeping allocation from irq contextTheodore Ts'o1-1/+8
We can do a sleeping allocation from an irq context when CONFIG_NUMA is enabled. Fix this by initializing the NUMA crng instances in a workqueue. Reported-by: Tetsuo Handa <[email protected]> Reported-by: [email protected] Fixes: 8ef35c866f8862df ("random: set up the NUMA crng instances...") Cc: [email protected] Signed-off-by: Theodore Ts'o <[email protected]>
2018-04-24ice: Fix incorrect comment for action typeAnirudh Venkataramanan1-1/+1
Action type 5 defines large action generic values. Fix comment to reflect that better. Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-04-24ice: Fix initialization for num_nodes_addedAnirudh Venkataramanan1-2/+2
ice_sched_add_nodes_to_layer is used recursively, and so we start with num_nodes_added being 0. This way, in case of an error or if num_nodes is NULL, the function just returns 0 to indicate that no nodes were added. Fixes: 5513b920a4f7 ("ice: Update Tx scheduler tree for VSI multi-Tx queue support") Signed-off-by: Anirudh Venkataramanan <[email protected]> Tested-by: Tony Brelinski <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-04-24igb: Fix the transmission mode of queue 0 for Qav modeVinicius Costa Gomes1-1/+16
When Qav mode is enabled, queue 0 should be kept on Stream Reservation mode. From the i210 datasheet, section 8.12.19: "Note: Queue0 QueueMode must be set to 1b when TransmitMode is set to Qav." ("QueueMode 1b" represents the Stream Reservation mode) The solution is to give queue 0 the all the credits it might need, so it has priority over queue 1. A situation where this can happen is when cbs is "installed" only on queue 1, leaving queue 0 alone. For example: $ tc qdisc replace dev enp2s0 handle 100: parent root mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 0 $ tc qdisc replace dev enp2s0 parent 100:2 cbs locredit -1470 \ hicredit 30 sendslope -980000 idleslope 20000 offload 1 Signed-off-by: Vinicius Costa Gomes <[email protected]> Tested-by: Aaron Brown <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-04-24mtd: cfi: cmdset_0002: Do not allow read/write to suspend erase block.Joakim Tjernlund1-3/+6
Currently it is possible to read and/or write to suspend EB's. Writing /dev/mtdX or /dev/mtdblockX from several processes may break the flash state machine. Taken from cfi_cmdset_0001 driver. Signed-off-by: Joakim Tjernlund <[email protected]> Cc: <[email protected]> Reviewed-by: Richard Weinberger <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
2018-04-24mtd: cfi: cmdset_0001: Workaround Micron Erase suspend bug.Joakim Tjernlund1-0/+17
Some Micron chips does not work well wrt Erase suspend for boot blocks. This avoids the issue by not allowing Erase suspend for the boot blocks for the 28F00AP30(1GBit) chip. Signed-off-by: Joakim Tjernlund <[email protected]> Cc: <[email protected]> Reviewed-by: Richard Weinberger <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
2018-04-24mtd: cfi: cmdset_0001: Do not allow read/write to suspend erase block.Joakim Tjernlund2-5/+12
Currently it is possible to read and/or write to suspend EB's. Writing /dev/mtdX or /dev/mtdblockX from several processes may break the flash state machine. Signed-off-by: Joakim Tjernlund <[email protected]> Cc: <[email protected]> Reviewed-by: Richard Weinberger <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
2018-04-24ext4: fix bitmap position validationLukas Czerner1-4/+5
Currently in ext4_valid_block_bitmap() we expect the bitmap to be positioned anywhere between 0 and s_blocksize clusters, but that's wrong because the bitmap can be placed anywhere in the block group. This causes false positives when validating bitmaps on perfectly valid file system layouts. Fix it by checking whether the bitmap is within the group boundary. The problem can be reproduced using the following mkfs -t ext3 -E stride=256 /dev/vdb1 mount /dev/vdb1 /mnt/test cd /mnt/test wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.16.3.tar.xz tar xf linux-4.16.3.tar.xz This will result in the warnings in the logs EXT4-fs error (device vdb1): ext4_validate_block_bitmap:399: comm tar: bg 84: block 2774529: invalid block bitmap [ Changed slightly for clarity and to not drop a overflow test -- TYT ] Signed-off-by: Lukas Czerner <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]> Reported-by: Ilya Dryomov <[email protected]> Fixes: 7dac4a1726a9 ("ext4: add validity checks for bitmap block numbers") Cc: [email protected]
2018-04-24ixgbevf: ensure xdp_ring resources are free'd on error exitColin Ian King1-1/+1
The current error handling for failed resource setup for xdp_ring data is a break out of the loop and returning 0 indicated everything was OK, when in fact it is not. Fix this by exiting via the error exit label err_setup_tx that will clean up the resources correctly and return and error status. Detected by CoverityScan, CID#1466879 ("Logically dead code") Fixes: 21092e9ce8b1 ("ixgbevf: Add support for XDP_TX action") Signed-off-by: Colin Ian King <[email protected]> Tested-by: Andrew Bowers <[email protected]> Signed-off-by: Jeff Kirsher <[email protected]>
2018-04-24SMB3: Fix 3.11 encryption to Windows and handle encrypted smb3 tconSteve French3-21/+22
Temporarily disable AES-GCM, as AES-CCM is only currently enabled mechanism on client side. This fixes SMB3.11 encrypted mounts to Windows. Also the tree connect request itself should be encrypted if requested encryption ("seal" on mount), in addition we should be enabling encryption in 3.11 based on whether we got any valid encryption ciphers back in negprot (the corresponding session flag is not set as it is in 3.0 and 3.02) Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]> Reviewed-by: Ronnie Sahlberg <[email protected]> CC: Stable <[email protected]>
2018-04-24CIFS: set *resp_buf_type to NO_BUFFER on errorSteve French1-1/+4
Dan Carpenter had pointed this out a while ago, but the code around this had changed so wasn't causing any problems since that field was not used in this error path. Still, it is cleaner to always initialize this field, so changing the error path to set it. Reviewed-by: Ronnie Sahlberg <[email protected]> CC: Dan Carpenter <[email protected]> Signed-off-by: Steve French <[email protected]>
2018-04-24team: fix netconsole setup over teamXin Long1-7/+12
The same fix in Commit dbe173079ab5 ("bridge: fix netconsole setup over bridge") is also needed for team driver. While at it, remove the unnecessary parameter *team from team_port_enable_netpoll(). v1->v2: - fix it in a better way, as does bridge. Fixes: 0fb52a27a04a ("team: cleanup netpoll clode") Reported-by: João Avelino Bellomo Filho <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-24ALSA: dice: fix OUI for TC groupTakashi Sakamoto1-1/+1
OUI for TC Electronic is 0x000166, for TC GROUP A/S. 0x001486 is for Echo Digital Audio Corporation. Fixes: 7cafc65b3aa1 ('ALSA: dice: force to add two pcm devices for listed models') Cc: <[email protected]> # v4.6+ Reference: http://standards-oui.ieee.org/oui/oui.txt Signed-off-by: Takashi Sakamoto <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-04-24ALSA: usb-audio: Skip broken EU on Dell dock USB-audioTakashi Iwai1-0/+3
The Dell Dock USB-audio device with 0bda:4014 is behaving notoriously bad, and we have already applied some workaround to avoid the firmware hiccup. Yet we still need to skip one thing, the Extension Unit at ID 4, which doesn't react correctly to the mixer ctl access. Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1090658 Cc: <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-04-24ALSA: usb-audio: Fix missing endian conversionTakashi Iwai1-2/+2
The UAC2 jack detection support introduced the bmControls checks in a couple of places, but they forgot the endian conversion; the bmControls of UAC2 terminal descriptor is __le16, not a byte like in UAC1. Fixes: 5a222e849452 ("ALSA: usb-audio: UAC2 jack detection") Tested-by: Andrew Chant <[email protected]> Signed-off-by: Takashi Iwai <[email protected]>
2018-04-24ACPI / button: make module loadable when booted in non-ACPI modeArd Biesheuvel1-1/+23
Modules such as nouveau.ko and i915.ko have a link time dependency on acpi_lid_open(), and due to its use of acpi_bus_register_driver(), the button.ko module that provides it is only loadable when booted in ACPI mode. However, the ACPI button driver can be built into the core kernel as well, in which case the dependency can always be satisfied, and the dependent modules can be loaded regardless of whether the system was booted in ACPI mode or not. So let's fix this asymmetry by making the ACPI button driver loadable as a module even if not booted in ACPI mode, so it can provide the acpi_lid_open() symbol in the same way as when built into the kernel. Signed-off-by: Ard Biesheuvel <[email protected]> [ rjw: Minor adjustments of comments, whitespace and names. ] Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-24drm/amdkfd: fix build, select MMU_NOTIFIERRandy Dunlap1-0/+1
When CONFIG_MMU_NOTIFIER is not enabled, struct mmu_notifier has an incomplete type definition, which causes build errors. ../drivers/gpu/drm/amd/amdkfd/kfd_priv.h:607:22: error: field 'mmu_notifier' has incomplete type ../include/linux/kernel.h:979:32: error: dereferencing pointer to incomplete type ../include/linux/kernel.h:980:18: error: dereferencing pointer to incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:434:2: error: implicit declaration of function 'mmu_notifier_unregister_no_release' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:435:2: error: implicit declaration of function 'mmu_notifier_call_srcu' [-Werror=implicit-function-declaration] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:438:21: error: variable 'kfd_process_mmu_notifier_ops' has initializer but incomplete type ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: error: unknown field 'release' specified in initializer ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: excess elements in struct initializer [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:439:2: warning: (near initialization for 'kfd_process_mmu_notifier_ops') [enabled by default] ../drivers/gpu/drm/amd/amdkfd/kfd_process.c:534:2: error: implicit declaration of function 'mmu_notifier_register' [-Werror=implicit-function-declaration] Signed-off-by: Randy Dunlap <[email protected]> Tested-by: Anders Roxell <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-04-24cpufreq: brcmstb-avs-cpufreq: remove development debug supportMarkus Mayer2-332/+1
This debug code was helpful while developing the driver, but it isn't being used for anything anymore. Signed-off-by: Markus Mayer <[email protected]> Acked-by: Viresh Kumar <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-24drm/amdkfd: fix clock counter retrieval for node without GPUAndres Rodriguez1-6/+7
Currently if a user requests clock counters for a node without a GPU resource we will always return EINVAL. Instead if no GPU resource is attached, fill the gpu_clock_counter argument with zeroes so that we may proceed and return valid CPU counters. Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-04-24drm/amdkfd: Fix the error return code in kfd_ioctl_unmap_memory_from_gpu()Wei Yongjun1-1/+1
Passing NULL pointer to PTR_ERR will result in return value of 0 indicating success which is clearly not what it is intended here. This patch returns -EINVAL instead. v2: change ret code to -ENODEV Fixes: 5ec7e02854b3 ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Wei Yongjun <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-04-24ACPI / watchdog: Prefer iTCO_wdt on Lenovo Z50-70Mika Westerberg1-10/+49
WDAT table on Lenovo Z50-70 is using RTC SRAM (ports 0x70 and 0x71) to store state of the timer. This conflicts with Linux RTC driver (rtc-cmos.c) who fails to reserve those ports for itself preventing RTC from functioning. In addition the WDAT table seems not to be fully functional because it does not reset the system when the watchdog times out. On this system iTCO_wdt works just fine so we simply prefer to use it instead of WDAT. This makes RTC working again and also results working watchdog via iTCO_wdt. Reported-by: Peter Milley <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=199033 Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-04-24drm/amdkfd: kfd_dev_is_large_bar() can be statickbuild test robot1-1/+1
Fixes: 5ec7e02854b3 ("drm/amdkfd: Add ioctls for GPUVM memory management") Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-04-24libceph: reschedule a tick in finish_hunting()Ilya Dryomov1-0/+1
If we go without an established session for a while, backoff delay will climb to 30 seconds. The keepalive timeout is also 30 seconds, so it's pretty easily hit after a prolonged hunting for a monitor: we don't get a chance to send out a keepalive in time, which means we never get back a keepalive ack in time, cutting an established session and attempting to connect to a different monitor every 30 seconds: [Sun Apr 1 23:37:05 2018] libceph: mon0 10.80.20.99:6789 session established [Sun Apr 1 23:37:36 2018] libceph: mon0 10.80.20.99:6789 session lost, hunting for new mon [Sun Apr 1 23:37:36 2018] libceph: mon2 10.80.20.103:6789 session established [Sun Apr 1 23:38:07 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon [Sun Apr 1 23:38:07 2018] libceph: mon1 10.80.20.100:6789 session established [Sun Apr 1 23:38:37 2018] libceph: mon1 10.80.20.100:6789 session lost, hunting for new mon [Sun Apr 1 23:38:37 2018] libceph: mon2 10.80.20.103:6789 session established [Sun Apr 1 23:39:08 2018] libceph: mon2 10.80.20.103:6789 session lost, hunting for new mon The regular keepalive interval is 10 seconds. After ->hunting is cleared in finish_hunting(), call __schedule_delayed() to ensure we send out a keepalive after 10 seconds. Cc: [email protected] # 4.7+ Link: http://tracker.ceph.com/issues/23537 Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jason Dillaman <[email protected]>
2018-04-24libceph: un-backoff on tick when we have a authenticated sessionIlya Dryomov1-3/+10
This means that if we do some backoff, then authenticate, and are healthy for an extended period of time, a subsequent failure won't leave us starting our hunting sequence with a large backoff. Mirrors ceph.git commit d466bc6e66abba9b464b0b69687cf45c9dccf383. Cc: [email protected] # 4.7+ Signed-off-by: Ilya Dryomov <[email protected]> Reviewed-by: Jason Dillaman <[email protected]>
2018-04-24arm64: mm: drop addr parameter from sync icache and dcacheShaokun Zhang2-3/+3
The addr parameter isn't used for anything. Let's simplify and get rid of it, like arm. Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2018-04-24x86/microcode: Do not exit early from __reload_late()Borislav Petkov1-4/+2
Vitezslav reported a case where the "Timeout during microcode update!" panic would hit. After a deeper look, it turned out that his .config had CONFIG_HOTPLUG_CPU disabled which practically made save_mc_for_early() a no-op. When that happened, the discovered microcode patch wasn't saved into the cache and the late loading path wouldn't find any. This, then, lead to early exit from __reload_late() and thus CPUs waiting until the timeout is reached, leading to the panic. In hindsight, that function should have been written so it does not return before the post-synchronization. Oh well, I know better now... Fixes: bb8c13d61a62 ("x86/microcode: Fix CPU synchronization routine") Reported-by: Vitezslav Samel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Vitezslav Samel <[email protected]> Tested-by: Ashok Raj <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-04-24x86/microcode/intel: Save microcode patch unconditionallyBorislav Petkov1-2/+0
save_mc_for_early() was a no-op on !CONFIG_HOTPLUG_CPU but the generic_load_microcode() path saves the microcode patches it has found into the cache of patches which is used for late loading too. Regardless of whether CPU hotplug is used or not. Make the saving unconditional so that late loading can find the proper patch. Reported-by: Vitezslav Samel <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Vitezslav Samel <[email protected]> Tested-by: Ashok Raj <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected]
2018-04-24powerpc/mce: Fix a bug where mce loops on memory UE.Mahesh Salgaonkar1-5/+2
The current code extracts the physical address for UE errors and then hooks it up into memory failure infrastructure. On successful extraction of physical address it wrongly sets "handled = 1" which means this UE error has been recovered. Since MCE handler gets return value as handled = 1, it assumes that error has been recovered and goes back to same NIP. This causes MCE interrupt again and again in a loop leading to hard lockup. Also, initialize phys_addr to ULONG_MAX so that we don't end up queuing undesired page to hwpoison. Without this patch we see: Severe Machine check interrupt [Recovered] NIP: [000000001002588c] PID: 7109 Comm: find Initiator: CPU Error type: UE [Load/Store] Effective address: 00007fffd2755940 Physical address: 000020181a080000 ... Severe Machine check interrupt [Recovered] NIP: [000000001002588c] PID: 7109 Comm: find Initiator: CPU Error type: UE [Load/Store] Effective address: 00007fffd2755940 Physical address: 000020181a080000 Severe Machine check interrupt [Recovered] NIP: [000000001002588c] PID: 7109 Comm: find Initiator: CPU Error type: UE [Load/Store] Effective address: 00007fffd2755940 Physical address: 000020181a080000 Memory failure: 0x20181a08: recovery action for dirty LRU page: Recovered Memory failure: 0x20181a08: already hardware poisoned Memory failure: 0x20181a08: already hardware poisoned Memory failure: 0x20181a08: already hardware poisoned Memory failure: 0x20181a08: already hardware poisoned Memory failure: 0x20181a08: already hardware poisoned Memory failure: 0x20181a08: already hardware poisoned ... Watchdog CPU:38 Hard LOCKUP After this patch we see: Severe Machine check interrupt [Not recovered] NIP: [00007fffaae585f4] PID: 7168 Comm: find Initiator: CPU Error type: UE [Load/Store] Effective address: 00007fffaafe28ac Physical address: 00002017c0bd0000 find[7168]: unhandled signal 7 at 00007fffaae585f4 nip 00007fffaae585f4 lr 00007fffaae585e0 code 4 Memory failure: 0x2017c0bd: recovery action for dirty LRU page: Recovered Fixes: 01eaac2b0591 ("powerpc/mce: Hookup ierror (instruction) UE errors") Fixes: ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors") Cc: [email protected] # v4.15+ Signed-off-by: Mahesh Salgaonkar <[email protected]> Signed-off-by: Balbir Singh <[email protected]> Reviewed-by: Balbir Singh <[email protected]> Signed-off-by: Michael Ellerman <[email protected]>
2018-04-23Merge branch 'amd-xgbe-fixes'David S. Miller7-23/+233
aTom Lendacky says: ==================== amd-xgbe: AMD XGBE driver fixes 2018-04-23 This patch series addresses some issues in the AMD XGBE driver. The following fixes are included in this driver update series: - Improve KR auto-negotiation and training (2 patches) - Add pre and post auto-negotiation hooks - Use the pre and post auto-negotiation hooks to disable CDR tracking during auto-negotiation page exchange in KR mode - Check for SFP tranceiver signal support and only use the signal if the SFP indicates that it is supported This patch series is based on net. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-04-23amd-xgbe: Only use the SFP supported transceiver signalsTom Lendacky1-17/+54
The SFP eeprom indicates the transceiver signals (Rx LOS, Tx Fault, etc.) that it supports. Update the driver to include checking the eeprom data when deciding whether to use a transceiver signal. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-23amd-xgbe: Improve KR auto-negotiation and trainingTom Lendacky7-4/+160
Update xgbe-phy-v2.c to make use of the auto-negotiation (AN) phy hooks to improve the ability to successfully complete Clause 73 AN when running at 10gbps. Hardware can sometimes have issues with CDR lock when the AN DME page exchange is being performed. The AN and KR training hooks are used as follows: - The pre AN hook is used to disable CDR tracking in the PHY so that the DME page exchange can be successfully and consistently completed. - The post KR training hook is used to re-enable the CDR tracking so that KR training can successfully complete. - The post AN hook is used to check for an unsuccessful AN which will increase a CDR tracking enablement delay (up to a maximum value). Add two debugfs entries to allow control over use of the CDR tracking workaround. The debugfs entries allow the CDR tracking workaround to be disabled and determine whether to re-enable CDR tracking before or after link training has been initiated. Also, with these changes the receiver reset cycle that is performed during the link status check can be performed less often. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-23amd-xgbe: Add pre/post auto-negotiation phy hooksTom Lendacky2-2/+19
Add hooks to the driver auto-negotiation (AN) flow to allow the different phy implementations to perform any steps necessary to improve AN. Signed-off-by: Tom Lendacky <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-23pppoe: check sockaddr length in pppoe_connect()Guillaume Nault1-0/+4
We must validate sockaddr_len, otherwise userspace can pass fewer data than we expect and we end up accessing invalid data. Fixes: 224cf5ad14c0 ("ppp: Move the PPP drivers") Reported-by: [email protected] Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-23l2tp: check sockaddr length in pppol2tp_connect()Guillaume Nault1-0/+7
Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that it actually points to valid data. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Reported-by: [email protected] Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-04-23net: phy: marvell: clear wol event before setting itJingju Hou1-0/+9
If WOL event happened once, the LED[2] interrupt pin will not be cleared unless we read the CSISR register. If interrupts are in use, the normal interrupt handling will clear the WOL event. Let's clear the WOL event before enabling it if !phy_interrupt_is_valid(). Signed-off-by: Jingju Hou <[email protected]> Signed-off-by: Jisheng Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>