aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-05-05Merge tag 'powerpc-4.12-1' of ↵Linus Torvalds212-2790/+8698
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights include: - Larger virtual address space on 64-bit server CPUs. By default we use a 128TB virtual address space, but a process can request access to the full 512TB by passing a hint to mmap(). - Support for the new Power9 "XIVE" interrupt controller. - TLB flushing optimisations for the radix MMU on Power9. - Support for CAPI cards on Power9, using the "Coherent Accelerator Interface Architecture 2.0". - The ability to configure the mmap randomisation limits at build and runtime. - Several small fixes and cleanups to the kprobes code, as well as support for KPROBES_ON_FTRACE. - Major improvements to handling of system reset interrupts, correctly treating them as NMIs, giving them a dedicated stack and using a new hypervisor call to trigger them, all of which should aid debugging and robustness. - Many fixes and other minor enhancements. Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt, Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy, Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy, Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin, Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C. Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar, Yang Shi" * tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits) powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it powerpc/powernv: Fix TCE kill on NVLink2 powerpc/mm/radix: Drop support for CPUs without lockless tlbie powerpc/book3s/mce: Move add_taint() later in virtual mode powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body powerpc/smp: Document irq enable/disable after migrating IRQs powerpc/mpc52xx: Don't select user-visible RTAS_PROC powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset() powerpc/eeh: Clean up and document event handling functions powerpc/eeh: Avoid use after free in eeh_handle_special_event() cxl: Mask slice error interrupts after first occurrence cxl: Route eeh events to all drivers in cxl_pci_error_detected() cxl: Force context lock during EEH flow powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST powerpc/xmon: Teach xmon oops about radix vectors powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids powerpc/pseries: Enable VFIO powerpc/powernv: Fix iommu table size calculation hook for small tables powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc powerpc: Add arch/powerpc/tools directory ...
2017-05-05Merge branch 'for-linus' of ↵Linus Torvalds11-21/+4
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "This is a set of small fixes that were mostly stumbled over during more significant development. This proc fix and the fix to posix-timers are the most significant of the lot. There is a lot of good development going on but unfortunately it didn't quite make the merge window" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Fix unbalanced hard link numbers signal: Make kill_proc_info static rlimit: Properly call security_task_setrlimit signal: Remove unused definition of sig_user_definied ia64: Remove unused IA64_TASK_SIGHAND_OFFSET and IA64_SIGHAND_SIGLOCK_OFFSET ipc: Remove unused declaration of recompute_msgmni posix-timers: Correct sanity check in posix_cpu_nsleep sysctl: Remove dead register_sysctl_root
2017-05-05nfs: use kmap/kunmap directlyFabian Frederick1-55/+12
This patch removes useless nfs_readdir_get_array() and nfs_readdir_release_array() as suggested by Trond Myklebust nfs_readdir() calls nfs_revalidate_mapping() before readdir_search_pagecache() , nfs_do_filldir(), uncached_readdir() so mapping should be correct. While kmap() can't fail, all subsequent error checks were removed as well as unused labels. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2017-05-05NFS: always treat the invocation of nfs_getattr as cache hit when noac is onHou Tao1-1/+4
When using 'ls -l' to display a large directory, if noac option is used, in function nfs_getattr() nfs_need_revalidate_inode() will always be true for NFSv3 and the nfs_entry cache of the directory will be flushed. The flush will lead to a fully reread of the directory entries from server. To prevent the unnecessary RPCs, we need to check whether or not the noac option is used, and always report the invocation of nfs_getattr() as cache hit instead cache miss when it's on. Signed-off-by: Hou Tao <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2017-05-05Fix nfs_client refcounting if kmalloc fails in nfs4_proc_exchange_id and ↵Dave Wysochanski1-2/+6
nfs4_proc_async_renew If memory allocation fails for the callback data, we need to put the nfs_client or we end up with an elevated refcount. Signed-off-by: Dave Wysochanski <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2017-05-05NFSv4.1: RECLAIM_COMPLETE must handle NFS4ERR_CONN_NOT_BOUND_TO_SESSIONTrond Myklebust2-4/+13
If the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION because we are trunking, then RECLAIM_COMPLETE must handle that by calling nfs4_schedule_session_recovery() and then retrying. Reported-by: Chuck Lever <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Tested-by: Chuck Lever <[email protected]>
2017-05-05tcp: randomize timestamps on syncookiesEric Dumazet8-52/+88
Whole point of randomization was to hide server uptime, but an attacker can simply start a syn flood and TCP generates 'old style' timestamps, directly revealing server jiffies value. Also, TSval sent by the server to a particular remote address vary depending on syncookies being sent or not, potentially triggering PAWS drops for innocent clients. Lets implement proper randomization, including for SYNcookies. Also we do not need to export sysctl_tcp_timestamps, since it is not used from a module. In v2, I added Florian feedback and contribution, adding tsoff to tcp_get_cookie_sock(). v3 removed one unused variable in tcp_v4_connect() as Florian spotted. Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection") Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Florian Westphal <[email protected]> Tested-by: Florian Westphal <[email protected]> Cc: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-05-05fbdev: sti: don't select CONFIG_VTArnd Bergmann2-2/+1
While working on another build error, I ran into several variations of this dependency loop: subsection "Kconfig recursive dependency limitations" drivers/input/Kconfig:8: symbol INPUT is selected by VT For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/tty/Kconfig:12: symbol VT is selected by FB_STI For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/video/fbdev/Kconfig:677: symbol FB_STI depends on FB For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/video/fbdev/Kconfig:5: symbol FB is selected by DRM_KMS_FB_HELPER For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/Kconfig:72: symbol DRM_KMS_FB_HELPER is selected by DRM_KMS_CMA_HELPER For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/Kconfig:137: symbol DRM_KMS_CMA_HELPER is selected by DRM_HDLCD For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/arm/Kconfig:6: symbol DRM_HDLCD depends on OF For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/of/Kconfig:4: symbol OF is selected by X86_INTEL_CE For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:523: symbol X86_INTEL_CE depends on X86_IO_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:1011: symbol X86_IO_APIC depends on X86_LOCAL_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:1005: symbol X86_LOCAL_APIC depends on X86_UP_APIC For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" arch/x86/Kconfig:980: symbol X86_UP_APIC depends on PCI_MSI For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/pci/Kconfig:11: symbol PCI_MSI is selected by AMD_IOMMU For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/iommu/Kconfig:106: symbol AMD_IOMMU depends on IOMMU_SUPPORT For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/iommu/Kconfig:5: symbol IOMMU_SUPPORT is selected by DRM_ETNAVIV For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/gpu/drm/etnaviv/Kconfig:2: symbol DRM_ETNAVIV depends on THERMAL For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/thermal/Kconfig:5: symbol THERMAL is selected by ACPI_VIDEO For a resolution refer to Documentation/kbuild/kconfig-language.txt subsection "Kconfig recursive dependency limitations" drivers/acpi/Kconfig:183: symbol ACPI_VIDEO is selected by INPUT This doesn't currently show up as I fixed the 'THERMAL' part of it, but I noticed that the FB_STI dependency should not be there but was introduced by slightly incorrect bug-fix patch that tried to fix a link error. Instead of selecting 'VT' to make us enter the drivers/video/console directory at compile-time, it's sufficient to build the drivers/video/console/sticore.c file by adding its directory to when CONFIG_FB_STI is enabled. Alternatively, we could move the sticore code to another directory that is always built when we have at STI_CONSOLE or FB_STI enabled. Fixes: 17085a934592 ("parisc: stifb: should depend on STI_CONSOLE") Signed-off-by: Arnd Bergmann <[email protected]> Cc: Helge Deller <[email protected]> Cc: "James E.J. Bottomley" <[email protected]> Cc: Alexander Beregalov <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
2017-05-05bridge: netlink: account for IFLA_BRPORT_{B, M}CAST_FLOOD size and policyTobias Klauser1-0/+4
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size() when they were added. Do so now and also add the corresponding policy entries: Cc: Nikolay Aleksandrov <[email protected]> Cc: Mike Manning <[email protected]> Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag") Fixes: 99f906e9ad7b ("bridge: add per-port broadcast flood flag") Signed-off-by: Tobias Klauser <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-05-05CIFS: add misssing SFM mapping for doublequoteBjörn Jacke2-0/+7
SFM is mapping doublequote to 0xF020 Without this patch creating files with doublequote fails to Windows/Mac Signed-off-by: Bjoern Jacke <[email protected]> Signed-off-by: Steve French <[email protected]> CC: stable <[email protected]>
2017-05-05Merge branch 'backup-thermal-shutdown' into nextZhang Rui3-1/+101
2017-05-05Merge branches 'thermal-core' and 'thermal-intel' into nextZhang Rui1-3/+6
2017-05-05arm64: Fix the DMA mmap and get_sgtable API with DMA_ATTR_FORCE_CONTIGUOUSCatalin Marinas1-16/+49
While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit 44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable() have been broken by passing a physically contiguous vm_struct with an invalid pages pointer through the common iommu API. Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA, this patch simply reuses the existing swiotlb logic for mmap and get_sgtable. Note that the current implementation of get_sgtable (both swiotlb and iommu) is broken if dma_declare_coherent_memory() is used since such memory does not have a corresponding struct page. To be addressed in a subsequent patch. Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU") Reported-by: Andrzej Hajda <[email protected]> Cc: Geert Uytterhoeven <[email protected]> Acked-by: Robin Murphy <[email protected]> Tested-by: Andrzej Hajda <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2017-05-05befs: make export work with cold dcacheFabian Frederick1-0/+15
based on commit b3b42c0deaa1 ("fs/affs: make export work with cold dcache") This adds get_parent function so that nfs client can still work after cache drop (Tested on NFS v4 with echo 3 > /proc/sys/vm/drop_caches) Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: Luis de Bethencourt <[email protected]>
2017-05-05ovl: update documentation w.r.t. constant inode numbersAmir Goldstein1-1/+8
Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: persistent inode numbers for upper hardlinksAmir Goldstein1-0/+3
An upper type non directory dentry that is a copy up target should have a reference to its lower copy up origin. There are three ways for an upper type dentry to be instantiated: 1. A lower type dentry that is being copied up 2. An entry that is found in upper dir by ovl_lookup() 3. A negative dentry is hardlinked to an upper type dentry In the first case, the lower reference is set before copy up. In the second case, the lower reference is found by ovl_lookup(). In the last case of hardlinked upper dentry, it is not easy to update the lower reference of the negative dentry. Instead, drop the newly hardlinked negative dentry from dcache and let the next access call ovl_lookup() to find its lower reference. This makes sure that the inode number reported by stat(2) after the hardlink is created is the same inode number that will be reported by stat(2) after mount cycle, which is the inode number of the lower copy up origin of the hardlink source. NOTE that this does not fix breaking of lower hardlinks on copy up, but only fixes the case of lower nlink == 1, whose upper copy up inode is hardlinked in upper dir. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: merge getattr for dir and nondirMiklos Szeredi3-64/+30
Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: constant st_ino/st_dev across copy upAmir Goldstein1-1/+38
When all layers are on the same underlying filesystem, let stat(2) return st_dev/st_ino values of the copy up origin inode if it is known. This results in constant st_ino/st_dev representation of files in an overlay mount before and after copy up. When the underlying filesystem support NFS exportfs, the result is also persistent st_ino/st_dev representation before and after mount cycle. Lower hardlinks are broken on copy up to different upper files, so we cannot use the lower origin st_ino for those different files, even for the same fs case. When all overlay layers are on the same fs, use overlay st_dev for non-dirs to get the correct result from du -x. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: persistent inode number for directoriesAmir Goldstein1-4/+33
stat(2) on overlay directories reports the overlay temp inode number, which is constant across copy up, but is not persistent. When all layers are on the same fs, report the copy up origin inode number for directories. This inode number is persistent, unique across the overlay mount and constant across copy up. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: set the ORIGIN type flagAmir Goldstein2-4/+8
For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE. Define a new type flag OVL_TYPE_ORIGIN to indicate that an entry holds a reference to its lower copy up origin. For directory entries ORIGIN := MERGE && UPPER. For non-dir entries ORIGIN means that a lower type dentry has been recently copied up or that we were able to find the copy up origin from overlay.origin xattr. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: lookup non-dir copy-up-origin by file handleAmir Goldstein1-0/+132
If overlay.origin xattr is found on a non-dir upper inode try to get lower dentry by calling exportfs_decode_fh(). On failure to lookup by file handle to lower layer, do not lookup the copy up origin by name, because the lower found by name could be another file in case the upper file was renamed. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: use an auxiliary var for overlay root entryAmir Goldstein1-5/+4
Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: store file handle of lower inode on copy upAmir Goldstein2-0/+118
Sometimes it is interesting to know if an upper file is pure upper or a copy up target, and if it is a copy up target, it may be interesting to find the copy up origin. This will be used to preserve lower inode numbers across copy up. Store the lower inode file handle in upper inode extended attribute overlay.origin on copy up to use it later for these cases. Store the lower filesystem uuid along side the file handle, so we can validate that we are looking for the origin file in the original fs. If lower fs does not support NFS export ops store a zero sized xattr so we can always use the overlay.origin xattr to distinguish between a copy up and a pure upper inode. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05ovl: check if all layers are on the same fsAmir Goldstein4-0/+18
Some features can only work when all layers are on the same fs. Test this condition during mount time, so features can check them later. Add helper ovl_same_sb() to return the common super block in case all layers are on the same fs. Signed-off-by: Amir Goldstein <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]>
2017-05-05xen/x86: Do not call xen_init_time_ops() until shared_info is initializedBoris Ostrovsky2-3/+8
Routines that are set by xen_init_time_ops() use shared_info's pvclock_vcpu_time_info area. This area is not properly available until shared_info is mapped in xen_setup_shared_info(). This became especially problematic due to commit dd759d93f4dd ("x86/timers: Add simple udelay calibration") where we end up reading tsc_to_system_mul from xen_dummy_shared_info (i.e. getting zero value) and then trying to divide by it in pvclock_tsc_khz(). Signed-off-by: Boris Ostrovsky <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2017-05-05x86/xen: fix xsave capability settingJuergen Gross1-23/+9
Commit 690b7f10b4f9f ("x86/xen: use capabilities instead of fake cpuid values for xsave") introduced a regression as it tried to make use of the fixup feature before it being available. Fall back to the old variant testing via cpuid(). Signed-off-by: Juergen Gross <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
2017-05-05kvm: nVMX: Don't validate disabled secondary controlsJim Mattson1-3/+4
According to the SDM, if the "activate secondary controls" primary processor-based VM-execution control is 0, no checks are performed on the secondary processor-based VM-execution controls. Signed-off-by: Jim Mattson <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2017-05-05thermal: core: Add a back up thermal shutdown mechanismKeerthy3-0/+91
orderly_poweroff is triggered when a graceful shutdown of system is desired. This may be used in many critical states of the kernel such as when subsystems detects conditions such as critical temperature conditions. However, in certain conditions in system boot up sequences like those in the middle of driver probes being initiated, userspace will be unable to power off the system in a clean manner and leaves the system in a critical state. In cases like these, the /sbin/poweroff will return success (having forked off to attempt powering off the system. However, the system overall will fail to completely poweroff (since other modules will be probed) and the system is still functional with no userspace (since that would have shut itself off). However, there is no clean way of detecting such failure of userspace powering off the system. In such scenarios, it is necessary for a backup workqueue to be able to force a shutdown of the system when orderly shutdown is not successful after a configurable time period. Reported-by: Nishanth Menon <[email protected]> Signed-off-by: Keerthy <[email protected]> Acked-by: Eduardo Valentin <[email protected]> Signed-off-by: Zhang Rui <[email protected]>
2017-05-05thermal: core: Allow orderly_poweroff to be called only onceKeerthy1-1/+10
thermal_zone_device_check --> thermal_zone_device_update --> handle_thermal_trip --> handle_critical_trips --> orderly_poweroff The above sequence happens every 250/500 mS based on the configuration. The orderly_poweroff function is getting called every 250/500 mS. With a full fledged file system it takes at least 5-10 Seconds to power off gracefully. In that period due to the thermal_zone_device_check triggering periodically the thermal work queues bombard with orderly_poweroff calls multiple times eventually leading to failures in gracefully powering off the system. Make sure that orderly_poweroff is called only once. Signed-off-by: Keerthy <[email protected]> Acked-by: Eduardo Valentin <[email protected]> Signed-off-by: Zhang Rui <[email protected]>
2017-05-05Thermal: Intel SoC DTS: Change interrupt request behaviorBrian Bian1-3/+6
The interrupt request call in Intel SoC DTS driver may fail if there is no underlying BIOS support. However, the user space thermal daemon can still use the thermal zones created by the SoC DTS driver in polling mode, therefore, instead of bailing out on interrupt request failures, it is better just to log a warning message and continue the init process. Signed-off-by: Brian Bian <[email protected]> Signed-off-by: Zhang Rui <[email protected]>
2017-05-05trace: thermal: add another parameter 'power' to the tracing functionLukasz Luba2-5/+8
This patch adds another parameter to the trace function: trace_thermal_power_devfreq_get_power(). In case when we call directly driver's code for the real power, we do not have static/dynamic_power values. Instead we get total power in the '*power' value. The 'static_power' and 'dynamic_power' are set to 0. Therefore, we have to trace that '*power' value in this scenario. CC: Steven Rostedt <[email protected]> CC: Ingo Molnar <[email protected]> CC: Zhang Rui <[email protected]> CC: Eduardo Valentin <[email protected]> Acked-by: Javi Merino <[email protected]> Signed-off-by: Lukasz Luba <[email protected]>
2017-05-05thermal: devfreq_cooling: add new interface for direct power readLukasz Luba2-23/+101
This patch introduces a new interface for device drivers connected to devfreq_cooling in the thermal framework: get_real_power(). Some devices have more sophisticated methods (like power counters) to approximate the actual power that they use. In the previous implementation we had a pre-calculated power table which was then scaled by 'utilization' ('busy_time' and 'total_time' taken from devfreq 'last_status'). With this new interface the driver can provide more precise data regarding actual power to the thermal governor every time the power budget is calculated. We then use this value and calculate the real resource utilization scaling factor. Reviewed-by: Chris Diamand <[email protected]> Acked-by: Javi Merino <[email protected]> Signed-off-by: Lukasz Luba <[email protected]>
2017-05-05thermal: devfreq_cooling: refactor code and add get_voltage functionLukasz Luba1-17/+28
Move the code which gets the voltage for a given frequency. This code will be resused in few places. Acked-by: Javi Merino <[email protected]> Signed-off-by: Lukasz Luba <[email protected]>
2017-05-05initramfs: Always do fput() and load modules after rootfs populateStafford Horne1-6/+9
In OpenRISC we do not have a bootloader passed initrd, but the built in initramfs does contain the /init and other binaries, including modules. The previous commit 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") made a change to only call fput() if the bootloader initrd was available, this caused intermittent crashes for OpenRISC. This patch changes the fput() to happen unconditionally if any rootfs is loaded. Also, I added some comments to make it a bit more clear why we call unpack_to_rootfs() multiple times. Fixes: 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs") Cc: [email protected] Cc: Lokesh Vutla <[email protected]> Cc: Al Viro <[email protected]> Acked-by: Al Viro <[email protected]> Signed-off-by: Stafford Horne <[email protected]>
2017-05-04Merge branch 'for-4.12/dax' into libnvdimm-for-nextDan Williams110-2054/+4759
2017-05-05x86/mm/kaslr: Use the _ASM_MUL macro for multiplication to work around Clang ↵Matthias Kaehlcke2-1/+3
incompatibility The constraint "rm" allows the compiler to put mix_const into memory. When the input operand is a memory location then MUL needs an operand size suffix, since Clang can't infer the multiplication width from the operand. Add and use the _ASM_MUL macro which determines the operand size and resolves to the NUL instruction with the corresponding suffix. This fixes the following error when building with clang: CC arch/x86/lib/kaslr.o /tmp/kaslr-dfe1ad.s: Assembler messages: /tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction Signed-off-by: Matthias Kaehlcke <[email protected]> Cc: Grant Grundler <[email protected]> Cc: Greg Hackmann <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Michael Davidson <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-05-05powerpc/64e: Don't place the stack beyond TASK_SIZEScott Wood1-0/+5
Commit f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") increased the task size on book3s, and introduced a mechanism to dynamically control whether a task uses these larger addresses. While the change to the task size itself was ifdef-protected to only apply on book3s, the change to STACK_TOP_USER64 was not. On book3e, this had the effect of trying to use addresses up to 128TiB for the stack despite a 64TiB task size limit -- which broke 64-bit userspace producing the following errors: Starting init: /sbin/init exists but couldn't execute it (error -14) Starting init: /bin/sh exists but couldn't execute it (error -14) Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance. Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") Cc: Aneesh Kumar K.V <[email protected]> Signed-off-by: Scott Wood <[email protected]>
2017-05-05x86/mm: Fix boot crash caused by incorrect loop count calculation in ↵Baoquan He1-6/+6
sync_global_pgds() Jeff Moyer reported that on his system with two memory regions 0~64G and 1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR will make the system hang intermittently during boot. While adding 'nokaslr' won't. The back trace is: Oops: 0000 [#1] SMP RIP: memcpy_erms() [ .... ] Call Trace: pmem_rw_page() bdev_read_page() do_mpage_readpage() mpage_readpages() blkdev_readpages() __do_page_cache_readahead() force_page_cache_readahead() page_cache_sync_readahead() generic_file_read_iter() blkdev_read_iter() __vfs_read() vfs_read() SyS_read() entry_SYSCALL_64_fastpath() This crash happens because the for loop count calculation in sync_global_pgds() is not correct. When a mapping area crosses PGD entries, we should calculate the starting address of region which next PGD covers and assign it to next for loop count, but not add PGDIR_SIZE directly. The old code works right only if the mapping area is an exact multiple of PGDIR_SIZE, otherwize the end region could be skipped so that it can't be synchronized to all other processes from kernel PGD init_mm.pgd. In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it makes this area be mapped inside one PGD entry. With KASLR enabled, this area could cross two PGD entries, then the next PGD entry won't be synced to all other processes. That is why we saw empty PGD. Fix it. Reported-by: Jeff Moyer <[email protected]> Signed-off-by: Baoquan He <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Young <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jinbum Park <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kees Cook <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Garnier <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Yasuaki Ishimatsu <[email protected]> Cc: Yinghai Lu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-05-05Merge branch 'linus' into x86/urgent, to pick up dependent commitsIngo Molnar1741-49141/+61858
We are going to fix a bug introduced by a more recent commit, so refresh the tree. Signed-off-by: Ingo Molnar <[email protected]>
2017-05-05stackprotector: Increase the per-task stack canary's random range from 32 ↵Daniel Micay1-1/+1
bits to 64 bits on 64-bit platforms The stack canary is an 'unsigned long' and should be fully initialized to random data rather than only 32 bits of random data. Signed-off-by: Daniel Micay <[email protected]> Acked-by: Arjan van de Ven <[email protected]> Acked-by: Rik van Riel <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Arjan van Ven <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-05-05x86/asm: Don't use RBP as a temporary register in csum_partial_copy_generic()Josh Poimboeuf1-6/+6
Andrey Konovalov reported the following warning while fuzzing the kernel with syzkaller: WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0 The unwinder dump revealed that RBP had a bad value when an interrupt occurred in csum_partial_copy_generic(). That function saves RBP on the stack and then overwrites it, using it as a scratch register. That's problematic because it breaks stack traces if an interrupt occurs in the middle of the function. Replace the usage of RBP with another callee-saved register (R15) so stack traces are no longer affected. Reported-by: Andrey Konovalov <[email protected]> Tested-by: Andrey Konovalov <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Cong Wang <[email protected]> Cc: David S . Miller <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Kostya Serebryany <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Marcelo Ricardo Leitner <[email protected]> Cc: Neil Horman <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vlad Yasevich <[email protected]> Cc: [email protected] Cc: netdev <[email protected]> Cc: syzkaller <[email protected]> Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2017-05-04tcmu: fix module removal due to stuck threadMike Christie1-0/+3
We need to do a kthread_should_stop to check when kthread_stop has been called. This was a regression added in b6df4b79a5514a9c6c53533436704129ef45bf76 tcmu: Add global data block pool support so not sure if you wanted to merge it in with that patch or what. Signed-off-by: Mike Christie <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-05-04target: Don't force session reset if queue_depth does not changeNicholas Bellinger1-0/+7
Keeping in the idempotent nature of target_core_fabric_configfs.c, if a queue_depth value is set and it's the same as the existing value, don't attempt to force session reinstatement. Reported-by: Raghu Krishnamurthy <[email protected]> Cc: Raghu Krishnamurthy <[email protected]> Tested-by: Gary Guo <[email protected]> Cc: Gary Guo <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]>
2017-05-04iscsi-target: Set session_fall_back_to_erl0 when forcing reinstatementNicholas Bellinger3-0/+3
While testing modification of per se_node_acl queue_depth forcing session reinstatement via lio_target_nacl_cmdsn_depth_store() -> core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered when changing cmdsn_depth invoked session reinstatement while an iscsi login was already waiting for session reinstatement to complete. This can happen when an outstanding se_cmd descriptor is taking a long time to complete, and session reinstatement from iscsi login or cmdsn_depth change occurs concurrently. To address this bug, explicitly set session_fall_back_to_erl0 = 1 when forcing session reinstatement, so session reinstatement is not attempted if an active session is already being shutdown. This patch has been tested with two scenarios. The first when iscsi login is blocked waiting for iscsi session reinstatement to complete followed by queue_depth change via configfs, and second when queue_depth change via configfs us blocked followed by a iscsi login driven session reinstatement. Note this patch depends on commit d36ad77f702 to handle multiple sessions per se_node_acl when changing cmdsn_depth, and for pre v4.5 kernels will need to be included for stable as well. Reported-by: Gary Guo <[email protected]> Tested-by: Gary Guo <[email protected]> Cc: Gary Guo <[email protected]> Cc: <[email protected]> # v4.1+ Signed-off-by: Nicholas Bellinger <[email protected]>
2017-05-04target: Fix compare_and_write_callback handling for non GOOD statusNicholas Bellinger1-1/+4
Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE status during COMMIT phase in commit 9b2792c3da1, the same bug exists for the READ phase as well. This would manifest first as a lost SCSI response, and eventual hung task during fabric driver logout or re-login, as existing shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref to reach zero. To address this bug, compare_and_write_callback() has been changed to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE as necessary to signal failure status. Reported-by: Bill Borsari <[email protected]> Cc: Bill Borsari <[email protected]> Tested-by: Gary Guo <[email protected]> Cc: Gary Guo <[email protected]> Cc: <[email protected]> # v4.1+ Signed-off-by: Nicholas Bellinger <[email protected]>
2017-05-04libnvdimm, pfn: fix 'npfns' vs section alignmentDan Williams1-2/+4
Fix failures to create namespaces due to the vmem_altmap not advertising enough free space to store the memmap. WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0 [..] Call Trace: dump_stack+0x63/0x83 __warn+0xcb/0xf0 warn_slowpath_null+0x1d/0x20 arch_add_memory+0xde/0xf0 devm_memremap_pages+0x244/0x440 pmem_attach_disk+0x37e/0x490 [nd_pmem] nd_pmem_probe+0x7e/0xa0 [nd_pmem] nvdimm_bus_probe+0x71/0x120 [libnvdimm] driver_probe_device+0x2bb/0x460 bind_store+0x114/0x160 drv_attr_store+0x25/0x30 In commit 658922e57b84 "libnvdimm, pfn: fix memmap reservation sizing" we arranged for the capacity to be allocated, but failed to also update the 'npfns' parameter. This leads to cases where there is enough capacity reserved to hold all the allocated sections, but vmemmap_populate_hugepages() still encounters -ENOMEM from altmap_alloc_block_buf(). This fix is a stop-gap until we can teach the core memory hotplug implementation to permit sub-section hotplug. Cc: <[email protected]> Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing") Reported-by: Anisha Allada <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2017-05-04Merge tag 'char-misc-4.12-rc1' of ↵Linus Torvalds166-3259/+8064
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver updates from Greg KH: "Here is the big set of new char/misc driver drivers and features for 4.12-rc1. There's lots of new drivers added this time around, new firmware drivers from Google, more auxdisplay drivers, extcon drivers, fpga drivers, and a bunch of other driver updates. Nothing major, except if you happen to have the hardware for these drivers, and then you will be happy :) All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits) firmware: google memconsole: Fix return value check in platform_memconsole_init() firmware: Google VPD: Fix return value check in vpd_platform_init() goldfish_pipe: fix build warning about using too much stack. goldfish_pipe: An implementation of more parallel pipe fpga fr br: update supported version numbers fpga: region: release FPGA region reference in error path fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe() mei: drop the TODO from samples firmware: Google VPD sysfs driver firmware: Google VPD: import lib_vpd source files misc: lkdtm: Add volatile to intentional NULL pointer reference eeprom: idt_89hpesx: Add OF device ID table misc: ds1682: Add OF device ID table misc: tsl2550: Add OF device ID table w1: Remove unneeded use of assert() and remove w1_log.h w1: Use kernel common min() implementation uio_mf624: Align memory regions to page size and set correct offsets uio_mf624: Refactor memory info initialization uio: Allow handling of non page-aligned memory regions hangcheck-timer: Fix typo in comment ...
2017-05-05drm: Document code of conductDaniel Vetter1-0/+11
freedesktop.org has adopted a formal&enforced code of conduct: https://www.fooishbar.org/blog/fdo-contributor-covenant/ https://www.freedesktop.org/wiki/CodeOfConduct/ Besides formalizing things a bit more I don't think this changes anything for us, we've already peer-enforced respectful and constructive interactions since a long time. But it's good to document things properly. v2: Drop confusing note from commit message and clarify the grammer (Chris, Alex and others). Cc: Daniel Stone <[email protected]> Cc: Keith Packard <[email protected]> Cc: [email protected] Signed-off-by: Daniel Vetter <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Sumit Semwal <[email protected]> Acked-by: Archit Taneja <[email protected]> Reviewed-by: Martin Peres <[email protected]> Acked-by: Thierry Reding <[email protected]> Acked-by: Jani Nikula <[email protected]> Acked-by: Vincent Abriou <[email protected]> Acked-by: Neil Armstrong <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]> Acked-by: Brian Starkey <[email protected]> Acked-by: Rob Clark <[email protected]> Reviewed-by: David Herrmann <[email protected]> Acked-by: Sean Paul <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Alex Deucher <[email protected]> Acked-by: Gustavo Padovan <[email protected]> Acked-by: Michel Dänzer <[email protected]> Acked-by: Laurent Pinchart <[email protected]> Acked-by: Sumit Semwal <[email protected]> Acked-by: Keith Packard <[email protected]> Acked-by: Gabriel Krisman Bertazi <[email protected]> Acked-by: Adam Jackson <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2017-05-05Merge tag 'drm/tegra/for-4.12-rc1' of ↵Dave Airlie25-202/+1517
git://anongit.freedesktop.org/tegra/linux into drm-next drm/tegra: Changes for v4.12-rc1 This contains various fixes to the host1x driver as well as a plug for a leak of kernel pointers to userspace. A fairly big addition this time around is the Video Image Composer (VIC) support that can be used to accelerate some 2D and image compositing operations. Furthermore the driver now supports FB modifiers, so we no longer rely on a custom IOCTL to set those. Finally this contains a few preparatory patches for Tegra186 support which unfortunately didn't quite make it this time, but will hopefully be ready for v4.13. * tag 'drm/tegra/for-4.12-rc1' of git://anongit.freedesktop.org/tegra/linux: gpu: host1x: Fix host1x driver shutdown gpu: host1x: Support module reset gpu: host1x: Sort includes alphabetically drm/tegra: Add VIC support dt-bindings: Add bindings for the Tegra VIC drm/tegra: Add falcon helper library drm/tegra: Add Tegra DRM allocation API drm/tegra: Add tiling FB modifiers drm/tegra: Don't leak kernel pointer to userspace drm/tegra: Protect IOMMU operations by mutex drm/tegra: Enable IOVA API when IOMMU support is enabled gpu: host1x: Add IOMMU support gpu: host1x: Fix potential out-of-bounds access iommu/iova: Fix compile error with CONFIG_IOMMU_IOVA=m iommu: Add dummy implementations for !IOMMU_IOVA MAINTAINERS: Add related headers to IOMMU section iommu/iova: Consolidate code for adding new node to iovad domain rbtree
2017-05-04Merge tag 'driver-core-4.12-rc1' of ↵Linus Torvalds6-9/+29
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Very tiny pull request for 4.12-rc1 for the driver core this time around. There are some documentation fixes, an eventpoll.h fixup to make it easier for the libc developers to take our header files directly, and some very minor driver core fixes and changes. All have been in linux-next for a very long time with no reported issues" * tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: Revert "kref: double kref_put() in my_data_handler()" driver core: don't initialize 'parent' in device_add() drivers: base: dma-mapping: use nth_page helper Documentation/ABI: add information about cpu_capacity debugfs: set no_llseek in DEFINE_DEBUGFS_ATTRIBUTE eventpoll.h: add missing epoll event masks eventpoll.h: fix epoll event masks