Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Highlights include:
- Larger virtual address space on 64-bit server CPUs. By default we
use a 128TB virtual address space, but a process can request access
to the full 512TB by passing a hint to mmap().
- Support for the new Power9 "XIVE" interrupt controller.
- TLB flushing optimisations for the radix MMU on Power9.
- Support for CAPI cards on Power9, using the "Coherent Accelerator
Interface Architecture 2.0".
- The ability to configure the mmap randomisation limits at build and
runtime.
- Several small fixes and cleanups to the kprobes code, as well as
support for KPROBES_ON_FTRACE.
- Major improvements to handling of system reset interrupts,
correctly treating them as NMIs, giving them a dedicated stack and
using a new hypervisor call to trigger them, all of which should
aid debugging and robustness.
- Many fixes and other minor enhancements.
Thanks to: Alastair D'Silva, Alexey Kardashevskiy, Alistair Popple,
Andrew Donnellan, Aneesh Kumar K.V, Anshuman Khandual, Anton
Blanchard, Balbir Singh, Ben Hutchings, Benjamin Herrenschmidt,
Bhupesh Sharma, Chris Packham, Christian Zigotzky, Christophe Leroy,
Christophe Lombard, Daniel Axtens, David Gibson, Gautham R. Shenoy,
Gavin Shan, Geert Uytterhoeven, Guilherme G. Piccoli, Hamish Martin,
Hari Bathini, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mahesh J
Salgaonkar, Mahesh Salgaonkar, Masami Hiramatsu, Matt Brown, Matthew
R. Ochs, Michael Neuling, Naveen N. Rao, Nicholas Piggin, Oliver
O'Halloran, Pan Xinhui, Paul Mackerras, Rashmica Gupta, Russell
Currey, Sukadev Bhattiprolu, Thadeu Lima de Souza Cascardo, Tobin C.
Harding, Tyrel Datwyler, Uma Krishnan, Vaibhav Jain, Vipin K Parashar,
Yang Shi"
* tag 'powerpc-4.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (214 commits)
powerpc/64s: Power9 has no LPCR[VRMASD] field so don't set it
powerpc/powernv: Fix TCE kill on NVLink2
powerpc/mm/radix: Drop support for CPUs without lockless tlbie
powerpc/book3s/mce: Move add_taint() later in virtual mode
powerpc/sysfs: Move #ifdef CONFIG_HOTPLUG_CPU out of the function body
powerpc/smp: Document irq enable/disable after migrating IRQs
powerpc/mpc52xx: Don't select user-visible RTAS_PROC
powerpc/powernv: Document cxl dependency on special case in pnv_eeh_reset()
powerpc/eeh: Clean up and document event handling functions
powerpc/eeh: Avoid use after free in eeh_handle_special_event()
cxl: Mask slice error interrupts after first occurrence
cxl: Route eeh events to all drivers in cxl_pci_error_detected()
cxl: Force context lock during EEH flow
powerpc/64: Allow CONFIG_RELOCATABLE if COMPILE_TEST
powerpc/xmon: Teach xmon oops about radix vectors
powerpc/mm/hash: Fix off-by-one in comment about kernel contexts ids
powerpc/pseries: Enable VFIO
powerpc/powernv: Fix iommu table size calculation hook for small tables
powerpc/powernv: Check kzalloc() return value in pnv_pci_table_alloc
powerpc: Add arch/powerpc/tools directory
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull namespace updates from Eric Biederman:
"This is a set of small fixes that were mostly stumbled over during
more significant development. This proc fix and the fix to
posix-timers are the most significant of the lot.
There is a lot of good development going on but unfortunately it
didn't quite make the merge window"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
proc: Fix unbalanced hard link numbers
signal: Make kill_proc_info static
rlimit: Properly call security_task_setrlimit
signal: Remove unused definition of sig_user_definied
ia64: Remove unused IA64_TASK_SIGHAND_OFFSET and IA64_SIGHAND_SIGLOCK_OFFSET
ipc: Remove unused declaration of recompute_msgmni
posix-timers: Correct sanity check in posix_cpu_nsleep
sysctl: Remove dead register_sysctl_root
|
|
This patch removes useless nfs_readdir_get_array() and
nfs_readdir_release_array() as suggested by Trond Myklebust
nfs_readdir() calls nfs_revalidate_mapping() before
readdir_search_pagecache() , nfs_do_filldir(), uncached_readdir()
so mapping should be correct.
While kmap() can't fail, all subsequent error checks were removed
as well as unused labels.
Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
When using 'ls -l' to display a large directory, if noac option is used,
in function nfs_getattr() nfs_need_revalidate_inode() will always be true
for NFSv3 and the nfs_entry cache of the directory will be flushed. The
flush will lead to a fully reread of the directory entries from server.
To prevent the unnecessary RPCs, we need to check whether or not the
noac option is used, and always report the invocation of nfs_getattr()
as cache hit instead cache miss when it's on.
Signed-off-by: Hou Tao <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
nfs4_proc_async_renew
If memory allocation fails for the callback data, we need to put the nfs_client
or we end up with an elevated refcount.
Signed-off-by: Dave Wysochanski <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
If the server returns NFS4ERR_CONN_NOT_BOUND_TO_SESSION because we
are trunking, then RECLAIM_COMPLETE must handle that by calling
nfs4_schedule_session_recovery() and then retrying.
Reported-by: Chuck Lever <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Tested-by: Chuck Lever <[email protected]>
|
|
Whole point of randomization was to hide server uptime, but an attacker
can simply start a syn flood and TCP generates 'old style' timestamps,
directly revealing server jiffies value.
Also, TSval sent by the server to a particular remote address vary
depending on syncookies being sent or not, potentially triggering PAWS
drops for innocent clients.
Lets implement proper randomization, including for SYNcookies.
Also we do not need to export sysctl_tcp_timestamps, since it is not
used from a module.
In v2, I added Florian feedback and contribution, adding tsoff to
tcp_get_cookie_sock().
v3 removed one unused variable in tcp_v4_connect() as Florian spotted.
Fixes: 95a22caee396c ("tcp: randomize tcp timestamp offsets for each connection")
Signed-off-by: Eric Dumazet <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Tested-by: Florian Westphal <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
While working on another build error, I ran into several variations of
this dependency loop:
subsection "Kconfig recursive dependency limitations"
drivers/input/Kconfig:8: symbol INPUT is selected by VT
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/tty/Kconfig:12: symbol VT is selected by FB_STI
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/video/fbdev/Kconfig:677: symbol FB_STI depends on FB
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/video/fbdev/Kconfig:5: symbol FB is selected by DRM_KMS_FB_HELPER
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/gpu/drm/Kconfig:72: symbol DRM_KMS_FB_HELPER is selected by DRM_KMS_CMA_HELPER
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/gpu/drm/Kconfig:137: symbol DRM_KMS_CMA_HELPER is selected by DRM_HDLCD
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/gpu/drm/arm/Kconfig:6: symbol DRM_HDLCD depends on OF
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/of/Kconfig:4: symbol OF is selected by X86_INTEL_CE
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
arch/x86/Kconfig:523: symbol X86_INTEL_CE depends on X86_IO_APIC
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
arch/x86/Kconfig:1011: symbol X86_IO_APIC depends on X86_LOCAL_APIC
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
arch/x86/Kconfig:1005: symbol X86_LOCAL_APIC depends on X86_UP_APIC
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
arch/x86/Kconfig:980: symbol X86_UP_APIC depends on PCI_MSI
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/pci/Kconfig:11: symbol PCI_MSI is selected by AMD_IOMMU
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/iommu/Kconfig:106: symbol AMD_IOMMU depends on IOMMU_SUPPORT
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/iommu/Kconfig:5: symbol IOMMU_SUPPORT is selected by DRM_ETNAVIV
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/gpu/drm/etnaviv/Kconfig:2: symbol DRM_ETNAVIV depends on THERMAL
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/thermal/Kconfig:5: symbol THERMAL is selected by ACPI_VIDEO
For a resolution refer to Documentation/kbuild/kconfig-language.txt
subsection "Kconfig recursive dependency limitations"
drivers/acpi/Kconfig:183: symbol ACPI_VIDEO is selected by INPUT
This doesn't currently show up as I fixed the 'THERMAL' part of it,
but I noticed that the FB_STI dependency should not be there but
was introduced by slightly incorrect bug-fix patch that tried to
fix a link error.
Instead of selecting 'VT' to make us enter the drivers/video/console
directory at compile-time, it's sufficient to build the
drivers/video/console/sticore.c file by adding its directory
to when CONFIG_FB_STI is enabled. Alternatively, we could move the
sticore code to another directory that is always built when we
have at STI_CONSOLE or FB_STI enabled.
Fixes: 17085a934592 ("parisc: stifb: should depend on STI_CONSOLE")
Signed-off-by: Arnd Bergmann <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Alexander Beregalov <[email protected]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
|
|
The attribute sizes for IFLA_BRPORT_MCAST_FLOOD and
IFLA_BRPORT_BCAST_FLOOD weren't accounted for in br_port_info_size()
when they were added. Do so now and also add the corresponding policy
entries:
Cc: Nikolay Aleksandrov <[email protected]>
Cc: Mike Manning <[email protected]>
Fixes: b6cb5ac8331b ("net: bridge: add per-port multicast flood flag")
Fixes: 99f906e9ad7b ("bridge: add per-port broadcast flood flag")
Signed-off-by: Tobias Klauser <[email protected]>
Acked-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
SFM is mapping doublequote to 0xF020
Without this patch creating files with doublequote fails to Windows/Mac
Signed-off-by: Bjoern Jacke <[email protected]>
Signed-off-by: Steve French <[email protected]>
CC: stable <[email protected]>
|
|
|
|
|
|
While honouring the DMA_ATTR_FORCE_CONTIGUOUS on arm64 (commit
44176bb38fa4: "arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to
IOMMU"), the existing uses of dma_mmap_attrs() and dma_get_sgtable()
have been broken by passing a physically contiguous vm_struct with an
invalid pages pointer through the common iommu API.
Since the coherent allocation with DMA_ATTR_FORCE_CONTIGUOUS uses CMA,
this patch simply reuses the existing swiotlb logic for mmap and
get_sgtable.
Note that the current implementation of get_sgtable (both swiotlb and
iommu) is broken if dma_declare_coherent_memory() is used since such
memory does not have a corresponding struct page. To be addressed in a
subsequent patch.
Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Reported-by: Andrzej Hajda <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Acked-by: Robin Murphy <[email protected]>
Tested-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
based on commit b3b42c0deaa1
("fs/affs: make export work with cold dcache")
This adds get_parent function so that nfs client can still work after
cache drop (Tested on NFS v4 with echo 3 > /proc/sys/vm/drop_caches)
Signed-off-by: Fabian Frederick <[email protected]>
Signed-off-by: Luis de Bethencourt <[email protected]>
|
|
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
An upper type non directory dentry that is a copy up target
should have a reference to its lower copy up origin.
There are three ways for an upper type dentry to be instantiated:
1. A lower type dentry that is being copied up
2. An entry that is found in upper dir by ovl_lookup()
3. A negative dentry is hardlinked to an upper type dentry
In the first case, the lower reference is set before copy up.
In the second case, the lower reference is found by ovl_lookup().
In the last case of hardlinked upper dentry, it is not easy to
update the lower reference of the negative dentry. Instead,
drop the newly hardlinked negative dentry from dcache and let
the next access call ovl_lookup() to find its lower reference.
This makes sure that the inode number reported by stat(2) after
the hardlink is created is the same inode number that will be
reported by stat(2) after mount cycle, which is the inode number
of the lower copy up origin of the hardlink source.
NOTE that this does not fix breaking of lower hardlinks on copy
up, but only fixes the case of lower nlink == 1, whose upper copy
up inode is hardlinked in upper dir.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
When all layers are on the same underlying filesystem, let stat(2) return
st_dev/st_ino values of the copy up origin inode if it is known.
This results in constant st_ino/st_dev representation of files in an
overlay mount before and after copy up.
When the underlying filesystem support NFS exportfs, the result is also
persistent st_ino/st_dev representation before and after mount cycle.
Lower hardlinks are broken on copy up to different upper files, so we
cannot use the lower origin st_ino for those different files, even for the
same fs case.
When all overlay layers are on the same fs, use overlay st_dev for non-dirs
to get the correct result from du -x.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
stat(2) on overlay directories reports the overlay temp inode
number, which is constant across copy up, but is not persistent.
When all layers are on the same fs, report the copy up origin inode
number for directories.
This inode number is persistent, unique across the overlay mount and
constant across copy up.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
For directory entries, non zero oe->numlower implies OVL_TYPE_MERGE.
Define a new type flag OVL_TYPE_ORIGIN to indicate that an entry holds a
reference to its lower copy up origin.
For directory entries ORIGIN := MERGE && UPPER. For non-dir entries ORIGIN
means that a lower type dentry has been recently copied up or that we were
able to find the copy up origin from overlay.origin xattr.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
If overlay.origin xattr is found on a non-dir upper inode try to get lower
dentry by calling exportfs_decode_fh().
On failure to lookup by file handle to lower layer, do not lookup the copy
up origin by name, because the lower found by name could be another file in
case the upper file was renamed.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Sometimes it is interesting to know if an upper file is pure upper or a
copy up target, and if it is a copy up target, it may be interesting to
find the copy up origin.
This will be used to preserve lower inode numbers across copy up.
Store the lower inode file handle in upper inode extended attribute
overlay.origin on copy up to use it later for these cases. Store the lower
filesystem uuid along side the file handle, so we can validate that we are
looking for the origin file in the original fs.
If lower fs does not support NFS export ops store a zero sized xattr so we
can always use the overlay.origin xattr to distinguish between a copy up
and a pure upper inode.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Some features can only work when all layers are on the same fs. Test this
condition during mount time, so features can check them later.
Add helper ovl_same_sb() to return the common super block in case all
layers are on the same fs.
Signed-off-by: Amir Goldstein <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
|
|
Routines that are set by xen_init_time_ops() use shared_info's
pvclock_vcpu_time_info area. This area is not properly available until
shared_info is mapped in xen_setup_shared_info().
This became especially problematic due to commit dd759d93f4dd ("x86/timers:
Add simple udelay calibration") where we end up reading tsc_to_system_mul
from xen_dummy_shared_info (i.e. getting zero value) and then trying
to divide by it in pvclock_tsc_khz().
Signed-off-by: Boris Ostrovsky <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
|
|
Commit 690b7f10b4f9f ("x86/xen: use capabilities instead of fake cpuid
values for xsave") introduced a regression as it tried to make use of
the fixup feature before it being available.
Fall back to the old variant testing via cpuid().
Signed-off-by: Juergen Gross <[email protected]>
Reviewed-by: Boris Ostrovsky <[email protected]>
Signed-off-by: Juergen Gross <[email protected]>
|
|
According to the SDM, if the "activate secondary controls" primary
processor-based VM-execution control is 0, no checks are performed on
the secondary processor-based VM-execution controls.
Signed-off-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
orderly_poweroff is triggered when a graceful shutdown
of system is desired. This may be used in many critical states of the
kernel such as when subsystems detects conditions such as critical
temperature conditions. However, in certain conditions in system
boot up sequences like those in the middle of driver probes being
initiated, userspace will be unable to power off the system in a clean
manner and leaves the system in a critical state. In cases like these,
the /sbin/poweroff will return success (having forked off to attempt
powering off the system. However, the system overall will fail to
completely poweroff (since other modules will be probed) and the system
is still functional with no userspace (since that would have shut itself
off).
However, there is no clean way of detecting such failure of userspace
powering off the system. In such scenarios, it is necessary for a backup
workqueue to be able to force a shutdown of the system when orderly
shutdown is not successful after a configurable time period.
Reported-by: Nishanth Menon <[email protected]>
Signed-off-by: Keerthy <[email protected]>
Acked-by: Eduardo Valentin <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
|
|
thermal_zone_device_check --> thermal_zone_device_update -->
handle_thermal_trip --> handle_critical_trips --> orderly_poweroff
The above sequence happens every 250/500 mS based on the configuration.
The orderly_poweroff function is getting called every 250/500 mS.
With a full fledged file system it takes at least 5-10 Seconds to
power off gracefully.
In that period due to the thermal_zone_device_check triggering
periodically the thermal work queues bombard with
orderly_poweroff calls multiple times eventually leading to
failures in gracefully powering off the system.
Make sure that orderly_poweroff is called only once.
Signed-off-by: Keerthy <[email protected]>
Acked-by: Eduardo Valentin <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
|
|
The interrupt request call in Intel SoC DTS driver may fail if
there is no underlying BIOS support. However, the user space
thermal daemon can still use the thermal zones created by the
SoC DTS driver in polling mode, therefore, instead of bailing
out on interrupt request failures, it is better just to log
a warning message and continue the init process.
Signed-off-by: Brian Bian <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
|
|
This patch adds another parameter to the trace function:
trace_thermal_power_devfreq_get_power().
In case when we call directly driver's code for the real power,
we do not have static/dynamic_power values. Instead we get total
power in the '*power' value. The 'static_power' and
'dynamic_power' are set to 0.
Therefore, we have to trace that '*power' value in this scenario.
CC: Steven Rostedt <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Zhang Rui <[email protected]>
CC: Eduardo Valentin <[email protected]>
Acked-by: Javi Merino <[email protected]>
Signed-off-by: Lukasz Luba <[email protected]>
|
|
This patch introduces a new interface for device drivers connected to
devfreq_cooling in the thermal framework: get_real_power().
Some devices have more sophisticated methods (like power counters)
to approximate the actual power that they use.
In the previous implementation we had a pre-calculated power
table which was then scaled by 'utilization'
('busy_time' and 'total_time' taken from devfreq 'last_status').
With this new interface the driver can provide more precise data
regarding actual power to the thermal governor every time the power
budget is calculated. We then use this value and calculate the real
resource utilization scaling factor.
Reviewed-by: Chris Diamand <[email protected]>
Acked-by: Javi Merino <[email protected]>
Signed-off-by: Lukasz Luba <[email protected]>
|
|
Move the code which gets the voltage for a given frequency.
This code will be resused in few places.
Acked-by: Javi Merino <[email protected]>
Signed-off-by: Lukasz Luba <[email protected]>
|
|
In OpenRISC we do not have a bootloader passed initrd, but the built in
initramfs does contain the /init and other binaries, including modules.
The previous commit 08865514805d2 ("initramfs: finish fput() before
accessing any binary from initramfs") made a change to only call fput()
if the bootloader initrd was available, this caused intermittent crashes
for OpenRISC.
This patch changes the fput() to happen unconditionally if any rootfs is
loaded. Also, I added some comments to make it a bit more clear why we
call unpack_to_rootfs() multiple times.
Fixes: 08865514805d2 ("initramfs: finish fput() before accessing any binary from initramfs")
Cc: [email protected]
Cc: Lokesh Vutla <[email protected]>
Cc: Al Viro <[email protected]>
Acked-by: Al Viro <[email protected]>
Signed-off-by: Stafford Horne <[email protected]>
|
|
|
|
incompatibility
The constraint "rm" allows the compiler to put mix_const into memory.
When the input operand is a memory location then MUL needs an operand
size suffix, since Clang can't infer the multiplication width from the
operand.
Add and use the _ASM_MUL macro which determines the operand size and
resolves to the NUL instruction with the corresponding suffix.
This fixes the following error when building with clang:
CC arch/x86/lib/kaslr.o
/tmp/kaslr-dfe1ad.s: Assembler messages:
/tmp/kaslr-dfe1ad.s:182: Error: no instruction mnemonic suffix given and no register operands; can't size instruction
Signed-off-by: Matthias Kaehlcke <[email protected]>
Cc: Grant Grundler <[email protected]>
Cc: Greg Hackmann <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Michael Davidson <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Commit f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB") increased
the task size on book3s, and introduced a mechanism to dynamically
control whether a task uses these larger addresses. While the change to
the task size itself was ifdef-protected to only apply on book3s, the
change to STACK_TOP_USER64 was not. On book3e, this had the effect of
trying to use addresses up to 128TiB for the stack despite a 64TiB task
size limit -- which broke 64-bit userspace producing the following errors:
Starting init: /sbin/init exists but couldn't execute it (error -14)
Starting init: /bin/sh exists but couldn't execute it (error -14)
Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
Fixes: f4ea6dcb08ea ("powerpc/mm: Enable mappings above 128TB")
Cc: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Scott Wood <[email protected]>
|
|
sync_global_pgds()
Jeff Moyer reported that on his system with two memory regions 0~64G and
1T~1T+192G, and kernel option "memmap=192G!1024G" added, enabling KASLR
will make the system hang intermittently during boot. While adding 'nokaslr'
won't.
The back trace is:
Oops: 0000 [#1] SMP
RIP: memcpy_erms()
[ .... ]
Call Trace:
pmem_rw_page()
bdev_read_page()
do_mpage_readpage()
mpage_readpages()
blkdev_readpages()
__do_page_cache_readahead()
force_page_cache_readahead()
page_cache_sync_readahead()
generic_file_read_iter()
blkdev_read_iter()
__vfs_read()
vfs_read()
SyS_read()
entry_SYSCALL_64_fastpath()
This crash happens because the for loop count calculation in sync_global_pgds()
is not correct. When a mapping area crosses PGD entries, we should
calculate the starting address of region which next PGD covers and assign
it to next for loop count, but not add PGDIR_SIZE directly. The old
code works right only if the mapping area is an exact multiple of PGDIR_SIZE,
otherwize the end region could be skipped so that it can't be synchronized
to all other processes from kernel PGD init_mm.pgd.
In Jeff's system, emulated pmem area [1024G, 1216G) is smaller than
PGDIR_SIZE. While 'nokaslr' works because PAGE_OFFSET is 1T aligned, it
makes this area be mapped inside one PGD entry. With KASLR enabled,
this area could cross two PGD entries, then the next PGD entry won't
be synced to all other processes. That is why we saw empty PGD.
Fix it.
Reported-by: Jeff Moyer <[email protected]>
Signed-off-by: Baoquan He <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Dave Young <[email protected]>
Cc: Denys Vlasenko <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jinbum Park <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Yasuaki Ishimatsu <[email protected]>
Cc: Yinghai Lu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We are going to fix a bug introduced by a more recent commit, so
refresh the tree.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
bits to 64 bits on 64-bit platforms
The stack canary is an 'unsigned long' and should be fully initialized to
random data rather than only 32 bits of random data.
Signed-off-by: Daniel Micay <[email protected]>
Acked-by: Arjan van de Ven <[email protected]>
Acked-by: Rik van Riel <[email protected]>
Acked-by: Kees Cook <[email protected]>
Cc: Arjan van Ven <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Andrey Konovalov reported the following warning while fuzzing the kernel
with syzkaller:
WARNING: kernel stack regs at ffff8800686869f8 in a.out:4933 has bad 'bp' value c3fc855a10167ec0
The unwinder dump revealed that RBP had a bad value when an interrupt
occurred in csum_partial_copy_generic().
That function saves RBP on the stack and then overwrites it, using it as
a scratch register. That's problematic because it breaks stack traces
if an interrupt occurs in the middle of the function.
Replace the usage of RBP with another callee-saved register (R15) so
stack traces are no longer affected.
Reported-by: Andrey Konovalov <[email protected]>
Tested-by: Andrey Konovalov <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: David S . Miller <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Kostya Serebryany <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Marcelo Ricardo Leitner <[email protected]>
Cc: Neil Horman <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: [email protected]
Cc: netdev <[email protected]>
Cc: syzkaller <[email protected]>
Link: http://lkml.kernel.org/r/4b03a961efda5ec9bfe46b7b9c9ad72d1efad343.1493909486.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <[email protected]>
|
|
We need to do a kthread_should_stop to check when kthread_stop has been
called.
This was a regression added in
b6df4b79a5514a9c6c53533436704129ef45bf76
tcmu: Add global data block pool support
so not sure if you wanted to merge it in with that patch or what.
Signed-off-by: Mike Christie <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
|
|
Keeping in the idempotent nature of target_core_fabric_configfs.c,
if a queue_depth value is set and it's the same as the existing
value, don't attempt to force session reinstatement.
Reported-by: Raghu Krishnamurthy <[email protected]>
Cc: Raghu Krishnamurthy <[email protected]>
Tested-by: Gary Guo <[email protected]>
Cc: Gary Guo <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
|
|
While testing modification of per se_node_acl queue_depth forcing
session reinstatement via lio_target_nacl_cmdsn_depth_store() ->
core_tpg_set_initiator_node_queue_depth(), a hung task bug triggered
when changing cmdsn_depth invoked session reinstatement while an iscsi
login was already waiting for session reinstatement to complete.
This can happen when an outstanding se_cmd descriptor is taking a
long time to complete, and session reinstatement from iscsi login
or cmdsn_depth change occurs concurrently.
To address this bug, explicitly set session_fall_back_to_erl0 = 1
when forcing session reinstatement, so session reinstatement is
not attempted if an active session is already being shutdown.
This patch has been tested with two scenarios. The first when
iscsi login is blocked waiting for iscsi session reinstatement
to complete followed by queue_depth change via configfs, and
second when queue_depth change via configfs us blocked followed
by a iscsi login driven session reinstatement.
Note this patch depends on commit d36ad77f702 to handle multiple
sessions per se_node_acl when changing cmdsn_depth, and for
pre v4.5 kernels will need to be included for stable as well.
Reported-by: Gary Guo <[email protected]>
Tested-by: Gary Guo <[email protected]>
Cc: Gary Guo <[email protected]>
Cc: <[email protected]> # v4.1+
Signed-off-by: Nicholas Bellinger <[email protected]>
|
|
Following the bugfix for handling non SAM_STAT_GOOD COMPARE_AND_WRITE
status during COMMIT phase in commit 9b2792c3da1, the same bug exists
for the READ phase as well.
This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.
To address this bug, compare_and_write_callback() has been changed
to set post_ret = 1 and return TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE
as necessary to signal failure status.
Reported-by: Bill Borsari <[email protected]>
Cc: Bill Borsari <[email protected]>
Tested-by: Gary Guo <[email protected]>
Cc: Gary Guo <[email protected]>
Cc: <[email protected]> # v4.1+
Signed-off-by: Nicholas Bellinger <[email protected]>
|
|
Fix failures to create namespaces due to the vmem_altmap not advertising
enough free space to store the memmap.
WARNING: CPU: 15 PID: 8022 at arch/x86/mm/init_64.c:656 arch_add_memory+0xde/0xf0
[..]
Call Trace:
dump_stack+0x63/0x83
__warn+0xcb/0xf0
warn_slowpath_null+0x1d/0x20
arch_add_memory+0xde/0xf0
devm_memremap_pages+0x244/0x440
pmem_attach_disk+0x37e/0x490 [nd_pmem]
nd_pmem_probe+0x7e/0xa0 [nd_pmem]
nvdimm_bus_probe+0x71/0x120 [libnvdimm]
driver_probe_device+0x2bb/0x460
bind_store+0x114/0x160
drv_attr_store+0x25/0x30
In commit 658922e57b84 "libnvdimm, pfn: fix memmap reservation sizing"
we arranged for the capacity to be allocated, but failed to also update
the 'npfns' parameter. This leads to cases where there is enough
capacity reserved to hold all the allocated sections, but
vmemmap_populate_hugepages() still encounters -ENOMEM from
altmap_alloc_block_buf().
This fix is a stop-gap until we can teach the core memory hotplug
implementation to permit sub-section hotplug.
Cc: <[email protected]>
Fixes: 658922e57b84 ("libnvdimm, pfn: fix memmap reservation sizing")
Reported-by: Anisha Allada <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the big set of new char/misc driver drivers and features for
4.12-rc1.
There's lots of new drivers added this time around, new firmware
drivers from Google, more auxdisplay drivers, extcon drivers, fpga
drivers, and a bunch of other driver updates. Nothing major, except if
you happen to have the hardware for these drivers, and then you will
be happy :)
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
firmware: google memconsole: Fix return value check in platform_memconsole_init()
firmware: Google VPD: Fix return value check in vpd_platform_init()
goldfish_pipe: fix build warning about using too much stack.
goldfish_pipe: An implementation of more parallel pipe
fpga fr br: update supported version numbers
fpga: region: release FPGA region reference in error path
fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe()
mei: drop the TODO from samples
firmware: Google VPD sysfs driver
firmware: Google VPD: import lib_vpd source files
misc: lkdtm: Add volatile to intentional NULL pointer reference
eeprom: idt_89hpesx: Add OF device ID table
misc: ds1682: Add OF device ID table
misc: tsl2550: Add OF device ID table
w1: Remove unneeded use of assert() and remove w1_log.h
w1: Use kernel common min() implementation
uio_mf624: Align memory regions to page size and set correct offsets
uio_mf624: Refactor memory info initialization
uio: Allow handling of non page-aligned memory regions
hangcheck-timer: Fix typo in comment
...
|
|
freedesktop.org has adopted a formal&enforced code of conduct:
https://www.fooishbar.org/blog/fdo-contributor-covenant/
https://www.freedesktop.org/wiki/CodeOfConduct/
Besides formalizing things a bit more I don't think this changes
anything for us, we've already peer-enforced respectful and
constructive interactions since a long time. But it's good to document
things properly.
v2: Drop confusing note from commit message and clarify the grammer
(Chris, Alex and others).
Cc: Daniel Stone <[email protected]>
Cc: Keith Packard <[email protected]>
Cc: [email protected]
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Daniel Stone <[email protected]>
Reviewed-by: Sumit Semwal <[email protected]>
Acked-by: Archit Taneja <[email protected]>
Reviewed-by: Martin Peres <[email protected]>
Acked-by: Thierry Reding <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Acked-by: Vincent Abriou <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Reviewed-by: Maarten Lankhorst <[email protected]>
Acked-by: Brian Starkey <[email protected]>
Acked-by: Rob Clark <[email protected]>
Reviewed-by: David Herrmann <[email protected]>
Acked-by: Sean Paul <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Acked-by: Gustavo Padovan <[email protected]>
Acked-by: Michel Dänzer <[email protected]>
Acked-by: Laurent Pinchart <[email protected]>
Acked-by: Sumit Semwal <[email protected]>
Acked-by: Keith Packard <[email protected]>
Acked-by: Gabriel Krisman Bertazi <[email protected]>
Acked-by: Adam Jackson <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v4.12-rc1
This contains various fixes to the host1x driver as well as a plug for a
leak of kernel pointers to userspace.
A fairly big addition this time around is the Video Image Composer (VIC)
support that can be used to accelerate some 2D and image compositing
operations.
Furthermore the driver now supports FB modifiers, so we no longer rely
on a custom IOCTL to set those.
Finally this contains a few preparatory patches for Tegra186 support
which unfortunately didn't quite make it this time, but will hopefully
be ready for v4.13.
* tag 'drm/tegra/for-4.12-rc1' of git://anongit.freedesktop.org/tegra/linux:
gpu: host1x: Fix host1x driver shutdown
gpu: host1x: Support module reset
gpu: host1x: Sort includes alphabetically
drm/tegra: Add VIC support
dt-bindings: Add bindings for the Tegra VIC
drm/tegra: Add falcon helper library
drm/tegra: Add Tegra DRM allocation API
drm/tegra: Add tiling FB modifiers
drm/tegra: Don't leak kernel pointer to userspace
drm/tegra: Protect IOMMU operations by mutex
drm/tegra: Enable IOVA API when IOMMU support is enabled
gpu: host1x: Add IOMMU support
gpu: host1x: Fix potential out-of-bounds access
iommu/iova: Fix compile error with CONFIG_IOMMU_IOVA=m
iommu: Add dummy implementations for !IOMMU_IOVA
MAINTAINERS: Add related headers to IOMMU section
iommu/iova: Consolidate code for adding new node to iovad domain rbtree
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Very tiny pull request for 4.12-rc1 for the driver core this time
around.
There are some documentation fixes, an eventpoll.h fixup to make it
easier for the libc developers to take our header files directly, and
some very minor driver core fixes and changes.
All have been in linux-next for a very long time with no reported
issues"
* tag 'driver-core-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Revert "kref: double kref_put() in my_data_handler()"
driver core: don't initialize 'parent' in device_add()
drivers: base: dma-mapping: use nth_page helper
Documentation/ABI: add information about cpu_capacity
debugfs: set no_llseek in DEFINE_DEBUGFS_ATTRIBUTE
eventpoll.h: add missing epoll event masks
eventpoll.h: fix epoll event masks
|