Age | Commit message (Collapse) | Author | Files | Lines |
|
oom_reaper relies on the mmap_sem for read to do its job. Many places
which might block readers have been converted to use down_write_killable
and that has reduced chances of the contention a lot. Some paths where
the mmap_sem is held for write can take other locks and they might
either be not prepared to fail due to fatal signal pending or too
impractical to be changed.
This patch introduces MMF_OOM_NOT_REAPABLE flag which gets set after the
first attempt to reap a task's mm fails. If the flag is present after
the failure then we set MMF_OOM_REAPED to hide this mm from the oom
killer completely so it can go and chose another victim.
As a result a risk of OOM deadlock when the oom victim would be blocked
indefinetly and so the oom killer cannot make any progress should be
mitigated considerably while we still try really hard to perform all
reclaim attempts and stay predictable in the behavior.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The 0-day robot has encountered the following:
Out of memory: Kill process 3914 (trinity-c0) score 167 or sacrifice child
Killed process 3914 (trinity-c0) total-vm:55864kB, anon-rss:1512kB, file-rss:1088kB, shmem-rss:25616kB
oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26488kB
oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB
oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:26900kB
oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:27296kB
oom_reaper: reaped process 3914 (trinity-c0), now anon-rss:0kB, file-rss:0kB, shmem-rss:28148kB
oom_reaper is trying to reap the same task again and again.
This is possible only when the oom killer is bypassed because of
task_will_free_mem because we skip over tasks with MMF_OOM_REAPED
already set during select_bad_process. Teach task_will_free_mem to skip
over MMF_OOM_REAPED tasks as well because they will be unlikely to free
anything more.
Analyzed by Tetsuo Handa.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
task_will_free_mem is rather weak. It doesn't really tell whether the
task has chance to drop its mm. 98748bd72200 ("oom: consider
multi-threaded tasks in task_will_free_mem") made a first step into making
it more robust for multi-threaded applications so now we know that the
whole process is going down and probably drop the mm.
This patch builds on top for more complex scenarios where mm is shared
between different processes - CLONE_VM without CLONE_SIGHAND, or in kernel
use_mm().
Make sure that all processes sharing the mm are killed or exiting. This
will allow us to replace try_oom_reaper by wake_oom_reaper because
task_will_free_mem implies the task is reapable now. Therefore all paths
which bypass the oom killer are now reapable and so they shouldn't lock up
the oom killer.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently oom_kill_process skips both the oom reaper and SIG_KILL if a
process sharing the same mm is unkillable via OOM_ADJUST_MIN. After "mm,
oom_adj: make sure processes sharing mm have same view of oom_score_adj"
all such processes are sharing the same value so we shouldn't see such a
task at all (oom_badness would rule them out).
We can still encounter oom disabled vforked task which has to be killed as
well if we want to have other tasks sharing the mm reapable because it can
access the memory before doing exec. Killing such a task should be
acceptable because it is highly unlikely it has done anything useful
because it cannot modify any memory before it calls exec. An alternative
would be to keep the task alive and skip the oom reaper and risk all the
weird corner cases where the OOM killer cannot make forward progress
because the oom victim hung somewhere on the way to exit.
[[email protected] - drop printk when OOM_SCORE_ADJ_MIN killed task
the setting is inherently racy and we cannot do much about it without
introducing locks in hot paths]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
vforked tasks are not really sitting on any memory. They are sharing the
mm with parent until they exec into a new code. Until then it is just
pinning the address space. OOM killer will kill the vforked task along
with its parent but we still can end up selecting vforked task when the
parent wouldn't be selected. E.g. init doing vfork to launch a task or
vforked being a child of oom unkillable task with an updated oom_score_adj
to be killable.
Add a new helper to check whether a task is in the vfork sharing memory
with its parent and use it in oom_badness to skip over these tasks.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
oom_score_adj is shared for the thread groups (via struct signal) but this
is not sufficient to cover processes sharing mm (CLONE_VM without
CLONE_SIGHAND) and so we can easily end up in a situation when some
processes update their oom_score_adj and confuse the oom killer. In the
worst case some of those processes might hide from the oom killer
altogether via OOM_SCORE_ADJ_MIN while others are eligible. OOM killer
would then pick up those eligible but won't be allowed to kill others
sharing the same mm so the mm wouldn't release the mm and so the memory.
It would be ideal to have the oom_score_adj per mm_struct because that is
the natural entity OOM killer considers. But this will not work because
some programs are doing
vfork()
set_oom_adj()
exec()
We can achieve the same though. oom_score_adj write handler can set the
oom_score_adj for all processes sharing the same mm if the task is not in
the middle of vfork. As a result all the processes will share the same
oom_score_adj. The current implementation is rather pessimistic and
checks all the existing processes by default if there is more than 1
holder of the mm but we do not have any reliable way to check for external
users yet.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently we have two proc interfaces to set oom_score_adj. The legacy
/proc/<pid>/oom_adj and /proc/<pid>/oom_score_adj which both have their
specific handlers. Big part of the logic is duplicated so extract the
common code into __set_oom_adj helper. Legacy knob still expects some
details slightly different so make sure those are handled same way - e.g.
the legacy mode ignores oom_score_adj_min and it warns about the usage.
This patch shouldn't introduce any functional changes.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Oleg has pointed out that can simplify both oom_adj_{read,write} and
oom_score_adj_{read,write} even further and drop the sighand lock. The
main purpose of the lock was to protect p->signal from going away but this
will not happen since ea6d290ca34c ("signals: make task_struct->signal
immutable/refcountable").
The other role of the lock was to synchronize different writers,
especially those with CAP_SYS_RESOURCE. Introduce a mutex for this
purpose. Later patches will need this lock anyway.
Suggested-by: Oleg Nesterov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Series "Handle oom bypass more gracefully", V5
The following 10 patches should put some order to very rare cases of mm
shared between processes and make the paths which bypass the oom killer
oom reapable and therefore much more reliable finally. Even though mm
shared outside of thread group is rare (either vforked tasks for a short
period, use_mm by kernel threads or exotic thread model of
clone(CLONE_VM) without CLONE_SIGHAND) it is better to cover them. Not
only it makes the current oom killer logic quite hard to follow and
reason about it can lead to weird corner cases. E.g. it is possible to
select an oom victim which shares the mm with unkillable process or
bypass the oom killer even when other processes sharing the mm are still
alive and other weird cases.
Patch 1 drops bogus task_lock and mm check from oom_{score_}adj_write.
This can be considered a bug fix with a low impact as nobody has noticed
for years.
Patch 2 drops sighand lock because it is not needed anymore as pointed
by Oleg.
Patch 3 is a clean up of oom_score_adj handling and a preparatory work
for later patches.
Patch 4 enforces oom_adj_score to be consistent between processes
sharing the mm to behave consistently with the regular thread groups.
This can be considered a user visible behavior change because one thread
group updating oom_score_adj will affect others which share the same mm
via clone(CLONE_VM). I argue that this should be acceptable because we
already have the same behavior for threads in the same thread group and
sharing the mm without signal struct is just a different model of
threading. This is probably the most controversial part of the series,
I would like to find some consensus here. There were some suggestions
to hook some counter/oom_score_adj into the mm_struct but I feel that
this is not necessary right now and we can rely on proc handler +
oom_kill_process to DTRT. I can be convinced otherwise but I strongly
think that whatever we do the userspace has to have a way to see the
current oom priority as consistently as possible.
Patch 5 makes sure that no vforked task is selected if it is sharing the
mm with oom unkillable task.
Patch 6 ensures that all user tasks sharing the mm are killed which in
turn makes sure that all oom victims are oom reapable.
Patch 7 guarantees that task_will_free_mem will always imply reapable
bypass of the oom killer.
Patch 8 is new in this version and it addresses an issue pointed out by
0-day OOM report where an oom victim was reaped several times.
Patch 9 puts an upper bound on how many times oom_reaper tries to reap a
task and hides it from the oom killer to move on when no progress can be
made. This will give an upper bound to how long an oom_reapable task
can block the oom killer from selecting another victim if the oom_reaper
is not able to reap the victim.
Patch 10 tries to plug the (hopefully) last hole when we can still lock
up when the oom victim is shared with oom unkillable tasks (kthreads and
global init). We just try to be best effort in that case and rather
fallback to kill something else than risk a lockup.
This patch (of 10):
Both oom_adj_write and oom_score_adj_write are using task_lock, check for
task->mm and fail if it is NULL. This is not needed because the
oom_score_adj is per signal struct so we do not need mm at all. The code
has been introduced by 3d5992d2ac7d ("oom: add per-mm oom disable count")
but we do not do per-mm oom disable since c9f01245b6a7 ("oom: remove
oom_disable_count").
The task->mm check is even not correct because the current thread might
have exited but the thread group might be still alive - e.g. thread group
leader would lead that echo $VAL > /proc/pid/oom_score_adj would always
fail with EINVAL while /proc/pid/task/$other_tid/oom_score_adj would
succeed. This is unexpected at best.
Remove the lock along with the check to fix the unexpected behavior and
also because there is not real need for the lock in the first place.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reviewed-by: Vladimir Davydov <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Tetsuo Handa <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Pull dmaengine updates from Vinod Koul:
"This time we have bit of largish changes: two new drivers, bunch of
updates and cleanups to existing set. Nothing super exciting though.
New drivers:
- Xilinx zynqmp dma engine driver
- Marvell xor2 driver
Updates:
- dmatest sg support
- updates and enhancements to Xilinx drivers, adding of cyclic mode
- clock handling fixes across drivers
- removal of OOM messages on kzalloc across subsystem
- interleaved transfers support in omap driver
- runtime pm support in qcom bam dma
- tasklet kill freeup across drivers
- irq cleanup on remove across drivers"
* tag 'dmaengine-4.8-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (94 commits)
dmaengine: k3dma: add missing clk_disable_unprepare() on error in k3_dma_probe()
dmaengine: zynqmp_dma: add missing MODULE_LICENSE
dmaengine: qcom_hidma: use for_each_matching_node() macro
dmaengine: zynqmp_dma: Fix static checker warning
dmaengine: omap-dma: Support for interleaved transfer
dmaengine: ioat: statify symbol
dmaengine: pxa_dma: implement device_synchronize
dmaengine: imx-sdma: remove assignment never used
dmaengine: imx-sdma: remove dummy assignment
dmaengine: cppi: remove unused and bogus check
dmaengine: qcom_hidma_lli: kill the tasklets upon exit
dmaengine: pxa_dma: remove owner assignment
dmaengine: fsl_raid: remove owner assignment
dmaengine: coh901318: remove owner assignment
dmaengine: qcom_hidma: kill the tasklets upon exit
dmaengine: txx9dmac: explicitly freeup irq
dmaengine: sirf-dma: kill the tasklets upon exit
dmaengine: s3c24xx: kill the tasklets upon exit
dmaengine: s3c24xx: explicitly freeup irq
dmaengine: pl330: explicitly freeup irq
...
|
|
Pull hwspinlock updates from Bjorn Andersson:
"Add missing of_node_put() in the Qualcomm driver and update
MAINTAINERS to make sure all hwspinlock related files have a
maintainer listed"
* tag 'hwlock-v4.8' of git://github.com/andersson/remoteproc:
MAINTAINERS: Update hwspinlock paths
hwspinlock: qcom_hwspinlock: add missing of_node_put after calling of_parse_phandle
|
|
Pull remoteproc updates from Bjorn Andersson:
"Introduce remoteproc driver for controlling the modem/DSP Hexagon CPU
found in a multitude of Qualcomm platform.
Also cleans up a race condition/potential leak during registration of
remoteprocs and includes devicetree bindings in the MAINTAINERS entry"
* tag 'rproc-v4.8' of git://github.com/andersson/remoteproc:
remoteproc: qcom: hexagon: Clean up mpss validation
remoteproc: qcom: remove redundant dev_err call in q6v5_init_mem()
remoteproc: qcom: Driver for the self-authenticating Hexagon v5
dt-binding: remoteproc: Introduce Hexagon loader binding
remoteproc: Fix potential race condition in rproc_add
MAINTAINERS: Add file patterns for remoteproc device tree bindings
|
|
This prevents a double-fetch from user space that can lead to to an
undersized allocation and heap overflow.
Fixes: 54dbc1517237 ("vfs: hoist the btrfs deduplication ioctl to the vfs")
Signed-off-by: Scott Bauer <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID updates from Jiri Kosina:
- new hid-alps driver for ALPS Touchpad-Stick device, from Masaki Ota
- much improved and generalized HID led handling, and merge of
specialized hid-thingm driver into this generic hid-led one, from
Heiner Kallweit
- i2c-hid power management improvements from Fu Zhonghui and Guohua
Zhong
- uhid initialization race fix from Roderick Colenbrander
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (21 commits)
HID: add usb device id for Apple Magic Keyboard
HID: hid-led: fix Delcom support on big endian systems
HID: hid-led: add support for Greynut Luxafor
HID: hid-led: add support for Delcom Visual Signal Indicator G2
HID: hid-led: remove report id from struct hidled_config
HID: alps: a few cleanups
HID: remove ThingM blink(1) driver
HID: hid-led: add support for ThingM blink(1)
HID: hid-led: add support for reading from LED devices
HID: hid-led: add support for devices with multiple independent LEDs
HID: i2c-hid: set power sleep before shutdown
HID: alps: match alps devices in core
HID: thingm: simplify debug output code
HID: alps: pass correct sizes to hid_hw_raw_request()
HID: alps: struct u1_dev *priv is internal to the driver
HID: add Alps I2C HID Touchpad-Stick support
HID: led: fix config
usb: misc: remove outdated USB LED driver
HID: migrate USB LED driver from usb misc to hid
HID: i2c_hid: enable i2c-hid devices to suspend/resume asynchronously
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
fat: fix error message for bogus number of directory entries
fat: fix typo s/supeblock/superblock/
ASoC: max9877: Remove unused function declaration
dw2102: don't output spurious blank lines to the kernel log
init: fix Kconfig text
ARM: io: fix comment grammar
ocfs: fix ocfs2_xattr_user_get() argument name
scsi/qla2xxx: Remove erroneous unused macro qla82xx_get_temp_val1()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull quota update from Jan Kara:
"time64 support for quota"
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: use time64_t internally
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random
Pull random driver fix from Ted Ts'o:
"Fix a boot failure on systems with non-contiguous NUMA id's"
* tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
random: use for_each_online_node() to iterate over NUMA nodes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs updates from Al Viro:
"Assorted cleanups and fixes.
Probably the most interesting part long-term is ->d_init() - that will
have a bunch of followups in (at least) ceph and lustre, but we'll
need to sort the barrier-related rules before it can get used for
really non-trivial stuff.
Another fun thing is the merge of ->d_iput() callers (dentry_iput()
and dentry_unlink_inode()) and a bunch of ->d_compare() ones (all
except the one in __d_lookup_lru())"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
fs/dcache.c: avoid soft-lockup in dput()
vfs: new d_init method
vfs: Update lookup_dcache() comment
bdev: get rid of ->bd_inodes
Remove last traces of ->sync_page
new helper: d_same_name()
dentry_cmp(): use lockless_dereference() instead of smp_read_barrier_depends()
vfs: clean up documentation
vfs: document ->d_real()
vfs: merge .d_select_inode() into .d_real()
unify dentry_iput() and dentry_unlink_inode()
binfmt_misc: ->s_root is not going anywhere
drop redundant ->owner initializations
ufs: get rid of redundant checks
orangefs: constify inode_operations
missed comment updates from ->direct_IO() prototype change
file_inode(f)->i_mapping is f->f_mapping
trim fsnotify hooks a bit
9p: new helper - v9fs_parent_fid()
debugfs: ->d_parent is never NULL or negative
...
|
|
LTP madvise05 was generating mm splat
| [ARCLinux]# /sd/ltp/testcases/bin/madvise05
| BUG: Bad page map in process madvise05 pte:80e08211 pmd:9f7d4000
| page:9fdcfc90 count:1 mapcount:-1 mapping: (null) index:0x0 flags: 0x404(referenced|reserved)
| page dumped because: bad pte
| addr:200b8000 vm_flags:00000070 anon_vma: (null) mapping: (null) index:1005c
| file: (null) fault: (null) mmap: (null) readpage: (null)
| CPU: 2 PID: 6707 Comm: madvise05
And for newer kernels, the system was rendered unusable afterwards.
The problem was mprotect->pte_modify() clearing PTE_SPECIAL (which is
set to identify the special zero page wired to the pte).
When pte was finally unmapped, special casing for zero page was not
done, and instead it was treated as a "normal" page, tripping on the
map counts etc.
This fixes ARC STAR 9001053308
Cc: <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
|
|
This changes the vfs dentry hashing to mix in the parent pointer at the
_beginning_ of the hash, rather than at the end.
That actually improves both the hash and the code generation, because we
can move more of the computation to the "static" part of the dcache
setup, and do less at lookup runtime.
It turns out that a lot of other hash users also really wanted to mix in
a base pointer as a 'salt' for the hash, and so the slightly extended
interface ends up working well for other cases too.
Users that want a string hash that is purely about the string pass in a
'salt' pointer of NULL.
* merge branch 'salted-string-hash':
fs/dcache.c: Save one 32-bit multiply in dcache lookup
vfs: make the string hashes salt the hash
|
|
A LAYOUTCOMMIT then subsequent GETATTR may both return the same attributes,
and in that case NFS_INO_INVALID_ATTR is never set on the second pass
through nfs_update_inode(). The existing check to skip the clearing of
NFS_INO_INVALID_ATTR if a LAYOUTCOMMIT is outstanding does not help in this
case (see commit 10b7e9ad4488: "pNFS: Don't mark the inode as revalidated
if a LAYOUTCOMMIT is outstanding"). We know that if a LAYOUTCOMMIT is
outstanding then attributes will need upating, so always set
NFS_INO_INVALID_ATTR.
Signed-off-by: Benjamin Coddington <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
|
|
On the tear-down path, the dead CPU callback for the timers was
misplaced within the 'cpuhp_state' enumeration. There is a hidden
dependency between the timers and block multiqueue. The timers
callback must happen before the block multiqueue callback otherwise a
RCU stall occurs.
Move the timers callback to the proper place in the state machine.
Reported-and-tested-by: Jon Hunter <[email protected]>
Reported-by: kbuild test robot <[email protected]>
Fixes: 24f73b99716a ("timers/core: Convert to hotplug state machine")
Signed-off-by: Richard Cochran <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Rasmus Villemoes <[email protected]>
Cc: John Stultz <[email protected]>
Cc: [email protected]
Cc: Oleg Nesterov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
|
|
The md device might not have personality (for example, ddf raid array). The
issue is introduced by 8430e7e0af9a15(md: disconnect device from personality
before trying to remove it)
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>
|
|
Fix format and type mismatches in a couple debug prints in the
Broadcom PDC driver. Use %pad for dma_addr_t and %pa for
resource_size_t.
Signed-off-by: Rob Rice <[email protected]>
Reported-by: Fengguang Wu <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>
|
|
|
|
Some of the checks didn't handle frev 2 tables properly.
amdgpu doesn't support any tables pre-frev 2, so drop
the checks.
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Some of the checks didn't handle frev 2 tables properly.
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
The genksyms helper in the kernel cannot parse a type definition
like "typeof(((type *)0)->keyfld)" that is used in the DEFINE_RB_FUNCS
helper, causing the following EXPORT_SYMBOL() statement to be ignored
when computing the crcs, and triggering a warning about this:
WARNING: "ceph_monc_do_statfs" [fs/ceph/ceph.ko] has no CRC
To work around the problem, we can rewrite the type to reference
an undefined 'extern' symbol instead of a NULL pointer. This is
evidently ok for genksyms, and it no longer complains about the
line when calling it with 'genksyms -w'.
I've looked briefly into extending genksyms instead, but it seems
really hard to do. Jan Beulich introduced basic support for 'typeof'
a while ago in dc53324060f3 ("genksyms: fix typeof() handling"),
but that is not sufficient for the expression we have here.
Signed-off-by: Arnd Bergmann <[email protected]>
Fixes: fcd00b68bbe2 ("libceph: DEFINE_RB_FUNCS macro")
Cc: Jan Beulich <[email protected]>
Cc: Michal Marek <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
|
|
Fix to return error code -EINVAL from the error handling
case instead of 0, as done elsewhere in this function.
Fixes: 3c31760e760c ('drm/arm: mali-dp: Set crtc.port to the port
instead of the endpoint')
Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Brian Starkey <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
There is a error message within devm_ioremap_resource
already, so remove the DRM_ERROR call to avoid redundant
error message.
Signed-off-by: Wei Yongjun <[email protected]>
Acked-by: Liviu Dudau <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Conflicts:
drivers/hid/hid-thingm.c
|
|
'for-4.8/uhid-offload-hid-device-add' and 'for-4.8/upstream' into for-linus
|
|
Stub implementation of fb_ioctl can be omitted, because function
do_fb_ioctl already returns -ENOTTY when fb_ioctl is not assigned.
Signed-off-by: Stefan Christ <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Convert asciidoc-formatted docs to rst in accordance with Jonathan's and
Jani's effort to use sphinx for kernel-doc rendering in 4.8.
Cc: Jonathan Corbet <[email protected]>
Cc: Jani Nikula <[email protected]>
Signed-off-by: Lukas Wunner <[email protected]>
Acked-by: Darren Hart <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/4c1b29986fa77772156b1af0c965d3799e43a47b.1467628307.git.lukas@wunner.de
|
|
It turns out that if the guest does a H_CEDE while the CPU is in
a transactional state, and the H_CEDE does a nap, and the nap
loses the architected state of the CPU (which is is allowed to do),
then we lose the checkpointed state of the virtual CPU. In addition,
the transactional-memory state recorded in the MSR gets reset back
to non-transactional, and when we try to return to the guest, we take
a TM bad thing type of program interrupt because we are trying to
transition from non-transactional to transactional with a hrfid
instruction, which is not permitted.
The result of the program interrupt occurring at that point is that
the host CPU will hang in an infinite loop with interrupts disabled.
Thus this is a denial of service vulnerability in the host which can
be triggered by any guest (and depending on the guest kernel, it can
potentially triggered by unprivileged userspace in the guest).
This vulnerability has been assigned the ID CVE-2016-5412.
To fix this, we save the TM state before napping and restore it
on exit from the nap, when handling a H_CEDE in real mode. The
case where H_CEDE exits to host virtual mode is already OK (as are
other hcalls which exit to host virtual mode) because the exit
path saves the TM state.
Cc: [email protected] # v3.15+
Signed-off-by: Paul Mackerras <[email protected]>
|
|
This moves the transactional memory state save and restore sequences
out of the guest entry/exit paths into separate procedures. This is
so that these sequences can be used in going into and out of nap
in a subsequent patch.
The only code changes here are (a) saving and restore LR on the
stack, since these new procedures get called with a bl instruction,
(b) explicitly saving r1 into the PACA instead of assuming that
HSTATE_HOST_R1(r13) is already set, and (c) removing an unnecessary
and redundant setting of MSR[TM] that should have been removed by
commit 9d4d0bdd9e0a ("KVM: PPC: Book3S HV: Add transactional memory
support", 2013-09-24) but wasn't.
Cc: [email protected] # v3.15+
Signed-off-by: Paul Mackerras <[email protected]>
|
|
We accidentally take the "port->lock" twice in a row. This old code
was supposed to be deleted.
Fixes: e58e241c1788 ('sparc: serial: Clean up the locking for -rt')
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Smatch complains that these tests are off by one, which is true but not
life threatening.
arch/sparc/kernel/irq_32.c:169 irq_link()
error: buffer overflow 'irq_map' 384 <= 384
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So far, the cache of the ahash requests was updated from the 'complete'
operation. This complete operation is called from mv_cesa_tdma_process
before the cleanup operation, which means that the content of req->src
can be read and copied when it is still mapped. This commit fixes the
issue by moving this cache update from mv_cesa_ahash_complete to
mv_cesa_ahash_req_cleanup, so the copy is done once the sglist is
unmapped.
Fixes: 1bf6682cb31d ("crypto: marvell - Add a complete operation for..")
Signed-off-by: Romain Perier <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
|
|
The flag CRYPTO_TFM_REQ_MAY_BACKLOG is optional and can be set from the
user to put requests into the backlog queue when the main cryptographic
queue is full. Before calling mv_cesa_tdma_chain we must check the value
of the return status to be sure that the current request has been
correctly queued or added to the backlog.
Fixes: 85030c5168f1 ("crypto: marvell - Add support for chaining...")
Signed-off-by: Romain Perier <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
|
|
So far in mv_cesa_ablkcipher_dma_req_init, if an error is thrown while
the tdma chain is built there is a memory leak. This issue exists
because the chain is assigned later at the end of the function, so the
cleanup function is called with the wrong version of the chain.
Fixes: db509a45339f ("crypto: marvell/cesa - add TDMA support")
Signed-off-by: Romain Perier <[email protected]>
Acked-by: Boris Brezillon <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
|
|
|
|
The Broadcom PDC mailbox driver is a mailbox controller that
manages data transfers to and from one or more offload engines.
Signed-off-by: Rob Rice <[email protected]>
Reviewed-by: Scott Branden <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>
|
|
Add the device tree binding documentation for the PDC hardware
in Broadcom iProc SoCs.
Signed-off-by: Rob Rice <[email protected]>
Acked-by: Rob Herring <[email protected]>
Reviewed-by: Ray Jui <[email protected]>
Reviewed-by: Anup Patel <[email protected]>
Reviewed-by: Scott Branden <[email protected]>
Signed-off-by: Jassi Brar <[email protected]>
|
|
During following a symbolic link we received err_buf from SMB2_open().
While the validity of SMB2 error response is checked previously
in smb2_check_message() a symbolic link payload is not checked at all.
Fix it by adding such checks.
Cc: Dan Carpenter <[email protected]>
CC: Stable <[email protected]>
Signed-off-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
if, when mounting //HOST/share/sub/dir/foo we can query /sub/dir/foo but
not any of the path components above:
- store the /sub/dir/foo prefix in the cifs super_block info
- in the superblock, set root dentry to the subpath dentry (instead of
the share root)
- set a flag in the superblock to remember it
- use prefixpath when building path from a dentry
fixes bso#8950
Signed-off-by: Aurelien Aptel <[email protected]>
CC: Stable <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
This fixes a crash on s390 with fake NUMA enabled.
Reported-by: Heiko Carstens <[email protected]>
Fixes: 1e7f583af67b ("random: make /dev/urandom scalable for silly userspace programs")
Signed-off-by: Theodore Ts'o <[email protected]>
|
|
Some of our "for_each_xyz()" macro constructs make gcc unhappy about
lack of braces around if-statements inside or outside the loop, because
the loop construct itself has a "if-then-else" statement inside of it.
The resulting warnings look something like this:
drivers/gpu/drm/i915/i915_debugfs.c: In function ‘i915_dump_lrc’:
drivers/gpu/drm/i915/i915_debugfs.c:2103:6: warning: suggest explicit braces to avoid ambiguous ‘else’ [-Wparentheses]
if (ctx != dev_priv->kernel_context)
^
even if the code itself is fine.
Since the warning is fairly easy to avoid by adding a braces around the
if-statement near the for_each_xyz() construct, do so, rather than
disabling the otherwise potentially useful warning.
(The if-then-else statements used in the "for_each_xyz()" constructs are
designed to be inherently safe even with no braces, but in this case
it's quite understandable that gcc isn't really able to tell that).
This finally leaves the standard "allmodconfig" build with just a
handful of remaining warnings, so new and valid warnings hopefully will
stand out.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently IS_ENABLED() produces an expression surrounded by parentheses,
which allows this code to compile, generating eg:
else if (1 || 0)
hpte_init_native();
However a change to the macro in the kbuild tree will break this in
future by removing the parentheses.
Fixes: 7353644fa9df ("powerpc/mm: Fix build break when PPC_NATIVE=n")
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|