Age | Commit message (Collapse) | Author | Files | Lines |
|
While adding handling for dying task group leaders c03cd7738a83
("cgroup: Include dying leaders with live threads in PROCS
iterations") added an inverted cset skip condition to
css_task_iter_advance_css_set(). It should skip cset if it's
completely empty but was incorrectly testing for the inverse condition
for the dying_tasks list. Fix it.
Signed-off-by: Tejun Heo <[email protected]>
Fixes: c03cd7738a83 ("cgroup: Include dying leaders with live threads in PROCS iterations")
Reported-by: [email protected]
|
|
So long as a struct hmm pointer exists, so should the struct mm it is
linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
once the hmm refcount goes to zero.
Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
mm->hmm delete the hmm_hmm_destroy().
Signed-off-by: Jason Gunthorpe <[email protected]>
Reviewed-by: Jérôme Glisse <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Reviewed-by: Ralph Campbell <[email protected]>
Reviewed-by: Ira Weiny <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Tested-by: Philip Yang <[email protected]>
|
|
There's some discussion on how to do this the best, and Tejun prefers
that BFQ just create the file itself instead of having cgroups support
a symlink feature.
Hence revert commit 54b7b868e826 and 19e9da9e86c4 for 5.2, and this
can be done properly for 5.3.
Signed-off-by: Jens Axboe <[email protected]>
|
|
This adds the clone3 system call.
As mentioned several times already (cf. [7], [8]) here's the promised
patchset for clone3().
We recently merged the CLONE_PIDFD patchset (cf. [1]). It took the last
free flag from clone().
Independent of the CLONE_PIDFD patchset a time namespace has been discussed
at Linux Plumber Conference last year and has been sent out and reviewed
(cf. [5]). It is expected that it will go upstream in the not too distant
future. However, it relies on the addition of the CLONE_NEWTIME flag to
clone(). The only other good candidate - CLONE_DETACHED - is currently not
recyclable as we have identified at least two large or widely used
codebases that currently pass this flag (cf. [2], [3], and [4]). Given that
CLONE_PIDFD grabbed the last clone() flag the time namespace is effectively
blocked. clone3() has the advantage that it will unblock this patchset
again. In general, clone3() is extensible and allows for the implementation
of new features.
The idea is to keep clone3() very simple and close to the original clone(),
specifically, to keep on supporting old clone()-based workloads.
We know there have been various creative proposals how a new process
creation syscall or even api is supposed to look like. Some people even
going so far as to argue that the traditional fork()+exec() split should be
abandoned in favor of an in-kernel version of spawn(). Independent of
whether or not we personally think spawn() is a good idea this patchset has
and does not want to have anything to do with this.
One stance we take is that there's no real good alternative to
clone()+exec() and we need and want to support this model going forward;
independent of spawn().
The following requirements guided clone3():
- bump the number of available flags
- move arguments that are currently passed as separate arguments
in clone() into a dedicated struct clone_args
- choose a struct layout that is easy to handle on 32 and on 64 bit
- choose a struct layout that is extensible
- give new flags that currently need to abuse another flag's dedicated
return argument in clone() their own dedicated return argument
(e.g. CLONE_PIDFD)
- use a separate kernel internal struct kernel_clone_args that is
properly typed according to current kernel conventions in fork.c and is
different from the uapi struct clone_args
- port _do_fork() to use kernel_clone_args so that all process creation
syscalls such as fork(), vfork(), clone(), and clone3() behave identical
(Arnd suggested, that we can probably also port do_fork() itself in a
separate patchset.)
- ease of transition for userspace from clone() to clone3()
This very much means that we do *not* remove functionality that userspace
currently relies on as the latter is a good way of creating a syscall
that won't be adopted.
- do not try to be clever or complex: keep clone3() as dumb as possible
In accordance with Linus suggestions (cf. [11]), clone3() has the following
signature:
/* uapi */
struct clone_args {
__aligned_u64 flags;
__aligned_u64 pidfd;
__aligned_u64 child_tid;
__aligned_u64 parent_tid;
__aligned_u64 exit_signal;
__aligned_u64 stack;
__aligned_u64 stack_size;
__aligned_u64 tls;
};
/* kernel internal */
struct kernel_clone_args {
u64 flags;
int __user *pidfd;
int __user *child_tid;
int __user *parent_tid;
int exit_signal;
unsigned long stack;
unsigned long stack_size;
unsigned long tls;
};
long sys_clone3(struct clone_args __user *uargs, size_t size)
clone3() cleanly supports all of the supported flags from clone() and thus
all legacy workloads.
The advantage of sticking close to the old clone() is the low cost for
userspace to switch to this new api. Quite a lot of userspace apis (e.g.
pthreads) are based on the clone() syscall. With the new clone3() syscall
supporting all of the old workloads and opening up the ability to add new
features should make switching to it for userspace more appealing. In
essence, glibc can just write a simple wrapper to switch from clone() to
clone3().
There has been some interest in this patchset already. We have received a
patch from the CRIU corner for clone3() that would set the PID/TID of a
restored process without /proc/sys/kernel/ns_last_pid to eliminate a race.
/* User visible differences to legacy clone() */
- CLONE_DETACHED will cause EINVAL with clone3()
- CSIGNAL is deprecated
It is superseeded by a dedicated "exit_signal" argument in struct
clone_args freeing up space for additional flags.
This is based on a suggestion from Andrei and Linus (cf. [9] and [10])
/* References */
[1]: b3e5838252665ee4cfa76b82bdf1198dca81e5be
[2]: https://dxr.mozilla.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp#343
[3]: https://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_create.c#n233
[4]: https://sources.debian.org/src/blcr/0.8.5-2.3/cr_module/cr_dump_self.c/?hl=740#L740
[5]: https://lore.kernel.org/lkml/[email protected]/
[6]: https://lore.kernel.org/lkml/[email protected]/
[7]: https://lore.kernel.org/lkml/CAHrFyr5HxpGXA2YrKza-oB-GGwJCqwPfyhD-Y5wbktWZdt0sGQ@mail.gmail.com/
[8]: https://lore.kernel.org/lkml/[email protected]/
[9]: https://lore.kernel.org/lkml/[email protected]/
[10]: https://lore.kernel.org/lkml/CAHk-=whQP-Ykxi=zSYaV9iXsHsENa+2fdj-zYKwyeyed63Lsfw@mail.gmail.com/
[11]: https://lore.kernel.org/lkml/CAHk-=wieuV4hGwznPsX-8E0G2FKhx3NjZ9X3dTKh5zKd+iqOBw@mail.gmail.com/
Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Christian Brauner <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
Acked-by: Serge Hallyn <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Pavel Emelyanov <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: David Howells <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Adrian Reber <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Florian Weimer <[email protected]>
Cc: [email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull yet more SPDX updates from Greg KH:
"Another round of SPDX header file fixes for 5.2-rc4
These are all more "GPL-2.0-or-later" or "GPL-2.0-only" tags being
added, based on the text in the files. We are slowly chipping away at
the 700+ different ways people tried to write the license text. All of
these were reviewed on the spdx mailing list by a number of different
people.
We now have over 60% of the kernel files covered with SPDX tags:
$ ./scripts/spdxcheck.py -v 2>&1 | grep Files
Files checked: 64533
Files with SPDX: 40392
Files with errors: 0
I think the majority of the "easy" fixups are now done, it's now the
start of the longer-tail of crazy variants to wade through"
* tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (159 commits)
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 450
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 449
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 448
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 446
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 445
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 444
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 443
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 442
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 440
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 438
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 437
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 435
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 434
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 433
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 432
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 431
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 430
treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 429
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are some small char and misc driver fixes for 5.2-rc4 to resolve
a number of reported issues.
The most "notable" one here is the kernel headers in proc^Wsysfs
fixes. Those changes move the header file info into sysfs and fixes
the build issues that you reported.
Other than that, a bunch of small habanalabs driver fixes, some fpga
driver fixes, and a few other tiny driver fixes.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
habanalabs: Read upper bits of trace buffer from RWPHI
habanalabs: Fix virtual address access via debugfs for 2MB pages
fpga: zynqmp-fpga: Correctly handle error pointer
habanalabs: fix bug in checking huge page optimization
habanalabs: Avoid using a non-initialized MMU cache mutex
habanalabs: fix debugfs code
uapi/habanalabs: add opcode for enable/disable device debug mode
habanalabs: halt debug engines on user process close
test_firmware: Use correct snprintf() limit
genwqe: Prevent an integer overflow in the ioctl
parport: Fix mem leak in parport_register_dev_model
fpga: dfl: expand minor range when registering chrdev region
fpga: dfl: Add lockdep classes for pdata->lock
fpga: dfl: afu: Pass the correct device to dma_mapping_error()
fpga: stratix10-soc: fix use-after-free on s10_init()
w1: ds2408: Fix typo after 49695ac46861 (reset on output_write retry with readback)
kheaders: Do not regenerate archive if config is not changed
kheaders: Move from proc to sysfs
lkdtm/bugs: Adjust recursion test to avoid elision
lkdtm/usercopy: Moves the KERNEL_DS test to non-canonical
|
|
Pull block fixes from Jens Axboe:
- Allow symlink from the bfq.weight cgroup parameter to the general
weight (Angelo)
- Damien is new skd maintainer (Bart)
- NVMe pull request from Sagi, with a few small fixes.
- Ensure we set DMA segment size properly, dma-debug is now tripping on
these (Christoph)
- Remove useless debugfs_create() return check (Greg)
- Remove redundant unlikely() check on IS_ERR() (Kefeng)
- Fixup request freeing on exit (Ming)
* tag 'for-linus-20190608' of git://git.kernel.dk/linux-block:
block, bfq: add weight symlink to the bfq.weight cgroup parameter
cgroup: let a symlink too be created with a cftype file
block: free sched's request pool in blk_cleanup_queue
nvme-rdma: use dynamic dma mapping per command
nvme: Fix u32 overflow in the number of namespace list calculation
mmc: also set max_segment_size in the device
mtip32xx: also set max_segment_size in the device
rsxx: don't call dma_set_max_seg_size
nvme-pci: don't limit DMA segement size
block: Drop unlikely before IS_ERR(_OR_NULL)
block: aoe: no need to check return value of debugfs_create functions
nvmet: fix data_len to 0 for bdev-backed write_zeroes
MAINTAINERS: Hand over skd maintainership
nvme-tcp: fix queue mapping when queue count is limited
nvme-rdma: fix queue mapping when queue count is limited
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2019-06-07
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix several bugs in riscv64 JIT code emission which forgot to clear high
32-bits for alu32 ops, from Björn and Luke with selftests covering all
relevant BPF alu ops from Björn and Jiong.
2) Two fixes for UDP BPF reuseport that avoid calling the program in case of
__udp6_lib_err and UDP GRO which broke reuseport_select_sock() assumption
that skb->data is pointing to transport header, from Martin.
3) Two fixes for BPF sockmap: a use-after-free from sleep in psock's backlog
workqueue, and a missing restore of sk_write_space when psock gets dropped,
from Jakub and John.
4) Fix unconnected UDP sendmsg hook API which is insufficient as-is since it
breaks standard applications like DNS if reverse NAT is not performed upon
receive, from Daniel.
5) Fix an out-of-bounds read in __bpf_skc_lookup which in case of AF_INET6
fails to verify that the length of the tuple is long enough, from Lorenz.
6) Fix libbpf's libbpf__probe_raw_btf to return an fd instead of 0/1 (for
{un,}successful probe) as that is expected to be propagated as an fd to
load_sk_storage_btf() and thus closing the wrong descriptor otherwise,
from Michal.
7) Fix bpftool's JSON output for the case when a lookup fails, from Krzesimir.
8) Minor misc fixes in docs, samples and selftests, from various others.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a crash during resume from hibernation introduced during the
4.19 cycle, cause the new Performance and Energy Bias Hint (EPB) code
to be built only if CONFIG_PM is set and add a few missing kerneldoc
comments.
Specifics:
- Fix a crash that occurs when a kernel with 'nosmt' in the command
line is used to resume the system from hibernation (as the
"restore" kernel), because memory mapping differences between the
restore and image kernels cause SMT siblings to be woken up from
idle states and subsequently they try to fetch instructions from
incorrect memory locations (Jiri Kosina).
- Cause the new Performance and Energy Bias Hint (EPB) code to be
built only if CONFIG_PM is set, because that code is not really
necessary otherwise (Rafael Wysocki).
- Add kerneldoc comments to documents some helper functions related
to system-wide suspend to avoid possible confusion regarding their
purpose (Rafael Wysocki)"
* tag 'pm-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
PM: sleep: Add kerneldoc comments to some functions
x86: intel_epb: Do not build when CONFIG_PM is unset
|
|
Some ISDN files that got removed in net-next had some changes
done in mainline, take the removals.
Signed-off-by: David S. Miller <[email protected]>
|
|
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct module_sect_attrs {
...
struct module_sect_attr attrs[0];
};
Make use of the struct_size() helper instead of an open-coded version
in order to avoid any potential type mistakes.
So, replace the following form:
sizeof(*sect_attrs) + nloaded * sizeof(sect_attrs->attrs[0]
with:
struct_size(sect_attrs, attrs, nloaded)
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Jessica Yu <[email protected]>
|
|
* pm-x86:
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
x86: intel_epb: Do not build when CONFIG_PM is unset
|
|
This commit enables a cftype to have a symlink (of any name) that
points to the file associated with the cftype.
Signed-off-by: Angelo Ruocco <[email protected]>
Signed-off-by: Paolo Valente <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Intention of cgroup bind/connect/sendmsg BPF hooks is to act transparently
to applications as also stated in original motivation in 7828f20e3779 ("Merge
branch 'bpf-cgroup-bind-connect'"). When recently integrating the latter
two hooks into Cilium to enable host based load-balancing with Kubernetes,
I ran into the issue that pods couldn't start up as DNS got broken. Kubernetes
typically sets up DNS as a service and is thus subject to load-balancing.
Upon further debugging, it turns out that the cgroupv2 sendmsg BPF hooks API
is currently insufficient and thus not usable as-is for standard applications
shipped with most distros. To break down the issue we ran into with a simple
example:
# cat /etc/resolv.conf
nameserver 147.75.207.207
nameserver 147.75.207.208
For the purpose of a simple test, we set up above IPs as service IPs and
transparently redirect traffic to a different DNS backend server for that
node:
# cilium service list
ID Frontend Backend
1 147.75.207.207:53 1 => 8.8.8.8:53
2 147.75.207.208:53 1 => 8.8.8.8:53
The attached BPF program is basically selecting one of the backends if the
service IP/port matches on the cgroup hook. DNS breaks here, because the
hooks are not transparent enough to applications which have built-in msg_name
address checks:
# nslookup 1.1.1.1
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.208#53
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
[...]
;; connection timed out; no servers could be reached
# dig 1.1.1.1
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.208#53
;; reply from unexpected source: 8.8.8.8#53, expected 147.75.207.207#53
[...]
; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> 1.1.1.1
;; global options: +cmd
;; connection timed out; no servers could be reached
For comparison, if none of the service IPs is used, and we tell nslookup
to use 8.8.8.8 directly it works just fine, of course:
# nslookup 1.1.1.1 8.8.8.8
1.1.1.1.in-addr.arpa name = one.one.one.one.
In order to fix this and thus act more transparent to the application,
this needs reverse translation on recvmsg() side. A minimal fix for this
API is to add similar recvmsg() hooks behind the BPF cgroups static key
such that the program can track state and replace the current sockaddr_in{,6}
with the original service IP. From BPF side, this basically tracks the
service tuple plus socket cookie in an LRU map where the reverse NAT can
then be retrieved via map value as one example. Side-note: the BPF cgroups
static key should be converted to a per-hook static key in future.
Same example after this fix:
# cilium service list
ID Frontend Backend
1 147.75.207.207:53 1 => 8.8.8.8:53
2 147.75.207.208:53 1 => 8.8.8.8:53
Lookups work fine now:
# nslookup 1.1.1.1
1.1.1.1.in-addr.arpa name = one.one.one.one.
Authoritative answers can be found from:
# dig 1.1.1.1
; <<>> DiG 9.11.3-1ubuntu1.7-Ubuntu <<>> 1.1.1.1
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 51550
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;1.1.1.1. IN A
;; AUTHORITY SECTION:
. 23426 IN SOA a.root-servers.net. nstld.verisign-grs.com. 2019052001 1800 900 604800 86400
;; Query time: 17 msec
;; SERVER: 147.75.207.207#53(147.75.207.207)
;; WHEN: Tue May 21 12:59:38 UTC 2019
;; MSG SIZE rcvd: 111
And from an actual packet level it shows that we're using the back end
server when talking via 147.75.207.20{7,8} front end:
# tcpdump -i any udp
[...]
12:59:52.698732 IP foo.42011 > google-public-dns-a.google.com.domain: 18803+ PTR? 1.1.1.1.in-addr.arpa. (38)
12:59:52.698735 IP foo.42011 > google-public-dns-a.google.com.domain: 18803+ PTR? 1.1.1.1.in-addr.arpa. (38)
12:59:52.701208 IP google-public-dns-a.google.com.domain > foo.42011: 18803 1/0/0 PTR one.one.one.one. (67)
12:59:52.701208 IP google-public-dns-a.google.com.domain > foo.42011: 18803 1/0/0 PTR one.one.one.one. (67)
[...]
In order to be flexible and to have same semantics as in sendmsg BPF
programs, we only allow return codes in [1,1] range. In the sendmsg case
the program is called if msg->msg_name is present which can be the case
in both, connected and unconnected UDP.
The former only relies on the sockaddr_in{,6} passed via connect(2) if
passed msg->msg_name was NULL. Therefore, on recvmsg side, we act in similar
way to call into the BPF program whenever a non-NULL msg->msg_name was
passed independent of sk->sk_state being TCP_ESTABLISHED or not. Note
that for TCP case, the msg->msg_name is ignored in the regular recvmsg
path and therefore not relevant.
For the case of ip{,v6}_recv_error() paths, picked up via MSG_ERRQUEUE,
the hook is not called. This is intentional as it aligns with the same
semantics as in case of TCP cgroup BPF hooks right now. This might be
better addressed in future through a different bpf_attach_type such
that this case can be distinguished from the regular recvmsg paths,
for example.
Fixes: 1cedee13d25a ("bpf: Hooks for sys_sendmsg")
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrey Ignatov <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Acked-by: Martynas Pumputis <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
b636fd38dc40 ("cgroup: Implement css_task_iter_skip()") introduced
css_task_iter_skip() which is used to fix task iterations skipping
dying threadgroup leaders with live threads. Skipping is implemented
as a subportion of full advancing but css_task_iter_next() forgot to
fully advance a skipped iterator before determining the next task to
visit causing it to return invalid task pointers.
Fix it by making css_task_iter_next() fully advance the iterator if it
has been skipped since the previous iteration.
Signed-off-by: Tejun Heo <[email protected]>
Reported-by: syzbot
Link: http://lkml.kernel.org/r/[email protected]
Fixes: b636fd38dc40 ("cgroup: Implement css_task_iter_skip()")
|
|
While the TIF_SYSCALL_EMU is set in ptrace_resume independent of any
architecture, currently only powerpc and x86 unset the TIF_SYSCALL_EMU
flag in ptrace_disable which gets called from ptrace_detach.
Let's move the clearing of TIF_SYSCALL_EMU flag to __ptrace_unlink
which gets executed from ptrace_detach and also keep it along with
or close to clearing of TIF_SYSCALL_TRACE.
Cc: Paul Mackerras <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Ingo Molnar <[email protected]>
Acked-by: Oleg Nesterov <[email protected]>
Signed-off-by: Sudeep Holla <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license as published by
the free software foundation version 2 of the license
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 315 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Reviewed-by: Armijn Hemel <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
distributed under the terms of the gnu gpl version 2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 2 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Armijn Hemel <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
distribute under gplv2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 8 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Armijn Hemel <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
this file is released under the gplv2
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 68 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Armijn Hemel <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
released under terms in gpl version 2 see copying
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 5 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Armijn Hemel <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not write to the free
software foundation inc 59 temple place suite 330 boston ma 02111
1307 usa
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 136 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Alexios Zavras <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms of version 2 of the gnu general public license as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 64 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Alexios Zavras <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Based on 1 normalized pattern(s):
this software is licensed under the terms of the gnu general public
license version 2 as published by the free software foundation and
may be copied distributed and modified under those terms this
program is distributed in the hope that it will be useful but
without any warranty without even the implied warranty of
merchantability or fitness for a particular purpose see the gnu
general public license for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 285 file(s).
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Alexios Zavras <[email protected]>
Reviewed-by: Allison Randal <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The err_buf array uses 128 bytes of stack space. Move it off the stack
by making it static. It's safe to use a shared buffer because
klp_try_switch_task() is called under klp_mutex.
Acked-by: Miroslav Benes <[email protected]>
Acked-by: Josh Poimboeuf <[email protected]>
Reviewed-by: Kamalesh Babulal <[email protected]>
Signed-off-by: Petr Mladek <[email protected]>
|
|
Improve the comments for pidfd_send_signal().
First, the comment still referred to a file descriptor for a process as a
"task file descriptor" which stems from way back at the beginning of the
discussion. Replace this with "pidfd" for consistency.
Second, the wording for the explanation of the arguments to the syscall
was a bit inconsistent, e.g. some used the past tense some used present
tense. Make the wording more consistent.
Signed-off-by: Christian Brauner <[email protected]>
|
|
Microsoft HyperV disables the X86_FEATURE_SMCA bit on AMD systems, and
linux guests boot with repeated errors:
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
amd64_edac_mod: Unknown symbol amd_unregister_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_register_ecc_decoder (err -2)
amd64_edac_mod: Unknown symbol amd_report_gart_errors (err -2)
The warnings occur because the module code erroneously returns -EEXIST
for modules that have failed to load and are in the process of being
removed from the module list.
module amd64_edac_mod has a dependency on module edac_mce_amd. Using
modules.dep, systemd will load edac_mce_amd for every request of
amd64_edac_mod. When the edac_mce_amd module loads, the module has
state MODULE_STATE_UNFORMED and once the module load fails and the state
becomes MODULE_STATE_GOING. Another request for edac_mce_amd module
executes and add_unformed_module() will erroneously return -EEXIST even
though the previous instance of edac_mce_amd has MODULE_STATE_GOING.
Upon receiving -EEXIST, systemd attempts to load amd64_edac_mod, which
fails because of unknown symbols from edac_mce_amd.
add_unformed_module() must wait to return for any case other than
MODULE_STATE_LIVE to prevent a race between multiple loads of
dependent modules.
Signed-off-by: Prarit Bhargava <[email protected]>
Signed-off-by: Barret Rhoden <[email protected]>
Cc: David Arcari <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Heiko Carstens <[email protected]>
Signed-off-by: Jessica Yu <[email protected]>
|
|
The variable err is assigned with the value -EINVAL that is never
read and it is re-assigned a new value later on. The assignment is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Also delete the dentry variable as it is never needed.
Acked-by: Peter Oberparleiter <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
A few architectures support uncached kernel segments. In that case we get
an uncached mapping for a given physica address by using an offset in the
uncached segement. Implement support for this scheme in the generic
dma-direct code instead of duplicating it in arch hooks.
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.
However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.
This patch tries to use the fallback alloc_pages path, instead of
one-page size allocations from the global CMA area in case that a
device does not have its own CMA area. This'd save resources from
the CMA global area for more CMA allocations, and also reduce CMA
fragmentations resulted from trivial allocations.
Signed-off-by: Nicolin Chen <[email protected]>
Tested-by: dann frazier <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
Both dma_alloc_from_contiguous() and dma_release_from_contiguous() are
very simply implemented, but requiring callers to pass certain
parameters like count and align, and taking a boolean parameter to check
__GFP_NOWARN in the allocation flags. So every function call duplicates
similar work:
unsigned long order = get_order(size);
size_t count = size >> PAGE_SHIFT;
page = dma_alloc_from_contiguous(dev, count, order,
gfp & __GFP_NOWARN);
[...]
dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
Additionally, as CMA can be used only in the context which permits
sleeping, most of callers do a gfpflags_allow_blocking() check and a
corresponding fallback allocation of normal pages upon any false result:
if (gfpflags_allow_blocking(flag))
page = dma_alloc_from_contiguous();
if (!page)
page = alloc_pages();
[...]
if (!dma_release_from_contiguous(dev, page, count))
__free_pages(page, get_order(size));
So this patch simplifies those function calls by abstracting these
operations into the two new functions: dma_{alloc,free}_contiguous.
As some callers of dma_{alloc,release}_from_contiguous() might be
complicated, this patch just implements these two new functions to
kernel/dma/direct.c only as an initial step.
Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
Tested-by: dann frazier <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: "Naveen N. Rao" <[email protected]>
Cc: Anil S Keshavamurthy <[email protected]>
Cc: "David S. Miller" <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Kees Cook <[email protected]>
Cc: Josef Bacik <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "Naveen N. Rao" <[email protected]>
Cc: zhong jiang <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Jens Axboe <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Instead of playing silly games with CONFIG_DEBUG_PREEMPT toggling
between this_cpu_*() and __this_cpu_*() use raw_cpu_*(), which is
exactly what we want here.
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: huang ying <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx;
struct ww_mutex ww_lock_a;
struct ww_mutex ww_lock_b;
struct ww_mutex ww_lock_c;
struct mutex lock_c;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
ww_mutex_init(&ww_lock_c, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
ww_mutex_lock(&ww_lock_c, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_c);
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx); (**)
will trigger the following error in __lock_release() when calling
mutex_release() at **:
DEBUG_LOCKS_WARN_ON(depth <= 0)
The problem is that the hlock merging happening at * updates the
references for test_ww_class incorrectly to 3 whereas it should've
updated it to 4 (representing all the instances for ww_ctx and
ww_lock_[abc]).
Fix this by updating the references during merging correctly taking into
account that we can have non-zero references (both for the hlock that we
merge into another hlock or for the hlock we are merging into).
Signed-off-by: Imre Deak <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The sequence
static DEFINE_WW_CLASS(test_ww_class);
struct ww_acquire_ctx ww_ctx;
struct ww_mutex ww_lock_a;
struct ww_mutex ww_lock_b;
struct mutex lock_c;
struct mutex lock_d;
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_c); (*)
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
ww_acquire_fini(&ww_ctx);
triggers the following WARN in __lock_release() when doing the unlock at *:
DEBUG_LOCKS_WARN_ON(curr->lockdep_depth != depth - 1);
The problem is that the WARN check doesn't take into account the merging
of ww_lock_a and ww_lock_b which results in decreasing curr->lockdep_depth
by 2 not only 1.
Note that the following sequence doesn't trigger the WARN, since there
won't be any hlock merging.
ww_acquire_init(&ww_ctx, &test_ww_class);
ww_mutex_init(&ww_lock_a, &test_ww_class);
ww_mutex_init(&ww_lock_b, &test_ww_class);
mutex_init(&lock_c);
mutex_init(&lock_d);
ww_mutex_lock(&ww_lock_a, &ww_ctx);
mutex_lock(&lock_c);
mutex_lock(&lock_d);
ww_mutex_lock(&ww_lock_b, &ww_ctx);
mutex_unlock(&lock_d);
ww_mutex_unlock(&ww_lock_b);
ww_mutex_unlock(&ww_lock_a);
mutex_unlock(&lock_c);
ww_acquire_fini(&ww_ctx);
In general both of the above two sequences are valid and shouldn't
trigger any lockdep warning.
Fix this by taking the decrement due to the hlock merging into account
during lock release and hlock class re-setting. Merging can't happen
during lock downgrading since there won't be a new possibility to merge
hlocks in that case, so add a WARN if merging still happens then.
Signed-off-by: Imre Deak <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
As explained in
0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
we always, no matter what, have to bring up x86 HT siblings during boot at
least once in order to avoid first MCE bringing the system to its knees.
That means that whenever 'nosmt' is supplied on the kernel command-line,
all the HT siblings are as a result sitting in mwait or cpudile after
going through the online-offline cycle at least once.
This causes a serious issue though when a kernel, which saw 'nosmt' on its
commandline, is going to perform resume from hibernation: if the resume
from the hibernated image is successful, cr3 is flipped in order to point
to the address space of the kernel that is being resumed, which in turn
means that all the HT siblings are all of a sudden mwaiting on address
which is no longer valid.
That results in triple fault shortly after cr3 is switched, and machine
reboots.
Fix this by always waking up all the SMT siblings before initiating the
'restore from hibernation' process; this guarantees that all the HT
siblings will be properly carried over to the resumed kernel waiting in
resume_play_dead(), and acted upon accordingly afterwards, based on the
target kernel configuration.
Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
again in case it has SMT disabled; this means it has to online all
the siblings when resuming (so that they come out of hlt) and offline
them again to let them reach mwait.
Cc: 4.19+ <[email protected]> # v4.19+
Debugged-by: Thomas Gleixner <[email protected]>
Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Signed-off-by: Jiri Kosina <[email protected]>
Acked-by: Pavel Machek <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Reviewed-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Adding attr_update attribute group into pmu, to allow
having multiple attribute groups for same group name.
This will allow us to update "events" or "format"
directories with attributes that depend on various
HW conditions.
For example having group_format_extra group that updates
"format" directory only if pmu version is 2 and higher:
static umode_t
exra_is_visible(struct kobject *kobj, struct attribute *attr, int i)
{
return x86_pmu.version >= 2 ? attr->mode : 0;
}
static struct attribute_group group_format_extra = {
.name = "format",
.is_visible = exra_is_visible,
};
Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Currently, non-privileged user could only use uprobe with
kernel.perf_event_paranoid = -1
However, setting perf_event_paranoid to -1 leaks other users' processes to
non-privileged uprobes.
To introduce proper permission control of uprobes, we are building the
following system:
A daemon with CAP_SYS_ADMIN is in charge to create uprobes via tracefs;
Users asks the daemon to create uprobes;
Then user can attach uprobe only to processes owned by the user.
This patch allows non-privileged user to attach uprobe to processes owned
by the user.
The following example shows how to use uprobe with non-privileged user.
This is based on Brendan's blog post [1]
1. Create uprobe with root:
sudo perf probe -x 'readline%return +0($retval):string'
2. Then non-root user can use the uprobe as:
perf record -vvv -e probe_bash:readline__return -p <pid> sleep 20
perf script
[1] http://www.brendangregg.com/blog/2015-06-28/linux-ftrace-uprobe.html
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In mark_lock_irq(), the following checks are performed:
----------------------------------
| -> | unsafe | read unsafe |
|----------------------------------|
| safe | F B | F* B* |
|----------------------------------|
| read safe | F? B* | - |
----------------------------------
Where:
F: check_usage_forwards
B: check_usage_backwards
*: check enabled by STRICT_READ_CHECKS
?: check enabled by the !dir condition
From checking point of view, the special F? case does not make sense,
whereas it perhaps is made for peroformance concern. As later patch will
address this issue, remove this exception, which makes the checks
consistent later.
With STRICT_READ_CHECKS = 1 which is default, there is no functional
change.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The new bit can be any possible lock usage except it is garbage, so the
cases in switch can be made simpler. Warn early on if wrong usage bit is
passed without taking locks. No functional change.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Lock usage bit initialization is consolidated into one function
mark_usage(). Trivial readability improvement. No functional change.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
As Peter has put it all sound and complete for the cause, I simply quote:
"It (check_redundant) was added for cross-release (which has since been
reverted) which would generate a lot of redundant links (IIRC) but
having it makes the reports more convoluted -- basically, if we had an
A-B-C relation, then A-C will not be added to the graph because it is
already covered. This then means any report will include B, even though
a shorter cycle might have been possible."
This would increase the number of direct dependencies. For a simple workload
(make clean; reboot; make vmlinux -j8), the data looks like this:
CONFIG_LOCKDEP_SMALL: direct dependencies: 6926
!CONFIG_LOCKDEP_SMALL: direct dependencies: 9052 (+30.7%)
Suggested-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
These two functions now handle different check results themselves. A new
check_path function is added to check whether there is a path in the
dependency graph. No functional change.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The @nested is not used in __release_lock so remove it despite that it
is not used in lock_release in the first place.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
In check_deadlock(), the third argument read comes from the second
argument hlock so that it can be removed. No functional change.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The breadth-first search is implemented as flat-out non-recursive now, but
the comments are still describing it as recursive, update the comments in
that regard.
Signed-off-by: Yuyang Du <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|