Age | Commit message (Collapse) | Author | Files | Lines |
|
In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.
symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.
Committer testing:
After fixing this:
- u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
+ u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);
Detected when crossbuilding to a 32-bit arch.
And making all this dependent on HAVE_LIBBFD_SUPPORT and
HAVE_LIBBPF_SUPPORT:
1) Have a BPF program running, one that has BTF info, etc, I used
the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
by 'perf trace'.
# grep -B1 augmented_raw ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
#
# perf trace -e *mmsg
dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
2) Then do a 'perf record' system wide for a while:
# perf record -a
^C[ perf record: Woken up 68 times to write data ]
[ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
#
3) Check that we captured BPF and BTF info in the perf.data file:
# perf report --header-only | grep 'b[pt]f'
# event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 41
# bpf_prog_info of id 42
# btf info of id 2
#
4) Check which programs got recorded:
# perf report | grep bpf_prog | head
0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
#
This was with the default --sort order for 'perf report', which is:
--sort comm,dso,symbol
If we just look for the symbol, for instance:
# perf report --sort symbol | grep bpf_prog | head
0.26% [k] bpf_prog_819967866022f1e1_sys_enter - -
0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - -
#
or the DSO:
# perf report --sort dso | grep bpf_prog | head
0.26% bpf_prog_819967866022f1e1_sys_enter
0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit
#
We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place, one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.
Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:
# perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Percent int sys_enter(struct syscall_enter_args *args)
53.41 push %rbp
0.63 mov %rsp,%rbp
0.31 sub $0x170,%rsp
1.93 sub $0x28,%rbp
7.02 mov %rbx,0x0(%rbp)
3.20 mov %r13,0x8(%rbp)
1.07 mov %r14,0x10(%rbp)
0.61 mov %r15,0x18(%rbp)
0.11 xor %eax,%eax
1.29 mov %rax,0x20(%rbp)
0.11 mov %rdi,%rbx
return bpf_get_current_pid_tgid();
2.02 → callq *ffffffffda6776d9
2.76 mov %eax,-0x148(%rbp)
mov %rbp,%rsi
int sys_enter(struct syscall_enter_args *args)
add $0xfffffffffffffeb8,%rsi
return bpf_map_lookup_elem(pids, &pid) != NULL;
movabs $0xffff975ac2607800,%rdi
1.26 → callq *ffffffffda6789e9
cmp $0x0,%rax
2.43 → je 0
add $0x38,%rax
0.21 xor %r13d,%r13d
if (pid_filter__has(&pids_filtered, getpid()))
0.81 cmp $0x0,%rax
→ jne 0
mov %rbp,%rdi
probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
2.22 add $0xfffffffffffffeb8,%rdi
0.11 mov $0x40,%esi
0.32 mov %rbx,%rdx
2.74 → callq *ffffffffda658409
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
0.22 mov %rbp,%rsi
1.69 add $0xfffffffffffffec0,%rsi
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
movabs $0xffff975bfcd36000,%rdi
add $0xd0,%rdi
0.21 mov 0x0(%rsi),%eax
0.93 cmp $0x200,%rax
→ jae 0
0.10 shl $0x3,%rax
0.11 add %rdi,%rax
0.11 → jmp 0
xor %eax,%eax
if (syscall == NULL || !syscall->enabled)
1.07 cmp $0x0,%rax
→ je 0
if (syscall == NULL || !syscall->enabled)
6.57 movzbq 0x0(%rax),%rdi
if (syscall == NULL || !syscall->enabled)
cmp $0x0,%rdi
0.95 → je 0
mov $0x40,%r8d
switch (augmented_args.args.syscall_nr) {
mov -0x140(%rbp),%rdi
switch (augmented_args.args.syscall_nr) {
cmp $0x2,%rdi
→ je 0
cmp $0x101,%rdi
→ je 0
cmp $0x15,%rdi
→ jne 0
case SYS_OPEN: filename_arg = (const void *)args->args[0];
mov 0x10(%rbx),%rdx
→ jmp 0
case SYS_OPENAT: filename_arg = (const void *)args->args[1];
mov 0x18(%rbx),%rdx
if (filename_arg != NULL) {
cmp $0x0,%rdx
→ je 0
xor %edi,%edi
augmented_args.filename.reserved = 0;
mov %edi,-0x104(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rbp,%rdi
add $0xffffffffffffff00,%rdi
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov $0x100,%esi
→ callq *ffffffffda658499
mov $0x148,%r8d
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %eax,-0x108(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rax,%rdi
shl $0x20,%rdi
shr $0x20,%rdi
if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
cmp $0xff,%rdi
→ ja 0
len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
add $0x48,%rax
len &= sizeof(augmented_args.filename.value) - 1;
and $0xff,%rax
mov %rax,%r8
mov %rbp,%rcx
return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
add $0xfffffffffffffeb8,%rcx
mov %rbx,%rdi
movabs $0xffff975fbd72d800,%rsi
mov $0xffffffff,%edx
→ callq *ffffffffda658ad9
mov %rax,%r13
}
mov %r13,%rax
0.72 mov 0x0(%rbp),%rbx
mov 0x8(%rbp),%r13
1.16 mov 0x10(%rbp),%r14
0.10 mov 0x18(%rbp),%r15
0.42 add $0x28,%rbp
0.54 leaveq
0.54 ← retq
#
Please see 'man perf-config' to see how to control what should be seen,
via ~/.perfconfig [annotate] section, for instance, one can suppress the
source code and see just the disassembly, etc.
Alternatively, use the TUI bu just using 'perf annotate', press
'/bpf_prog' to see the bpf symbols, press enter and do the interactive
annotation, which allows for dumping to a file after selecting the
the various output tunables, for instance, the above without source code
intermixed, plus showing all the instruction offsets:
# perf annotate bpf_prog_819967866022f1e1_sys_enter
Then press: 's' to hide the source code + 'O' twice to show all
instruction offsets, then 'P' to print to the
bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
# cat bpf_prog_819967866022f1e1_sys_enter.annotation
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Event: cycles:ppp
53.41 0: push %rbp
0.63 1: mov %rsp,%rbp
0.31 4: sub $0x170,%rsp
1.93 b: sub $0x28,%rbp
7.02 f: mov %rbx,0x0(%rbp)
3.20 13: mov %r13,0x8(%rbp)
1.07 17: mov %r14,0x10(%rbp)
0.61 1b: mov %r15,0x18(%rbp)
0.11 1f: xor %eax,%eax
1.29 21: mov %rax,0x20(%rbp)
0.11 25: mov %rdi,%rbx
2.02 28: → callq *ffffffffda6776d9
2.76 2d: mov %eax,-0x148(%rbp)
33: mov %rbp,%rsi
36: add $0xfffffffffffffeb8,%rsi
3d: movabs $0xffff975ac2607800,%rdi
1.26 47: → callq *ffffffffda6789e9
4c: cmp $0x0,%rax
2.43 50: → je 0
52: add $0x38,%rax
0.21 56: xor %r13d,%r13d
0.81 59: cmp $0x0,%rax
5d: → jne 0
63: mov %rbp,%rdi
2.22 66: add $0xfffffffffffffeb8,%rdi
0.11 6d: mov $0x40,%esi
0.32 72: mov %rbx,%rdx
2.74 75: → callq *ffffffffda658409
0.22 7a: mov %rbp,%rsi
1.69 7d: add $0xfffffffffffffec0,%rsi
84: movabs $0xffff975bfcd36000,%rdi
8e: add $0xd0,%rdi
0.21 95: mov 0x0(%rsi),%eax
0.93 98: cmp $0x200,%rax
9f: → jae 0
0.10 a1: shl $0x3,%rax
0.11 a5: add %rdi,%rax
0.11 a8: → jmp 0
aa: xor %eax,%eax
1.07 ac: cmp $0x0,%rax
b0: → je 0
6.57 b6: movzbq 0x0(%rax),%rdi
bb: cmp $0x0,%rdi
0.95 bf: → je 0
c5: mov $0x40,%r8d
cb: mov -0x140(%rbp),%rdi
d2: cmp $0x2,%rdi
d6: → je 0
d8: cmp $0x101,%rdi
df: → je 0
e1: cmp $0x15,%rdi
e5: → jne 0
e7: mov 0x10(%rbx),%rdx
eb: → jmp 0
ed: mov 0x18(%rbx),%rdx
f1: cmp $0x0,%rdx
f5: → je 0
f7: xor %edi,%edi
f9: mov %edi,-0x104(%rbp)
ff: mov %rbp,%rdi
102: add $0xffffffffffffff00,%rdi
109: mov $0x100,%esi
10e: → callq *ffffffffda658499
113: mov $0x148,%r8d
119: mov %eax,-0x108(%rbp)
11f: mov %rax,%rdi
122: shl $0x20,%rdi
126: shr $0x20,%rdi
12a: cmp $0xff,%rdi
131: → ja 0
133: add $0x48,%rax
137: and $0xff,%rax
13d: mov %rax,%r8
140: mov %rbp,%rcx
143: add $0xfffffffffffffeb8,%rcx
14a: mov %rbx,%rdi
14d: movabs $0xffff975fbd72d800,%rsi
157: mov $0xffffffff,%edx
15c: → callq *ffffffffda658ad9
161: mov %rax,%r13
164: mov %r13,%rax
0.72 167: mov 0x0(%rbp),%rbx
16b: mov 0x8(%rbp),%r13
1.16 16f: mov 0x10(%rbp),%r14
0.10 173: mov 0x18(%rbp),%r15
0.42 177: add $0x28,%rbp
0.54 17b: leaveq
0.54 17c: ← retq
Another cool way to test all this is to symple use 'perf top' look for
those symbols, go there and press enter, annotate it live :-)
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils
repo, which changed the disassembler() function signature, so we must
use the feature test introduced in fb982666e380 ("tools/bpftool: fix
bpftool build with bintutils >= 2.9") to deal with that.
Committer testing:
After adding the missing function call to test-all.c, and:
FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes
And the fallbacks for cases where we need -liberty and sometimes -lz to
tools/perf/Makefile.config, we get:
$ make -C tools/perf O=/tmp/build/perf install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j8' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... disassembler-four-args: [ on ]
CC /tmp/build/perf/jvmti/libjvmti.o
CC /tmp/build/perf/builtin-bench.o
<SNIP>
$
$
The feature detection test-all.bin gets successfully built and linked:
$ ls -la /tmp/build/perf/feature/test-all.bin
-rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin
$ nm /tmp/build/perf/feature/test-all.bin | grep -w disassembler
0000000000061f90 T disassembler
$
Time to move on to the patches that make use of this disassembler()
routine in binutils's libopcodes.
Signed-off-by: Song Liu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch fixes the PORT_SYNC_MODE_MASTER_SELECT macro
to correctly do the left shifting to set the port sync
master select correctly.
I have tested this fix on ICL.
Fixes: 49edbd49786e ("drm/i915/icl: Define TRANS_DDI_FUNC_CTL DSI registers")
Cc: Madhav Chauhan <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: <[email protected]> # v5.0+
Signed-off-by: Manasi Navare <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We found out that for v2 hw, a SATA disk can not be written to after the
system comes up.
In commit ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status
in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an
internal abort for a SATA device, but without following it with a
softreset.
We need to always follow an internal abort with a software reset, as per HW
programming flow, so add this.
Fixes: ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()")
Signed-off-by: Luo Jiaxing <[email protected]>
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The lpi_range_list is supposed to be sorted in ascending order of
->base_id (at least if the range merging is to work), but the current
comparison function returns a positive value if rb->base_id >
ra->base_id, which means that list_sort() will put A after B in that
case - and vice versa, of course.
Fixes: 880cb3cddd16 (irqchip/gic-v3-its: Refactor LPI allocator)
Cc: [email protected] (v4.19+)
Signed-off-by: Rasmus Villemoes <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- unaligned access support for HS cores
- Removed extra memory barrier around spinlock code
- HSDK platform updates: enable dmac, reset
- some more boot logging updates
- misc minor fixes
* tag 'arc-5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
arch: arc: Kconfig: pedantic formatting
ARCv2: spinlock: remove the extra smp_mb before lock, after unlock
ARC: unaligned: relax the check for gcc supporting -mno-unaligned-access
ARC: boot log: cut down on verbosity
ARCv2: boot log: refurbish HS core/release identification
arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM
ARC: u-boot args: check that magic number is correct
ARC: perf: bpok condition only exists for ARCompact
ARCv2: Add explcit unaligned access support (and ability to disable too)
ARCv2: lib: introduce memcpy optimized for unaligned access
ARC: [plat-hsdk]: Enable AXI DW DMAC support
ARC: [plat-hsdk]: Add reset controller handle to manage USB reset
ARC: DTB: [scripted] fix node name and address spelling
|
|
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The arm64 config selects MULTI_IRQ_HANDLER, which was renamed to
GENERIC_IRQ_MULTI_HANDLER by commit 4c301f9b6a94 ("ARM: Convert
to GENERIC_IRQ_MULTI_HANDLER"). The 'new' option is already
selected, so just remove the obsolete entry.
Signed-off-by: Matthias Kaehlcke <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jason Dillaman <[email protected]>
|
|
Because map updates are distributed lazily, an OSD may not know about
the new blacklist for quite some time after "osd blacklist add" command
is completed. This makes it possible for a blacklisted but still alive
client to overwrite a post-blacklist update, resulting in data
corruption.
Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
the post-blacklist epoch for all post-blacklist requests ensures that
all such requests "wait" for the blacklist to come into force on their
respective OSDs.
Cc: [email protected]
Fixes: 6305a3b41515 ("libceph: support for blacklisting clients")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jason Dillaman <[email protected]>
|
|
skl_update_pipe_wm() is quite pointless now. Just inline it into
skl_compute_wm().
v2: s/skl_build_pipe_wm/skl_update_pipe_wm/ in the commit message (Matt)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
{skl,icl}_build_plane_wm() don't need to be passed the pipe_wm, so
don't. And skl_build_pipe_wm() can easily dig it out itself.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Clean up skl_allocate_pipe_ddb() a bit by moving the 'wm' variable
to tighter scope. We'll also consitify it where appropriate.
Also initialize plane_alloc/uv_plane_alloc when decrlaring them
rather than later.
v2: Update commit message (Matt)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Currently we disable all the watermarks above the selected max
level for every plane. That would mean that the cursor's watermarks
may also get modified when another plane causes the selected
max watermark level to change. That is not so great as we would
like to keep the cursor as indepenedent as possible to avoid
having to throttle it in resposne to other plane activity.
To avoid that let's keep the watermarks enabled even for levels
above the max selected watermark level, iff the plane has enough
ddb for that particular level. This way the cursor's enabled
watermarks only depend on the cursor itself. This is safe because
the hardware will never choose to use a watermark level unless
all enabled planes have also enabled that level.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
We use a fixed ddb allocation for the cursor. Now the calculation
actually makes sure we have enough ddb space, but let's double check
anyway.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Currently we just assume that 32 or 8 blocks of ddb is sufficient
for the cursor. The 32 might be, but the 8 is certainly not. The
minimum we need is at least what level 0 watermarks need, but that
is a bit restrictive, so instead let's calculate what level 7
would need for a 256x256 cursor. We'll use that to determine the
fixed ddb allocation for the cursor. This way the cursor will never
be responsible for missing out on deeper power saving states.
v2: Loop to make sure this works even if some wm levels are
totally disabled (latency==0)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Matt Roper <[email protected]> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Extract the meat of skl_compute_plane_wm_params() into a lower
level helper that doesn't depend on the plane state. We'll
reuse this for the cursor ddb allocation calculations.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
skl_compute_plane_wm() doesn't actually need the plane state. While
it would make logically sense to pass it, we shall need to reuse
skl_compute_plane_wm() to compute the minimum ddb allocation for
the cursor before the cursor may be enabled. Thus we can't rely
on the plane state. The alternative would be to duplicate a lot of
the wm calculations for the cursor ddb allocation case, which doens't
appeal to me.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
If the minimum required ddb space for all the planes equals the
total ddb space available we are allowed to use the relevant
watermark level.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
To allow unsetting .is_mobile for the desktop variant
of PNV fix up the cdclk code to select the mobile HPLLVCO register
for both PNV variants.
Cc: Tvrtko Ursulin <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Tvrtko Ursulin <[email protected]>
|
|
We want to allow the desktop PNV to not have .is_mobile set. To
that end let's add a small helper to determine if the platform
has the ASLE interrupt (or equivalent). Supposdely both PNV
variants have it.
Cc: Tvrtko Ursulin <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Tvrtko Ursulin <[email protected]>
|
|
Add a small helper to determine if we have the panel power
sequencer or not. We'll make PNV an exceptional case so
that we can unset .is_mobile for the desktop variant.
Cc: Tvrtko Ursulin <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Tvrtko Ursulin <[email protected]>
|
|
Make the code self-documenting by introducing i9xx_has_pfit().
Also make PNV an exceptional case so that we can unset
.is_mobile for the desktop variant.
v2: s/gen4/gen>=4/ (Tvrtko)
Cc: Tvrtko Ursulin <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Tvrtko Ursulin <[email protected]>
|
|
g33/i964g/g45 are the exceptional cases when it comes to
the swizzle detection. Let's reorder the code to handle
them first and let everything else be handled by the
else branch. This allows us to unset .is_mobile for the
desktop PNV variant (which supposedly must follow the
"mobile" path here).
Cc: Tvrtko Ursulin <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Tvrtko Ursulin <[email protected]>
|
|
MAX_PHYSMEM_BITS only needs to be defined if CONFIG_SPARSEMEM is
enabled, and that was the case before commit 4ffe713b7587
("powerpc/mm: Increase the max addressable memory to 2PB").
On 32-bit systems, where CONFIG_SPARSEMEM is not enabled, we now
define it as 46. That is larger than the real number of physical
address bits, and breaks calculations in zsmalloc:
mm/zsmalloc.c:130:49: warning: right shift count is negative
MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))
^~
...
mm/zsmalloc.c:253:21: error: variably modified 'size_class' at file scope
struct size_class *size_class[ZS_SIZE_CLASSES];
^~~~~~~~~~
Fixes: 4ffe713b7587 ("powerpc/mm: Increase the max addressable memory to 2PB")
Cc: [email protected] # v4.20+
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
|
|
Exercise acquiring and releasing forcewake around register reads. In
order to read a register behind a GT powerwell, we need to instruct that
powerwell to wake up using a forcewake. When we no longer require the GT
powerwell, we tell the GT to release our forcewake. Inside the
forcewake, the register read should work but outside it should just
return garbage, 0 being the most common garbage. Thus we can detect when
we are inside and outside of the forcewake with just a simple register
read, and so can verify that the GT powerwell is released when we say
so.
v2: Picking the right forcewaked register to return 0 outside of
forcewake is an art.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Buffers passed to spi_sync() must be dma-safe even for tiny buffers since
some SPI controllers use DMA for all transfers.
Example splat with CONFIG_DMA_API_DEBUG enabled:
[ 23.750467] DMA-API: dw_dmac_pci 0000:00:15.0: device driver maps memory from stack [probable addr=000000001e49185d]
[ 23.750529] WARNING: CPU: 1 PID: 1296 at kernel/dma/debug.c:1161 check_for_stack+0xb7/0x190
[ 23.750533] Modules linked in: mmc_block(+) spi_pxa2xx_platform(+) pwm_lpss_pci pwm_lpss spi_pxa2xx_pci sdhci_pci cqhci intel_mrfld_pwrbtn extcon_intel_mrfld sdhci intel_mrfld_adc led_class mmc_core ili9341 mipi_dbi tinydrm backlight ti_ads7950 industrialio_triggered_buffer kfifo_buf intel_soc_pmic_mrfld hci_uart btbcm
[ 23.750599] CPU: 1 PID: 1296 Comm: modprobe Not tainted 5.0.0-rc7+ #236
[ 23.750605] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
[ 23.750620] RIP: 0010:check_for_stack+0xb7/0x190
[ 23.750630] Code: 8b 6d 50 4d 85 ed 75 04 4c 8b 6d 10 48 89 ef e8 2f 8b 44 00 48 89 c6 4a 8d 0c 23 4c 89 ea 48 c7 c7 88 d0 82 b4 e8 40 7c f9 ff <0f> 0b 8b 05 79 00 4b 01 85 c0 74 07 5b 5d 41 5c 41 5d c3 8b 05 54
[ 23.750637] RSP: 0000:ffff97bbc0292fa0 EFLAGS: 00010286
[ 23.750646] RAX: 0000000000000000 RBX: ffff97bbc0290000 RCX: 0000000000000006
[ 23.750652] RDX: 0000000000000007 RSI: 0000000000000002 RDI: ffff94b33e115450
[ 23.750658] RBP: ffff94b33c8578b0 R08: 0000000000000002 R09: 00000000000201c0
[ 23.750664] R10: 00000006ecb0ccc6 R11: 0000000000034f38 R12: 000000000000316c
[ 23.750670] R13: ffff94b33c84b250 R14: ffff94b33dedd5a0 R15: 0000000000000001
[ 23.750679] FS: 0000000000000000(0000) GS:ffff94b33e100000(0063) knlGS:00000000f7faf690
[ 23.750686] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 23.750691] CR2: 00000000f7f54faf CR3: 000000000722c000 CR4: 00000000001006e0
[ 23.750696] Call Trace:
[ 23.750713] debug_dma_map_sg+0x100/0x340
[ 23.750727] ? dma_direct_map_sg+0x3b/0xb0
[ 23.750739] spi_map_buf+0x25a/0x300
[ 23.750751] __spi_pump_messages+0x2a4/0x680
[ 23.750762] __spi_sync+0x1dd/0x1f0
[ 23.750773] spi_sync+0x26/0x40
[ 23.750790] mipi_dbi_typec3_command_read+0x14d/0x240 [mipi_dbi]
[ 23.750802] ? spi_finalize_current_transfer+0x10/0x10
[ 23.750821] mipi_dbi_typec3_command+0x1bc/0x1d0 [mipi_dbi]
Reported-by: Andy Shevchenko <[email protected]>
Signed-off-by: Noralf Trønnes <[email protected]>
Tested-by: Andy Shevchenko <[email protected]>
Acked-by: Andy Shevchenko <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Remove including <linux/version.h> that don't need it.
Signed-off-by: YueHaibing <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Hans de Goede <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
If a test fails, we quite often mark the device as wedged. Provide the
stub functions so that we can wedge the mock device, and avoid exploding
on test failures.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109981
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that the DMC register range is no longer in the bindings, remove any
mention towards it and exclusively use the meson-canvas module.
Signed-off-by: Maxime Jourdan <[email protected]>
Acked-by: Neil Armstrong <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
When the DRM driver for the meson platform was created, the bindings
required that the DMC register region was provided.
Through those DMC registers, the display driver could configure an IP
called "canvas", a video lookup table used by the display IP.
It was later discovered that "canvas" is actually an IP shared by other
components than display: video decoder, 2D engine.. and that it wasn't
possible to keep the canvas code in DRM.
Over the past few months, incremental efforts have been deployed to
create a standalone meson-canvas driver [1], and the DRM driver was
patched to optionally use it if present [2].
This is the final step of those efforts where we simply remove any
control over DMC that the meson DRM driver has.
Please note that this breaks compatibility with older DTs that only
provide the DMC register range but not the amlogic,canvas node.
[1] https://patchwork.kernel.org/cover/10573771/
[2] https://patchwork.freedesktop.org/series/52076/
Signed-off-by: Maxime Jourdan <[email protected]>
Reviewed-by: Neil Armstrong <[email protected]>
Signed-off-by: Neil Armstrong <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
When calling vmw_fb_set_par(), the mode stored in par->set_mode gets free'd
twice. The first free is in vmw_fb_kms_detach(), the second is near the
end of vmw_fb_set_par() under the name of 'old_mode'. The mode-setting code
only works correctly if the mode doesn't actually change. Removing
'old_mode' in favor of using par->set_mode directly fixes the problem.
Cc: <[email protected]>
Fixes: a278724aa23c ("drm/vmwgfx: Implement fbdev on kms v2")
Signed-off-by: Thomas Zimmermann <[email protected]>
Reviewed-by: Deepak Rawat <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
|
|
If it's not a system error and get_node implementation accommodate the
buffer object then it should return 0 with memm::mm_node set to NULL.
v2: Test for id != -ENOMEM instead of id == -ENOSPC.
Cc: <[email protected]>
Fixes: 4eb085e42fde ("drm/vmwgfx: Convert to new IDA API")
Signed-off-by: Deepak Rawat <[email protected]>
Reviewed-by: Thomas Hellstrom <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
|
|
Comet Lake PCH is based off of Cannon Point(CNP).
Add PCI ID for Comet Lake PCH.
v2: Code cleanup (DK)
v3: Comment cleanup (Jani)
Cc: Jani Nikula <[email protected]>
Cc: Dhinakaran Pandiyan <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anusha Srivatsa <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Comet Lake is a Intel Processor containing Gen9
Intel HD Graphics. This patch adds the initial set of
PCI IDs. Comet Lake comes off of Coffee Lake - adding
the IDs to Coffee Lake ID list.
More support and features will be in the patches that follow.
v2: Split IDs according to GT. (Rodrigo)
v3: Update IDs.
Cc: Rodrigo Vivi <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Anusha Srivatsa <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Lockdep warns that prepare_lock and genpd->mlock can cause a deadlock
the deadlock scenario is like following:
First thread is probing cs2000
cs2000_probe()
clk_register()
__clk_core_init()
clk_prepare_lock() ----> acquires prepare_lock
cs2000_recalc_rate()
i2c_smbus_read_byte_data()
rcar_i2c_master_xfer()
dma_request_chan()
rcar_dmac_of_xlate()
rcar_dmac_alloc_chan_resources()
pm_runtime_get_sync()
__pm_runtime_resume()
rpm_resume()
rpm_callback()
genpd_runtime_resume() ----> acquires genpd->mlock
Second thread is attaching any device to the same PM domain
genpd_add_device()
genpd_lock() ----> acquires genpd->mlock
cpg_mssr_attach_dev()
of_clk_get_from_provider()
__of_clk_get_from_provider()
__clk_create_clk()
clk_prepare_lock() ----> acquires prepare_lock
Since currently no PM provider access genpd's critical section
in .attach_dev, and .detach_dev callbacks, so there is no need to protect
these two callbacks with genpd->mlock.
This patch avoids a potential deadlock by moving out .attach_dev and .detach_dev
from genpd->mlock, so that genpd->mlock won't be held when prepare_lock is acquired
in .attach_dev and .detach_dev
Signed-off-by: Jiada Wang <[email protected]>
Reviewed-by: Ulf Hansson <[email protected]>
Tested-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
When commit 8661423eea1a ("ACPI / utils: Add new acpi_dev_present
helper") introduced acpi_dev_present(), it missed the fact that
bus_find_device() took a reference on the device found by it and
the callers of acpi_dev_present() don't drop that reference.
Drop the reference on the device in acpi_dev_present().
Fixes: 8661423eea1a ("ACPI / utils: Add new acpi_dev_present helper")
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
userptr may cross two VMAs if the forked child process (not call exec
after fork) malloc buffer, then free it, and then malloc larger size
buf, kerenl will create new VMA adjacent to old VMA which was cloned
from parent process, some pages of userptr are in the first VMA, the
rest pages are in the second VMA.
HMM expects range only have one VMA, loop over all VMAs in the address
range, create multiple ranges to handle this case. See
is_mergeable_anon_vma in mm/mmap.c for details.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Userptr restore may have concurrent userptr invalidation after
hmm_vma_fault adds the range to the hmm->ranges list, needs call
hmm_vma_range_done to remove the range from hmm->ranges list first,
then reschedule the restore worker. Otherwise hmm_vma_fault will add
same range to the list, this will cause loop in the list because
range->next point to range itself.
Add function untrack_invalid_user_pages to reduce code duplication.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Otherwise we won't be able to cleanly handle page faults.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Make sure that not only the entities are flush, but that
we also wait for the HW to finish all processing.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's a bug having a dead pointer in the IDR, silently returning
is the worst we can do.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove the chash implementation for now since it isn't used any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Further testing showed that the idea with the chash doesn't work as expected.
Especially we can't predict when we can remove the entries from the hash again.
So replace the chash with a ring buffer/hash mix where entries in the container
age automatically based on their timestamp.
v2: use ring buffer / hash mix
v3: check the timeout to make sure all entries age
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]> (v2)
Signed-off-by: Alex Deucher <[email protected]>
|
|
Only process a maximum of 32 IVs before writing back the RPTR. This improves
hw handling when we get close to an overflow in the ring buffer.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
That doesn't seem to have any negative effects.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The doorbells should already be reserved, just enable them.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Disable overflow and enable full drain. This makes fault handling on ring 1
much more reliable since we don't generate back pressure any more.
Signed-off-by: Christian König <[email protected]>
Acked-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|