Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Aric Cyr <[email protected]>
Reviewed-by: Aric Cyr <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[why]
There exist displays with EDIDs > 512 bytes, existing code
will cause us to ignore all extension blocks.
Signed-off-by: Jun Lei <[email protected]>
Reviewed-by: Wenjing Liu <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Audio was unmuted for HDMI only, need to do so for DP as well.
Signed-off-by: Wesley Chalmers <[email protected]>
Reviewed-by: Chris Park <[email protected]>
Reviewed-by: Eric Bernstein <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Flickering is observed on some displays when the number of inserted BTR
frames changes frequently.
[How]
Add in a margin of drift to prevent the inserted number of frames from
jumping around too frequently.
Signed-off-by: Aric Cyr <[email protected]>
Reviewed-by: Nicholas Kazlauskas <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Description]
when programming VID_TIMING, we were using the original VESA timing for DP_VIDM/N.
for YCbCr420 or compressed YCbCr422, using half rate as YCbCr444.
Signed-off-by: Charlene Liu <[email protected]>
Reviewed-by: Nikola Cornij <[email protected]>
Acked-by: Leo Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Need to set VID_N_MUL for 4:2:0 cases
[How]
Move setting to enc1_stream_encoder_dp_unblank and
ensure it is also set for non-4:2:0 cases.
Signed-off-by: Eric Bernstein <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move out of header to dc_helper.c, in preparation for future
implementations.
Signed-off-by: Yongqiang Sun <[email protected]>
Reviewed-by: Tony Cheng <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
To help prevent plane state not being set to the correct default
value if any new properties are added in the future.
[How]
Use the drm helper - which seems to be the common solution among other
DRM drivers.
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Sun peng Li <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
we were only checking the return value in one place, thus changing
generic_reg_wait from int to void and reading the register instead of
getting it from generic_reg_wait, when we need the return value.
Signed-off-by: Yongqiang Sun <[email protected]>
Reviewed-by: Tony Cheng <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
The plane_reset callback is subclassed but hasn't been updated since
the drm helper got updated to include resetting alpha related state
(state->alpha and state->pixel_blend_mode). The overlay planes
exposed by amdgpu_dm were therefore being rendered as invisible by
default ever since supported was exposed for alpha blending properties
on overlays.
This caused regressions in igt@kms_plane_multiple@atomic-tiling-none
and igt@kms_plane@plane-position-covered-pipe tests.
[How]
Reset the plane state values to their correct values as defined in
the drm helper.
This fixes the IGT test regression.
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Acked-by: Bhawanpreet Lakha <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To aid recoverable page faults.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We need the first paging queue to handle page faults.
v2: handle any number of SDMA instances gracefully
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Now that we have re-reoute faults to the other IH
ring we can enable retries again.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
add amdgpu_amdkfd_pre_reset and amdgpu_amdkfd_post_reset inside amdgpu_device_reset_sriov.
Signed-off-by: Wentao Lou <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
Incorrect hardcoded assumptions are made regarding luma and chroma
alignment. The actual values set for the DRM framebuffer should be used
when programming the address.
[How]
Respect the given pitch for both luma and chroma planes - it's not like
we can force the alignment to anything else at this point anyway.
Use the FB offset for the chroma planes directly. DRM already
provides this to us so there's no need to calculate it manually.
While we don't actually use the chroma surface size parameters on Raven,
these should have technically been fb->width / 2 and fb->height / 2
since the chroma plane is half size of the luma plane for NV12.
Leave a TODO indicating that those should be set based on the actual
surface format instead since this is only correct for YUV420 formats.
Signed-off-by: Nicholas Kazlauskas <[email protected]>
Reviewed-by: Harry Wentland <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When page table are updated by the CPU, synchronize with the
allocation and initialization of newly allocated page tables.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
If using old kernel config file, CONFIG_ZONE_DEVICE is not selected,
so CONFIG_HMM and CONFIG_HMM_MIRROR is not enabled, the current driver
error message "Failed to register MMU notifier" is not clear. Inform
user with more descriptive message on how to fix the missing kernel
config option.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109808
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
MIXER on Exynos5 SoCs uses different synchronisation method than Exynos4
to update internal state (shadow registers).
Apparently the driver implements it incorrectly. The rule should be
as follows:
- do not request updating registers until previous request was finished,
ie. MXR_CFG_LAYER_UPDATE_COUNT must be 0.
- before setting registers synchronisation on VSYNC should be turned off,
ie. MXR_STATUS_SYNC_ENABLE should be reset,
- after finishing MXR_STATUS_SYNC_ENABLE should be set again.
The patch hopefully implements it correctly.
Below sample kernel log from page fault caused by the bug:
[ 25.670038] exynos-sysmmu 14650000.sysmmu: 14450000.mixer: PAGE FAULT occurred at 0x2247b800
[ 25.677888] ------------[ cut here ]------------
[ 25.682164] kernel BUG at ../drivers/iommu/exynos-iommu.c:450!
[ 25.687971] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
[ 25.693778] Modules linked in:
[ 25.696816] CPU: 5 PID: 1553 Comm: fb-release_test Not tainted 5.0.0-rc7-01157-g5f86b1566bdd #136
[ 25.705646] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[ 25.711710] PC is at exynos_sysmmu_irq+0x1c0/0x264
[ 25.716470] LR is at lock_is_held_type+0x44/0x64
v2: added missing MXR_CFG_LAYER_UPDATE bit setting in mixer_enable_sync
Reported-by: Marian Mihailescu <[email protected]>
Signed-off-by: Andrzej Hajda <[email protected]>
Signed-off-by: Inki Dae <[email protected]>
|
|
The event pool used for queueing commands is destroyed fairly early in the
ibmvscsi_remove() code path. Since, this happens prior to the call so
scsi_remove_host() it is possible for further calls to queuecommand to be
processed which manifest as a panic due to a NULL pointer dereference as
seen here:
PANIC: "Unable to handle kernel paging request for data at address
0x00000000"
Context process backtrace:
DSISR: 0000000042000000 ????Syscall Result: 0000000000000000
4 [c000000002cb3820] memcpy_power7 at c000000000064204
[Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4
5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable)
6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi]
7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod]
8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod]
9 [c000000002cb3be0] __blk_run_queue at c000000000429860
10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec
11 [c000000002cb3c40] process_one_work at c0000000000dac30
12 [c000000002cb3cd0] worker_thread at c0000000000db110
13 [c000000002cb3d80] kthread at c0000000000e3378
14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c
The kernel buffer log is overfilled with this log:
[11261.952732] ibmvscsi: found no event struct in pool!
This patch reorders the operations during host teardown. Start by calling
the SRP transport and Scsi_Host remove functions to flush any outstanding
work and set the host offline. LLDD teardown follows including destruction
of the event pool, freeing the Command Response Queue (CRQ), and unmapping
any persistent buffers. The event pool destruction is protected by the
scsi_host lock, and the pool is purged prior of any requests for which we
never received a response. Finally, move the removal of the scsi host from
our global list to the end so that the host is easily locatable for
debugging purposes during teardown.
Cc: <[email protected]> # v2.6.12+
Signed-off-by: Tyrel Datwyler <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
For each ibmvscsi host created during a probe or destroyed during a remove
we either add or remove that host to/from the global ibmvscsi_head
list. This runs the risk of concurrent modification.
This patch adds a simple spinlock around the list modification calls to
prevent concurrent updates as is done similarly in the ibmvfc driver and
ipr driver.
Fixes: 32d6e4b6e4ea ("scsi: ibmvscsi: add vscsi hosts to global list_head")
Cc: <[email protected]> # v4.10+
Signed-off-by: Tyrel Datwyler <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
This allows us to ditch i915 in some more places.
v2: use local var in check_vgpu (Paulo)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This will allow futher simplifications in the uncore handling.
v2: move register access setup under uncore (Chris)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Use a local variable where it makes sense.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Remove unneeded usage of dev_priv from 1 extra function.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Move the init, fini, prune, suspend, resume function to work on
intel_uncore instead of dev_priv.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that the internal code all works on intel_uncore, flip the
external-facing interface.
v2: fix GVT.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Get/put functions used outside of uncore.c are updated in the next
patch for a nicer split.
v2: use dev_priv where we still have it (Paulo)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Avoid that the following warnings are reported when building with W=1:
block/blk-cgroup.c:1755: warning: Function parameter or member 'q' not described in 'blkcg_schedule_throttle'
block/blk-cgroup.c:1755: warning: Function parameter or member 'use_memdelay' not described in 'blkcg_schedule_throttle'
block/blk-cgroup.c:1779: warning: Function parameter or member 'blkg' not described in 'blkcg_add_delay'
block/blk-cgroup.c:1779: warning: Function parameter or member 'now' not described in 'blkcg_add_delay'
block/blk-cgroup.c:1779: warning: Function parameter or member 'delta' not described in 'blkcg_add_delay'
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Upper bits are reserved on gen6, so no issue if we write them. Note that
we're already doing this in the non-MT case of IVB, which uses the same
register.
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Paulo Zanoni <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Paulo Zanoni <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This patch avoids that the following warning is reported when building
with W=1:
block/blk-iolatency.c:734:5: warning: no previous prototype for 'blk_iolatency_init' [-Wmissing-prototypes]
Cc: Josef Bacik <[email protected]>
Fixes: d70675121546 ("block: introduce blk-iolatency io controller") # v4.19
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
This function is not used outside the block layer core. Hence unexport it.
Cc: Christoph Hellwig <[email protected]>
Cc: Ming Lei <[email protected]>
Signed-off-by: Bart Van Assche <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
unexpected value
For q->poll_nsec == -1, means doing classic poll, not hybrid poll.
We introduce a new flag BLK_MQ_POLL_CLASSIC to replace -1, which
may make code much easier to read.
Additionally, since val is an int obtained with kstrtoint(), val can be
a negative value other than -1, so return -EINVAL for that case.
Thanks to Damien Le Moal for some good suggestion.
Reviewed-by: Damien Le Moal <[email protected]>
Signed-off-by: Yufen Yu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
In symbol__disassemble(), DSO_BINARY_TYPE__BPF_PROG_INFO dso calls into
a new function symbol__disassemble_bpf(), where annotation line
information is filled based on the bpf_prog_info and btf data saved in
given perf_env.
symbol__disassemble_bpf() uses binutils's libopcodes to disassemble bpf
programs.
Committer testing:
After fixing this:
- u64 *addrs = (u64 *)(info_linear->info.jited_ksyms);
+ u64 *addrs = (u64 *)(uintptr_t)(info_linear->info.jited_ksyms);
Detected when crossbuilding to a 32-bit arch.
And making all this dependent on HAVE_LIBBFD_SUPPORT and
HAVE_LIBBPF_SUPPORT:
1) Have a BPF program running, one that has BTF info, etc, I used
the tools/perf/examples/bpf/augmented_raw_syscalls.c put in place
by 'perf trace'.
# grep -B1 augmented_raw ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
#
# perf trace -e *mmsg
dnf/6245 sendmmsg(20, 0x7f5485a88030, 2, MSG_NOSIGNAL) = 2
NetworkManager/10055 sendmmsg(22<socket:[1056822]>, 0x7f8126ad1bb0, 2, MSG_NOSIGNAL) = 2
2) Then do a 'perf record' system wide for a while:
# perf record -a
^C[ perf record: Woken up 68 times to write data ]
[ perf record: Captured and wrote 19.427 MB perf.data (366891 samples) ]
#
3) Check that we captured BPF and BTF info in the perf.data file:
# perf report --header-only | grep 'b[pt]f'
# event : name = cycles:ppp, , id = { 294789, 294790, 294791, 294792, 294793, 294794, 294795, 294796 }, size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, read_format = ID, disabled = 1, inherit = 1, mmap = 1, comm = 1, freq = 1, task = 1, precise_ip = 3, sample_id_all = 1, exclude_guest = 1, mmap2 = 1, comm_exec = 1, ksymbol = 1, bpf_event = 1
# bpf_prog_info of id 13
# bpf_prog_info of id 14
# bpf_prog_info of id 15
# bpf_prog_info of id 16
# bpf_prog_info of id 17
# bpf_prog_info of id 18
# bpf_prog_info of id 21
# bpf_prog_info of id 22
# bpf_prog_info of id 41
# bpf_prog_info of id 42
# btf info of id 2
#
4) Check which programs got recorded:
# perf report | grep bpf_prog | head
0.16% exe bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.14% exe bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.08% fuse-overlayfs bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.07% fuse-overlayfs bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.01% clang-4.0 bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
0.00% runc bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% clang bpf_prog_819967866022f1e1_sys_enter [k] bpf_prog_819967866022f1e1_sys_enter
0.00% sh bpf_prog_c1bd85c092d6e4aa_sys_exit [k] bpf_prog_c1bd85c092d6e4aa_sys_exit
#
This was with the default --sort order for 'perf report', which is:
--sort comm,dso,symbol
If we just look for the symbol, for instance:
# perf report --sort symbol | grep bpf_prog | head
0.26% [k] bpf_prog_819967866022f1e1_sys_enter - -
0.24% [k] bpf_prog_c1bd85c092d6e4aa_sys_exit - -
#
or the DSO:
# perf report --sort dso | grep bpf_prog | head
0.26% bpf_prog_819967866022f1e1_sys_enter
0.24% bpf_prog_c1bd85c092d6e4aa_sys_exit
#
We'll see the two BPF programs that augmented_raw_syscalls.o puts in
place, one attached to the raw_syscalls:sys_enter and another to the
raw_syscalls:sys_exit tracepoints, as expected.
Now we can finally do, from the command line, annotation for one of
those two symbols, with the original BPF program source coude intermixed
with the disassembled JITed code:
# perf annotate --stdio2 bpf_prog_819967866022f1e1_sys_enter
Samples: 950 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 553756947, [percent: local period]
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Percent int sys_enter(struct syscall_enter_args *args)
53.41 push %rbp
0.63 mov %rsp,%rbp
0.31 sub $0x170,%rsp
1.93 sub $0x28,%rbp
7.02 mov %rbx,0x0(%rbp)
3.20 mov %r13,0x8(%rbp)
1.07 mov %r14,0x10(%rbp)
0.61 mov %r15,0x18(%rbp)
0.11 xor %eax,%eax
1.29 mov %rax,0x20(%rbp)
0.11 mov %rdi,%rbx
return bpf_get_current_pid_tgid();
2.02 → callq *ffffffffda6776d9
2.76 mov %eax,-0x148(%rbp)
mov %rbp,%rsi
int sys_enter(struct syscall_enter_args *args)
add $0xfffffffffffffeb8,%rsi
return bpf_map_lookup_elem(pids, &pid) != NULL;
movabs $0xffff975ac2607800,%rdi
1.26 → callq *ffffffffda6789e9
cmp $0x0,%rax
2.43 → je 0
add $0x38,%rax
0.21 xor %r13d,%r13d
if (pid_filter__has(&pids_filtered, getpid()))
0.81 cmp $0x0,%rax
→ jne 0
mov %rbp,%rdi
probe_read(&augmented_args.args, sizeof(augmented_args.args), args);
2.22 add $0xfffffffffffffeb8,%rdi
0.11 mov $0x40,%esi
0.32 mov %rbx,%rdx
2.74 → callq *ffffffffda658409
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
0.22 mov %rbp,%rsi
1.69 add $0xfffffffffffffec0,%rsi
syscall = bpf_map_lookup_elem(&syscalls, &augmented_args.args.syscall_nr);
movabs $0xffff975bfcd36000,%rdi
add $0xd0,%rdi
0.21 mov 0x0(%rsi),%eax
0.93 cmp $0x200,%rax
→ jae 0
0.10 shl $0x3,%rax
0.11 add %rdi,%rax
0.11 → jmp 0
xor %eax,%eax
if (syscall == NULL || !syscall->enabled)
1.07 cmp $0x0,%rax
→ je 0
if (syscall == NULL || !syscall->enabled)
6.57 movzbq 0x0(%rax),%rdi
if (syscall == NULL || !syscall->enabled)
cmp $0x0,%rdi
0.95 → je 0
mov $0x40,%r8d
switch (augmented_args.args.syscall_nr) {
mov -0x140(%rbp),%rdi
switch (augmented_args.args.syscall_nr) {
cmp $0x2,%rdi
→ je 0
cmp $0x101,%rdi
→ je 0
cmp $0x15,%rdi
→ jne 0
case SYS_OPEN: filename_arg = (const void *)args->args[0];
mov 0x10(%rbx),%rdx
→ jmp 0
case SYS_OPENAT: filename_arg = (const void *)args->args[1];
mov 0x18(%rbx),%rdx
if (filename_arg != NULL) {
cmp $0x0,%rdx
→ je 0
xor %edi,%edi
augmented_args.filename.reserved = 0;
mov %edi,-0x104(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rbp,%rdi
add $0xffffffffffffff00,%rdi
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov $0x100,%esi
→ callq *ffffffffda658499
mov $0x148,%r8d
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %eax,-0x108(%rbp)
augmented_args.filename.size = probe_read_str(&augmented_args.filename.value,
mov %rax,%rdi
shl $0x20,%rdi
shr $0x20,%rdi
if (augmented_args.filename.size < sizeof(augmented_args.filename.value)) {
cmp $0xff,%rdi
→ ja 0
len -= sizeof(augmented_args.filename.value) - augmented_args.filename.size;
add $0x48,%rax
len &= sizeof(augmented_args.filename.value) - 1;
and $0xff,%rax
mov %rax,%r8
mov %rbp,%rcx
return perf_event_output(args, &__augmented_syscalls__, BPF_F_CURRENT_CPU, &augmented_args, len);
add $0xfffffffffffffeb8,%rcx
mov %rbx,%rdi
movabs $0xffff975fbd72d800,%rsi
mov $0xffffffff,%edx
→ callq *ffffffffda658ad9
mov %rax,%r13
}
mov %r13,%rax
0.72 mov 0x0(%rbp),%rbx
mov 0x8(%rbp),%r13
1.16 mov 0x10(%rbp),%r14
0.10 mov 0x18(%rbp),%r15
0.42 add $0x28,%rbp
0.54 leaveq
0.54 ← retq
#
Please see 'man perf-config' to see how to control what should be seen,
via ~/.perfconfig [annotate] section, for instance, one can suppress the
source code and see just the disassembly, etc.
Alternatively, use the TUI bu just using 'perf annotate', press
'/bpf_prog' to see the bpf symbols, press enter and do the interactive
annotation, which allows for dumping to a file after selecting the
the various output tunables, for instance, the above without source code
intermixed, plus showing all the instruction offsets:
# perf annotate bpf_prog_819967866022f1e1_sys_enter
Then press: 's' to hide the source code + 'O' twice to show all
instruction offsets, then 'P' to print to the
bpf_prog_819967866022f1e1_sys_enter.annotation file, which will have:
# cat bpf_prog_819967866022f1e1_sys_enter.annotation
bpf_prog_819967866022f1e1_sys_enter() bpf_prog_819967866022f1e1_sys_enter
Event: cycles:ppp
53.41 0: push %rbp
0.63 1: mov %rsp,%rbp
0.31 4: sub $0x170,%rsp
1.93 b: sub $0x28,%rbp
7.02 f: mov %rbx,0x0(%rbp)
3.20 13: mov %r13,0x8(%rbp)
1.07 17: mov %r14,0x10(%rbp)
0.61 1b: mov %r15,0x18(%rbp)
0.11 1f: xor %eax,%eax
1.29 21: mov %rax,0x20(%rbp)
0.11 25: mov %rdi,%rbx
2.02 28: → callq *ffffffffda6776d9
2.76 2d: mov %eax,-0x148(%rbp)
33: mov %rbp,%rsi
36: add $0xfffffffffffffeb8,%rsi
3d: movabs $0xffff975ac2607800,%rdi
1.26 47: → callq *ffffffffda6789e9
4c: cmp $0x0,%rax
2.43 50: → je 0
52: add $0x38,%rax
0.21 56: xor %r13d,%r13d
0.81 59: cmp $0x0,%rax
5d: → jne 0
63: mov %rbp,%rdi
2.22 66: add $0xfffffffffffffeb8,%rdi
0.11 6d: mov $0x40,%esi
0.32 72: mov %rbx,%rdx
2.74 75: → callq *ffffffffda658409
0.22 7a: mov %rbp,%rsi
1.69 7d: add $0xfffffffffffffec0,%rsi
84: movabs $0xffff975bfcd36000,%rdi
8e: add $0xd0,%rdi
0.21 95: mov 0x0(%rsi),%eax
0.93 98: cmp $0x200,%rax
9f: → jae 0
0.10 a1: shl $0x3,%rax
0.11 a5: add %rdi,%rax
0.11 a8: → jmp 0
aa: xor %eax,%eax
1.07 ac: cmp $0x0,%rax
b0: → je 0
6.57 b6: movzbq 0x0(%rax),%rdi
bb: cmp $0x0,%rdi
0.95 bf: → je 0
c5: mov $0x40,%r8d
cb: mov -0x140(%rbp),%rdi
d2: cmp $0x2,%rdi
d6: → je 0
d8: cmp $0x101,%rdi
df: → je 0
e1: cmp $0x15,%rdi
e5: → jne 0
e7: mov 0x10(%rbx),%rdx
eb: → jmp 0
ed: mov 0x18(%rbx),%rdx
f1: cmp $0x0,%rdx
f5: → je 0
f7: xor %edi,%edi
f9: mov %edi,-0x104(%rbp)
ff: mov %rbp,%rdi
102: add $0xffffffffffffff00,%rdi
109: mov $0x100,%esi
10e: → callq *ffffffffda658499
113: mov $0x148,%r8d
119: mov %eax,-0x108(%rbp)
11f: mov %rax,%rdi
122: shl $0x20,%rdi
126: shr $0x20,%rdi
12a: cmp $0xff,%rdi
131: → ja 0
133: add $0x48,%rax
137: and $0xff,%rax
13d: mov %rax,%r8
140: mov %rbp,%rcx
143: add $0xfffffffffffffeb8,%rcx
14a: mov %rbx,%rdi
14d: movabs $0xffff975fbd72d800,%rsi
157: mov $0xffffffff,%edx
15c: → callq *ffffffffda658ad9
161: mov %rax,%r13
164: mov %r13,%rax
0.72 167: mov 0x0(%rbp),%rbx
16b: mov 0x8(%rbp),%r13
1.16 16f: mov 0x10(%rbp),%r14
0.10 173: mov 0x18(%rbp),%r15
0.42 177: add $0x28,%rbp
0.54 17b: leaveq
0.54 17c: ← retq
Another cool way to test all this is to symple use 'perf top' look for
those symbols, go there and press enter, annotate it live :-)
Signed-off-by: Song Liu <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit 003ca0fd2286 ("Refactor disassembler selection") in the binutils
repo, which changed the disassembler() function signature, so we must
use the feature test introduced in fb982666e380 ("tools/bpftool: fix
bpftool build with bintutils >= 2.9") to deal with that.
Committer testing:
After adding the missing function call to test-all.c, and:
FEATURE_CHECK_LDFLAGS-disassembler-four-args = -bfd -lopcodes
And the fallbacks for cases where we need -liberty and sometimes -lz to
tools/perf/Makefile.config, we get:
$ make -C tools/perf O=/tmp/build/perf install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j8' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
... libaudit: [ on ]
... libbfd: [ on ]
... libelf: [ on ]
... libnuma: [ on ]
... numa_num_possible_cpus: [ on ]
... libperl: [ on ]
... libpython: [ on ]
... libslang: [ on ]
... libcrypto: [ on ]
... libunwind: [ on ]
... libdw-dwarf-unwind: [ on ]
... zlib: [ on ]
... lzma: [ on ]
... get_cpuid: [ on ]
... bpf: [ on ]
... libaio: [ on ]
... disassembler-four-args: [ on ]
CC /tmp/build/perf/jvmti/libjvmti.o
CC /tmp/build/perf/builtin-bench.o
<SNIP>
$
$
The feature detection test-all.bin gets successfully built and linked:
$ ls -la /tmp/build/perf/feature/test-all.bin
-rwxrwxr-x. 1 acme acme 2680352 Mar 19 11:07 /tmp/build/perf/feature/test-all.bin
$ nm /tmp/build/perf/feature/test-all.bin | grep -w disassembler
0000000000061f90 T disassembler
$
Time to move on to the patches that make use of this disassembler()
routine in binutils's libopcodes.
Signed-off-by: Song Liu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Roman Gushchin <[email protected]>
Cc: Stanislav Fomichev <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ split from a larger patch, added missing FEATURE_CHECK_LDFLAGS-disassembler-four-args ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch fixes the PORT_SYNC_MODE_MASTER_SELECT macro
to correctly do the left shifting to set the port sync
master select correctly.
I have tested this fix on ICL.
Fixes: 49edbd49786e ("drm/i915/icl: Define TRANS_DDI_FUNC_CTL DSI registers")
Cc: Madhav Chauhan <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: <[email protected]> # v5.0+
Signed-off-by: Manasi Navare <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We found out that for v2 hw, a SATA disk can not be written to after the
system comes up.
In commit ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status
in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an
internal abort for a SATA device, but without following it with a
softreset.
We need to always follow an internal abort with a software reset, as per HW
programming flow, so add this.
Fixes: ffb1c820b8b6 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()")
Signed-off-by: Luo Jiaxing <[email protected]>
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
|
|
The lpi_range_list is supposed to be sorted in ascending order of
->base_id (at least if the range merging is to work), but the current
comparison function returns a positive value if rb->base_id >
ra->base_id, which means that list_sort() will put A after B in that
case - and vice versa, of course.
Fixes: 880cb3cddd16 (irqchip/gic-v3-its: Refactor LPI allocator)
Cc: [email protected] (v4.19+)
Signed-off-by: Rasmus Villemoes <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC updates from Vineet Gupta:
- unaligned access support for HS cores
- Removed extra memory barrier around spinlock code
- HSDK platform updates: enable dmac, reset
- some more boot logging updates
- misc minor fixes
* tag 'arc-5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
arch: arc: Kconfig: pedantic formatting
ARCv2: spinlock: remove the extra smp_mb before lock, after unlock
ARC: unaligned: relax the check for gcc supporting -mno-unaligned-access
ARC: boot log: cut down on verbosity
ARCv2: boot log: refurbish HS core/release identification
arc: hsdk_defconfig: Enable CONFIG_BLK_DEV_RAM
ARC: u-boot args: check that magic number is correct
ARC: perf: bpok condition only exists for ARCompact
ARCv2: Add explcit unaligned access support (and ability to disable too)
ARCv2: lib: introduce memcpy optimized for unaligned access
ARC: [plat-hsdk]: Enable AXI DW DMAC support
ARC: [plat-hsdk]: Add reset controller handle to manage USB reset
ARC: DTB: [scripted] fix node name and address spelling
|
|
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The arm64 config selects MULTI_IRQ_HANDLER, which was renamed to
GENERIC_IRQ_MULTI_HANDLER by commit 4c301f9b6a94 ("ARM: Convert
to GENERIC_IRQ_MULTI_HANDLER"). The 'new' option is already
selected, so just remove the obsolete entry.
Signed-off-by: Matthias Kaehlcke <[email protected]>
Signed-off-by: Catalin Marinas <[email protected]>
|
|
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jason Dillaman <[email protected]>
|
|
Because map updates are distributed lazily, an OSD may not know about
the new blacklist for quite some time after "osd blacklist add" command
is completed. This makes it possible for a blacklisted but still alive
client to overwrite a post-blacklist update, resulting in data
corruption.
Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
the post-blacklist epoch for all post-blacklist requests ensures that
all such requests "wait" for the blacklist to come into force on their
respective OSDs.
Cc: [email protected]
Fixes: 6305a3b41515 ("libceph: support for blacklisting clients")
Signed-off-by: Ilya Dryomov <[email protected]>
Reviewed-by: Jason Dillaman <[email protected]>
|
|
skl_update_pipe_wm() is quite pointless now. Just inline it into
skl_compute_wm().
v2: s/skl_build_pipe_wm/skl_update_pipe_wm/ in the commit message (Matt)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
{skl,icl}_build_plane_wm() don't need to be passed the pipe_wm, so
don't. And skl_build_pipe_wm() can easily dig it out itself.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Clean up skl_allocate_pipe_ddb() a bit by moving the 'wm' variable
to tighter scope. We'll also consitify it where appropriate.
Also initialize plane_alloc/uv_plane_alloc when decrlaring them
rather than later.
v2: Update commit message (Matt)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Currently we disable all the watermarks above the selected max
level for every plane. That would mean that the cursor's watermarks
may also get modified when another plane causes the selected
max watermark level to change. That is not so great as we would
like to keep the cursor as indepenedent as possible to avoid
having to throttle it in resposne to other plane activity.
To avoid that let's keep the watermarks enabled even for levels
above the max selected watermark level, iff the plane has enough
ddb for that particular level. This way the cursor's enabled
watermarks only depend on the cursor itself. This is safe because
the hardware will never choose to use a watermark level unless
all enabled planes have also enabled that level.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
We use a fixed ddb allocation for the cursor. Now the calculation
actually makes sure we have enough ddb space, but let's double check
anyway.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|
|
Currently we just assume that 32 or 8 blocks of ddb is sufficient
for the cursor. The 32 might be, but the 8 is certainly not. The
minimum we need is at least what level 0 watermarks need, but that
is a bit restrictive, so instead let's calculate what level 7
would need for a 256x256 cursor. We'll use that to determine the
fixed ddb allocation for the cursor. This way the cursor will never
be responsible for missing out on deeper power saving states.
v2: Loop to make sure this works even if some wm levels are
totally disabled (latency==0)
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Matt Roper <[email protected]> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Extract the meat of skl_compute_plane_wm_params() into a lower
level helper that doesn't depend on the plane state. We'll
reuse this for the cursor ddb allocation calculations.
Cc: Neel Desai <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Matt Roper <[email protected]>
|