Age | Commit message (Collapse) | Author | Files | Lines |
|
The commit af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
introduces RET_PTR_TO_BTF_ID_OR_NULL and
the commit eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
introduces RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL.
Note that for RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, the reg0->type
could become PTR_TO_MEM_OR_NULL which is not covered by
BPF_PROBE_MEM.
The BPF_REG_0 will then hold a _OR_NULL pointer type. This _OR_NULL
pointer type requires the bpf program to explicitly do a NULL check first.
After NULL check, the verifier will mark all registers having
the same reg->id as safe to use. However, the reg->id
is not set for those new _OR_NULL return types. One of the ways
that may be wrong is, checking NULL for one btf_id typed pointer will
end up validating all other btf_id typed pointers because
all of them have id == 0. The later tests will exercise
this path.
To fix it and also avoid similar issue in the future, this patch
moves the id generation logic out of each individual RET type
test in check_helper_call(). Instead, it does one
reg_type_may_be_null() test and then do the id generation
if needed.
This patch also adds a WARN_ON_ONCE in mark_ptr_or_null_reg()
to catch future breakage.
The _OR_NULL pointer usage in the bpf_iter_reg.ctx_arg_info is
fine because it just happens that the existing id generation after
check_ctx_access() has covered it. It is also using the
reg_type_may_be_null() to decide if id generation is needed or not.
Fixes: af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Pull more xfs updates from Darrick Wong:
"The second large pile of new stuff for 5.10, with changes even more
monumental than last week!
We are formally announcing the deprecation of the V4 filesystem format
in 2030. All users must upgrade to the V5 format, which contains
design improvements that greatly strengthen metadata validation,
supports reflink and online fsck, and is the intended vehicle for
handling timestamps past 2038. We're also deprecating the old Irix
behavioral tweaks in September 2025.
Coming along for the ride are two design changes to the deferred
metadata ops subsystem. One of the improvements is to retain correct
logical ordering of tasks and subtasks, which is a more logical design
for upper layers of XFS and will become necessary when we add atomic
file range swaps and commits. The second improvement to deferred ops
improves the scalability of the log by helping the log tail to move
forward during long-running operations. This reduces log contention
when there are a large number of threads trying to run transactions.
In addition to that, this fixes numerous small bugs in log recovery;
refactors logical intent log item recovery to remove the last
remaining place in XFS where we could have nested transactions; fixes
a couple of ways that intent log item recovery could fail in ways that
wouldn't have happened in the regular commit paths; fixes a deadlock
vector in the GETFSMAP implementation (which improves its performance
by 20%); and fixes serious bugs in the realtime growfs, fallocate, and
bitmap handling code.
Summary:
- Deprecate the V4 filesystem format, some disused mount options, and
some legacy sysctl knobs now that we can support dates into the
25th century. Note that removal of V4 support will not happen until
the early 2030s.
- Fix some probles with inode realtime flag propagation.
- Fix some buffer handling issues when growing a rt filesystem.
- Fix a problem where a BMAP_REMAP unmap call would free rt extents
even though the purpose of BMAP_REMAP is to avoid freeing the
blocks.
- Strengthen the dabtree online scrubber to check hash values on
child dabtree blocks.
- Actually log new intent items created as part of recovering log
intent items.
- Fix a bug where quotas weren't attached to an inode undergoing bmap
intent item recovery.
- Fix a buffer overrun problem with specially crafted log buffer
headers.
- Various cleanups to type usage and slightly inaccurate comments.
- More cleanups to the xattr, log, and quota code.
- Don't run the (slower) shared-rmap operations on attr fork
mappings.
- Fix a bug where we failed to check the LSN of finobt blocks during
replay and could therefore overwrite newer data with older data.
- Clean up the ugly nested transaction mess that log recovery uses to
stage intent item recovery in the correct order by creating a
proper data structure to capture recovered chains.
- Use the capture structure to resume intent item chains with the
same log space and block reservations as when they were captured.
- Fix a UAF bug in bmap intent item recovery where we failed to
maintain our reference to the incore inode if the bmap operation
needed to relog itself to continue.
- Rearrange the defer ops mechanism to finish newly created subtasks
of a parent task before moving on to the next parent task.
- Automatically relog intent items in deferred ops chains if doing so
would help us avoid pinning the log tail. This will help fix some
log scaling problems now and will facilitate atomic file updates
later.
- Fix a deadlock in the GETFSMAP implementation by using an internal
memory buffer to reduce indirect calls and copies to userspace,
thereby improving its performance by ~20%.
- Fix various problems when calling growfs on a realtime volume would
not fully update the filesystem metadata.
- Fix broken Kconfig asking about deprecated XFS when XFS is
disabled"
* tag 'xfs-5.10-merge-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (48 commits)
xfs: fix Kconfig asking about XFS_SUPPORT_V4 when XFS_FS=n
xfs: fix high key handling in the rt allocator's query_range function
xfs: annotate grabbing the realtime bitmap/summary locks in growfs
xfs: make xfs_growfs_rt update secondary superblocks
xfs: fix realtime bitmap/summary file truncation when growing rt volume
xfs: fix the indent in xfs_trans_mod_dquot
xfs: do the ASSERT for the arguments O_{u,g,p}dqpp
xfs: fix deadlock and streamline xfs_getfsmap performance
xfs: limit entries returned when counting fsmap records
xfs: only relog deferred intent items if free space in the log gets low
xfs: expose the log push threshold
xfs: periodically relog deferred intent items
xfs: change the order in which child and parent defer ops are finished
xfs: fix an incore inode UAF in xfs_bui_recover
xfs: clean up xfs_bui_item_recover iget/trans_alloc/ilock ordering
xfs: clean up bmap intent item recovery checking
xfs: xfs_defer_capture should absorb remaining transaction reservation
xfs: xfs_defer_capture should absorb remaining block reservations
xfs: proper replay of deferred ops queued during log recovery
xfs: remove XFS_LI_RECOVERED
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Support directly accessing host page cache from virtiofs. This can
improve I/O performance for various workloads, as well as reducing
the memory requirement by eliminating double caching. Thanks to Vivek
Goyal for doing most of the work on this.
- Allow automatic submounting inside virtiofs. This allows unique
st_dev/ st_ino values to be assigned inside the guest to files
residing on different filesystems on the host. Thanks to Max Reitz
for the patches.
- Fix an old use after free bug found by Pradeep P V K.
* tag 'fuse-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
virtiofs: calculate number of scatter-gather elements accurately
fuse: connection remove fix
fuse: implement crossmounts
fuse: Allow fuse_fill_super_common() for submounts
fuse: split fuse_mount off of fuse_conn
fuse: drop fuse_conn parameter where possible
fuse: store fuse_conn in fuse_req
fuse: add submount support to <uapi/linux/fuse.h>
fuse: fix page dereference after free
virtiofs: add logic to free up a memory range
virtiofs: maintain a list of busy elements
virtiofs: serialize truncate/punch_hole and dax fault path
virtiofs: define dax address space operations
virtiofs: add DAX mmap support
virtiofs: implement dax read/write operations
virtiofs: introduce setupmapping/removemapping commands
virtiofs: implement FUSE_INIT map_alignment field
virtiofs: keep a list of free dax memory ranges
virtiofs: add a mount option to enable dax
virtiofs: set up virtio_fs dax_device
...
|
|
workaround
alignment_handler currently only tests the unaligned cases but it can
also be useful for testing the workaround for the P9N DD2.1 vector CI
load issue fixed by p9_hmi_special_emu(). This workaround was
introduced in 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector
CI load issue").
This changes the loop to start from offset 0 rather than 1 so that we
test the kernel emulation in p9_hmi_special_emu().
Signed-off-by: Michael Neuling <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
__get_user_atomic_128_aligned() stores to kaddr using stvx which is a
VMX store instruction, hence kaddr must be 16 byte aligned otherwise
the store won't occur as expected.
Unfortunately when we call __get_user_atomic_128_aligned() in
p9_hmi_special_emu(), the buffer we pass as kaddr (ie. vbuf) isn't
guaranteed to be 16B aligned. This means that the write to vbuf in
__get_user_atomic_128_aligned() has the bottom bits of the address
truncated. This results in other local variables being
overwritten. Also vbuf will not contain the correct data which results
in the userspace emulation being wrong and hence undetected user data
corruption.
In the past we've been mostly lucky as vbuf has ended up aligned but
this is fragile and isn't always true. CONFIG_STACKPROTECTOR in
particular can change the stack arrangement enough that our luck runs
out.
This issue only occurs on POWER9 Nimbus <= DD2.1 bare metal.
The fix is to align vbuf to a 16 byte boundary.
Fixes: 5080332c2c89 ("powerpc/64s: Add workaround for P9 vector CI load issue")
Cc: [email protected] # v4.15+
Signed-off-by: Michael Neuling <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs updates from Damien Le Moal:
"Add an 'explicit-open' mount option to automatically issue a
REQ_OP_ZONE_OPEN command to the device whenever a sequential zone file
is open for writing for the first time.
This avoids 'insufficient zone resources' errors for write operations
on some drives with limited zone resources or on ZNS drives with a
limited number of active zones. From Johannes"
* tag 'zonefs-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: document the explicit-open mount option
zonefs: open/close zone on file open/close
zonefs: provide no-lock zonefs_io_error variant
zonefs: introduce helper for zone management
|
|
Set range and remove the set_time check. This is a classic BCD RTC.
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
This allows further improvement of the driver.
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
tm_wday is never checked for validity and it is not read back in
r9701_get_datetime. Avoid setting it to stop tripping static checkers:
drivers/rtc/rtc-r9701.c:109 r9701_set_datetime()
error: undefined (user controlled) shift '1 << dt->tm_wday'
Reported-by: Dan Carpenter <[email protected]>
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The RTC core already sets to zero the struct rtc_tie it passes to the
driver, avoid doing it a second time.
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It doesn't make sense to set the RTC to a default value at probe time. Let
the core handle invalid date and time.
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Commit 22652ba72453 ("rtc: stop validating rtc_time in .read_time") removed
the code but not the associated comment.
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
New driver for the Microcrystal RV-3032, including support for:
- Date/time
- Alarms
- Low voltage detection
- Trickle charge
- Trimming
- Clkout
- RAM
- EEPROM
- Temperature sensor
Signed-off-by: Alexandre Belloni <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Document the Microcrystal RV-3032 device tree bindings
Signed-off-by: Alexandre Belloni <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Some RTCs have a trickle charge that is able to output different voltages
depending on the type of the connected auxiliary power (battery, supercap,
...). Add a property allowing to specify the necessary voltage.
Signed-off-by: Alexandre Belloni <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
In crypt_message, when smb2_get_enc_key returns error, we need to
return the error back to the caller. If not, we end up processing
the message further, causing a kernel oops due to unwarranted access
of memory.
Call Trace:
smb3_receive_transform+0x120/0x870 [cifs]
cifs_demultiplex_thread+0xb53/0xc20 [cifs]
? cifs_handle_standard+0x190/0x190 [cifs]
kthread+0x116/0x130
? kthread_park+0x80/0x80
ret_from_fork+0x1f/0x30
Signed-off-by: Shyam Prasad N <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
CC: Stable <[email protected]>
Signed-off-by: Steve French <[email protected]>
|
|
update smb encryption code to set 32 byte key length and to
set gcm256 when requested on mount.
Signed-off-by: Steve French <[email protected]>
|
|
Now that 256 bit encryption can be negotiated, update
names of the nonces to match the updated official protocol
documentation (e.g. AES_GCM_NONCE instead of AES_128GCM_NONCE)
since they apply to both 128 bit and 256 bit encryption.
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
If server does not support AES-256-GCM and it was required on mount, print
warning message. Also log and return a different error message (EOPNOTSUPP)
when encryption mechanism is not supported vs the case when an unknown
unrequested encryption mechanism could be returned (EINVAL).
Signed-off-by: Steve French <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
|
|
io_link_timeout_fn() removes REQ_F_LINK_TIMEOUT from the link head's
flags, it's not atomic and may race with what the head is doing.
If io_link_timeout_fn() doesn't clear the flag, as forced by this patch,
then it may happen that for "req -> link_timeout1 -> link_timeout2",
__io_kill_linked_timeout() would find link_timeout2 and try to cancel
it, so miscounting references. Teach it to ignore such double timeouts
by marking the active one with a new flag in io_prep_linked_timeout().
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Move INIT_HLIST_NODE(&req->hash_node) into __io_arm_poll_handler(), so
that it doesn't duplicated and common poll code would be responsible for
it.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_poll_task_handler() doesn't add clarity, inline it in its only user.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_poll_add_prep() doesn't need to verify ->file because it's already
done in io_init_req().
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
ctx->cached_cq_overflow is changed only under completion_lock. Convert
it from atomic_t to just int, and mark all places when it's read without
lock with READ_ONCE, which guarantees atomicity (relaxed ordering).
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Inline io_fail_links() and kill extra io_cqring_ev_posted().
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Don't take an identity on personality/creds init only to drop it a few
lines after. Extract a function which prepares req->work but leaves it
without identity.
Note: it's safe to not check REQ_F_WORK_INITIALIZED there because it's
nobody had a chance to init it before io_init_req().
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Use IO_WQ_WORK_CREDS to figure out if req has creds to be used.
Since recently it should rely only on flags, but not value of
work.creds.
Signed-off-by: Pavel Begunkov <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
A break is not needed if it is preceded by a return.
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Andrii Nakryiko <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: [email protected] # v5.4
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 233c1ae3c83f21046c6c4083da904163ece8f110)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
A CSB entry is 64b, and it is simpler for us to treat it as an array of
64b entries than as an array of pairs of 32b entries.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit f24a44e52fbc9881fc5f3bcef536831a15a439f3)
(cherry picked from commit 3d4dbe0e0f0d04ebcea917b7279586817da8cf46)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Commits
ca0e22d4f011 ("x86/boot/compressed/64: Always switch to own page table")
8570978ea030 ("x86/boot/compressed/64: Don't pre-map memory in KASLR code")
set up a new page table in the decompressor stub, but without explicit
mappings for boot_params and the kernel command line, relying on the #PF
handler instead.
This is fragile, as boot_params and the command line mappings are
required for the main kernel. If EARLY_PRINTK and RANDOMIZE_BASE are
disabled, a QEMU/OVMF boot never accesses the command line in the
decompressor stub, and so it never gets mapped. The main kernel accesses
it from the identity mapping if AMD_MEM_ENCRYPT is enabled, and will
crash.
Fix this by adding back the explicit mapping of boot_params and the
command line.
Note: the changes also removed the explicit mapping of the main kernel,
with the result that .bss and .brk may not be in the identity mapping,
but those don't get accessed by the main kernel before it switches to
its own page tables.
[ bp: Pass boot_params with a MOV %rsp... instead of PUSH/POP. Use
block formatting for the comment. ]
Signed-off-by: Arvind Sankar <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Joerg Roedel <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
perf_event.h has macros that define the field offsets in the
data_src bitmask in perf records. The SNOOPX and REMOTE offsets
were both 37. These are distinct fields, and the bitfield layout
in perf_mem_data_src confirms that SNOOPX should be at offset 38.
Fixes: 52839e653b5629bd ("perf tools: Add support for printing new mem_info encodings")
Signed-off-by: Al Grant <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
During error capture, we need to take a reference to the vma from before
the reset in order to catpure the contents of the vma later. Currently
we are using both an active reference and a kref, but due to nature of
the i915_vma reference handling, that kref is on the vma->obj and not
the vma itself. This means the vma may be destroyed as soon as it is
idle, that is in between the i915_active_release(&vma->active) and the
i915_vma_put(vma):
<3> [197.866181] BUG: KASAN: use-after-free in intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<3> [197.866339] Read of size 8 at addr ffff8881258cb800 by task gem_exec_captur/1041
<3> [197.866467]
<4> [197.866512] CPU: 2 PID: 1041 Comm: gem_exec_captur Not tainted 5.9.0-g5e4234f97efba-kasan_200+ #1
<4> [197.866521] Hardware name: Intel Corp. Broxton P/Apollolake RVP1A, BIOS APLKRVPA.X64.0150.B11.1608081044 08/08/2016
<4> [197.866530] Call Trace:
<4> [197.866549] dump_stack+0x99/0xd0
<4> [197.866760] ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<4> [197.866783] print_address_description.constprop.8+0x3e/0x60
<4> [197.866797] ? kmsg_dump_rewind_nolock+0xd4/0xd4
<4> [197.866819] ? lockdep_hardirqs_off+0xd4/0x120
<4> [197.867037] ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<4> [197.867249] ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<4> [197.867270] kasan_report.cold.10+0x1f/0x37
<4> [197.867492] ? intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<4> [197.867710] intel_engine_coredump_add_vma+0x36c/0x4a0 [i915]
<4> [197.867949] i915_gpu_coredump.part.29+0x150/0x7b0 [i915]
<4> [197.868186] i915_capture_error_state+0x5e/0xc0 [i915]
<4> [197.868396] intel_gt_handle_error+0x6eb/0xa20 [i915]
<4> [197.868624] ? intel_gt_reset_global+0x370/0x370 [i915]
<4> [197.868644] ? check_flags+0x50/0x50
<4> [197.868662] ? __lock_acquire+0xd59/0x6b00
<4> [197.868678] ? register_lock_class+0x1ad0/0x1ad0
<4> [197.868944] i915_wedged_set+0xcf/0x1b0 [i915]
<4> [197.869147] ? i915_wedged_get+0x90/0x90 [i915]
<4> [197.869371] ? i915_wedged_get+0x90/0x90 [i915]
<4> [197.869398] simple_attr_write+0x153/0x1c0
<4> [197.869428] full_proxy_write+0xee/0x180
<4> [197.869442] ? __sb_start_write+0x1f3/0x310
<4> [197.869465] vfs_write+0x1a3/0x640
<4> [197.869492] ksys_write+0xec/0x1c0
<4> [197.869507] ? __ia32_sys_read+0xa0/0xa0
<4> [197.869525] ? lockdep_hardirqs_on_prepare+0x32b/0x4e0
<4> [197.869541] ? syscall_enter_from_user_mode+0x1c/0x50
<4> [197.869566] do_syscall_64+0x33/0x80
<4> [197.869579] entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4> [197.869590] RIP: 0033:0x7fd8b7aee281
<4> [197.869604] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [197.869613] RSP: 002b:00007ffea3b72008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [197.869625] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fd8b7aee281
<4> [197.869633] RDX: 0000000000000002 RSI: 00007fd8b81a82e7 RDI: 000000000000000d
<4> [197.869641] RBP: 0000000000000002 R08: 0000000000000000 R09: 0000000000000034
<4> [197.869650] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8b81a82e7
<4> [197.869658] R13: 000000000000000d R14: 0000000000000000 R15: 0000000000000000
<3> [197.869707]
<3> [197.869757] Allocated by task 1041:
<4> [197.869833] kasan_save_stack+0x19/0x40
<4> [197.869843] __kasan_kmalloc.constprop.5+0xc1/0xd0
<4> [197.869853] kmem_cache_alloc+0x106/0x8e0
<4> [197.870059] i915_vma_instance+0x212/0x1930 [i915]
<4> [197.870270] eb_lookup_vmas+0xe06/0x1d10 [i915]
<4> [197.870475] i915_gem_do_execbuffer+0x131d/0x4080 [i915]
<4> [197.870682] i915_gem_execbuffer2_ioctl+0x103/0x5d0 [i915]
<4> [197.870701] drm_ioctl_kernel+0x1d2/0x270
<4> [197.870710] drm_ioctl+0x40d/0x85c
<4> [197.870721] __x64_sys_ioctl+0x10d/0x170
<4> [197.870731] do_syscall_64+0x33/0x80
<4> [197.870740] entry_SYSCALL_64_after_hwframe+0x44/0xa9
<3> [197.870748]
<3> [197.870798] Freed by task 22:
<4> [197.870865] kasan_save_stack+0x19/0x40
<4> [197.870875] kasan_set_track+0x1c/0x30
<4> [197.870884] kasan_set_free_info+0x1b/0x30
<4> [197.870894] __kasan_slab_free+0x111/0x160
<4> [197.870903] kmem_cache_free+0xcd/0x710
<4> [197.871109] i915_vma_parked+0x618/0x800 [i915]
<4> [197.871307] __gt_park+0xdb/0x1e0 [i915]
<4> [197.871501] ____intel_wakeref_put_last+0xb1/0x190 [i915]
<4> [197.871516] process_one_work+0x8dc/0x15d0
<4> [197.871525] worker_thread+0x82/0xb30
<4> [197.871535] kthread+0x36d/0x440
<4> [197.871545] ret_from_fork+0x22/0x30
<3> [197.871553]
<3> [197.871602] The buggy address belongs to the object at ffff8881258cb740
which belongs to the cache i915_vma of size 968
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2553
Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: <[email protected]> # v5.5+
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 178536b8292ecd118f59d2fac4509c7e70b99854)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We may try to preempt the currently executing request, only to find that
after unravelling all the dependencies that the original executing
context is still the earliest in the topological sort and re-submitted
back to HW (if we do detect some change in the ELSP that requires
re-submission). However, due to the way we check for wrap-around during
the unravelling, we mark any context that has been submitted just once
(i.e. with the rq->wa_tail set, but the ring->tail earlier) as
potentially wrapping and requiring a forced restore on resubmission.
This was expected to be not a problem, as it was anticipated that most
unwinding for preemption would result in a context switch and the few
that did not would be lost in the noise. It did not take long for
someone to find one particular workload where the cost of those extra
context restores was measurable.
However, since we know the wa_tail is of fixed size, and we know that a
request must be larger than the wa_tail itself, we can safely maintain
the check for request wrapping and check against a slightly future point
in the ring that includes an expected wa_tail. (That is if the
ring->tail is already set to rq->wa_tail, including another 8 bytes in
the check does not invalidate the incremental wrap detection.)
Fixes: 8ab3a3812aa9 ("drm/i915/gt: Incrementally check for rewinding")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Ramalingam C <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: <[email protected]> # v5.4+
Reviewed-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit bb65548e3c6e299175a9e8c3e24b2b9577656a5d)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
When running gem_exec_nop, it floods the system with many requests (with
the goal of userspace submitting faster than the HW can process a single
empty batch). This causes the driver to continually resubmit new
requests onto the end of an active context, a flood of lite-restore
preemptions. If we time this just right, Tigerlake hangs.
Inserting a small delay between the processing of CS events and
submitting the next context, prevents the hang. Naturally it does not
occur with debugging enabled. The suspicion then is that this is related
to the issues with the CS event buffer, and inserting an mmio read of
the CS pointer status appears to be very successful in preventing the
hang. Other registers, or uncached reads, or plain mb, do not prevent
the hang, suggesting that register is key -- but that the hang can be
prevented by a simple udelay, suggests it is just a timing issue like
that encountered by commit 233c1ae3c83f ("drm/i915/gt: Wait for CSB
entries on Tigerlake"). Also note that the hang is not prevented by
applying CTX_DESC_FORCE_RESTORE, or by inserting a delay on the GPU
between requests.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: [email protected]
Acked-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 6ca7217dffaf1abba91558e67a2efb655ac91405)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <[email protected]>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Jon Bloomfield <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Cc: [email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
during fbdev init
Currently we leave the cache_level of the initial fb obj
set to NONE. This means on eLLC machines the first pin_to_display()
will try to switch it to WT which requires a vma unbind+bind.
If that happens during the fbdev initialization rcu does not
seem operational which causes the unbind to get stuck. To
most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level
as WT when creating it. We still do an excplicit ggtt pin
which will rewrite the PTEs anyway, so they will match whatever
cache level we set.
Cc: <[email protected]> # v5.7+
Suggested-by: Chris Wilson <[email protected]>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit d46b60a2e8d246f1f0faa38e52f4f5a73858c338)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In order to avoid functional breakage of mis-programmed applications that
have grown to depend on unused MOCS entries, we are programming
those entries to be equal to fully cached ("L3 + LLC") entry.
These reserved and unspecified entries should not be used as they may be
changed to less performant variants with better coherency in the future
if more entries are needed.
v2: As suggested by Lucas De Marchi to utilise __init_mocs_table for
programming default value, setting I915_MOCS_PTE index of tgl_mocs_table
with desired value.
Cc: Chris Wilson <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Cc: Tomasz Lis <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Francisco Jerez <[email protected]>
Cc: Mathew Alwin <[email protected]>
Cc: Mcguire Russell W <[email protected]>
Cc: Spruit Neil R <[email protected]>
Cc: Zhou Cheng <[email protected]>
Cc: Benemelis Mike G <[email protected]>
Signed-off-by: Ayaz A Siddiqui <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Acked-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: [email protected]
(cherry picked from commit 4d8a5cfe3b131f60903949f998c5961cc922e0b0)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In commit 79946723092b ("drm/i915: Assume 100% brightness when not in
DPCD control mode"), we fixed the brightness level when DPCD control was
not active to max brightness. This is as good as we can guess since most
backlights go on full when uncontrolled.
However in doing so we changed the semantics of the initial
'backlight.enabled' value. At least on Pixelbooks, they were relying
on the brightness level in DP_EDP_BACKLIGHT_BRIGHTNESS_MSB to be 0 on
boot such that enabled would be false. This causes the device to be
enabled when the brightness is set. Without this, brightness control
doesn't work. So by changing brightness to max, we also flipped enabled
to be true on boot.
To fix this, make enabled a function of brightness and backlight control
mechanism.
Fixes: 79946723092b ("drm/i915: Assume 100% brightness when not in DPCD control mode")
Cc: Lyude Paul <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Juha-Pekka Heikkila <[email protected]>
Cc: "Ville Syrjälä" <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Kevin Chowski <[email protected]>>
Signed-off-by: Sean Paul <[email protected]>
Reviewed-by: Lyude Paul <[email protected]>
Signed-off-by: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 4ade8f31c25bef7ce7ed4d7cbac17df7c4bad850)
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Fix an inverted flag for intercepting x2APIC MSRs and intercept writes
by default, even when APICV is enabled.
Fixes: 3eb900173c71 ("KVM: x86: VMX: Prevent MSR passthrough when MSR access is denied")
Co-developed-by: Peter Xu <[email protected]>
[sean: added changelog]
Signed-off-by: Sean Christopherson <[email protected]>
Message-Id: <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
|
|
Recently, in commit 6735b4632def ("Fonts: Support FONT_EXTRA_WORDS macros
for built-in fonts"), we wrapped each of our built-in data buffers in a
`font_data` structure, in order to use the following macros on them, see
include/linux/font.h:
#define REFCOUNT(fd) (((int *)(fd))[-1])
#define FNTSIZE(fd) (((int *)(fd))[-2])
#define FNTCHARCNT(fd) (((int *)(fd))[-3])
#define FNTSUM(fd) (((int *)(fd))[-4])
#define FONT_EXTRA_WORDS 4
Do the same thing to our new 6x8 font. For built-in fonts, currently we
only use FNTSIZE(). Since this is only a temporary solution for an
out-of-bounds issue in the framebuffer layer (see commit 5af08640795b
("fbcon: Fix global-out-of-bounds read in fbcon_get_font()")), all the
three other fields are intentionally set to zero in order to discourage
using these negative-indexing macros.
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/926453876c92caac34cba8545716a491754d04d5.1603037079.git.yepeilin.cs@gmail.com
|
|
Recently we added a new 6x8 font in commit e2028c8e6bf9 ("lib/fonts: add
font 6x8 for OLED display"). Add its name to the "compiled-in fonts"
list.
Signed-off-by: Peilin Ye <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Signed-off-by: Hubert Jasudowicz <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
We have the raw cached freq to reduce the chance in calling cpufreq
driver where it could be costly in some arch/SoC.
Currently, the raw cached freq is reset in sugov_update_single() when
it avoids frequency reduction (which is not desirable sometimes), but
it is better to restore the previous value of it in that case,
because it may not change in the next cycle and it is not necessary
to change the CPU frequency then.
Adapted from https://android-review.googlesource.com/1352810/
Signed-off-by: Wei Wang <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
[ rjw: Subject edit and changelog rewrite ]
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
Mikhail reported a lockdep spat detailing how __zram_bvec_read() and
__zram_bvec_write() use zstrm->lock and zspage->lock in opposite order.
Reported-by: Mikhail Gavrilov <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Mikhail Gavrilov <[email protected]>
Acked-by: Minchan Kim <[email protected]>
Acked-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
acpi-cpufreq has a old quirk that overrides the _PSD table supplied by
BIOS on AMD CPUs. However the _PSD table of new AMD CPUs (Family 19h+)
now accurately reports the P-state dependency of CPU cores. Hence this
quirk needs to be fixed in order to support new CPUs' frequency control.
Fixes: acd316248205 ("acpi-cpufreq: Add quirk to disable _PSD usage on all AMD CPUs")
Signed-off-by: Wei Huang <[email protected]>
[ rjw: Subject edit ]
Cc: 3.10+ <[email protected]> # 3.10+
Signed-off-by: Rafael J. Wysocki <[email protected]>
|
|
commit 021a24460dc2 ("block: add QUEUE_FLAG_NOWAIT") adds a new helper
function blk_queue_nowait() to check if the bdev supports handling of
REQ_NOWAIT or not. Since then bio-based dm device can also support
REQ_NOWAIT, and currently only dm-linear supports that since
commit 6abc49468eea ("dm: add support for REQ_NOWAIT and enable it for
linear target").
Signed-off-by: Jeffle Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
The eventfd context is used as our irqbypass token, therefore if an
eventfd is re-used, our token is the same. The irqbypass code will
return an -EBUSY in this case, but we'll still attempt to unregister
the producer, where if that duplicate token still exists, results in
removing the wrong object. Clear the token of failed producers so
that they harmlessly fall out when unregistered.
Fixes: 6d7425f109d2 ("vfio: Register/unregister irq_bypass_producer")
Reported-by: guomin chen <[email protected]>
Tested-by: guomin chen <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
The vfio_fsl_mc_reflck_attach function may return, on success path,
an uninitialized variable. Fix the problem by initializing the return
variable to 0.
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling")
Reported-by: Colin Ian King <[email protected]>
Signed-off-by: Diana Craciun <[email protected]>
Signed-off-by: Alex Williamson <[email protected]>
|
|
Type casts needed on 32-bit systems are missing in two places in the
GPE register access code, so add them.
Fixes: 7a8379eb41a4 ("ACPICA: Add support for using logical addresses of GPE blocks")
Reported-and-tested-by: Matthieu Baerts <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
|