<feed xmlns='http://www.w3.org/2005/Atom'>
<title>blaster4385/linux-IllusionX/io_uring, branch v6.12.1</title>
<subtitle>Linux kernel with personal config changes for arch linux</subtitle>
<id>https://git.tablaster.dev/blaster4385/linux-IllusionX/atom?h=v6.12.1</id>
<link rel='self' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/atom?h=v6.12.1'/>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/'/>
<updated>2024-10-31T14:21:02Z</updated>
<entry>
<title>io_uring/rw: fix missing NOWAIT check for O_DIRECT start write</title>
<updated>2024-10-31T14:21:02Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-31T14:05:44Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=1d60d74e852647255bd8e76f5a22dc42531e4389'/>
<id>urn:sha1:1d60d74e852647255bd8e76f5a22dc42531e4389</id>
<content type='text'>
When io_uring starts a write, it'll call kiocb_start_write() to bump the
super block rwsem, preventing any freezes from happening while that
write is in-flight. The freeze side will grab that rwsem for writing,
excluding any new writers from happening and waiting for existing writes
to finish. But io_uring unconditionally uses kiocb_start_write(), which
will block if someone is currently attempting to freeze the mount point.
This causes a deadlock where freeze is waiting for previous writes to
complete, but the previous writes cannot complete, as the task that is
supposed to complete them is blocked waiting on starting a new write.
This results in the following stuck trace showing that dependency with
the write blocked starting a new write:

task:fio             state:D stack:0     pid:886   tgid:886   ppid:876
Call trace:
 __switch_to+0x1d8/0x348
 __schedule+0x8e8/0x2248
 schedule+0x110/0x3f0
 percpu_rwsem_wait+0x1e8/0x3f8
 __percpu_down_read+0xe8/0x500
 io_write+0xbb8/0xff8
 io_issue_sqe+0x10c/0x1020
 io_submit_sqes+0x614/0x2110
 __arm64_sys_io_uring_enter+0x524/0x1038
 invoke_syscall+0x74/0x268
 el0_svc_common.constprop.0+0x160/0x238
 do_el0_svc+0x44/0x60
 el0_svc+0x44/0xb0
 el0t_64_sync_handler+0x118/0x128
 el0t_64_sync+0x168/0x170
INFO: task fsfreeze:7364 blocked for more than 15 seconds.
      Not tainted 6.12.0-rc5-00063-g76aaf945701c #7963

with the attempting freezer stuck trying to grab the rwsem:

task:fsfreeze        state:D stack:0     pid:7364  tgid:7364  ppid:995
Call trace:
 __switch_to+0x1d8/0x348
 __schedule+0x8e8/0x2248
 schedule+0x110/0x3f0
 percpu_down_write+0x2b0/0x680
 freeze_super+0x248/0x8a8
 do_vfs_ioctl+0x149c/0x1b18
 __arm64_sys_ioctl+0xd0/0x1a0
 invoke_syscall+0x74/0x268
 el0_svc_common.constprop.0+0x160/0x238
 do_el0_svc+0x44/0x60
 el0_svc+0x44/0xb0
 el0t_64_sync_handler+0x118/0x128
 el0t_64_sync+0x168/0x170

Fix this by having the io_uring side honor IOCB_NOWAIT, and only attempt a
blocking grab of the super block rwsem if it isn't set. For normal issue
where IOCB_NOWAIT would always be set, this returns -EAGAIN which will
have io_uring core issue a blocking attempt of the write. That will in
turn also get completions run, ensuring forward progress.

Since freezing requires CAP_SYS_ADMIN in the first place, this isn't
something that can be triggered by a regular user.

Cc: stable@vger.kernel.org # 5.10+
Reported-by: Peter Mann &lt;peter.mann@sh.cz&gt;
Link: https://lore.kernel.org/io-uring/38c94aec-81c9-4f62-b44e-1d87f5597644@sh.cz
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/rw: fix wrong NOWAIT check in io_rw_init_file()</title>
<updated>2024-10-19T15:25:45Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-19T15:16:51Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=ae6a888a4357131c01d85f4c91fb32552dd0bf70'/>
<id>urn:sha1:ae6a888a4357131c01d85f4c91fb32552dd0bf70</id>
<content type='text'>
A previous commit improved how !FMODE_NOWAIT is dealt with, but
inadvertently negated a check whilst doing so. This caused -EAGAIN to be
returned from reading files with O_NONBLOCK set. Fix up the check for
REQ_F_SUPPORT_NOWAIT.

Reported-by: Julian Orth &lt;ju.orth@gmail.com&gt;
Link: https://github.com/axboe/liburing/issues/1270
Fixes: f7c913438533 ("io_uring/rw: allow pollable non-blocking attempts for !FMODE_NOWAIT")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/sqpoll: ensure task state is TASK_RUNNING when running task_work</title>
<updated>2024-10-17T14:38:04Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-17T14:31:56Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=8f7033aa4089fbaf7a33995f0f2ee6c9d7b9ca1b'/>
<id>urn:sha1:8f7033aa4089fbaf7a33995f0f2ee6c9d7b9ca1b</id>
<content type='text'>
When the sqpoll is exiting and cancels pending work items, it may need
to run task_work. If this happens from within io_uring_cancel_generic(),
then it may be under waiting for the io_uring_task waitqueue. This
results in the below splat from the scheduler, as the ring mutex may be
attempted grabbed while in a TASK_INTERRUPTIBLE state.

Ensure that the task state is set appropriately for that, just like what
is done for the other cases in io_run_task_work().

do not call blocking ops when !TASK_RUNNING; state=1 set at [&lt;0000000029387fd2&gt;] prepare_to_wait+0x88/0x2fc
WARNING: CPU: 6 PID: 59939 at kernel/sched/core.c:8561 __might_sleep+0xf4/0x140
Modules linked in:
CPU: 6 UID: 0 PID: 59939 Comm: iou-sqp-59938 Not tainted 6.12.0-rc3-00113-g8d020023b155 #7456
Hardware name: linux,dummy-virt (DT)
pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
pc : __might_sleep+0xf4/0x140
lr : __might_sleep+0xf4/0x140
sp : ffff80008c5e7830
x29: ffff80008c5e7830 x28: ffff0000d93088c0 x27: ffff60001c2d7230
x26: dfff800000000000 x25: ffff0000e16b9180 x24: ffff80008c5e7a50
x23: 1ffff000118bcf4a x22: ffff0000e16b9180 x21: ffff0000e16b9180
x20: 000000000000011b x19: ffff80008310fac0 x18: 1ffff000118bcd90
x17: 30303c5b20746120 x16: 74657320313d6574 x15: 0720072007200720
x14: 0720072007200720 x13: 0720072007200720 x12: ffff600036c64f0b
x11: 1fffe00036c64f0a x10: ffff600036c64f0a x9 : dfff800000000000
x8 : 00009fffc939b0f6 x7 : ffff0001b6327853 x6 : 0000000000000001
x5 : ffff0001b6327850 x4 : ffff600036c64f0b x3 : ffff8000803c35bc
x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000e16b9180
Call trace:
 __might_sleep+0xf4/0x140
 mutex_lock+0x84/0x124
 io_handle_tw_list+0xf4/0x260
 tctx_task_work_run+0x94/0x340
 io_run_task_work+0x1ec/0x3c0
 io_uring_cancel_generic+0x364/0x524
 io_sq_thread+0x820/0x124c
 ret_from_fork+0x10/0x20

Cc: stable@vger.kernel.org
Fixes: af5d68f8892f ("io_uring/sqpoll: manage task_work privately")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/rsrc: ignore dummy_ubuf for buffer cloning</title>
<updated>2024-10-16T13:09:25Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-16T13:09:25Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=858e686a30d7bffba3f3527add4f78766a4389d0'/>
<id>urn:sha1:858e686a30d7bffba3f3527add4f78766a4389d0</id>
<content type='text'>
For placeholder buffers, &amp;dummy_ubuf is assigned which is a static
value. When buffers are attempted cloned, don't attempt to grab a
reference to it, as we both don't need it and it'll actively fail as
dummy_ubuf doesn't have a valid reference count setup.

Link: https://lore.kernel.org/io-uring/Zw8dkUzsxQ5LgAJL@ly-workstation/
Reported-by: Lai, Yi &lt;yi1.lai@linux.intel.com&gt;
Fixes: 7cc2a6eadcd7 ("io_uring: add IORING_REGISTER_COPY_BUFFERS method")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/sqpoll: close race on waiting for sqring entries</title>
<updated>2024-10-15T15:13:51Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-15T14:58:25Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=28aabffae6be54284869a91cd8bccd3720041129'/>
<id>urn:sha1:28aabffae6be54284869a91cd8bccd3720041129</id>
<content type='text'>
When an application uses SQPOLL, it must wait for the SQPOLL thread to
consume SQE entries, if it fails to get an sqe when calling
io_uring_get_sqe(). It can do so by calling io_uring_enter(2) with the
flag value of IORING_ENTER_SQ_WAIT. In liburing, this is generally done
with io_uring_sqring_wait(). There's a natural expectation that once
this call returns, a new SQE entry can be retrieved, filled out, and
submitted. However, the kernel uses the cached sq head to determine if
the SQRING is full or not. If the SQPOLL thread is currently in the
process of submitting SQE entries, it may have updated the cached sq
head, but not yet committed it to the SQ ring. Hence the kernel may find
that there are SQE entries ready to be consumed, and return successfully
to the application. If the SQPOLL thread hasn't yet committed the SQ
ring entries by the time the application returns to userspace and
attempts to get a new SQE, it will fail getting a new SQE.

Fix this by having io_sqring_full() always use the user visible SQ ring
head entry, rather than the internally cached one.

Cc: stable@vger.kernel.org # 5.10+
Link: https://github.com/axboe/liburing/discussions/1267
Reported-by: Benedek Thaler &lt;thaler@thaler.hu&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/rw: allow pollable non-blocking attempts for !FMODE_NOWAIT</title>
<updated>2024-10-07T02:58:53Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-06T16:40:36Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=f7c9134385331c5ef36252895130aa01a92de907'/>
<id>urn:sha1:f7c9134385331c5ef36252895130aa01a92de907</id>
<content type='text'>
The checking for whether or not io_uring can do a non-blocking read or
write attempt is gated on FMODE_NOWAIT. However, if the file is
pollable, it's feasible to just check if it's currently in a state in
which it can sanely receive or send _some_ data.

This avoids unnecessary io-wq punts, and repeated worthless retries
before doing that punt, by assuming that some data can get delivered
or received if poll tells us that is true. It also allows multishot
reads to properly work with these types of files, enabling a bit of
a cleanup of the logic that:

c9d952b9103b ("io_uring/rw: fix cflags posting for single issue multishot read")

had to put in place.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/rw: fix cflags posting for single issue multishot read</title>
<updated>2024-10-06T14:05:47Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-10-06T01:06:50Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=c9d952b9103b600ddafc5d1c0e2f2dbd30f0b805'/>
<id>urn:sha1:c9d952b9103b600ddafc5d1c0e2f2dbd30f0b805</id>
<content type='text'>
If multishot gets disabled, and hence the request will get terminated
rather than persist for more iterations, then posting the CQE with the
right cflags is still important. Most notably, the buffer reference
needs to be included.

Refactor the return of __io_read() a bit, so that the provided buffer
is always put correctly, and hence returned to the application.

Reported-by: Sharon Rosner &lt;Sharon Rosner&gt;
Link: https://github.com/axboe/liburing/issues/1257
Cc: stable@vger.kernel.org
Fixes: 2a975d426c82 ("io_uring/rw: don't allow multishot reads without NOWAIT support")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring/net: harden multishot termination case for recv</title>
<updated>2024-09-30T14:26:59Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2024-09-26T13:08:10Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=c314094cb4cfa6fc5a17f4881ead2dfebfa717a7'/>
<id>urn:sha1:c314094cb4cfa6fc5a17f4881ead2dfebfa717a7</id>
<content type='text'>
If the recv returns zero, or an error, then it doesn't matter if more
data has already been received for this buffer. A condition like that
should terminate the multishot receive. Rather than pass in the
collected return value, pass in whether to terminate or keep the recv
going separately.

Note that this isn't a bug right now, as the only way to get there is
via setting MSG_WAITALL with multishot receive. And if an application
does that, then -EINVAL is returned anyway. But it seems like an easy
bug to introduce, so let's make it a bit more explicit.

Link: https://github.com/axboe/liburing/issues/1246
Cc: stable@vger.kernel.org
Fixes: b3fdea6ecb55 ("io_uring: multishot recv")
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: fix casts to io_req_flags_t</title>
<updated>2024-09-24T19:31:04Z</updated>
<author>
<name>Min-Hua Chen</name>
<email>minhuadotchen@gmail.com</email>
</author>
<published>2024-09-22T10:41:29Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=17ea56b752b6ba58fdd1fcfd24f0fd2fa2b0ade2'/>
<id>urn:sha1:17ea56b752b6ba58fdd1fcfd24f0fd2fa2b0ade2</id>
<content type='text'>
Apply __force cast to restricted io_req_flags_t type to fix
the following sparse warning:

io_uring/io_uring.c:2026:23: sparse: warning: cast to restricted io_req_flags_t

No functional changes intended.

Signed-off-by: Min-Hua Chen &lt;minhuadotchen@gmail.com&gt;
Link: https://lore.kernel.org/r/20240922104132.157055-1-minhuadotchen@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>io_uring: fix memory leak when cache init fail</title>
<updated>2024-09-24T19:31:00Z</updated>
<author>
<name>Guixin Liu</name>
<email>kanie@linux.alibaba.com</email>
</author>
<published>2024-09-23T10:05:12Z</published>
<link rel='alternate' type='text/html' href='https://git.tablaster.dev/blaster4385/linux-IllusionX/commit/?id=3a87e264290d71ec86a210ab3e8d23b715ad266d'/>
<id>urn:sha1:3a87e264290d71ec86a210ab3e8d23b715ad266d</id>
<content type='text'>
Exit the percpu ref when cache init fails to free the data memory with
in struct percpu_ref.

Fixes: 206aefde4f88 ("io_uring: reduce/pack size of io_ring_ctx")
Signed-off-by: Guixin Liu &lt;kanie@linux.alibaba.com&gt;
Reviewed-by: Gabriel Krisman Bertazi &lt;krisman@suse.de&gt;
Link: https://lore.kernel.org/r/20240923100512.64638-1-kanie@linux.alibaba.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
