Age | Commit message (Collapse) | Author | Files | Lines |
|
We don't post events in __io_commit_cqring_flush() anymore but send all
requests to tw, so no need to do io_commit_cqring() there.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/f2481e32375e749be89c42e4804268b608722cef.1655637157.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Commit 3a3d47fa9cfd ("io_uring: make io_uring_types.h public") moved
a bunch of io_uring types to a kernel wide header, so we could make
tracing a bit saner rather than pass in a ton of arguments.
However, there are a few types in there that are not really needed to
be system wide. Move the cancel data and mapped buffers back to the
appropriate io_uring local headers.
Signed-off-by: Jens Axboe <[email protected]>
|
|
We have lots of trace events accepting an io_uring request and wanting
to print some of its fields like user_data, opcode, flags and so on.
However, as trace points were unaware of io_uring structures, we had to
pass all the fields as arguments. Teach trace/events/io_uring.h about
struct io_kiocb and stop the misery of passing a horde of arguments to
trace helpers.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/40ff72f92798114e56d400f2b003beb6cde6ef53.1655384063.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Move io_uring types to linux/include, need them public so tracing can
see the definitions and we can clean trace/events/io_uring.h
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/a15f12e8cb7289b2de0deaddcc7518d98a132d17.1655384063.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_uring/io_uring.h already includes io_uring_types.h, no need to
include it every time. Kill it in a bunch of places, it prepares us for
following patches.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/94d8c943fbe0ef949981c508ddcee7fc1c18850f.1655384063.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
With IORING_SETUP_CQE32 ->cqe_cached doesn't store a real address but
rather an implicit offset into cqes. Store the real cqe pointer and
increment it accordingly if CQE32.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1ee1838cba16bed96381a006950b36ba640d998c.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Deduplicate calls to io_get_cqe() from __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/4fa077986cc3abab7c59ff4e7c390c783885465f.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Deduplicate two trace_io_uring_complete() calls in __io_fill_cqe_req().
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/277ed85dba5189ab7d932164b314013a0f0b0fdc.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
__io_fill_cqe_req() is hot and inlined, we want it to be as small as
possible. Add io_req_cqe_overflow() accepting only a request and doing
all overflow accounting, and replace with it two calls to 6 argument
io_cqring_event_overflow().
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/048b9fbcce56814d77a1a540409c98c3d383edcb.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
__io_get_cqe() is not as hot as io_get_cqe(), no need to inline it, it
sheds ~500B from the binary.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/c1ac829198a881b7af8710926f99a3559b9f24c0.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Deduplicate some code and add a helper for filling an aux CQE, locking
and notification.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b7c6557c8f9dc5c4cfb01292116c682a0ff61081.1655455613.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add comments to explain why it is always under uring lock when
incrementing head in __io_kbuf_recycle. And rectify one comemnt about
kbuf consuming in iowq case.
Signed-off-by: Hao Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jens Axboe <[email protected]>
|
|
Currently we do two extra spin lock/unlock pairs to add a poll/apoll
request to the cancellation hash table and remove it from there.
On the submission side we often already hold ->uring_lock and tw
completion is likely to hold it as well. Add a second cancellation hash
table protected by ->uring_lock. In concerns for latency because of a
need to have the mutex locked on the completion side, use the new table
only in following cases:
1) IORING_SETUP_SINGLE_ISSUER: only one task grabs uring_lock, so there
is little to no contention and so the main tw hander will almost
always end up grabbing it before calling callbacks.
2) IORING_SETUP_SQPOLL: same as with single issuer, only one task is
a major user of ->uring_lock.
3) apoll: we normally grab the lock on the completion side anyway to
execute the request, so it's free.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/1bbad9c78c454b7b92f100bbf46730a37df7194f.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Poll cancellation will be soon need to grab ->uring_lock inside, pass
the locking state, i.e. issue_flags, inside the cancellation functions.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b86781d047727c07163443b57551a3fa57c7c5e1.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Instead of passing around a pointer to hash buckets, add a bit of type
safety and wrap it into a structure.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/d65bc3faba537ec2aca9eabf334394936d44bd28.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
In preparation for having multiple cancellation hash tables, pass a
table pointer into io_poll_find() and other poll cancel functions.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/a31c88502463dce09254240fa037352927d7ecc3.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add a new IORING_SETUP_SINGLE_ISSUER flag and the userspace visible part
of it, i.e. put limitations of submitters. Also, don't allow it together
with IOPOLL as we're not going to put it to good use.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/4bcc41ee467fdf04c8aab8baf6ce3ba21858c3d4.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Use io_req_task_complete() for poll request completions, so it can
utilise state completions and save lots of unnecessary locking.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/ced94cb5a728d8e386c640d052fd3da3f5d6891a.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add a variable for the number of hash buckets in io_ring_ctx_alloc(),
makes it more readable.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/993926ed0d614ba9a76b2a85bebae2babcb13983.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Don't allocate to many hash/cancellation buckets, there might be too
many, clamp it to 8 bits, or 256 * 64B = 16KB. We don't usually have too
many requests, and 256 buckets should be enough, especially since we
do hash search only in the cancellation path.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b9620c8072ba61a2d50eba894b89bd93a94a9abd.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Get rid of an unnecessary extra goto in io_try_cancel() and simplify the
function.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/48cf5417b43a8386c6c364dba1ad9b4c7382d158.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Instead of using implicit knowledge of what is locked or not after
io_poll_find() and co returns, pass back a pointer to the locked
bucket if any. If set the user must to unlock the spinlock.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/dae1dc5749aa34367812ecf62f82fd3f053aae44.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add a new io_hash_bucket structure so that each bucket in cancel_hash
has separate spinlock. Use per entry lock for cancel_hash, this removes
some completion lock invocation and remove contension between different
cancel_hash entries.
Signed-off-by: Hao Xu <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/05d1e135b0c8bce9d1441e6346776589e5783e26.1655371007.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
We now don't need to set req->refcount for poll requests since the
reworked poll code ensures no request release race.
Signed-off-by: Hao Xu <[email protected]>
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/ec6fee45705890bdb968b0c175519242753c0215.1655371007.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_put_kbuf() is huge, don't bloat the kernel with inlining.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/2e21ccf0be471ffa654032914b9430813cae53f8.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Clean up io_req_task_complete() and deduplicate io_put_kbuf() calls.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/ae3148ac7eb5cce3e06895cde306e9e959d6f6ae.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the
completion list to io_queue_sqe() as __io_req_complete() is inlined and
we don't want to bloat the kernel.
As now we complete in a more centralised fashion in io_issue_sqe() we
can get rid of the flag and queue to the list directly.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_issue_sqe() from the io_uring core knows how to complete requests
based on the returned error code, we can delegate io_read()/io_write()
completion to it. Make kiocb_done() to return the right completion
code and propagate it.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/32ef005b45d23bf6b5e6837740dc0331bb051bd4.1655371007.git.asml.silence@gmail.com
Reviewed-by: Hao Xu <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Jens Axboe <[email protected]>
|
|
io_req_task_complete() enqueues requests for state completion itself, no
need for REQ_F_COMPLETE_INLINE, which is only serve the purpose of not
bloating the kernel.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/aca80f71464ad02c06f1311d998a2d6ee0b31573.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
All ctx->check_cq events are slow path, don't test every single flag one
by one in the hot path, but add a common guarding if.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/dff026585cea7ff3a172a7c83894a3b0111bbf6a.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Luckily, nnobody completes multi-apoll requests outside the polling
functions, but don't set IO_URING_F_COMPLETE_DEFER in any case as
there is nobody who is catching REQ_F_COMPLETE_INLINE, and so will leak
requests if used.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/a65ed3f5effd9321ee06e6edea294a03be3e15a0.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
There can be only 16 registered rings, no need to allocate an array for
them separately but store it in tctx.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/495f0b953c87994dd9e13de2134019054fa5830d.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add a comment on why we keep ->cancel_seq in struct io_wq_work instead
of struct io_kiocb despite it needed only by io_uring but not io-wq.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/988e87eec9dc700b5dae933df3aefef303502f6c.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
There is a bunch of inline helpers that will be useful not only to the
core of io_uring, move them to headers.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Shove all slow path data at the end of ctx and get rid of extra
indention.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/bcaf200298dd469af20787650550efc66d89bef2.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
Following timeout fields access patterns, move all of them into a
separate cache line inside ctx, so they don't intervene with normal
completion caching, especially since timeout removals and completion
are separated and the later is done via tw.
It also sheds some bytes from io_ring_ctx, 1216B -> 1152B
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/4b163793072840de53b3cb66e0c2995e7226ff78.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
draining is slow path, move defer_list to the end where slow data lives
inside the context.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/e16379391ca72b490afdd24e8944baab849b4a7b.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
The default (i.e. empty) state of register buffer is dummy_ubuf, so set
it to dummy on init instead of NULL.
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/c5456aecf03d9627fbd6e65e100e2b5293a6151e.1655310733.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <[email protected]>
|
|
As far as we know, nobody ever adopted the epoll_ctl management via
io_uring. Deprecate it now with a warning, and plan on removing it in
a later kernel version. When we do remove it, we can revert the following
commits as well:
39220e8d4a2a ("eventpoll: support non-blocking do_epoll_ctl() calls")
58e41a44c488 ("eventpoll: abstract out epoll_ctl() handler")
Suggested-by: Linus Torvalds <[email protected]>
Link: https://lore.kernel.org/io-uring/CAHk-=wiTyisXBgKnVHAGYCNvkmjk=50agS2Uk6nr+n3ssLZg2w@mail.gmail.com/
Signed-off-by: Jens Axboe <[email protected]>
|
|
By default, the POLL_ADD command does edge triggered poll - if we get
a non-zero mask on the initial poll attempt, we complete the request
successfully.
Support level triggered by always waiting for a notification, regardless
of whether or not the initial mask matches the file state.
Signed-off-by: Jens Axboe <[email protected]>
|
|
We already have the declarations in opdef.h, move the rest into its own
file rather than in the main io_uring.c file.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Jens Axboe <[email protected]>
|
|
Signed-off-by: Jens Axboe <[email protected]>
|
|
Move both the opcodes related to it, and the internals code dealing with
it.
Signed-off-by: Jens Axboe <[email protected]>
|
|
This also helps cleanup the io_uring.h cancel parts, as we can make
things static in the cancel.c file, mostly.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Add a io_poll_issue() rather than export the general task_work locking
and io_issue_sqe(), and put the io_op_defs definition and structure into
a separate header file so that poll can use it.
Signed-off-by: Jens Axboe <[email protected]>
|
|
This kills the last per-op switch.
Signed-off-by: Jens Axboe <[email protected]>
|
|
Remove some dead headers we no longer need, and get rid of the
io_ring_ctx and io_uring_fops forward declarations.
Signed-off-by: Jens Axboe <[email protected]>
|