aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2022-07-24io_uring: clean up io_ring_ctx_allocPavel Begunkov1-4/+6
Add a variable for the number of hash buckets in io_ring_ctx_alloc(), makes it more readable. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/993926ed0d614ba9a76b2a85bebae2babcb13983.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: limit the number of cancellation bucketsPavel Begunkov1-5/+5
Don't allocate to many hash/cancellation buckets, there might be too many, clamp it to 8 bits, or 256 * 64B = 16KB. We don't usually have too many requests, and 256 buckets should be enough, especially since we do hash search only in the cancellation path. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/b9620c8072ba61a2d50eba894b89bd93a94a9abd.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: clean up io_try_cancelPavel Begunkov1-2/+2
Get rid of an unnecessary extra goto in io_try_cancel() and simplify the function. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/48cf5417b43a8386c6c364dba1ad9b4c7382d158.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: pass poll_find lock backPavel Begunkov1-20/+26
Instead of using implicit knowledge of what is locked or not after io_poll_find() and co returns, pass back a pointer to the locked bucket if any. If set the user must to unlock the spinlock. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/dae1dc5749aa34367812ecf62f82fd3f053aae44.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: switch cancel_hash to use per entry spinlockHao Xu6-40/+80
Add a new io_hash_bucket structure so that each bucket in cancel_hash has separate spinlock. Use per entry lock for cancel_hash, this removes some completion lock invocation and remove contension between different cancel_hash entries. Signed-off-by: Hao Xu <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/05d1e135b0c8bce9d1441e6346776589e5783e26.1655371007.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: poll: remove unnecessary req->ref setHao Xu1-1/+0
We now don't need to set req->refcount for poll requests since the reworked poll code ensures no request release race. Signed-off-by: Hao Xu <[email protected]> Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/ec6fee45705890bdb968b0c175519242753c0215.1655371007.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: don't inline io_put_kbufPavel Begunkov2-32/+39
io_put_kbuf() is huge, don't bloat the kernel with inlining. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/2e21ccf0be471ffa654032914b9430813cae53f8.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: refactor io_req_task_complete()Pavel Begunkov1-6/+10
Clean up io_req_task_complete() and deduplicate io_put_kbuf() calls. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/ae3148ac7eb5cce3e06895cde306e9e959d6f6ae.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: kill REQ_F_COMPLETE_INLINEPavel Begunkov3-19/+7
REQ_F_COMPLETE_INLINE is only needed to delay queueing into the completion list to io_queue_sqe() as __io_req_complete() is inlined and we don't want to bloat the kernel. As now we complete in a more centralised fashion in io_issue_sqe() we can get rid of the flag and queue to the list directly. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/600ba20a9338b8a39b249b23d3d177803613dde4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: rw: delegate sync completions to core io_uringPavel Begunkov1-22/+19
io_issue_sqe() from the io_uring core knows how to complete requests based on the returned error code, we can delegate io_read()/io_write() completion to it. Make kiocb_done() to return the right completion code and propagate it. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/32ef005b45d23bf6b5e6837740dc0331bb051bd4.1655371007.git.asml.silence@gmail.com Reviewed-by: Hao Xu <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: remove unused IO_REQ_CACHE_SIZE definedJens Axboe1-1/+0
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: don't set REQ_F_COMPLETE_INLINE in twPavel Begunkov1-1/+0
io_req_task_complete() enqueues requests for state completion itself, no need for REQ_F_COMPLETE_INLINE, which is only serve the purpose of not bloating the kernel. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/aca80f71464ad02c06f1311d998a2d6ee0b31573.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: remove check_cq checking from hot pathsPavel Begunkov1-15/+19
All ctx->check_cq events are slow path, don't test every single flag one by one in the hot path, but add a common guarding if. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/dff026585cea7ff3a172a7c83894a3b0111bbf6a.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: never defer-complete multi-apollPavel Begunkov1-1/+1
Luckily, nnobody completes multi-apoll requests outside the polling functions, but don't set IO_URING_F_COMPLETE_DEFER in any case as there is nobody who is catching REQ_F_COMPLETE_INLINE, and so will leak requests if used. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/a65ed3f5effd9321ee06e6edea294a03be3e15a0.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: inline ->registered_ringsPavel Begunkov2-11/+2
There can be only 16 registered rings, no need to allocate an array for them separately but store it in tctx. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/495f0b953c87994dd9e13de2134019054fa5830d.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: explain io_wq_work::cancel_seq placementPavel Begunkov1-0/+1
Add a comment on why we keep ->cancel_seq in struct io_wq_work instead of struct io_kiocb despite it needed only by io_uring but not io-wq. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/988e87eec9dc700b5dae933df3aefef303502f6c.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move small helpers to headersPavel Begunkov2-17/+22
There is a bunch of inline helpers that will be useful not only to the core of io_uring, move them to headers. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/22df99c83723e44cba7e945e8519e64e3642c064.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: refactor ctx slow data placementPavel Begunkov1-42/+39
Shove all slow path data at the end of ctx and get rid of extra indention. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/bcaf200298dd469af20787650550efc66d89bef2.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: better caching for ctx timeout fieldsPavel Begunkov1-6/+9
Following timeout fields access patterns, move all of them into a separate cache line inside ctx, so they don't intervene with normal completion caching, especially since timeout removals and completion are separated and the later is done via tw. It also sheds some bytes from io_ring_ctx, 1216B -> 1152B Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/4b163793072840de53b3cb66e0c2995e7226ff78.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move defer_list to slow dataPavel Begunkov1-1/+4
draining is slow path, move defer_list to the end where slow data lives inside the context. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/e16379391ca72b490afdd24e8944baab849b4a7b.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: make reg buf init consistentPavel Begunkov1-6/+3
The default (i.e. empty) state of register buffer is dummy_ubuf, so set it to dummy on init instead of NULL. Signed-off-by: Pavel Begunkov <[email protected]> Link: https://lore.kernel.org/r/c5456aecf03d9627fbd6e65e100e2b5293a6151e.1655310733.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: deprecate epoll_ctl supportJens Axboe1-0/+4
As far as we know, nobody ever adopted the epoll_ctl management via io_uring. Deprecate it now with a warning, and plan on removing it in a later kernel version. When we do remove it, we can revert the following commits as well: 39220e8d4a2a ("eventpoll: support non-blocking do_epoll_ctl() calls") 58e41a44c488 ("eventpoll: abstract out epoll_ctl() handler") Suggested-by: Linus Torvalds <[email protected]> Link: https://lore.kernel.org/io-uring/CAHk-=wiTyisXBgKnVHAGYCNvkmjk=50agS2Uk6nr+n3ssLZg2w@mail.gmail.com/ Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: add support for level triggered pollJens Axboe2-5/+13
By default, the POLL_ADD command does edge triggered poll - if we get a non-zero mask on the initial poll attempt, we complete the request successfully. Support level triggered by always waiting for a notification, regardless of whether or not the initial mask matches the file state. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move opcode table to opdef.cJens Axboe5-469/+501
We already have the declarations in opdef.h, move the rest into its own file rather than in the main io_uring.c file. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move read/write related opcodes to its own fileJens Axboe5-1231/+1263
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move remaining file table manipulation to filetable.cJens Axboe4-88/+90
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move rsrc related data, core, and commandsJens Axboe6-1418/+1480
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split provided buffers handling into its own fileJens Axboe7-636/+672
Move both the opcodes related to it, and the internals code dealing with it. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move cancelation into its own fileJens Axboe6-178/+204
This also helps cleanup the io_uring.h cancel parts, as we can make things static in the cancel.c file, mostly. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move poll handling into its own fileJens Axboe6-827/+879
Add a io_poll_issue() rather than export the general task_work locking and io_issue_sqe(), and put the io_op_defs definition and structure into a separate header file so that poll can use it. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: add opcode name to io_op_defsJens Axboe1-98/+52
This kills the last per-op switch. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: include and forward-declaration sanitationJens Axboe1-12/+5
Remove some dead headers we no longer need, and get rid of the io_ring_ctx and io_uring_fops forward declarations. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move io_uring_task (tctx) helpers into its own fileJens Axboe5-365/+396
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move fdinfo helpers to its own fileJens Axboe6-208/+230
This also means moving a bit more of the fixed file handling to the filetable side, which makes sense separately too. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: use io_is_uring_fops() consistentlyJens Axboe1-8/+8
Convert the last spots that check for io_uring_fops to use the provided helper instead. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move SQPOLL related handling into its own fileJens Axboe5-462/+497
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move timeout opcodes and handling into its own fileJens Axboe6-660/+701
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move our reference counting into a headerJens Axboe2-42/+49
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move msg_ring into its own fileJens Axboe4-55/+71
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split network related opcodes into its own fileJens Axboe5-835/+884
While at it, convert the handlers to just use io_eopnotsupp_prep() if CONFIG_NET isn't set. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move statx handling to its own fileJens Axboe4-62/+82
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move epoll handler to its own fileJens Axboe4-50/+70
Would be nice to sort out Kconfig for this and don't even compile epoll.c if we don't have epoll configured. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: add a dummy -EOPNOTSUPP prep handlerJens Axboe1-9/+14
Add it and use it for the epoll handling, if epoll isn't configured. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: move uring_cmd handling to its own fileJens Axboe5-124/+142
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split out open/close operationsJens Axboe5-298/+345
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: separate out file table handling codeJens Axboe5-93/+117
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split out fadvise/madvise operationsJens Axboe4-85/+109
Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split out fs related sync/fallocate functionsJens Axboe4-97/+124
This splits out sync_file_range, fsync, and fallocate. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split out splice related operationsJens Axboe5-130/+154
This splits out splice and tee support. Signed-off-by: Jens Axboe <[email protected]>
2022-07-24io_uring: split out filesystem related operationsJens Axboe4-283/+316
This splits out renameat, unlinkat, mkdirat, symlinkat, and linkat. Signed-off-by: Jens Axboe <[email protected]>