Another weekend, another weekend read, this time all about the fundamental difference between epoll and io_uring
I/O is a fundamental part of any operating system. In this issue of The Weekend Read, we will develop a dependable mental model of blocking and non-blocking I/O as well as synchronous and asynchronous I/O by discussing epoll and io_uring, two popular I/O mechanisms.
System Model
We will reason about the system as if there are two concurrent units of execution: the application and the kernel. An app is either runnable i.e. can be scheduled, or blocked on a condition i.e. cannot be scheduled until the condition becomes true.
Applications communicate with the kernel via system calls. When the app invokes a system call, it “stalls” (pauses temporarily) until it gets a response—here, “stalls” is deliberately used instead of “blocks.”
Understanding read()
Here, we are interested in the read()
system call. Read accepts a file descriptor to read data from and a buffer to copy the data into.
read()
can be blocking or non-blocking
blocking read
If there is no additional data to read from the file descriptor for example no data arrived on a socket yet, the kernel transitions the app from runnable to blocked and the app resumes after data is available.
non-blocking read
If there is no additional data to read from the file descriptor for example no data arrived on a socket yet, the kernel returns an error code like EAGAIN and the app resumes immediately.
There is a difference between a non-blocking read and a read that will not block: A non-blocking, synchronous read will never block the app. A blocking, synchronous will not block the app if there is data available.
In either case, read()
is always synchronous: If there is data to read from the file descriptor, the app stalls until the data is copied into the buffer.
The problem with read()
On the one hand, a blocking, synchronous read limits the concurrency of your app: while the app is waiting to read, the app cannot process other tasks. On the other hand, a non-blocking, synchronous read requires periodic polling.
epoll and io_uring are designed to address that problem
epoll
epoll is a notification mechanism: the application uses epoll syscalls to ask the kernel to monitor a set of file descriptors for a set of events, such as data availability. If epoll signals that data is available, the app may issues a synchronous read request that will not block but will stall until data is copied into the buffer.
import select
# Create an epoll instance
epoll = select.epoll()
# Assume we have a list of non-blocking file descriptors
# Register each file descriptor for read events
for fd in [...]:
epoll.register(fd, select.EPOLLIN)
# Wait for events and handle them
events = epoll.poll()
for fd, event in events:
if event & select.EPOLLIN:
# Perform a read operation that will not block
# but will stall
# Clean up
epoll.close()
io_uring
io_uring is an execution mechanism: the application uses io_uring syscalls to ask the kernel to perform a non-blocking, asynchronous read. On data availability, the kernel copies data to the buffer and notifies the app of completion. The app does not block and the app does not stall during data copying.
io_uring uses a submission queue and a completion queue to coordinate requests and responses between the app and the kernel. Check out The Weekend Read Issue 33 - Exploring io_uring with python to learn more about io_uring’s programming model.
Conclusion
epoll and io_uring both enable non-blocking I/O. However, there is a fundamental difference:
In the case of epoll, performing I/O happens synchronously, the app is stalled while the kernel transfers data into the app’s buffer.
In the case of io_uring, performing I/O happens asynchronously, the app is not stalled while the kernel transfers data into the app’s buffer.
epoll is non-blocking and synchronous, io_uring is non-blocking and asynchronous.
Happy Reading