blob: 4ba6c780934069a4c2323fb5ff075fb1f11e4ef5 [file] [view]
# Edge triggered async_wait (RFC-0009)
Field | Value
----------|--------------------------
Status | Accepted
Author(s) | steveaustin@google.com
Submitted | 2020-10-22
Reviewed | 2020-11-04
Issue | [fxbug.dev/62553](http://fxbug.dev/62553) [fxbug.dev/45709](http://fxbug.dev/45709)
[TOC]
## Summary
Waiting for signals to be asserted on an object is usually level-triggered and
a check is done at the start of `zx_object_wait_async` in case the signal is
already active, in which case a packet is immediately sent to the port.
This RFC concerns adding an option to `zx_object_wait_async`, `ZX_WAIT_ASYNC_EDGE`,
which does not perform that initial check and thus will only produce a packet
when the signal transitions from inactive to active after the call.
It may be that `zx_object_wait_async` is called with `ZX_WAIT_ASYNC_EDGE` with
the signals on the object already active. In this case, a packet will be queued
on the port only after the signal on the object becomes inactive and then
subsequently is asserted. In fact, this is how the `ZX_WAIT_ASYNC_EDGE` is commonly
used.
## Motivation
The `epoll` polling mechanism in Linux can function in two modes - level-triggered
and edge-triggered. Fuchsia's waiting features, particularly `zx_object_wait_async`
and `zx_port_wait` already make level-triggered polling possible. However, edge-triggered
polling requires the ability to wait on a signal on an object that is already
active is expected (through I/O) to become inactive and subsequently
active again, queuing a packet on the port on this subsequent signal transition.
This is the intent of `ZX_WAIT_ASYNC_EDGE`.
## Design
Implementation of the `ZX_WAIT_ASYNC_EDGE` flag of `zx_object_wait_async` is
fortunately quite simple.
At present, if one of the signal set is already active, the observer's `OnMatch`
method is called directly without any further action. Otherwise, if none of the
signal set is active, the set is added to the interest list of the `DispatchObject`
via the supplied `SignalObserver`.
The proposal is that, if the `ZX_WAIT_ASYNC_EDGE` flag is specified, the initial check
is omitted and the signal set added to the interest list of the `DispatchObject`
regardless of the initial signal state. In this mode of operation, one of the
signals must transition from inactive to active for a packet to be queued on
the supplied port (possibly requiring a signal to become inactive in the process).
### Use of ZX_WAIT_ASYNC_EDGE in epoll with EPOLLET edge triggering
The main use of this change is to enable edge-triggered polling with the EPOLLET flag
in epoll. Waiting in Zircon differs from polling in `epoll` in that file descriptors are
added to an `epoll` file descriptor using `epoll_ctl/EPOLL_CTL_ADD` and are continually
monitored until removed with `epoll_ctl/EPOLL_CTL_DEL`. Zircon waiting, especially
with `zx_object_wait_async`, is always one-shot and the file object must be "re-armed"
by calling `zx_object_wait_async` again after a signal has become active.
Because `epoll` use must operate by repeatedly calling `epoll_wait` (without necessarily
calling `epoll_ctl`), this re-arming call to `zx_object_wait_async` must occur somewhere
in `epoll_wait`.
For the default level-triggered polling, in `epoll_wait` once `zx_port_wait` returns with
a signalled file object, we cannot call `zx_object_wait_async` before returning, because
the signal on that object is actve and will generate a duplicate packet on the port. Therefore,
a list of active level-triggered file descriptors is maintained and `zx_object_wait_async`
is called on file descriptors in this list on entering `epoll_wait` prior to waiting in
`zx_port_wait`.
For edge-triggered polling, after `epoll_wait` returns, non-blocking I/O should be
performed until `EWOULDBLOCK` is returned. At that point the signal on the file object will
be inactive. At this point `epoll_wait` should be called. However, if the signal on the file
object becomes active between the I/O operation returning `EWOULDBLOCK` and `epoll_wait` being
called, that event will be lost unless `zx_object_wait_async` has already been called.
It follows that, in edge-triggered mode, the call to `zx_object_wait_async` to re-arm
the file object must be called *before* `epoll_wait` returns. This is where `ZX_WAIT_ASYNC_EDGE`
is necessary. The call to `zx_object_wait_async` can be called with this flag between
`zx_port_wait` returning and `epoll_wait` returning, because although the signal is active
at this point, the `ZX_WAIT_ASYNC_EDGE` skips the check that the signals are active
(which they are at this point), so no packet is immediately queued on the port.
This means that when the I/O occurs until `EWOULDBLOCK`, the file object is already being
monitored by `zx_object_wait_async` and there is no gap in coverage.
## Implementation
The addition of the `ZX_WAIT_ASYNC_EDGE` option to `zx_object_wait_async` has
already been implemented in [https://fuchsia-review.googlesource.com/c/fuchsia/+/438521]
and its use in epoll has been implemented in
[https://fuchsia-review.googlesource.com/c/fuchsia/+/438656].
## Performance
The performance impact will be negligible as only an extra added method parameter
and a check for that parameter is added to the existing code.
## Security considerations
N/A
## Privacy considerations
N/A
## Testing
Additional unit tests have been added.
## Documentation
Documentation has been added to `zx_object_wait_async` in the implementing CL.
[zx_object_wait_async](/docs/reference/syscalls/object_wait_async.md)
## Drawbacks, alternatives, and unknowns
This appears to be the simplest way of implementing this feature and
is analogous to how edge-triggered polling is implemented in other
operating systems.
Care should be taken to not miss I/O events when using this flag.
A signal may become active after performing I/O and before making
a call to `zx_object_wait_async`, in which case the transition from
unsignalled to signalled may be missed. In practice, the `ZX_WAIT_ASYNC_EDGE`
flag is used immediately `zx_port_wait` has returned indicating that
the signal is active on the object. In this way, after non-blocking
I/O is performed on the object until the signal is inactive (usually
until `ZX_ERR_SHOULD_WAIT` is returned) `zx_port_wait` can be called to
wait until additional I/O is ready.
Because of the pattern in edge-triggered `epoll_wait` with `EPOLLET` of (1)
`epoll_wait` (2) non-blocking I/O until fd is not-ready (3) `epoll_wait` again,
an alternative to `ZX_WAIT_ASYNC_EDGE` would be to perform a check in every
I/O operation to see if the file descriptor has been added to an epoll file
descriptor via `epoll_ctl` with `EPOLLET` (and has ceased to be ready) and
re-arms a wait. This would require considerable modification to `zxio` and
`fdio` for a somewhat rare use case.
## Prior art and references
The purpose of this change is to emulate the EPOLLET flag in Linux:
https://man7.org/linux/man-pages/man7/epoll.7.html