| //! Implementation of mio for Windows using IOCP |
| //! |
| //! This module uses I/O Completion Ports (IOCP) on Windows to implement mio's |
| //! Unix epoll-like interface. Unfortunately these two I/O models are |
| //! fundamentally incompatible: |
| //! |
| //! * IOCP is a completion-based model where work is submitted to the kernel and |
| //! a program is notified later when the work finished. |
| //! * epoll is a readiness-based model where the kernel is queried as to what |
| //! work can be done, and afterwards the work is done. |
| //! |
| //! As a result, this implementation for Windows is much less "low level" than |
| //! the Unix implementation of mio. This design decision was intentional, |
| //! however. |
| //! |
| //! ## What is IOCP? |
| //! |
| //! The [official docs][docs] have a comprehensive explanation of what IOCP is, |
| //! but at a high level it requires the following operations to be executed to |
| //! perform some I/O: |
| //! |
| //! 1. A completion port is created |
| //! 2. An I/O handle and a token is registered with this completion port |
| //! 3. Some I/O is issued on the handle. This generally means that an API was |
| //! invoked with a zeroed `OVERLAPPED` structure. The API will immediately |
| //! return. |
| //! 4. After some time, the application queries the I/O port for completed |
| //! events. The port will returned a pointer to the `OVERLAPPED` along with |
| //! the token presented at registration time. |
| //! |
| //! Many I/O operations can be fired off before waiting on a port, and the port |
| //! will block execution of the calling thread until an I/O event has completed |
| //! (or a timeout has elapsed). |
| //! |
| //! Currently all of these low-level operations are housed in a separate `miow` |
| //! crate to provide a 0-cost abstraction over IOCP. This crate uses that to |
| //! implement all fiddly bits so there's very few actual Windows API calls or |
| //! `unsafe` blocks as a result. |
| //! |
| //! [docs]: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365198%28v=vs.85%29.aspx |
| //! |
| //! ## Safety of IOCP |
| //! |
| //! Unfortunately for us, IOCP is pretty unsafe in terms of Rust lifetimes and |
| //! such. When an I/O operation is submitted to the kernel, it involves handing |
| //! the kernel a few pointers like a buffer to read/write, an `OVERLAPPED` |
| //! structure pointer, and perhaps some other buffers such as for socket |
| //! addresses. These pointers all have to remain valid **for the entire I/O |
| //! operation's duration**. |
| //! |
| //! There's no way to define a safe lifetime for these pointers/buffers over |
| //! the span of an I/O operation, so we're forced to add a layer of abstraction |
| //! (not 0-cost) to make these APIs safe. Currently this implementation |
| //! basically just boxes everything up on the heap to give it a stable address |
| //! and then keys off that most of the time. |
| //! |
| //! ## From completion to readiness |
| //! |
| //! Translating a completion-based model to a readiness-based model is also no |
| //! easy task, and a significant portion of this implementation is managing this |
| //! translation. The basic idea behind this implementation is to issue I/O |
| //! operations preemptively and then translate their completions to a "I'm |
| //! ready" event. |
| //! |
| //! For example, in the case of reading a `TcpSocket`, as soon as a socket is |
| //! connected (or registered after an accept) a read operation is executed. |
| //! While the read is in progress calls to `read` will return `WouldBlock`, and |
| //! once the read is completed we translate the completion notification into a |
| //! `readable` event. Once the internal buffer is drained (e.g. all data from it |
| //! has been read) a read operation is re-issued. |
| //! |
| //! Write operations are a little different from reads, and the current |
| //! implementation is to just schedule a write as soon as `write` is first |
| //! called. While that write operation is in progress all future calls to |
| //! `write` will return `WouldBlock`. Completion of the write then translates to |
| //! a `writable` event. Note that this will probably want to add some layer of |
| //! internal buffering in the future. |
| //! |
| //! ## Buffer Management |
| //! |
| //! As there's lots of I/O operations in flight at any one point in time, |
| //! there's lots of live buffers that need to be juggled around (e.g. this |
| //! implementation's own internal buffers). |
| //! |
| //! Currently all buffers are created for the I/O operation at hand and are then |
| //! discarded when it completes (this is listed as future work below). |
| //! |
| //! ## Callback Management |
| //! |
| //! When the main event loop receives a notification that an I/O operation has |
| //! completed, some work needs to be done to translate that to a set of events |
| //! or perhaps some more I/O needs to be scheduled. For example after a |
| //! `TcpStream` is connected it generates a writable event and also schedules a |
| //! read. |
| //! |
| //! To manage all this the `Selector` uses the `OVERLAPPED` pointer from the |
| //! completion status. The selector assumes that all `OVERLAPPED` pointers are |
| //! actually pointers to the interior of a `selector::Overlapped` which means |
| //! that right after the `OVERLAPPED` itself there's a function pointer. This |
| //! function pointer is given the completion status as well as another callback |
| //! to push events onto the selector. |
| //! |
| //! The callback for each I/O operation doesn't have any environment, so it |
| //! relies on memory layout and unsafe casting to translate an `OVERLAPPED` |
| //! pointer (or in this case a `selector::Overlapped` pointer) to a type of |
| //! `FromRawArc<T>` (see module docs for why this type exists). |
| //! |
| //! ## Thread Safety |
| //! |
| //! Currently all of the I/O primitives make liberal use of `Arc` and `Mutex` |
| //! as an implementation detail. The main reason for this is to ensure that the |
| //! types are `Send` and `Sync`, but the implementations have not been stressed |
| //! in multithreaded situations yet. As a result, there are bound to be |
| //! functional surprises in using these concurrently. |
| //! |
| //! ## Future Work |
| //! |
| //! First up, let's take a look at unimplemented portions of this module: |
| //! |
| //! * The `PollOpt::level()` option is currently entirely unimplemented. |
| //! * Each `EventLoop` currently owns its completion port, but this prevents an |
| //! I/O handle from being added to multiple event loops (something that can be |
| //! done on Unix). Additionally, it hinders event loops moving across threads. |
| //! This should be solved by likely having a global `Selector` which all |
| //! others then communicate with. |
| //! * Although Unix sockets don't exist on Windows, there are named pipes and |
| //! those should likely be bound here in a similar fashion to `TcpStream`. |
| //! |
| //! Next up, there are a few performance improvements and optimizations that can |
| //! still be implemented |
| //! |
| //! * Buffer management right now is pretty bad, they're all just allocated |
| //! right before an I/O operation and discarded right after. There should at |
| //! least be some form of buffering buffers. |
| //! * No calls to `write` are internally buffered before being scheduled, which |
| //! means that writing performance is abysmal compared to Unix. There should |
| //! be some level of buffering of writes probably. |
| |
| use std::io; |
| use std::os::windows::prelude::*; |
| |
| use kernel32; |
| use winapi; |
| |
| mod awakener; |
| #[macro_use] |
| mod selector; |
| mod tcp; |
| mod udp; |
| mod from_raw_arc; |
| mod buffer_pool; |
| |
| pub use self::awakener::Awakener; |
| pub use self::selector::{Events, Selector, Overlapped, Binding}; |
| pub use self::tcp::{TcpStream, TcpListener}; |
| pub use self::udp::UdpSocket; |
| |
| #[derive(Copy, Clone)] |
| enum Family { |
| V4, V6, |
| } |
| |
| unsafe fn cancel(socket: &AsRawSocket, |
| overlapped: &Overlapped) -> io::Result<()> { |
| let handle = socket.as_raw_socket() as winapi::HANDLE; |
| let ret = kernel32::CancelIoEx(handle, overlapped.as_mut_ptr()); |
| if ret == 0 { |
| Err(io::Error::last_os_error()) |
| } else { |
| Ok(()) |
| } |
| } |
| |
| unsafe fn no_notify_on_instant_completion(handle: winapi::HANDLE) -> io::Result<()> { |
| // TODO: move those to winapi |
| const FILE_SKIP_COMPLETION_PORT_ON_SUCCESS: winapi::UCHAR = 1; |
| const FILE_SKIP_SET_EVENT_ON_HANDLE: winapi::UCHAR = 2; |
| |
| let flags = FILE_SKIP_COMPLETION_PORT_ON_SUCCESS | FILE_SKIP_SET_EVENT_ON_HANDLE; |
| |
| let r = kernel32::SetFileCompletionNotificationModes(handle, flags); |
| if r == winapi::TRUE { |
| Ok(()) |
| } else { |
| Err(io::Error::last_os_error()) |
| } |
| } |