Status: DRAFT
Author: jeffbrown@google.com
This document is a specification of the Fuchsia Interface Definition Language (FIDL) v2.0 data structure encoding.
See FIDL 2.0: Overview for more information about FIDL's overall purpose, goals, and requirements, as well as links to related documents.
WORK IN PROGRESS
A message is a contiguous data structure represented using the FIDL Wire Format, consisting of a single in-line primary object followed by a sequence of out-of-line secondary objects stored in traversal order.
Messages are aggregates of objects.
The primary object of a message is simply the first object it contains. It is always a struct of fixed size whose type (and size) is known from the context (such as by examining the method ordinal in the interface call header).
To store variable-size or optional data, the primary object may refer to secondary objects, such as string content, vector content, structs, and unions. Secondary objects are stored** out-of-line** sequentially in traversal order following the object which reference them. In encoded messages, the presence of secondary objects is marked by a flag. In decoded messages, the flags are substituted with pointers to their location in memory (or null pointers when absent).
Primary and secondary objects are 8-byte aligned and stored sequentially in traversal order without gaps other than the minimum required padding to maintain object alignment.
Objects may also contain in-line objects which are aggregated within the body of the containing object, such as embedded structs and fixed-size arrays of structs. The alignment factor of in-line objects is determined by the alignment factor of their most restrictive member.
In the following example, each Rect structure contains two Point objects which are stored in-line whereas each Region structure contains a vector with a variable number of Rect objects which are stored sequentially out-of-line. In this case, the secondary object is the vector's content (as a unit).
struct Region { vector<Rect> rects; }; struct Rect { Point top_left; Point bottom_right; }; struct Point { uint32 x, y; };
The traversal order of a message is a determined by a recursive depth-first walk of all of the objects it contains, as obtained by following the chain of references.
Given the following structure:
struct Cart { vector<Item> items; }; struct Item { Product product; uint32 quantity; }; struct Product { string sku; string name; string? description; uint32 price; };
The depth-first traversal order for a Cart message is defined by the following pseudo-code:
visit Cart: for each Item in Cart.items vector data: visit Item.product: visit Product.sku visit Product.name visit Product.description visit Product.price visit Item.quantity
The same message content can be expressed in two forms -- encoded and decoded -- which have the same size and overall layout but differ in terms of their representation of pointers (memory addresses) or handles (capabilities).
FIDL is designed such that encoding and decoding of messages can occur in place in memory assuming that objects have been stored in traversal order.
The representation of encoded messages is unambiguous. There is exactly one possible encoding for all messages of the same type with the same content.
An encoded message has been prepared for transfer to another process: it does not contain pointers (memory addresses) or handles (capabilities).
During encoding…
The resulting encoding message and handle vector can then be sent to another process using ZX_channel_call() or a similar IPC mechanism.
A decoded message has been prepared for use within a process's address space: it may contain pointers (memory addresses) or handles (capabilities).
During decoding...
The resulting decoded message is ready to be consumed directly from memory.
The following primitive types are supported:
Category | Types |
---|---|
Boolean | bool |
Signed integer | int8 int16 int32 int64 |
Unsigned integer | uint8 uint16 uint32 uint64 |
IEEE 754 Floating-point | float32 float64 |
Number types are suffixed with their size in bits, bool
is 1 byte.
From the perspective of the wire format, enums are just fancy names for primitive types.
For example, an enum whose underlying type is int32 is stored in exactly the same way as any ordinary int32 would be.
Arrays are denoted:
T can be any FIDL type.
Variable-length sequence of UTF-8 encoded characters representing text.
Nullable; null strings and empty strings are distinct.
Can specify a maximum size, eg. string:40 for a maximum 40 byte string.
String content does not have a null-terminator.[^1]
Stored as a 16 byte record consisting of:
When encoded for transfer, data indicates presence of content:
When decoded for consumption, data is a pointer to content.
Strings are denoted as follows:
Vectors are denoted as follows:
T can be any FIDL type.
Transfers a Zircon capability by handle value.
Stored as a 32-bit unsigned integer.
Nullable by encoding as a zero-valued[^2] handle (equivalent to ZX_HANDLE_INVALID).
When encoded for transfer, the stored integer indicates presence of handle:
When decoded for consumption, the stored integer is handle value itself.
Handles are denoted:
H can be one of[^3]: channel, event, eventpair, fifo, job, process, port, resource, socket, thread, vmo
Record type consisting of a sequence of typed fields.
Fields are first-fit packed in declaration order according to their alignment factors.[^4]
Alignment factor of structure is defined by maximal alignment factor of any of its fields.
Structure is padded with zeroes so that its size is a multiple of its alignment factor.
In general, changing the definition of a struct will break binary compatibility; instead prefer to extend interfaces by adding new methods which use new structs.
Storage of a structure depends on whether it is nullable at point of reference.
Structs are denoted by their declared name (eg. Circle) and nullability:
The following example shows how structs are laid out according to their fields.
struct Circle { bool filled; Point center; // Point will be stored in-line float32 radius; Color? color; // Color will be stored out-of-line bool dashed; }; struct Point { float32 x, y; }; struct Color { float32 r, g, b; };
When laying out Circle, its fields are first-fit packed in declaration order, considering each field in turn: filled, center, radius, color, dashed. We can see how dashed fills an alignment gap left by the need to align center to a 4 byte boundary. The Color content is padded to the 8 byte secondary object alignment boundary.
Storage of a union depends on whether it is nullable at point of reference.
Union are denoted by their declared name (eg. Pattern) and nullability:
The following example shows how unions are laid out according to their options.
struct Paint { Pattern fg; Pattern? bg; }; union Pattern { Color color; Texture texture; }; struct Color { float32 r, g, b; }; struct Texture { string name; };
When laying out Pattern, space is first allotted to the tag (4 bytes) then to the selected option.
Messages which are sent directly through Zircon channels have a maximum total size (header + body) which is defined by the kernel (currently 64 KB, eventual intent may be 16 KB). Arbitrarily large messages can be transferred by writing the message body into a Zircon VMO, setting the FIDL_MSG_VMO flag in the message header, and sending the VMO through the channel along with the header.
It is possible to extend interfaces by declaring additional methods with unique ordinals. The language also supports creating derived interfaces provided the method ordinals remain unique down the hierarchy. Interface derivation is purely syntactic; it does not affect the wire format).
We'll use the following interface for the next few examples.
interface Calculator { 1: Add(int32 a, int32 b) -> (int32 sum); 2: Divide(int32 dividend, int32 divisor) -> (int32 quotient, int32 remainder); 3: Clear(); };
FIDL does not provide a mechanism to determine the “version” of an interface; interface capabilities must be determined out-of-band such as by querying a ServiceProvider for an interface “version” by name or by other means.
The client of an interface sends method call messages to the implementor of the interface to invoke methods of that interface.
If a server receives an unknown, unsupported, or unexpected method call message, it must close the channel.
The message indicates the method to invoke by means of its ordinal index. The body of the message contains the method arguments as if they were packed in a struct.
The implementor of an interface sends method result messages to the client of the interface to indicate completion of a method invocation and to provide a (possibly empty) result.
If a client receives an unknown, unsupported, or unexpected method call message, it must close the channel.
Only two-way method calls which are defined to provide a (possibly empty) result in the FIDL interface declaration will elicit a method result message. One-way method calls must not produce a method result message.
A method result message provides the result associated with a prior method call. The body of the message contains the method results as if they were packed in a struct.
The message result header consists of uint32 txn_id, uint32_t reserved, uint32 flags, ZX_status_t status
, which represents protocol-level status. The txn_id
must be equal to the txn_id
of the method call that this is a response to. The flags must be zero. A status of ZX_OK
indicates normal response. A status of ZX_ERR_NOT_SUPPORTED
indicates that the ordinal of the method call is not supported by the server.
Control messages support in-band signaling of events other than method calls and responses.
If a client or server receives an unknown, unsupported, or unexpected control message, it must discard it. This allows for future expansion of control messages in the protocol.
The maximum size of a valid control message is 512 bytes, including the header.
An epitaph is a message with ordinal 0x80000001 which a client or server sends just prior to closing the connection to provide an indication of why the connection is being closed. No further messages must be sent through the channel after the epitaph.
When a client or server receives an epitaph message, it can assume that it has received the last message and the channel is about to be closed. The contents of the epitaph message explains the disposition of the channel.
The body of an epitaph is described by the following structure:
struct Epitaph { // Generic protocol status, represented as an ZX_status_t. uint32 status; // Protocol-specific data, interpretation depends on the interface // associated with the channel. uint32 code; // Human-readable message. string:480 message; };
TODO: Should we allow custom epitaph structures as in the original proposal? On the other hand, making them system-defined greatly simplifies the bindings and is probably sufficient for the most common usage of simply indicating why a connection is being closed.
sizeof(T) denotes the size in bytes for an object of type T.
alignof(T) denotes the alignment factor in bytes to store an object of type T.
FIDL primitive types are stored at offsets in the message which are a multiple of their size in bytes. Thus for primitives T_,_ alignof(T) == sizeof(T). This is called natural alignment. It has the nice property of satisfying typical alignment requirements of modern CPU architectures.
FIDL complex types, such as structs and arrays, are stored at offsets in the message which are a multiple of the maximum alignment factor of any of their fields. Thus for complex types T, alignof(T) == max(alignof(F:T)) over all fields F in T. It has the nice property of satisfying typical C structure packing requirements (which can be enforced using packing attributes in the generated code). The size of a complex type is the total number of bytes needed to store its members properly aligned plus padding up to the type's alignment factor.
FIDL primary and secondary objects are aligned at 8-byte offsets within the message, regardless of their contents. The primary object of a FIDL message starts at offset 0. Secondary objects, which are the only possible referent of pointers within the message, always start at offsets which are a multiple of 8. (So all pointers within the message point at offsets which are a multiple of 8.)
FIDL in-line objects (complex types embedded within primary or secondary objects) are aligned according to their type. They are not forced to 8 byte alignment.
The creator of a message must fill all alignment padding gaps with zeros.
The consumer of a message may verify that padding contains zeroes (and generate an error if not) but it is not required to check as long as it does not actually read the padding bytes.
FIDL arrays, vectors, structures, and unions enable the construction of recursive messages. Left unchecked, processing excessively deep messages could lead to resource exhaustion of the consumer.
For safety, the maximum recursion depth for all FIDL messages is limited to 32 levels of nested complex objects. The FIDL validator must enforce this by keeping track of the current nesting level during message validation.
Complex objects are arrays, vectors, structures, or unions which contain pointers or handles which require fix-up. These are precisely the kinds of objects for which encoding tables must be generated. See FIDL 2.0: C Language Bindings for information about encoding tables. Therefore limiting the nesting depth of complex objects has the effect of limiting the recursion depth for traversal of encoding tables.
Formal definition:
The purpose of message validation is to discover wire format errors early before they have a chance to induce security or stability problems.
Message validation is required when decoding messages received from a peer to prevent bad data from propagating beyond the service entry point.
Message validation is optional but recommended when encoding messages to send to a peer to help localize violated integrity constraints.
To minimize runtime overhead, validation should generally be performed as part of a single pass message encoding or decoding process such that only a single traversal is needed. Since messages are encoded in depth-first traversal order, traversal exhibits good memory locality and should therefore be quite efficient.
For simple messages, validation may be very trivial, amounting to no more than a few size checks. While programmers are encouraged to rely on their FIDL bindings library to validate messages on their behalf, validation can also be done manually if needed.
Conformant FIDL bindings must check all of the following integrity constraints:
Stricter FIDL bindings may perform some or all of the following additional safety checks:
See FIDL 2.0: I/O Sketch.
Grab bag of things we may or may not implement based on demand and experience with FIDL.
These may or may not be good ideas.
Define a vector with parallel content arrays of same dimension but different types. Used to represented unordered associative arrays.
Sorted dictionary. Key must be a primitive type. Lookup via binary search. Validation verifies order and uniqueness of keys.
An efficient way to encapsulate uninterpreted FIDL messages.
Envelopes are denoted as follows:
Note: This could also be represented as a pair containing a vector and a vector but that would require a minimum of 32 bytes instead of 16 bytes.
A tagged structure whose fields are all tagged with ordinal indices. New fields can be added at any time. Clients and servers can skip over unrecognized fields.
Tables are denoted by their declared name (eg. Station) and nullability:
The following example shows how tables are laid out according to their fields.
table Station { 1: string name; 3: bool encrypted; 2: uint32 channel; };
When encoding the table, the field information records for each of its fields are stored in increasing tag order and padded to a 8 byte boundary. The actual values of the fields are then stored successively following these records. When encoding field information records:
When decoding the table, the field information records are processed in order:
After decoding a table, the consumer of the table can find individual fields by performing a binary search upon the field information records to locate the field by its tag, ignoring any fields that do not bear the FIDL_TABLE_FIELD_DECODED flag. The value of that field can then be retrieved from the decoded envelope by dereferencing its value pointer.
This representation is optimized for ease of generation and access, not for space efficiency.
Allow primitives to be optionally encoded, either by recording a pointer in their place (costly) or by encoding a bool field indicating presence of absence (much cheaper, preferred).
Whereas T would normally force a struct to be stored in-line, T* forces it to be out-of-line and instead encodes a pointer to it. T* is non-optional unlike T?.
Support sending unsolicited messages from the server back to the client.
interface Calculator { 1: Add(int32 a, int32 b) -> (int32 sum); 2: Divide(int32 dividend, int32 divisor) -> (int32 quotient, int32 remainder); 3: Clear(); 4: event Error(uint32 status_code); };
The implementor of an interface sends unsolicited event messages to the client of the interface to indicate that an asynchronous event occurred as specified by the FIDL interface declaration.
Events may be used to let the client observe significant state changes without having to create an additional channel to receive the response.
In the Calculator example, we can imagine that an attempt to divide by zero would cause the Error() event to be sent with a “divide by zero” status code prior to the connection being closed. This allows the client to distinguish between the connection being closed due to an error as opposed to for other reasons (such as the calculator process terminating abnormally).
TODO: Consider whether a client could acknowledge events with a result, possibly just for flow control.
The body of the message contains the event arguments as if they were packed in a struct.
It might be useful to provide a way for clients to acknowledge receipt of unsolicited event messages so that the implementor can perform flow control. However this does add some complexity.
[^1]: Justification for unterminated strings. Since strings can contain embedded null characters, it is safer to encode the size explicitly and to make no guarantees about null-termination, thereby defeating incorrect assumptions that clients might make. Modern APIs generally use sized strings as a security precaution. It‘s important that data always have one unambiguous interpretation. [^2]: Defining the zero handle to mean “there is no handle” makes it is safe to default-initialize wire format structures to all zeroes. Zero is also the value of the ZX_HANDLE_INVALID constant. [^3]: New handle types can easily be added to the language without affecting the wire format since all handles are transferred the same way. [^4]: First-fit packing reduces padding overhead. It isn’t guaranteed to be optimal but it is much easier to understand and implement than a best-fit packing algorithm. When generating C-like structs from FIDL declarations, it may be necessary for the code generator to reorder fields or add packing annotations to ensure that the memory layout matches the wire format.