docs/contribute/governance/rfcs/0050_syntax_revamp.md - fuchsia - Git at Google

 {% set rfcid = "RFC-0050" %}
 {% include "docs/contribute/governance/rfcs/_common/_rfc_header.md" %}
 # {{ rfc.name }}: {{ rfc.title }}
 <!-- SET the `rfcid` VAR ABOVE. DO NOT EDIT ANYTHING ELSE ABOVE THIS LINE. -->

 Note: Formerly known as [FTP](../deprecated-ftp-process.md)-050.

 ## Summary

 We establish guiding principles for syntactic choices, and make a few syntax
 changes following these principles.

 ### Changes {#changes}

 *   Placing **types in second position**, e.g. in method parameter names precede
     their respective types, in table declarations member names precede their
     respective types;
 *   Changing types to **separate layout from constraints** to place layout
     related type information on the left hand side of the `:` separator with
     constraints information on the right hand side, e.g. `array<T, 5>` vs
     `vector<T>:5` more clearly conveys that an array's size is layout
     impacting, whereas it is a constraint for vectors.
 *   Introduction of **anonymous layouts**. For instance `table { f1 uint8; f2
     uint16; }` can be used directly within a method parameter list.
 *   Declaration of top-level types is done by using anonymous layouts with the
     help of the **type introduction declarations** of the form `type Name =
     Layout;`.
 *   Lastly, for a protocol `P`, **renaming** `P` and `request<P>` to
     `client_end:P` and `server_end:P` respectively. Note that the protocol is a
     constraint of the client or server end, rather than the previous position
     which would incorrectly indicate a layout related concern.

 ### Relation to other RFCs

 *   This RFC subsumes [RFC-0038: Separating Layout from Constraints][rfc-0038] and
     [RFC-0039: Types Come Second][rfc-0039] i.e. accepting RFC-0050 means rejecting
     both RFC-0038 and RFC-0039 as "obsolete".
 *   This RFC proposes an alternative solution to
     [RFC-0044: Extensible Method Arguments][rfc-0044], i.e. accepting RFC-0050
     means rejecting RFC-0044 as "obsolete".

 This RFC was later amended by:

 *   [RFC-0086: Updates to RFC-0050: FIDL Attributes Syntax](0086_rfc_0050_attributes.md)
 *   [RFC-0087: Updates to RFC-0050: FIDL Method Parameter Syntax](0087_fidl_method_syntax.md)
 *   [RFC-0088: Updates to RFC-0050: FIDL Bits, Enum, and Constraints Syntax](0088_rfc_0050_bits_enums_constraints.md)

 ## Motivation {#motivation}

 ### Introductory Examples

 #### Algebraic Data Types

 The syntax is versatile enough that representing algebraic data types (ADTs) can
 be done fluently, without requiring any more sugar. Consider for instance:

 ```fidl
 /// Describes simple algebraic expressions.
 type Expression = flexible union {
     1: value int64;
     2: bin_op struct {
         op flexible enum {
             ADD = 1;
             MUL = 2;
             DIV = 3;
         };
         left_exp Expression;
         right_exp Expression;
     };
     3: un_op struct {
         op flexible enum {
             NEG = 1;
         };
         exp Expression;
     };
 };
 ```

 Pattern wise, we choose to use a `union` of `struct`: the `union` offers
 extensibility, and it is therefore not needed (and preferable) to use a more
 rigid variant. Should we need to change a variant, we can instead add a new one
 wholesale, and migrate to using this new variant. (In other places where
 evolvability is needed, e.g. the list of binary or unary operators, a
 flexible enum is chosen.)

 Supporting ADTs requires much more than ergonomic syntax to describe data types.
 One of the key features expected, for instance, is easy construction and
 destruction (e.g. via pattern matching or visitor pattern).

 This RFC does not introduce new functionality to FIDL, and limitations on
 recursive types would prevent the example to compile today. We plan to add
 support for generalized recursive types, and this extension will be the object
 of a future RFC.

 #### Combining non-evolvable messages with evolvable messages more easily

 For instance, expressing an "extensible struct" which has both struct elements
 (compact, inline, fast encoding/decoding), as well as the possibility to be
 extended:

 ```fidl
 type Something = struct {
     ...

     /// Provide extension point, initially empty.
     extension table {};
 };
 ```

 As an example, the `fuchsia.test.breakpoints` library needs to define an
 extensible event dubbed `Invocation`. These events all share common values, as
 well as specific payload for each variant of the event. This can now be more
 directly and succinctly expressed as:

 ```fidl
 type Invocation = table {
     1: target_moniker string:MAX_MONIKER_LENGTH;
     2: handler Handler;
     3: payload InvocationPayload;
 };

 type InvocationPayload = union {
     1: start_instance struct{};
     2: routing table {
         1: protocol RoutingProtocol;
         2: capability_id string:MAX_CAPABILITY_ID_LENGTH;
         3: source CapabilitySource;
     };
 };
 ```

 #### Extensible method arguments {#extensible-method-args}

 For instance, extensible method arguments:

 ```fidl
 protocol Peripheral {
     StartAdvertising(table {
         1: data AdvertisingData;
         2: scan_response AdvertisingData;
         3: mode_hint AdvertisingModeHint;
         4: connectable bool;
         5: handle server_end:AdvertisingHandle;
     }) -> () error PeripheralError;
 };
 ```

 Using a `table` for arguments is not a "best practice". It may be appropriate,
 but comes with its set of issues, e.g. 2<sup>N</sup> possibilities with N
 fields, possibly adding a lot of complexity on recipients.

 ### Guiding principles {#principles}

 FIDL is primarily concerned with defining [Application Binary
 Interface](https://en.wikipedia.org/wiki/Application_binary_interface) (ABI)
 concerns, and second with Application Programming Interface (API) concerns. This
 can result in a syntax that is more verbose than one may be accustomed to, or
 may be expecting when comparing to other programming languages. For instance, a
 `unit` variant of a union would be expressed as an empty struct as can be seen
 in the `InvocationPayload` example above. We could choose to introduce syntactic
 sugar to elide this type, but that would go against making ABI concerns
 front-and-center.

 #### Separating Layouts from Constraints {#layouts-constraints}

 Align on the syntax

 ```
     layout:constraint
 ```

 For types, i.e. anything that controls layout is before the colon, anything that
 controls constraint is after the colon. The layout describes how the bytes are
 laid out, vs how they are interpreted. The constraint restricts what can be
 represented given the layout, it is a validation step done during
 encoding/decoding.

 This syntax provides a simplified way to consider ABI implications of a change,
 and in particular leads to two shorthand rules:

 1. If two types have a different layout, it is not possible to soft transition
    from one to the other, and vice versa [[1]](#Footnote1), i.e. **changing the
    left hand side breaks ABI**
 2. Constraints can evolve, and as long as writers are more constrained than
    readers, things are compatible, i.e. **it is possible to evolve the right
    hand side and preserve ABI**

 Here are example changes following this principle:

 * `array<T>:N` _becomes_ `array<T, N>`
 * `handle<K>` _becomes_ `handle:K`
 * `vector<T>?` _becomes_ `vector<T>:optional`
 * `Struct?` _becomes_ `box<Struct>`
 * `Table?` _becomes_ `Table:optional`
 * `Union?` _becomes_ `Union:optional`

 Those changes are discussed in the [design](#design) section of this RFC.

 Note: This RFC does not introduce new functionality to FIDL, and so `box<T>` is
 only allowed for structs, and within structs. This puts the layout in front of
 users, whereas before it was hidden behind syntax. Similarly, optional tables
 are not allowed in FIDL, but that restriction is a semantic restriction, not a
 grammar restriction. We look to harmonize handling of optionality in another
 RFC, for instance by allowing optional primitives natively through boxing.

 Note: From the FIDL language perspective, i.e. the FIDL language specification,
 a `string` can be viewed purely as a new type of a bytes vector, along with the
 UTF-8 well formedness constraint, i.e loosely `vector<uint8>:UTF-8`. We expect
 bindings to understand this special named type, and to map it to an ergonomic
 version in the target language, e.g. `std::string` in C++ bindings, `string` in
 Go, or `char*` in low level C-family bindings. This is not dissimilar to mapping
 the error syntax to a more ergonomic version, e.g. `std::result` in Rust.
 Essentially, from a FIDL language standpoint, nothing about string needs to be
 "special", it does not need to appear in the specification. The treatment of
 string is best left to the [FIDL bindings specification][bindings-spec]

 #### Binary wire format first {#binary-wire-format-first}

 While many formats can represent FIDL messages, the [FIDL Wire Format][wire-format]
 (or "FIDL Binary Wire Format") is the one which has preferential treatment, and
 is catered to first.

 This means that syntax choices meant to align syntax consistency with ABI
 consistency should consider ABI under the binary wire format (and not, say,
 other formats like JSON).

 As an example, names do not matter when it comes to types' ABI — names _do_
 matter for protocols and methods. While names might matter for a possible JSON
 format we choose to over rotate towards the binary ABI format when making syntax
 choices, and would not alter the syntax to advantage a textual representation if
 it hinders the understanding of ABI rules.

 #### Fewest features {#fewest-features}

 Wright's ["form and function should be one"](https://www.guggenheim.org/teaching-materials/the-architecture-of-the-solomon-r-guggenheim-museum/form-follows-function)
 makes us strive for similar looking constructs to have similar looking meaning,
 and vice versa. As an example, all extensible data, which internally leverage
 [envelopes][envelopes], are always presented with `ordinal:`.

 ```
 layout {
     ordinal: name type;
 };
 ```

 We strive to have the fewest features and rules, and aim to combine features to
 achieve use cases. In practice, when considering new features, we should first
 try to adapt or generalize other existing features rather than introduce new
 features. As an example, while special syntax can be designed for extensible
 method arguments (and returns) as discussed in
 [RFC-0044: Extensible Method Arguments][rfc-0044]
 we prefer leveraging `table` and the normal syntax for those.

 One could argue that we should even require anonymous `struct` layouts for
 methods requests and responses rather than the current syntactic sugar for
 arguments borrowed from most programming languages. However, a competing design
 consideration is to help library authors in aggregate achieve consistency: in
 `enum` layout declaration, we prefer syntactic sugar over explicitly choosing a
 wrapped type, as having a sensible default provides greater consistency for enums
 across FIDL libraries. This in turn provides a migration path to switch enums
 down the road, e.g. should a library define a general purpose `ErrorStatus`
 enum, it could be replaced later by another 'better' general purpose
 `ErrorStatusV2`.

 ## Design {#design}

 ### Types

 Types follow the general form:

 ```
 Name<Param1, Param2, ...>:<Constraint1, Constraint2, ...>
 ```

 Empty type parameterization must omit `<` and `>`, i.e. `uint32` (not`
 uint32<>`).

 A type with no constraints must omit both the `:` separator, and `<`, `>`, i.e.
 `uint32` (not `uint32:<>`, nor `uint32:`).

 A type with a single constraint may omit `<` and `>`, i.e.` vector<uint32>:5`
 and `vector<uint32>:<5>` are both allowed, and equivalent.

 #### Built Ins

 The following **primitive types** are supported:

 *   Boolean `bool`
 *   Signed integer `int8`, `int16`, `int32`, `int64`
 *   Unsigned integer `uint8`, `uint16`, `uint32`, `uint64`
 *   IEEE 754 Floating-point `float32`, `float64`

 **Fixed sized repeated values**:

 ```
 array<T, N>
 ```

 Which can be thought of as a `struct` with `N` elements of type `T`.

 **Variable sized repeated values**:

 ```
 vector<T>
 vector<T>:N
 ```

 i.e. the size `N` can be omitted.

 **Variable sized UTF-8 strings**:

 ```
 string
 string:N
 ```

 i.e. the size `N` can be omitted.

 **References to kernel objects, i.e. handles**:

 ```
 handle
 handle:S
 ```

 Where the subtype `S` is one of `bti`, `buffer`, `channel`, `debuglog`, `event`,
 `eventpair`, `exception`, `fifo`, `guest`, `interrupt`, `iommu`, `job`, `pager`,
 `pcidevice`, `pmt`, `port`, `process`, `profile`, `resource`, `socket`,
 `suspendtoken`, `thread`, `timer`, `vcpu`, `vmar`, `vmo`.

 Handles with rights introduced in [RFC-0028: Handle Rights][rfc-0028]:

 ```
 handle:<S, R>
 ```

 Where the rights `R` are either a rights value, or a rights expression.

 **References to protocols objects, i.e. channel handles of targeted use**:

 ```
 client_end:P
 server_end:P
 ```

 i.e. `client_end:fuchsia.media.AudioCore` or `server_end:fuchsia.ui.scenic.Session`.

 Specifically, it is not legal to reference a protocol by itself: protocol
 declarations do not introduce a type, only what can be thought of as a kind of
 client or server ends. This is discussed at greater length in the [Transport
 generalization](#transport-generalization) section.

 ### Layouts {#layouts}

 In addition to the built in layouts, we have five layouts which can be
 configured to introduce new types:

 *   `enum`
 *   `bits`
 *   `struct`
 *   `table`
 *   `union`

 #### Finite layout

 Both `enum` and `bits` layout are expressed in similar ways:

 ```
 layout : WrappedType {
     MEMBER = expression;
     ...;
 };
 ```

 Where the `: WrappedType` is optional[^2], and defaults to `uint32` if omitted.

 An example `enum`:

 ```fidl
 enum {
     OTHER = 1;
     AUDIO = 2;
     VIDEO = 3;
     ...
 };
 ```

 An example `bits`:

 ```fidl
 bits : uint64 {
     TOTAL_BYTES = 0x1;
     USED_BYTES  = 0x2;
     TOTAL_NODES = 0x4;
     ...
 };
 ```

 #### Flexible layouts

 Both `table` and `union` layouts are expressed in similar ways:

 ```
 layout {
     ordinal: member_name type;
     ...;
 };
 ```

 Here, the `ordinal:` can be thought of as syntactic sugar to describe an
 `envelope<type>`.

 For tables, members are often referred to as fields. For unions, members are
 often referred to as variants. Additionally, members may be reserved:

 ```
 layout {
     ordinal: reserved;
     ...
 };
 ```

 #### Rigid layouts

 The only rigid layout `struct` is expressed in a way that is close to flexible
 layouts, without the flexible notation:

 ```
 layout {
     member_name type;
     ...;
 };
 ```

 For structs, members are often referred to as fields.

 #### Attributes

 A layout may be preceded by attributes for that layout:

 ```fidl
 [MaxBytes = "64"] struct {
     x uint32;
     y uint32;
 };
 ```

 This makes it possible to unambiguously attach attributes to both the member of
 a layout, and the type of that member:

 ```fidl
 table {
     [OnMember = "origin"]
     1: origin [OnLayout] struct {
         x uint32;
         y uint32;
     };
 };
 ```

 In the case of the introduction of a [new type](#newtypes) that is a layout, there are two
 possible placements for attributes on the newly introduced type:

 * On the new type: `[Attr] type MyStruct = struct { ... }`.
 * On the layout: `type MyStruct = [Attr] struct { ... }`.

 `fidlc` will consider these equivalent, and raise an error if attributes are
 specified in both places.

 Regardless of which placement is used to specify the attributes, the attributes
 are conceptually attached to the layout itself rather than the type stanza as a
 whole. An example of a practical application of this is that in any IR the
 preference would be to lower attributes on the type stanza down to the layout
 rather than hoist the attributes on the layout up to the type stanza.

 ### Naming context and use of layouts {#layout-naming-contexts}

 Layouts themselves do not carry names, in a way all layouts are "anonymous".
 Instead, it is a specific use of a layout which determines the name it will have
 in the target language.

 For instance, the most common use of layouts is to introduce a new top-level
 type:

 ```fidld
 library fuchsia.mem;

 type Buffer = struct {
     vmo handle:vmo;
     size uint64;
 };
 ```

 Here, the struct layout is used in a "new type" declaration within the top-level
 library.

 An example use in an anonymous context was covered in the introductory notes to
 express extensible method arguments:

 ```fidl
 library fuchsia.bluetooth.le;

 protocol Peripheral {
     StartAdvertising(table {
         1: data AdvertisingData;
         2: scan_response AdvertisingData;
         3: mode_hint AdvertisingModeHint;
         4: connectable bool;
         5: handle server_end:AdvertisingHandle;
     }) -> () error PeripheralError;
 };
 ```

 Here, the table layout is used within the request of the `StartAdvertising`
 method, in the `Peripheral` protocol declaration.

 We refer to the list of names, from least specific to most specific, which
 identifies the use of a layout as its "naming context". In the two examples
 above, we have respectively `fuchsia.mem/Buffer` and
 `fuchsia.bluetooth.le/Peripheral, StartAdvertising, request` as the two naming
 contexts.

 In the JSON IR, layout declarations will include their naming context, i.e. the
 hierarchical list of names described above.

 #### Naming contexts {#naming-contexts}

 Within a library `some.library`, a `type Name = ` declaration introduces a
 naming context for `some.library/Name`.

 A use within a request (respectively a response)  of a `Method` within
 `Protocol` introduces a naming context of `some.library/Protocol, Method,
 request/response`

 A use within a layout adds the field name (or variant name) to the naming
 context. For instance:

 ```fidl
 type Outer = struct {
     inner struct {
         ...
     };
 };
 ```

 The first outer struct layout's naming context is `some.library/Outer`, and the
 second inner struct layout's naming context is `some.library/Outer, inner`.

 #### Generated flattened name {#flattened-name}

 Many target languages can represent naming context hierarchically. In C++ for
 instance, a type can be defined within an enclosing type. However, some target
 languages do not have this ability, and we must therefore consider name clashing
 caused by flattening naming contexts.

 Consider for instance the naming context `some.library/Protocol, Method,
 request`. This may be flattened to `some.library/MethodRequestOfProtocool` in
 Go. If some other definition happens to use the naming context
 `some.library/MethodRequestOfProtocool` then the Go bindings are faced with a
 conundrum: one of the two declarations must be renamed. Worst, should a library
 with one declaration (no name clash) evolve into a library with the two
 declarations (with a name clash), then the Go bindings must be consistent with
 what was generated before in order to avoid a source breaking change.

 Our experience has shown that these decisions are best left to the core FIDL
 compiler, rather than delegated down the toolchain to FIDL bindings. We will
 therefore compute and guarantee a stable flattened name.

 In the JSON IR, naming contexts will include a generated flattened name which
 the compiler guarantees is unique in global scope, i.e. the frontend compiler is
 responsible for generating flattened names, and verifying that flattened names
 do not clash with other declarations (be it other flattened names, or top-level
 declarations).

 Take the example before, should a library author add a declaration `type
 MethodRequestOfProtocool = ...` which clashes with the generated flattened name of
 another declaration, compilation will fail.

 #### Use of naming contexts by bindings

 Bindings can be split in roughly two categories:

 1. Ability to represent naming context scoping in the target language, e.g.
    bindings for the C++ language;
 2. Inability to represent naming context and fallback to the use of the
    generated flattened nuse flattened name, e.g. bindings for the Go language.

 That's an improvement over the situation today because we'll at least be
 consistent between bindings, and have compiler help on the frontend. Today, we
 have to generate some of the names late in the game (in the backend), which is a
 hazardous and error prone approach.

 For instance, consider the definition:

 ```fidl
 type BinOp = union {
     add struct {
         left uint32;
         right uint32;
     };
 };
 ```

 In C++ bindings, we could end up:

 ```cpp
 class BinOp {
     class Add {
         ...
     };
 };
 ```

 The accessor to the variant `add` would be:

 ```cpp
 BinOp.add();
 ```

 which does not clash with the class definition.

 Or in Go, with the use of flattened names:

 ```go
 type BinOp struct { ... };
 type BinOpAdd struct { ... };
 ```

 Should the library author later decide to introduce a top-level declaration
 named `BinOpAdd`, this would be caught by the frontend compiler and reported as
 an error. The library author is put in control to think through the
 ramifications of this change, and would have the option to decide to break
 source compatibility for the introduction of this new declaration. Again, this
 is an improvement over the current situation where such source compatibility
 breakages are discovered later, and farther from where the decision was made.

 ### Type Aliasing, and New Type {#newtypes}

 In [RFC-0052: Type Aliasing and New Types][rfc-0052] we evolved type aliasing and
 new type declarations.

 Aliases are declared as:

 ```fidl
 alias NewName = AliasedType;
 ```

 i.e. unchanged from syntax proposed in RFC-0052.

 New types are declared as:

 ```fidl
 type NewType = WrappedType;
 ```

 i.e. the syntax for new types is the same whether the wrapped type is another
 existing type (wrapping) or some layout (new top-level type). This differs from
 the initially proposed syntax in [RFC-0052][rfc-0052].

 ### Optionality

 Certain types are inherently capable of being optional: `vectors`, `strings`,
 `envelopes`, and layouts using such constructs i.e `table` which is a vector (of
 envelopes) and a `union` which is a tag plus an envelope. As a result, whether
 these types are optional or not is a constraint, and can be evolved into
 (becoming nullable, by relaxing the constraint), or evolved out of (becoming
 required, by tightening the constraint).

 On the other hand, types such as `int8` or `struct` layout are not inherently
 capable of being optional. In order to have optionality, one needs to introduce
 an indirection, for instance via an indirect reference in the struct case. As a
 result, unlike types which are inherently optional, no evolutionary path is
 possible.

 To distinguish between these two cases, and following the principle of keeping
 ABI concerns "on the left" and evolvable concerns "on the right" have:

 | Naturally optional | Not naturally optional |
 |--------------------|------------------------|
 | `string:optional`  | `box<struct>`          |
 | `vector:optional`  |                        |
 | `union:optional`   |                        |

 Naming wise, we prefer the terms "optional", "required", "present", "absent".
 (We should avoid "nullable", "not nullable", "null fields".) In line with that
 naming preference, we choose `box<T>` rather than `pointer<T>`. A `box` is
 an optional by default structure, i.e. `box<struct>` in the new syntax is
 equivalent to `struct?` in the old syntax, and `box<struct>:optional` is
 redundant and may trigger a warning from the compiler or linter. This is to
 better match the use case we expect: users generally box structs to get
 optionality rather than to add indirection.

 ### Constants

 Constants are declared as:

 ```fidl
 const NAME type = expression;
 ```

 ### Constraint ordering

 When parameterizing a type based on layouts and constraints, the ordering of
 these arguments is fixed for a given type. This RFC defines the following
 orders for constraints (no type has multiple layout arguments yet):

 * Handles: subtype, rights, optionality.
 * Protocol client/server_end: protocol, optionality.
 * Vector: size, optionality.
 * Unions: optionality.

 As a guiding principle,  optionality always comes last, and, for handles,
 subtype before rights.

 As an example, consider this struct with all possible constraints defined on its
 members:

 ```fidl
 type Foo = struct {
   h1 zx.handle,
   h2 zx.handle:optional,
   h3 zx.handle:VMO,
   h4 zx.handle:<VMO,optional>,
   h5 zx.handle:<VMO,zx.READ>,
   h6 zx.handle:<VMO,zx.READ,optional>,
   p1 client_end:MyProtocol,
   p2 client_end:<MyProtocol,optional>,
   r1 server_end:P,
   r2 server_end:<MyProtocol,optional>,
   s1 MyStruct,
   s2 box<MyStruct>,
   u1 MyUnion,
   u2 MyUnion:optional,
   v1 vector<bool>,
   v2 vector<bool>:optional,
   v3 vector<bool>:16,
   v4 vector<bool>:<16,optional>,
 };
 ```

 ## Future Direction {#future-directions}

 In addition to changes to the syntax to features which currently exist, we look
 and set the direction for features which are expected to see the light of day in
 the near future. Here, the focus is on intended expressivity and it's syntactic
 rendering (not on the precise semantics, which warrants separate RFCs). For
 instance, while we describe transport generalization, we do not discuss various
 thorny design issues (e.g. extent of configurability, representation in JSON
 IR).

 This section is also expected to be read as directional, and not as a future
 specification. As new features are introduced, their corresponding syntax will
 be evaluated along with the precise workings of those features.

 ### Contextual name resolution

 E.g.

 ```fidl
 const A_OR_B MyBits = MyBits.A | MyBits.B;
 ```

 Would be simplified to:

 ```fidl
 const A_OR_B MyBits = A | B;
 ```

 E.g.

 ```fidl
 zx.handle:<zx.VMO, zx.rights.READ_ONLY>
 ```

 Would be simplified to:

 ```fidl
 zx.handle:<VMO, READ_ONLY>
 ```

 ### Constraints

 #### Declaration site constraints

 ```fidl
 type CircleCoordinates = struct {
     x int32;
     y int32;
 }:x^2 + y^2 < 100;
 ```

 #### Use site constraints

 ```fidl
 type Small = struct {
     content fuchsia.mem.Buffer:vmo.size < 1024;
 };
 ```

 #### Standalone constraints

 ```fidl
 constraint Circular : Coordinates {
     x^2 + y^2 < 100
 };
 ```

 ### Constraints on envelopes

 The syntax of tables and extensible unions hides the use of envelopes:

 *   A `table` is a `vector<envelope<...>>`, and
 *   A `union` is a `struct { tag uint64; variant envelope<...>; }`.

 Right now, the `ordinal:` which appears in `table` and `union` declarations are
 the only places where envelopes exist, and it's useful to think of this syntax
 as the "sugared" introduction of an envelope. Essentially, we can de-sugar as
 follows:

 <table>
   <tr>
    <td colspan="2" ><strong>Desugaring tables and flexible unions</strong>
    </td>
   </tr>
   <tr>
    <td>
 <pre class="prettyprint">table ExampleTable {
     1: name string;
     2: size uint32;
 };</pre>
    </td>
    <td>
 <pre class="prettyprint">table ExampleTable {
     @1 name envelope<string>;
     @2 size envelope<uint32>;
 };</pre>
    </td>
   </tr>
   <tr>
    <td>
 <pre class="prettyprint">union ExampleUnion {
     1: name string;
     2: size uint32;
 };</pre>
    </td>
    <td>
 <pre class="prettyprint">union ExampleUnion {
     @1 name envelope<string>;
     @2 size envelope<uint32>;
 };</pre>
    </td>
   </tr>
 </table>

 Should we want to constrain the `envelope`, say to `require` an element, we
 would place this constraint on the ordinal `ordinal:C` such as:

 <table>
   <tr>
    <td colspan="2" ><strong>Desugaring tables and flexible unions</strong>
    </td>
   </tr>
   <tr>
    <td>
 <pre class="prettyprint">table ExampleTable {
     1:C1 name string:C2;
     2:C size uint32;
 };</pre>
    </td>
    <td>
 <pre class="prettyprint">table ExampleTable {
     @1 name envelope<string:C2>:C1;
     @2 size envelope<uint32>:C;
 };</pre>
    </td>
   </tr>
   <tr>
    <td>
 <pre class="prettyprint">union ExampleUnion {
     1:C1 name string:C2;
     2:C size uint32;
 };</pre>
    </td>
    <td>
 <pre class="prettyprint">union ExampleUnion {
     @1 name envelope<string:C2>:C1;
     @2 size envelope<uint32>:C;
 };</pre>
    </td>
   </tr>
 </table>

 ### Properties

 FIDL's type system is already one which has the concept of constraints. We have
 `vector<uint8>:8` to mean that a vector has at most 8 elements, or `string:optional`
 to relax the optionality constraint and allow the string to be optional.

 Various needs are pushing towards both more expressive constraints, and an
 opinionated view of how these constraints are unified and handled.

 For instance, [fuchsia.mem/Buffer][mem-buffer]
 notes "This size must not be greater than the physical size of the VMO." Work is
 ongoing to introduce [RFC-0028: Handle Rights][rfc-0028],
 i.e. constraining handles. Or idea of requiring table fields, i.e. constraining
 the presence on otherwise optional envelopes.

 Right now, there is no way to describe runtime properties of the values or
 entities being manipulated. While a `string` value has a size, it is not
 possible to name this. While a `handle` has rights associated with it, it is not
 possible to name these either.

 To properly solve the expressivity problem associated with constrained types, we
 must first bridge the runtime aspects of values, with the limited view which
 FIDL has of these values. We plan to introduce **properties **which can be
 thought of as virtual fields attached to values. Properties have no impact on
 the wire format, they are purely a language level construct, and appear in the
 JSON IR for bindings to give runtime meaning to them. Properties exist for the
 sole purpose of expressing constraints over them. Each and every property would
 need to be known to bindings, in a similar fashion that built ins are known to
 bindings.

 Continuing the example above, a `string` value may have a `uint32 size`
 property, a handle may have a `zx.rights rights` property.

 For instance:

 ```
 layout name {
     properties {
         size uint32;
     };
 };
 ```

 ### Transport generalization {#transport-generalization}

 Declaring a new transport would at least require defining a new name, specifying
 constraints for the messages the transport supports (e.g. "no handles", "no
 tables"), and specifying constraints for the protocol (e.g. only
 "fire-and-forget methods", "no events").

 The envisaged syntax resembles a configuration expressed in untyped FIDL Text:

 ```
 transport ipc = {
     methods: {
         fire_and_forget: true,
         request_response: true,
     },
     allowed_resources: [handle],
 };
 ```

 Note: the literal can be untyped because the target type is determined by
 assigning this to a transport.

 And then used as:

 ```
 protocol SomeProtocol over zx.ipc {
     ...
 };
 ```

 ### Handle generalization {#handle-generalization}

 Right now, handles are a purely Fuchsia specific concept: they are directly tied
 to the Zircon kernel, map to `zx_handle_t` (or equivalent in other languages
 than C), and their kinds are only the objects exposed by the kernel such as
 `port`, `vmo`, `fifo`, etc.

 When considering other cases (e.g. in process communication),
 one desirable extension point is to be able to define handles in FIDL directly,
 rather than have that be a part of the language definition.

 As an example, defining zircon handles:

 ```fidl
 library zx;

 resource handle : uint32 {
     properties {
         subtype handle_subtype;
         rights rights;
     };
 };

 type handle_subtype = enum {
     PROCESS = 1;
     THREAD = 2;
     VMO = 3;
     CHANNEL = 4;
 };

 type rights = bits {
     READ = ...;
     WRIE = ...;
 };
 ```

 Which would allow `handle` or `handle:VMO` (or in another library
 `zx.handle:zx.handle.VMO`).

 An [experimental](https://fuchsia-review.googlesource.com/c/fuchsia/+/390333)
 implementation exists, and will be used to break the cyclic dependency between
 Zircon and FIDL (until this change, Zircon's API was described in FIDL, but FIDL
 was partly defined in terms of Zircon's API).

 ## Implementation Strategy

 A temporary "version declaration" will be added to the top of all `.fidl` files
 to be used by `fidlc` to detect whether a `.fidl` file is in the prior or new
 syntax.

 This token will be immediately preceding the library statement:

 ```fidl
 // Copyright notice...

 deprecated_syntax;

 library fidl.test;
 ...
 ```

 An explicit marker is preferred in order to simplify the role of `fidlc` in
 detecting the syntax and to improve readability. An example of a challenge from
 detecting syntax is the case where interpreting as either syntax leads to
 compilation errors. These scenarios would require a heuristic to decide between
 the old and new syntax, which could lead to surprising results.

 Further, this token is added to all files in the prior syntax rather than in the
 new syntax (e.g. `new_syntax;"`) in order to socialize the aspect of the
 upcoming migration - readers of FIDL files will get a sense that the syntax is
 about to change and can seek additional context through other channels (e.g.
 documentation, mailing lists).

 A new `fidlconv` host tool will be added that can take FIDL files in the old
 format and convert them to files in the new format, referred to as `.fidl_new`
 for the purposes of this section. Though this tool is separate from `fidlc`,
 it will need to leverage the compiler's internal representation to perform
 this conversion correctly. For example, a type `Foo` will need to be converted to
 `client_end:Foo` only if it is a protocol - to determine whether the case
 `fidlconv` will leverage `fidlc` to compile the FIDL library first.

 The FIDL frontend compiler `fidlc` as well as accompanying tools like the
 formatter and linter will be extended to support either syntax based on the
 marker defined above.

 With this added functionality, the build pipeline will be extended as follows:

 ![Visualization: build pipeline strategy](resources/0050_syntax_revamp/strategy.png)

 That is:

 *   A `fidlconv` tool will convert FIDL files in the old syntax to the new
     syntax.
 *   The `fidlc` compiler will output the `.json` by compiling the old syntax.
 *   Separately, the `fidlc` compiler will output the `.json` IR by compiling the new
     syntax.
 *   The `fidlfmt` formatter will format the generated new library files
     `.fidl_new`.

 For testing and verification:

 *   The two json IR will be compared, and verified to match (except for span
     information).
 *   Idempotency of the formatting of new libraries files will be verified to
     check both the output of the `fidlc` compiler, and of the `fidlfmt`
     formatter with the new syntax.

 As part of this implementation, the FIDL team will also move the coding tables
 backend to be a standalone binary (in the same vein as other backends), and will
 obsolete and delete the C bindings backend by generating the last uses, and
 checking them in the fuchsia.git tree repository.

 ## Ergonomics

 This RFC is all about ergonomics.

 We are willing to trade a short term productivity loss to developers familiar
 with the current syntax as they retrain to use this modified syntax as we
 strongly believe the many more developers who will be using FIDL in the future
 will greatly benefit.

 ## Documentation and Examples

 This will require changing:

 *   [FIDL language specification][language-spec]
 *   [FIDL grammar][fidl-grammar]
 *   all FIDL code examples

 ## Backwards Compatibility

 This change is not backwards compatible. See the implementation section for the
 transition plan.

 ## Performance

 This change has no impact on performance.

 ## Security

 This change has no impact on security.

 ## Testing

 See the implementation section for the transition plan, and verifying its
 correctness.

 ## Drawbacks, Alternatives, and Unknowns

 ### Using colon to separate name from type

 Since we're moving types to be second, we could also consider using the quite
 common `:` separator as is done in type theory, Rust, Kotlin, the ML languages
 (SML, Haskell, OCaml), Scala, Nim, Python, TypeScript, and many more:

 ```
     field: int32 rather than the proposed field int32
 ```

 This proposal rejects this approach.

 The `:` separator is primordially used to separate layouts from constraints. It
 is also used to indicate a "wrapped type" for `enum` and `bits` declarations.
 Finally it is used to denote envelopes in `table` and `union` declarations.
 Further overloading the `:` separator, especially in close grammatical proximity
 to its main use will lead to confusion (e.g. a table member `1: name:
 string:128;`).

 ### Omitting semicolons

 It has been discussed to work to omit semicolons terminating declarations (be it
 member, const, or other).

 This proposal chooses not to explore this simplification.

 Removing semicolons makes little syntactic difference for FIDL authors. It's
 also not a key change to make, and should we want to explore this in the future
 it will be easy to modify (e.g. [Go's approach to remove semicolons](https://golang.org/doc/effective_go.html#semicolons)).

 However, presence of semicolons to terminate members and declarations makes it
 much easier to guarantee unambiguous grammar rules especially as we explore
 constraints (use-site and declaration-site). For instance, with a declaration
 site layout constraint (`C`) such as `struct Example { ... }:C;` we delineate a
 constraint nicely between the `:` separator and the `;` terminator.

 ### Unifying enums and unions

 From a type theoretic standpoint, an enumeration represents a sum of unit types,
 and a union represents a sum of any types. It is therefore tempting to seek to
 unify these two concepts into one. This is the approach taken by programming
 languages which support ADTs such as
 [ML](https://en.wikipedia.org/wiki/ML_(programming_language)) or Rust.

 However, from a layout standpoint, a sum type of only unit types (an
 enumeration) can be represented much more efficiently than the extensible
 counterpart (a union). While both offer extensibility in light of adding new
 members, only unions offer extensibility to go from unit types (e.g. `struct
 {}`)  to any types. This extensibility comes at a cost of an inline envelope.

 We have chosen a pragmatic approach that balances the complexity of having two
 constructs, with the performance benefit of special casing enumerations.

 ## References

 On syntax

 *   [RFC-0038: Separating Layout from Constraints][rfc-0038]
 *   [RFC-0039: Types Come Second][rfc-0039]

 On extensible method arguments

 *   [RFC-0044: Extensible Method Arguments][rfc-0044]

 On type aliasing and named types

 *   [RFC-0052: Type Aliasing and New Types][rfc-0052]

 --------------------------------------------------------------------------------------------

 ##### Footnote1

 Or at least, not without a good understanding of the wire format and care, e.g.
 [fxb/360015](https://fuchsia-review.googlesource.com/c/fuchsia/+/360015)

 ##### Footnote2

 While it may seem odd to prefer syntactic conciseness over explicitly choosing a
 wrapped type, having a sensible default provides greater consistency for enums
 across FIDL libraries. This in term provides a migration path to switch enums
 down the road, e.g. should a library define a general purpose `ErrorStatus`
 enum, it could be replaced later by another 'better' general purpose
 `ErrorStatusV2`.

 [envelopes]: /docs/contribute/governance/rfcs/0047_tables.md#envelopes
 [rfc-0028]: /docs/contribute/governance/rfcs/0028_handle_rights.md
 [rfc-0038]: /docs/contribute/governance/rfcs/0038_seperating_layout_from_constraints.md
 [rfc-0039]: /docs/contribute/governance/rfcs/0039_types_come_second.md
 [rfc-0044]: /docs/contribute/governance/rfcs/0044_extensible_method_arguments.md
 [rfc-0052]: /docs/contribute/governance/rfcs/0052_type_aliasing_named_types.md
 [bindings-spec]: /docs/reference/fidl/language/bindings-spec.md
 [language-spec]: /docs/reference/fidl/language/language.md
 [fidl-grammar]: /docs/reference/fidl/language/grammar.md
 [wire-format]: /docs/reference/fidl/language/wire-format
 [mem-buffer]: /sdk/fidl/fuchsia.mem/buffer.fidl