blob: c2f4c19e1a2ad2f315c9acbb9548d42c585d2f91 [file] [log] [blame] [view]
<!-- mdformat off(templates not supported) -->
{% set rfcid = "RFC-0131" %}
{% include "docs/contribute/governance/rfcs/_common/_rfc_header.md" %}
# {{ rfc.name }}: {{ rfc.title }}
<!-- SET the `rfcid` VAR ABOVE. DO NOT EDIT ANYTHING ELSE ABOVE THIS LINE. -->
<!-- mdformat on -->
## Summary
We describe the current (as of Sept 2021) design principles underpinning the
[FIDL wire format][wire-format].
## Motivation
The FIDL wire format specifies how messages are to be encoded (and decoded), as
well as the format for transport level metadata such as the the transactional
message header. Implicit in the wire format specification are theoretical
boundaries which optimal implementation of it may attain. Just like data
structures imply certain big-O bounds on operations, so does the wire format.
In Fuchsia, interprocess communication (at least control plane) is ubiquitously
over FIDL or intended to be. As a result, the wire format has a significant
impact on the overall target performance of the operating system. Similarly, the
wire format has an important role as part of the many layered defense of privacy
and security.
In March 2017, the design for "FIDL 2.0" was being completed. FIDL 2.0 is a more
static version of FIDL, compared with later developments. See also [RFC-0027:
You only pay for what you use][RFC-0027] for additional historical context.
The goals for the wire format specification were as follows[^1]:
[^1]: Authored by Jeff Brown <jeffbrown@google.com>.
* Efficiently transfer messages between processes.
* General purpose, for use with device drivers, high-level services, and
applications.
* Optimized for Zircon IPC only; portability is not a goal. (This goal was since
relaxed.)[^2]
* Optimized for direct memory access; inter-machine transport is not a goal.
* Optimized for 64-bit only; no accommodation for 32-bit environments.
* Uses uncompressed native datatypes _with host-endianness_, first-fit packing of
elements, and correct alignment to support in-place access of message
contents.
* Compatible with C structure in-memory layout (with suitable field ordering and
packing annotations).
* Structures are fixed size and inlined; variable-sized data is stored
out-of-line.
* Structures are not self-described; FIDL files describe their contents.
* No versioning of structures, but interfaces can be extended with new methods
for protocol evolution. (This goal was since relaxed.)[^2]
* No offset calculations required, very little arithmetic which may overflow.
* Support fast single-pass encoding and validation (as a combined operation).
* Support fast single-pass decoding and validation (as a combined operation).
[^2]: Some goals have since been relaxed (portability, no versioning of
structures), or tightened ([endianness][RFC-0030]).
While the the ongoing evolution of the wire format has followed very specific
design principles, some outlined above, these were not necessarily written down
along with rationale. This RFC is an attempt at clearly writing these design
principles down.
## Design
We describe the various design principles underpinning the FIDL wire format.
### Low level first {#low-level-first}
> When faced with making a design tradeoff to support low level programming at
> the expense of high level programming (or the reverse), we typically opt for
> enabling low level programming.
FIDL must satisfy the requirements of low level protocols in Fuchsia, sometimes
used during the boot process when a `malloc` is not yet available for instance.
The alternative, should FIDL not satisfy these requirements, is to manually
design protocols. However, in high level programming, if FIDL is not able to
satisfy the requirements, there are a lot of other options to choose from
(Protobuf, Cap'n Proto, JSON, Yaml, and the like).
### Single pass, and no heap allocation {#single-pass-no-heap}
> It must be possible to encode and decode in a single pass, without allocation
> beyond stack space (i.e. no dynamic heap allocation).
This principle somewhat follows being over specialized towards low level use
cases, and ensuring that any software on the system can fully participate in the
FIDL ecosystem.
Because FIDL provides "decode + validate", the single pass requirement should be
compared to similar systems offering both deserialization and validation, which
is most often done in two passes (with validation occurring on the decoded
form).
A corollary of the no allocation requirement is that encoding and decoding is
done in-place, i.e. with in-place modifications.
### As efficient as hand-rolled data structures
> It must be possible to write an implementation of the wire format which is as
> efficient as hand-rolled data structures.
This is a specialization of the "you pay for what you use" principle, whereby
the convenience and ergonomics that FIDL aims to provide must not be offered at
the expense of performance. In practice, many implementations choose to be less
efficient to provide additional ergonomics, but the wire format does not dictate
this choice.
### Canonical representation
> There must be a single unambiguous representation of a FIDL value, i.e. there
> is one and only one encoded representation of a FIDL value, and one and only
> one decoded representation of a FIDL value.
By forcing a single representation, the wire format is naturally more strict,
which means that implementations have to expect less variance in inputs and
follow a more straight-line path. This helps ensure correctness, through
reduction of surprises coming from data divergences. A canonical form makes it
possible to check for equality of two values without the need to understand the
schema, i.e. a `memcmp` suffices for value types (things are a little [more
complicated for resource
types](/docs/reference/fidl/language/bindings-spec.md#equality-comparison)).
See also the [drawbacks of a canonical form](#drawback-canonical-representation).
### Specify every byte
> When encoding or decoding, it must be possible to traverse every single byte
> of a message in a [single pass and without any heap
> allocation](#single-pass-no-heap).
To ensure that no data leaks from one process to another unbeknownst to the
sender, we both ensure that all bytes can be efficiently traversed, and that all
bytes have a specified value (e.g. padding must be 0). As an example, this can
help to ensure that no personally identifiable information (PII) is
inadvertently shared across process boundaries, or help avoid leaking
uninitialized memory that could contain pointer values, which could be used to
defeat address space layout randomization (ASLR). Another example is considering
"trailing junk" invalid since all data and handles must be accounted for.
### Validation everywhere
> As part of our [defense in depth](/docs/concepts/principles/secure.md), we
> want the FIDL wire format to enforce strict validation (e.g. bound checks,
> strings are well-formed UTF-8 code unit sequences, handles are of the correct
> type and rights) everywhere it is used.
Strict validation is considered worthwhile
in ensuring the security of the platform, and helps API authors state
assumptions and invariants of a design onto the API schema. It is also our
experience that absent of validation in lower layers, applications tend to
validate invariants themselves, leading to code that is less clear, tends to
be less efficient, and more prone to bugs.
Since strict validation can be the source of high performance costs, and that
FIDL is geared towards being used in [low-level](#low-level-first) layers, a
corollary is that such validation must be done efficiently, and designed to fit
in a [single pass](#single-pass-no-heap).
### No reflective functionality out of the box {#avoid-reflection}
> Without explicit opt-in, a peer must not be allowed to perform reflection on a
> protocol, be it exposed methods, or exposed types.
For instance, if a peer calls the wrong FIDL method, the connection is closed,
preventing any information to be extracted about the peer. It might seem
convenient to build such functionality, but that may compromise privacy and be
difficult to undo (users would start building load-bearing functionality off
of this feature).
Similarly, structures lacking a self-descriptive format are in line with this
principle, and meant to avoid disclosing more than necessary in an ecosystem
where interacting peers ought to distrust each other. (There are also
significant performance gains with avoiding a self-descriptive format, which
aligns with the low level first approach.)
As we have changed the FIDL wire format to allow evolution, e.g.
[tables][RFC-0047], we have had to navigate carefully the balance between
forbidding reflection, and adding just enough to allow handling without a
schema.
## Implementation
Keep calm, and follow the principles. As seen in [RFC-0017].
## Performance
Most guiding principles of the FIDL wire format are aimed at performance, and
over specialize towards low level use cases. Performance is a central concern.
## Ergonomics
No change to ergonomics.
## Backwards Compatibility
Some of the principles stated here are in conflict with the primary goal of FIDL
which is providing a foundation for stable ABI, e.g. implementing backwards
compatible protocols is challenging in the absence of reflexive features. Among
other things, the design of the FIDL wire format strikes a balance between
performance (often a result of rigidity) and evolvability concerns (often a
result of flexibility). Balancing these is where the fun lies.
## Security considerations
The role of FIDL in the multi layered approach to security on Fuchsia is
explained in this RFC.
## Privacy considerations
The role of FIDL in the multi layered approach to privacy on Fuchsia is
explained in this RFC.
## Testing
No change to testing.
## Documentation
Amend as needed:
* [FIDL Overview](/docs/concepts/fidl/overview.md)
* [FIDL design principles](/docs/contribute/contributing-to-fidl/design-principles.md)
* [FIDL wire format][wire-format]
## Drawbacks, alternatives, and unknowns
As described in the text.
### Drawbacks of a canonical form {#drawback-canonical-representation}
Requiring a canonicalized form can constrain the problem of finding a good
representation for data, to the point of discarding otherwise interesting or
pursuable forms.
When working on [sparser tables][RFC-0116], canonicalization was one of the
toughest constraints to satisfy, and directly conflicted with the need for the
format to be performant. For instance, we could have explored writing members in
the order provided by the users, without needing a second pass which reorders
those members to satisfy canonicalization requirements.
## Prior art and references
As described in the text.
<!-- link labels -->
[RFC-0017]: 0017_folding_ftp_into_rfc.md
[RFC-0027]: 0027_you_only_pay_what_you_use.md
[RFC-0030]: 0030_fidl_is_little_endian.md
[RFC-0047]: 0047_tables.md
[RFC-0116]: 0116_fidl_sparser_tables.md
[wire-format]: /docs/reference/fidl/language/wire-format/README.md