docs/source/evolution.md - third_party/github.com/google/flatbuffers - Git at Google

 # Evolution

 FlatBuffers enables the [schema](schema.md) to evolve over time while still
 maintaining forwards and backwards compatibility with old flatbuffers.

 Some rules must be followed to ensure the evolution of a schema is valid.

 ## Rules

 Adding new tables, vectors, structs to the schema is always allowed. Its only
 when you add a new field to a [`table`](schema.md#tables) that certain rules
 must be followed.

 ### Addition

 **New fields MUST be added to the end of the table definition.**

 This allows older data to still be read correctly (giving you the default value
 of the added field if accessed).

 Older code will simply ignore the new field in the flatbuffer.

 You can ignore this rule if you use the `id` attribute on all the fields of a
 table.

 ### Removal

 **You MUST not remove a field from the schema, even if you don't use it
 anymore.** You simply stop writing them to the buffer.

 Its encouraged to mark the field deprecated by adding the `deprecated`
 attribute. This will skip the generation of accessors and setters in the code,
 to enforce the field not to be used any more.

 ### Name Changes

 Its generally OK to change the name of tables and fields, as these are not
 serialized to the buffer. It may break code that would have to be refactored
 with the updated name.

 ## Examples

 The following examples uses a base schema and attempts to evolve it a few times.
 The versions are tracked by `V1`, `V2`, etc.. and `CodeV1` means code compiled
 against the `V1` schema.

 ### Table Evolution

 Lets start with a simple table `T` with two fields.

 ```c++ title="Schema V1"
 table T {
   a:int;
   b:int;
 }
 ```

 === "Well Evolved"

     First lets extend the table with a new field.

     ```c++ title="Schema V2"
     table T {
       a:int;
       b:int;
       c:int;
     }
     ```

     This is OK. `CodeV1` reading `V2` data will simply ignore the presence of the
     new field `c`. `CodeV2` reading `V1` data will get a default value (0) when
     reading `c`.

     ```c++ title="Schema V3"
     table T {
       a:int (deprecated);
       b:int;
       c:int;
     }
     ```

     This is OK, removing field `a` via deprecation. `CodeV1`, `CodeV2` and `CodeV3`
     reading `V3` data will now always get the default value of `a`, since it is not
     present. `CodeV3` cannot write `a` anymore. `CodeV3` reading old data (`V1` or
     `V2`) will not be able to access the field anymore, since no generated accessors
     are omitted.

 === "Improper Addition"

     Add a new field, but this time at the beginning.

     ```c++ title="Schema V2"
     table T {
       c:int;
       a:int;
       b:int;
     }
     ```

     This is NOT OK, as it makes `V2` incompatible. `CodeV1` reading `V2` data
     will access `a` but will read `c` data.

     `CodeV2` reading `V1` data will access `c` but will read `a` data.

 === "Improper Deletion"

     Remove a field from the schema.

     ```c++ title="Schema V2"
     table T {
       b:int;
     }
     ```

     This is NOT OK. `CodeV1` reading `V2` data will access `a` but read `b` data.

     `CodeV2` reading `V1` data will access `b` but will read `a` data.

 === "Proper Reordering"

     Lets add a new field to the beginning, but use `id` attributes.

     ```c++ title="Schema V2"
     table T {
       c:int (id: 2);
       a:int (id: 0);
       b:int (id: 1);
     }
     ```

     This is OK. This adds the a new field in the beginning, but because all the
     `id` attributes were added, it is OK.

 === "Changing Types"

     Let change the types of the fields.

     ```c++ title="Schema V2"
     table T {
       a:uint;
       b:uint;
     }
     ```

     This is MAYBE OK, and only in the case where the type change is the same
     width. This is tricky if the `V1` data contained any negative numbers. So
     this should be done with care.

 === "Changing Defaults"

     Lets change the default values of the existing fields.

     ```c++ title="Schema V2"
     table T {
       a:int = 1;
       b:int = 2;
     }
     ```

     This is NOT OK. Any `V1` data that did not have a value written to the
     buffer relied on generated code to provide the default value.

     There MAY be cases where this is OK, if you control all the producers and
     consumers, and you can update them in tandem.

 === "Renaming Fields"

     Lets change the name of the fields

     ```c++ title="Schema V2"
     table T {
       aa:int;
       bb:int;
     }
     ```

     This is generally OK. You've renamed fields will break all code and JSON
     files that use this schema, but you can refactor those without affecting the
     binary data, since the binary only address fields by id and offset, not by
     names.

 ### Union Evolution

 Lets start with a simple union `U` with two members.

 ```c++ title="Schema V1"
 union U {
   A,
   B
 }
 ```

 === "Well Evolved"

     Lets add a another variant to the end.

     ```c++ title="Schema V2"
     union U {
       A,
       B,
       another_a: A
     }
     ```

     This is OK. `CodeV1` will not recognize the `another_a`.

 === "Improper Evolved"

     Lets add a another variant to the middle.

     ```c++ title="Schema V2"
     union U {
       A,
       another_a: A,
       B
     }
     ```

     This is NOT OK. `CodeV1` reading `V2` data will interpret `B` as `another_a`.
     `CodeV2` reading `V1` data will interpret `another_a` as `B`.

 === "Evolved With Discriminant"

     Lets add a another variant to the middle, this time adding a union "discriminant".

     ```c++ title="Schema V2"
     union U {
       A = 1,
       another_a: A = 3,
       B = 2
     }
     ```

     This is OK. Its like you added it to the end, but using the discriminant
     value to physically place it elsewhere in the union.

 ## Version Control

 FlatBuffers relies on new field declarations being added at the end, and earlier
 declarations to not be removed, but be marked deprecated when needed. We think
 this is an improvement over the manual number assignment that happens in
 Protocol Buffers (and which is still an option using the `id` attribute
 mentioned above).

 One place where this is possibly problematic however is source control. If user
 `A` adds a field, generates new binary data with this new schema, then tries to
 commit both to source control after user `B` already committed a new field also,
 and just auto-merges the schema, the binary files are now invalid compared to
 the new schema.

 The solution of course is that you should not be generating binary data before
 your schema changes have been committed, ensuring consistency with the rest of
 the world. If this is not practical for you, use explicit field `id`s, which
 should always generate a merge conflict if two people try to allocate the same
 id.

 ## Checking Conformity

 To check that schema are properly evolved, the [`flatc`](flatc.md) compiler has
 a [option](flatc.md#additional-options) to do just that:

 ```sh
 --conform FILE
 ```

 Where `FILE` is the base schema the rest of the input schemas must evolve from.
 It returns `0` if they are properly evolved, otherwise returns a non-zero value
 and provides errors on the reason why the schema are not properly evolved.

 As an example, the following checks if `schema_v2.fbs` is properly evolved from
 `schema_v1.fbs`.

 ```sh
 flatc --conform schema_v1.fbs schema_v2.fbs
 ```
	# Evolution

	FlatBuffers enables the [schema](schema.md) to evolve over time while still
	maintaining forwards and backwards compatibility with old flatbuffers.

	Some rules must be followed to ensure the evolution of a schema is valid.

	## Rules

	Adding new tables, vectors, structs to the schema is always allowed. Its only
	when you add a new field to a [`table`](schema.md#tables) that certain rules
	must be followed.

	### Addition

	New fields MUST be added to the end of the table definition.

	This allows older data to still be read correctly (giving you the default value
	of the added field if accessed).

	Older code will simply ignore the new field in the flatbuffer.

	You can ignore this rule if you use the `id` attribute on all the fields of a
	table.

	### Removal

	**You MUST not remove a field from the schema, even if you don't use it
	anymore.** You simply stop writing them to the buffer.

	Its encouraged to mark the field deprecated by adding the `deprecated`
	attribute. This will skip the generation of accessors and setters in the code,
	to enforce the field not to be used any more.

	### Name Changes

	Its generally OK to change the name of tables and fields, as these are not
	serialized to the buffer. It may break code that would have to be refactored
	with the updated name.

	## Examples

	The following examples uses a base schema and attempts to evolve it a few times.
	The versions are tracked by `V1`, `V2`, etc.. and `CodeV1` means code compiled
	against the `V1` schema.

	### Table Evolution

	Lets start with a simple table `T` with two fields.

	```c++ title="Schema V1"
	table T {
	a:int;
	b:int;
	}
	```

	=== "Well Evolved"

	First lets extend the table with a new field.

	```c++ title="Schema V2"
	table T {
	a:int;
	b:int;
	c:int;
	}
	```

	This is OK. `CodeV1` reading `V2` data will simply ignore the presence of the
	new field `c`. `CodeV2` reading `V1` data will get a default value (0) when
	reading `c`.

	```c++ title="Schema V3"
	table T {
	a:int (deprecated);
	b:int;
	c:int;
	}
	```

	This is OK, removing field `a` via deprecation. `CodeV1`, `CodeV2` and `CodeV3`
	reading `V3` data will now always get the default value of `a`, since it is not
	present. `CodeV3` cannot write `a` anymore. `CodeV3` reading old data (`V1` or
	`V2`) will not be able to access the field anymore, since no generated accessors
	are omitted.

	=== "Improper Addition"

	Add a new field, but this time at the beginning.

	```c++ title="Schema V2"
	table T {
	c:int;
	a:int;
	b:int;
	}
	```

	This is NOT OK, as it makes `V2` incompatible. `CodeV1` reading `V2` data
	will access `a` but will read `c` data.

	`CodeV2` reading `V1` data will access `c` but will read `a` data.

	=== "Improper Deletion"

	Remove a field from the schema.

	```c++ title="Schema V2"
	table T {
	b:int;
	}
	```

	This is NOT OK. `CodeV1` reading `V2` data will access `a` but read `b` data.

	`CodeV2` reading `V1` data will access `b` but will read `a` data.

	=== "Proper Reordering"

	Lets add a new field to the beginning, but use `id` attributes.

	```c++ title="Schema V2"
	table T {
	c:int (id: 2);
	a:int (id: 0);
	b:int (id: 1);
	}
	```

	This is OK. This adds the a new field in the beginning, but because all the
	`id` attributes were added, it is OK.

	=== "Changing Types"

	Let change the types of the fields.

	```c++ title="Schema V2"
	table T {
	a:uint;
	b:uint;
	}
	```

	This is MAYBE OK, and only in the case where the type change is the same
	width. This is tricky if the `V1` data contained any negative numbers. So
	this should be done with care.

	=== "Changing Defaults"

	Lets change the default values of the existing fields.

	```c++ title="Schema V2"
	table T {
	a:int = 1;
	b:int = 2;
	}
	```

	This is NOT OK. Any `V1` data that did not have a value written to the
	buffer relied on generated code to provide the default value.

	There MAY be cases where this is OK, if you control all the producers and
	consumers, and you can update them in tandem.

	=== "Renaming Fields"

	Lets change the name of the fields

	```c++ title="Schema V2"
	table T {
	aa:int;
	bb:int;
	}
	```

	This is generally OK. You've renamed fields will break all code and JSON
	files that use this schema, but you can refactor those without affecting the
	binary data, since the binary only address fields by id and offset, not by
	names.

	### Union Evolution

	Lets start with a simple union `U` with two members.

	```c++ title="Schema V1"
	union U {
	A,
	B
	}
	```

	=== "Well Evolved"

	Lets add a another variant to the end.

	```c++ title="Schema V2"
	union U {
	A,
	B,
	another_a: A
	}
	```

	This is OK. `CodeV1` will not recognize the `another_a`.

	=== "Improper Evolved"

	Lets add a another variant to the middle.

	```c++ title="Schema V2"
	union U {
	A,
	another_a: A,
	B
	}
	```

	This is NOT OK. `CodeV1` reading `V2` data will interpret `B` as `another_a`.
	`CodeV2` reading `V1` data will interpret `another_a` as `B`.

	=== "Evolved With Discriminant"

	Lets add a another variant to the middle, this time adding a union "discriminant".

	```c++ title="Schema V2"
	union U {
	A = 1,
	another_a: A = 3,
	B = 2
	}
	```

	This is OK. Its like you added it to the end, but using the discriminant
	value to physically place it elsewhere in the union.

	## Version Control

	FlatBuffers relies on new field declarations being added at the end, and earlier
	declarations to not be removed, but be marked deprecated when needed. We think
	this is an improvement over the manual number assignment that happens in
	Protocol Buffers (and which is still an option using the `id` attribute
	mentioned above).

	One place where this is possibly problematic however is source control. If user
	`A` adds a field, generates new binary data with this new schema, then tries to
	commit both to source control after user `B` already committed a new field also,
	and just auto-merges the schema, the binary files are now invalid compared to
	the new schema.

	The solution of course is that you should not be generating binary data before
	your schema changes have been committed, ensuring consistency with the rest of
	the world. If this is not practical for you, use explicit field `id`s, which
	should always generate a merge conflict if two people try to allocate the same
	id.

	## Checking Conformity

	To check that schema are properly evolved, the [`flatc`](flatc.md) compiler has
	a [option](flatc.md#additional-options) to do just that:

	```sh
	--conform FILE
	```

	Where `FILE` is the base schema the rest of the input schemas must evolve from.
	It returns `0` if they are properly evolved, otherwise returns a non-zero value
	and provides errors on the reason why the schema are not properly evolved.

	As an example, the following checks if `schema_v2.fbs` is properly evolved from
	`schema_v1.fbs`.

	```sh
	flatc --conform schema_v1.fbs schema_v2.fbs
	```