This document describes recommended practices for structuring data that is stored in Ledger. We assume that the reader is already familiar with Ledger API surface.
Data stored in Ledger is grouped into separate independent key-value stores called pages.
Deciding how to split data into pages has the following implications:
There are no restrictions on how many pages a client application can use.
Each page stores an ordered list of key-value pairs, also called entries.
Both keys and values are arbitrary byte arrays - the client application is free to construct the keys and values as needed based on their needs, but see some guidance below.
Entries are sorted by key in the lexicographic order of the byte array. Entry retrieval methods always return them in order, it is also possible to use a range query to retrieve only a part of the ordered list.
The lexicographic byte array ordering of the keys implies the following caveats:
[1, 0, 0]
is ordered before the key of [2]
. If ordering between the items is important, it is essential to add padding to any numbers that are part of the key, so that their width is fixedThe entries stored in a page are sorted by keys, and can be retrieved using either exact matching or range queries.
Because of the range query support, it might be desirable to construct the keys in a way that matches the querying needs of the application. For example, a messaging application which needs to retrieve only the messages not older than a week might wish to structure the data as follows:
Then, a range query can be used to retrieve the messages after the given timestamp.
UUIDs should be used as part of the entry key in order to avoid unintended collisions, where needed. For example, a messaging application might want to include an UUID of a message stored in a Ledger entry as part of the entry key, in order to avoid collisions between two messages of the same timestamp:
Deciding how to split page data into entries has implications on querying ability and conflict resolution.
If two pieces of data can change independently (for example: a name and an email of a contact in a contacts app), it is beneficial to put these pieces of data in separate entries. Because conflicts are resolved entry-by-entry, concurrent modifications of different entries can be automatically merged by the default conflict resolution policy.
On the other hand, if two pieces of data are related (when one changes, the other piece of data needs to change accordingly), we need to either: