fbl
intrusive containersIntrusive containers (or intrusive data structures) are the type of containers offered by the Fuchsia Base Library (fbl::
).
This document will:
fbl::
and std::
containers.An intrusive container is a data structure which can be used to hold a collection of objects where the bookkeeping used to track membership in the container is stored in the objects themselves instead of holding the object alongside of the bookkeeping in a structure. To highlight the distinction, consider a structure which defines a point in a 2-dimensional coordinate system.
struct Point { float x, y; };
In a traditional non-intrusive implementation of a doubly linked list, a node in the list might look something like this
struct ListNode { Point val; ListNode *next, *prev; };
While an intrusive version of the same list would be
struct Point { float x, y; Point *next, *prev; };
Initially, these appear to be very similar approaches. The distinction seems mostly like a question of which object is contained within the other. As time goes on, however, the differences become more apparent.
Adding a point to a non-intrusive list of points means:
ListNode
Point
you want to add to the list into the val
member of the ListNode
next
/prev
pointers of the ListNode
into the list.Removing the point is the same process in reverse.
Point
are copied out to local storage.ListNode
is de-allocated.When using an intrusive list, the add/remove operations skip the allocation/deallocation and copying of the Point
's members. The next
/prev
pointers, which exist in the point object already, are simply linked into the list during the add operation and unlinked during the remove. While the intrusive form does not have to perform any allocations or copy any structures to add elements to a list, every Point
in the system is must carry around the overhead of the next/prev bookkeeping, even while not in a list.
While a small structure or primitive data type might be held by value in a non-intrusive list, when objects become larger, it becomes more common to track the object using some appropriate pointer type instead of storing the object by value. For example
struct ListNode { std::shared_ptr<LargeObject> ptr; ListNode *next, *prev; };
The “value” being stored in this list is an instance of a shared pointer, so while adding or removing a pointer to a LargeObject
to a list still requires management of the allocation used to store the node state, the object itself no longer needs to be copied when being added to a list. Additionally, this approach allows a single instance of a LargeObject
to exist in multiple containers at the same time as the nodes simply track references to the objects and not copies of the objects. A simple implementation of an intrusive form of this would put the next
/prev
pointers into LargeObject
directly implying that an instance of a LargeObject
could be a member of exactly 0 or 1 intrusive lists, but not more.
The primary advantage of an intrusive container approach to tracking objects is the lack of allocations when adding/removing elements to/from a container. In some (usually specialized) environments, heap interactions can introduce a potential failure path which would ideally not exist. Additionally, interactions with heaps usually involve interactions with locks/mutexes which could result in involuntary blocking. This might introduce undesirable timing indeterminism, or in some cases may even not be an option if the code is running in a context in which blocking is not allowed.
For example, what happens if code needs to add a bookkeeping structure to a list during a hard interrupt handler in the kernel if the allocation for the bookkeeping structure cannot be allocated, or if allocation of the bookkeeping would require the IRQ handler to block due to lock contention (something that cannot happen in the exception handler)? By allocating the container bookkeeping as part of the object itself ahead of time, an intrusive container approach to this problem allows these issues to be avoided.
In general, intrusive containers can provide performance advantages when the design of the system allows the implementer to know at compile time all of the various containers that an object will ever exist in. This does not mean that they are always the best choice, however, as they come with limitations as well.
Finally, for many types of intrusive containers, holding a reference to an object implies that you have located that object in all of the container types it can possible exist in where the same is not true when an object is tracked by reference in multiple non-intrusive container instances. For example, an object which could exist in a balanced binary search tree as well as in a doubly linked list (both intrusive) could be located in O(log n) time in the tree, and then removed from the list in O(1) time, instead of needing to find the object in the list in O(n) time before removing it.
A short list of some pros and cons of the two approaches might look something like this:
Non-intrusive:
int
or float
O
is defined in Library A, but a user wishes to create a collection of A::O
s in their own program and cannot reasonably re-define O
.std::map
with an O(log n) operation does not help me to locate the same object which is simultaneously contained in a std::deque
. If I want to remove the object from the std::deque
, I must find it there with a separate O(n) operation first.Intrusive:
fbl::
vs. std::
The C++ Standard Library offers a large set of implementation agnostic non-intrusive containers. For example, std::map
is an ordered associative container, however there are no requirements that an implementation of the standard library use any specific data structure in order to implement std::map
, only that the implementation chosen provides certain guarantees for various container operations as specified by the standard, such as worst case algorithmic complexities, or guarantees around iterator invalidation.
If you need a non-intrusive container, this is where you should be looking.
While there have been draft proposals, the C++ standard library does not currently offer any intrusive container implementations. Similar to boost::
, fbl::
attempts to fill this gap, but in a much more limited/focused fashion. The algorithms backing the types of containers types offered by fbl::
are explicit and embedded in the names of the container types themselves. The APIs are inspired by and attempt to emulate those offered by std::
, however there are some differences as a result of the fundamental differences between intrusive and non-intrusive containers. There are currently 3 primary container types defined by fbl::
and one composed container type. They are:
fbl::SinglyLinkedList<>
fbl::DoublyLinkedList<>
fbl::WAVLTree<>
fbl::HashTable<>
fbl::HashTable<>
is considered to be a composed container type as it offers a fixed bucket count hash table which resolves collision using chaining where the bucket type may be either fbl::SinglyLinkedList<>
or fbl::DoublyLinkedList<>
. Please refer to Setting up build dependencies for details on which files need to be included in order to use the various container types.
Here are some terms and their definitions which will be used throughout this guide.
T*
-style pointer to an object.