A tutorial for enabling tracing in drivers.
Tracing is used for a number of reasons, including:
In this tutorial, we'll examine tracing in Fuchsia and address the following topics:
In the Fuchsia tracing system, three types of components cooperate in a distributed manner:
Let's suppose your program wants to create trace data. Your program would play the role of a trace provider — it generates tracing data. There can, of course, be many trace providers in a system.
Let's further suppose that a different program wants to process the data that your trace provider is generating. This program takes on the role of a trace client, and like the trace provider, there can be many trace clients in a system.
Note that there‘s currently just one trace client, with no immediate plans to add more. We’ll discuss this further below, in the Trace Client operation section.
What‘s interesting about Fuchsia’s distributed implementation is that for efficiency, the trace provider writes the data directly into a shared memory segment (a Zircon VMO — Virtual Memory Object). The data isn‘t copied anywhere, it’s stored in memory as it's generated.
This means that the trace client must somehow find out where that data is stored. This discovery is mediated by the trace manager.
There's exactly one trace manager in the system, and it serves as a central rendezvous point: a place where trace providers and trace clients meet.
Let's look at this one step at a time.
Your program starts a background async loop (implemented as a thread in C, for example). This allows it to play the role of a trace provider by handling messages from the trace manager (like “start tracing” or “stop tracing”).
Then, your program registers with the trace manager, telling it that it is ready to take on the role of being a trace provider.
Finally, your program goes about its business. Note that no tracing is happening yet — we're waiting for the trace manager to start tracing in your program.
When tracing starts, the trace manager provides your program with a VMO into which it can write its data.
From the perspective of your program, there's nothing special you need to do to handle the interaction with the async loop (that is, to turn tracing on or off) — your program calls the various tracing API functions, and they themselves determine if data should be written to the VMO or not.
Note that tracing can be disabled entirely at compile-time. In this case, no tracing code is present in your program; and thus there's no way to turn it on. A program built with tracing enabled, however, can respond to commands from the trace manager to selectively enable or disable tracing.
When a program wishes to assume the role of a trace client (that is, to get trace data), it contacts the trace manager, requests tracing to start (and subsequently stop), and then finally saves collected trace data. The trace manager gathers the data and sends it over a socket to the trace client.
Because the trace provider writes to the VMO, the trace manager reads from the VMO, and the trace client reads data from a socket (provided by the trace manager), there's no way for the trace client to directly affect the operation of the trace provider.
Tracing is on-demand — that is, your program normally runs with tracing turned off. When some event occurs (e.g., a system problem, or the user initiates a debugging session), tracing can be turned on for an arbitrary period. Not only can tracing be turned on or off, but specific categories of tracing can be individually selected (we'll see this shortly).
Some time later, tracing can be turned off again.
trace
and traceutil
As mentioned above, there is currently just the one trace client. It consists of two utilities: trace
and traceutil
. The trace
utility runs on the target, and traceutil
runs on the development host.
trace
is used to control tracing — it sends the commands to the trace manager to start and stop tracing, and it gathers the trace data.
traceutil
, on the development host side, communicates with trace
. Trace data can be streamed from trace
through traceutil
right to the developer's desktop.
Assuming that the background async loop is started and running (see Fuchsia Tracing System Design for details), this is the minimum code you need in order to write a simple string to the trace buffer:
TRACE_INSTANT("category", "name", TRACE_SCOPE_PROCESS, "message", TA_STRING("Hello, World!"));
There are 5 arguments to the macro TRACE_INSTANT()
. In order, they are:
"category"
— this is a nul-terminated string representing the category of the trace event."name"
— a nul-terminated string representing the name of the trace event.TRACE_SCOPE_PROCESS
— for the TRACE_INSTANT
tracing macro, this indicates the scope of the event."message"
— this is the “key” part of the data.TA_STRING("Hello, World!")
— this is the “value” part of the data.The result of executing this code is that if tracing is compiled in, and enabled, a trace datum will be logged to the VMO. If tracing is compiled in, but not enabled, this code returns almost immediately (it checks to see if tracing is enabled, and discovering that it‘s not, doesn’t do anything else). If tracing isn‘t compiled in, this code doesn’t even make it past the C compiler — no code is generated (it's like the entire TRACE_INSTANT()
macro was a comment).
The key and value arguments are optional, and can be repeated. Whenever you specify a key you must specify a value (even if the value is nul).
What is a category?
A category is something that you define; there's a convention for how categories should look:
provider:
category[:
sub-category[...]]
For example, “demo:flow:outline
” which has three colon-delimited elements:
demo
is the name of the trace provider; it identifies your programflow
is the name of the category; here, we're using a name that suggests that we are tracing the call-by-call flow of the programoutline
is the name of the sub-category; here, we're using a name that suggests that we are tracing the high-level flow of the program, perhaps just a few top-level functions.We might have another category name, demo:flow:detailed
, for example, which we could use to trace the detailed flow of the program.
The names are at your discretion; whatever has meaning for you. Beware though, that the category namespace is global to all programs running, so if there was another program with the “provider” set to
demo
as well, you would most likely run into naming conflicts (and thus could end up with unrelated data from some other trace provider).
Categories should be grouped hierarchically. For example, with statistics collection, you might have some sub-categories:
demo:statistics:bandwidth
— to collect bandwidth statistics,demo:statistics:requests
— to collect user request statistics.Categories, therefore, give you control over what's collected. If a category is not requested by the trace client program, then the data is not collected by the trace provider.
The name argument (2nd argument in our TRACE_INSTANT()
, and in fact most other macros) is a string description of the event data.
So, if we had a category of demo:statistics:bandwidth
we might have two different trace points, one that counted received packets and another that counted transmitted packets. We'd use the name to distinguish them:
// we just received another packet, log it status.rx_count++; TRACE_INSTANT("demo:statistics:bandwidth", "rxpackets", TRACE_SCOPE_PROCESS, "count", TA_UINT64(status.rx_count));
versus
// we just transmitted another packet, log it status.tx_count++; TRACE_INSTANT("demo:statistics:bandwidth", "txpackets", TRACE_SCOPE_PROCESS, "count", TA_UINT64(status.tx_count));
Notice that, as mentioned above, we have both a key (the
"count"
argument) and a value (theTA_UINT64()
encoded value).
In the examples above, we saw TA_STRING()
and TA_UINT64()
macros. They‘re used to properly encode the value for C. For C++, you can still use the macro, but it’s not required because the value type can be inferred (for all but three, shown in the table below as “req'd” in the “C++” column).
Macro | C++ | Description |
---|---|---|
TA_NULL | a null value. | |
TA_BOOL | a boolean value. | |
TA_INT32 | a signed 32-bit integer value. | |
TA_UINT32 | an unsigned 32-bit integer value. | |
TA_INT64 | a signed 64-bit integer value. | |
TA_UINT64 | an unsigned 64-bit integer value. | |
TA_DOUBLE | a double-precision floating point value. | |
TA_CHAR_ARRAY | req'd | a character array with a length (copied rather than cached). |
TA_STRING | a null-terminated dynamic string (copied rather than cached). | |
TA_STRING_LITERAL | req'd | a null-terminated static string constant (cached). |
TA_POINTER | a pointer value (records the memory address, not the target). | |
TA_KOID | req'd | a kernel object id. |
For the most part, the above operate as you'd expect. For example, the TA_INT32()
macro takes a 32-bit signed integer as an argument.
The notable exceptions are:
TA_NULL()
— does not take an argument, that is, it's written literally as TA_NULL()
(in C++, you could use nullptr
instead of the TA_NULL()
macro).TA_CHAR_ARRAY()
— takes two arguments, the first is a pointer to the character array, and the second is its length.Note that in C++, when using a literal constant, type inference needs a hint in order to get the size, signedness, and type right.
For example, is the value 77
a signed 32-bit integer? An unsigned 32-bit integer? Or maybe even a 64-bit integer of some kind?
Type inference in the tracing macros works according to the standard C++ rules:
77
is a signed 32-bit integer, TA_INT32
77U
is an unsigned 32-bit integer, TA_UINT32
77L
is a signed 64-bit integer, TA_INT64
77LU
is an unsigned 64-bit integer, TA_UINT64
This also means that floating point needs to be explicitly noted if it's an (otherwise) integer value. 77
is, as above, a TA_INT32
, but 77.
(note the period) is a TA_DOUBLE
.
For this reason, if you‘re using constants, you should consider retaining the encoding macros if you’re expressing the values directly, or you should use the appropriate const
type:
TRACE_INSTANT("category", "name", "int", 77); // discouraged
is the same as:
const int32_t my_id = 77; // well defined type TRACE_INSTANT("category", "name", "int", my_id);
and:
#define MY_ID (TA_INT32(77)) // uses the typing macro TRACE_INSTANT("category", "name", "int", MY_ID);
As mentioned above, you can have zero or more key/value pairs. For example:
TRACE_INSTANT("category", "name", TRACE_SCOPE_PROCESS); TRACE_INSTANT("category", "name", TRACE_SCOPE_PROCESS, "key1", nullptr); TRACE_INSTANT("category", "name", TRACE_SCOPE_PROCESS, "key1", "string1"); TRACE_INSTANT("category", "name", TRACE_SCOPE_PROCESS, "key1", "string1", "key2", 77);
For the TRACE_INSTANT()
macros, there are three values of the “scope” (3rd argument):
Scope | Meaning |
---|---|
TRACE_SCOPE_THREAD | The event is only relevant to the thread it occurred on |
TRACE_SCOPE_PROCESS | The event is only relevant to the process in which it occurred |
TRACE_SCOPE_GLOBAL | The event is globally relevant |
There are cases where you might wish to entirely disable tracing (like final release).
The NTRACE
macro is what's used to make this happen.
This is similar to the NDEBUG
macro used with assert() — if the macro is present, then the assert() calls don't generate any code.
In the case of tracing, if the NTRACE
macro is present, then the tracing macros don't generate any code.
In particular, keep in mind the negative sense — if the macro is present, then tracing is disabled.
You can explicitly turn on the macro yourself to disable tracing:
#define NTRACE // disable tracing #include <trace/event.h>
Here, the macros contained in the tracing file (like TRACE_INSTANT()
) are made inactive; they're effectively converted to comments, which are eliminated by the compiler.
Notice that we defined the macro before the
#include
— this is required in order to select the inactive forms of the macro expansions in the#include
file.
You can also test the NTRACE
macro, to see if you need to provide tracing data.
In the example above, where we discussed the rxpackets
and txpackets
counters, you might have a general statistics structure:
typedef struct { #ifndef NTRACE // reads as "if tracing is not disabled" uint64_t rx_count; uint64_t tx_count; #endif uint64_t npackets; } my_statistics_t;
The rx_count
and tx_count
fields are used only with tracing, so if NDEBUG
is asserted (meaning tracing is completely disabled), they don't take up any room in your my_statistics_t
structure.
This does mean that you need to conditionally compile the code for managing the recording of those statistics:
#ifndef NTRACE status.tx_count++; TRACE_INSTANT("demo:statistics:bandwidth", "txpackets", TRACE_SCOPE_PROCESS, "count", TA_UINT64(status.tx_count)); #endif // NTRACE
There's one more case to consider.
Sometimes, you may wish to determine if tracing is on at runtime. There's a handy test macro, TRACE_ENABLED()
. If tracing is compiled in (NTRACE
is not defined), then the TRACE_ENABLED()
macro looks to see if tracing is currently turned on or off in your trace provider, and returns a true or false value at runtime. Note that if tracing is compiled out, then TRACE_ENABLED()
always returns false (generally causing the compiler to entirely optimize out the code).
For example:
#ifndef NTRACE if (TRACE_ENABLED()) { int v = do_something_expensive(); TRACE_INSTANT(... } #endif // NTRACE
Here, if tracing is compiled in, and enabled, we call do_something_expensive(), perhaps to fetch some data for tracing.
Notice that we used both the #ifndef
and the TRACE_ENABLED()
macro together. That‘s because the function do_something_expensive() might not exist in the trace-disabled version, and thus you’d get compiler and linker diagnostics.
There's a similar macro, TRACE_CATEGORY_ENABLED()
that simply gives you more refinement; you can test if a particular category is enabled or not. As with TRACE_ENABLED()
, if tracing is compiled out, this macro reduces to an if (0)
, which the compiler optimizes out.
Another common function during tracing is timing things. Often, you'll want to know “how long does this operation take?” or “how long does this procedure run for?”
There are three macros that can be used here:
TRACE_DURATION()
— monitor the duration of the current scope,TRACE_DURATION_BEGIN()
and TRACE_DURATION_END()
— monitor the duration of a specific, bounded section of code.These timing macros are often used without any key/value data; we'll show them both ways below.
For example:
int my_function(void) { TRACE_DURATION("demo:timing:functions", "my_function"); // your function does stuff here... }
This generates a trace event when my_function() ends, indicating the length of time spent in the function (and all called functions).
Yes, this works in C — a compiler extension is used to add a code hook at scope end.
A fairly common use case for the TRACE_DURATION()
macro is in C++ constructors (and other member functions), with additional data captured at the same time.
This is from one of the blobfs
vnode constructors (zircon/system/ulib/blobfs/blobfs.cpp
):
zx_status_t VnodeBlob::InitCompressed() { TRACE_DURATION("blobfs", "Blobfs::InitCompressed", "size", inode_.blob_size, "blocks", inode_.num_blocks); ...
Here, the length of time spent in the constructor, along with the size and number of blocks, is captured. By the way, notice how the macros for the types of inode_.blob_size
and inode_.num_blocks
are not used in the C++ version — their type is inferred by the compiler.
TRACE_DURATION()
can be used for any scope, not just an entire function:
int my_function(void) { if (this) { TRACE_DURATION("demo:timing:functions", "my_function", "path", TA_STRING ("this")); // do stuff that's timed } else { TRACE_DURATION("demo:timing:functions", "my_function", "path", TA_STRING ("that")); // do other stuff that's timed } }
Here, two different timing durations are captured, depending on which path is taken in the code, with the key/value pair indicating the selected one.
For greater control over the area that's timed, you can use the TRACE_DURATION_BEGIN()
and TRACE_DURATION_END()
macros:
int my_function(void) { if (this) { // do something that you don't want to time here // start timing TRACE_DURATION_BEGIN("demo:timing:functions", "my_function:area1"); // do something that you'd like to time here // end timing TRACE_DURATION_END("demo:timing:functions", "my_function:area1"); // do something else that you don't want to time here } }
In this sample, we added some path information to the name component.
The rule here is that the TRACE_DURATION_BEGIN()
must have a matching TRACE_DURATION_END()
macro in the same scope. Matching means that both the category and name must be the same.
The macros must can be nested hierarchically.
For example:
int my_function(void) { if (this) { // do something that you don't want to time TRACE_DURATION_BEGIN("demo:timing:functions", "my_function:area1"); // NOTE 1 // do something that you'd like to time TRACE_DURATION_BEGIN("demo:timing:functions", "my_function:inner"); // NOTE 2 // do something that you'd like timed by "inner" // NOTE 3 TRACE_DURATION_END("demo:timing:functions", "my_function:inner"); // do something that's still timed by "area1" but not "inner" // NOTE 4 TRACE_DURATION_END("demo:timing:functions", "my_function:area1"); // do something else that you don't want to time } }
To be clear about what the above is doing — there are two parts of the code being timed; an overall “my_function:area1” that spans from “NOTE 1” through to and including “NOTE 4”, and a separately timed area “my_function:inner” that spans from “NOTE 2” through to and including “NOTE 3”.
Tip: prefer the TRACE_DURATION() macro over the TRACE_DURATION_BEGIN() and TRACE_DURATION_END() macros. The simple TRACE_DURATION() macro automatically handles leaving scope, whereas the begin/end style macros don't — you need to manually ensure correct nesting and termination.
Also, note that TRACE_DURATION() takes roughly half the space in the output buffer as TRACE_DURATION_BEGIN() and TRACE_DURATION_END() do! This can have a big impact on size.
All trace timing is expressed as an unsigned 64-bit count of 1 nanosecond ticks, giving a 584+ year range (which should be sufficient for all but the most patient of users).
All of the examples so far have been “synchronous” — that is, occuring in a linear fashion in one thread.
There's a set of “asynchronous” tracing functions that are used when the operation spans multiple threads.
For example, in a multi-threaded server, a request is handled by one thread, and then put back on a queue while the operation is in progress. Some time later, another thread receives notification that the operation has completed, and “picks up” the processing of that request. The goal of asynchronous tracing is to allow the correlation of these disjoint trace events.
Asynchronous tracing takes into consideration that the same code path is used for multiple different flows of processing. In the previous examples, we were interested in seeing how long a particular function ran, or what a certain value was at a given point in time. With asynchronous tracing, we're interested in tracking the same data, but for a logical processing flow, rather than a program location based flow.
In the queue processing example, the code that receives requests would tag each request with a “nonce” — a unique value that follows the request around. This nonce can be generated via TRACE_NONCE()
, which simply increments a global counter.
Let's see how this works. First, you declare a place to hold the nonce. This is usually in a context structure for the request itself:
typedef struct { ... // add the nonce to your context structure trace_async_id_t async_id; } my_request_context_t;
When the request arrives, you fetch a nonce and begin the asynchronous tracing flow:
// a new request; start asynchronous tracing ctx->async_id = TRACE_NONCE(); TRACE_ASYNC_BEGIN("category", "name", ctx->async_id, "key", TA_STRING("value"));
You can log trace events periodically using the TRACE_ASYNC_INSTANT()
macro (similar to what we did with the TRACE_INSTANT()
macro above):
TRACE_ASYNC_INSTANT("category", "name", ctx->async_id, "state", TA_STRING("phase2"));
And clean up via TRACE_ASYNC_END()
:
TRACE_ASYNC_END("category", "name", ctx->async_id);
Don‘t confuse this use of “async” with the async loop that’s running in your process; they aren't related.
Asynchronous tracing is intended for tracing within the same process, but perhaps by way of different threads.
There‘s a higher-level tracing mechanism, called “flow” tracing, that’s intended for use between processes or abstraction layers.
You call TRACE_FLOW_BEGIN()
to mark the start of a “flow”. Just like TRACE_ASYNC_BEGIN()
, you pass in a nonce to identify this particular flow. The flow ID is an unsigned 64-bit integer.
Then, you (optionally) call TRACE_FLOW_STEP()
to indicate trace operations within that flow.
When you're done, you end the flow with TRACE_FLOW_END()
.
A flow could be used, for example, between a client and server for tracking a request end-to-end from the client, through the server, and back to the client.
@@@ Here we'll highlight the differences and similarities amongst logging, tracing, inspection, and debugging.
#include
files to add) and Makefile additions required by the trace provider in order to add tracing, or disable it completely.