blob: eed03bd2c26355de257387aeec354a871eeef539 [file] [log] [blame] [view] [edit]
# Trace-based benchmarking
This document describes how to use trace-based benchmarking to measure and track
performance of Fuchsia apps.
[TOC]
## Overview
Trace-based benchmarks measure the performance of an application by running it
under [tracing](tracing_usage_guide.md) and analyzing the collected traces to
compute performance metrics.
For a typical **service** application (application to which clients connect over
FIDL), the following components participate in a benchmarking run:
- **service binary** - the service being benchmarked.
- **benchmark app** - a client app that connects to the service and
exercises the usage patterns we are interested in benchmarking.
- **benchmark spec** - a JSON file specifying which trace events captured
during a run of the benchmark app should be measured, and how.
The same framework can be also used to benchmark single binaries (without the
client-server split).
## Mechanics
Trace-based benchmarks are run using the `trace` binary. The spec file needs to be
passed to the tool as follows:
```sh
trace record --spec-file=<path to the spec file>
```
### Specification file
The specification file configures tracing parameters and specifies measurements.
(see [examples/benchmark](../examples/benchmark/) if you'd like to see a full
example straight away)
The file supports the following top level-parameters:
- `app`: string, url of the application to be run
- `args`: array of strings, startup arguments to be passed to the application
- `categories`: array of strings, tracing categories to be enabled
- `duration`: integer, maximum duration of tracing in seconds
- `measure`: array of measurement specifications, see below
Given the specification file, the `trace` tool runs the `app` with the given
`args` for at most `duration` seconds and gathers trace events from the selected
`categories`. Then, the tool computes the measurements specified in the
`measure` section on the recorded trace events.
Example:
```{json}
{
"app": "benchmark_example",
"args": [],
"categories": ["benchmark"],
"measure": [
...
]
}
```
For any tracing parameters that can be passed both as arguments to `trace record`
and set in the specification file, the command line value overrides the one from
the file.
### Measurement types
The `trace` tool supports the following types of measurements:
- `duration`
- `time_between`
A `duration` measurement targets a single trace event and computes the
duration of its occurences. The target trace event can be recorded as a
duration, an async, or a flow event.
**Example**:
```{json}
{
"type": "duration",
"event_name": "example",
"event_category": "benchmark"
},
```
A `time_between` measurement targets two trace events with the specified
anchors (either the beginning or the end of the events) and computes the time
between the consecutive occurrences of the two. The target events can be
"duration", "async", "flow" or "instant" (in which case the anchor doesn't matter).
Takes arguments: `first_event_name`, `first_event_category`,
`first_event_anchor`, `second_event_name`, `second_event_category`,
`second_event_anchor`.
**Example**:
```{json}
{
"type": "time_between",
"first_event_name": "task_end",
"first_event_category": "benchmark",
"second_event_name": "task_start",
"second_event_category": "benchmark"
}
```
In the example above the `time_between` measurement captures the time between
the two instant events and measures the time between the end of one task and
the beginning of another.
### Samples
Both `duration` and `time_between` measurements can optionally group the
recorded samples into consecutive ranges, splitting the samples at the given
instances of the recorded events and reporting the results of each group
separately. In order to achieve that, pass a strictly increasing list of
zero-based numbers denoting the occurrences at which samples must be split as
`split_samples_at`.
For example, if a measurement specifies `"split_samples_at": [1, 50],`, the
results will be reported in three groups: sample 0, samples 1 - 49, and samples
50 to N, where N is the last samples.
### Full example
See [examples/benchmark](../examples/benchmark/) for a full example of a
traced-based benchmark.
This example can be run with the following command:
```{shell}
trace record --spec-file=/system/data/benchmark_example/benchmark_example.tspec
```
## Best practices
### Consider reusing benchmark binaries
The separation between specification files and benchmark binaries allows to
define multiple benchmarks based on a single benchmark binary. Note that you can
parametrize the benchmark binary by taking command line arguments which can be
set to different values in each spec file.
### Record "cold run" samples separately
For any duration measurement that happens more than once, chances are that the
first time has different performance characteristics that the subsequent ones.
You can set `"split_samples_at": [1]` to report and track separately the first
sample in one group, and all subsequent samples in another group.
## Results
By default, the results are printed on the command line in a human-friendly
format.
### Export
If you prefer a machine-friendly format, pass the path to the output file to
`trace record` as `--benchmark-results-file=<file>`.
The resulting file has the following format:
```{json}
[
{
"label": "put",
"unit": "ms",
"samples": [
{
"label": "samples 0 to 0",
values: [2.74]
},
{
"label: "samples 1 to 9",
values: [1.01, 1.12, 0.91, 1, 1.03, 0.97, 1.03, 1.07, 1.15]
}
]
},
<more results>
]
```
This format is formally defined by
[//zircon/system/ulib/perftest/performance-results-schema.json](
https://fuchsia.googlesource.com/zircon/+/master/system/ulib/perftest/performance-results-schema.json).
### Dashboard upload
Dashboard upload integration and infra support is WIP as of March, 2018.