| # Test coverage |
| |
| ## Motivation |
| |
| Software testing is a common practice that helps teams continuously deliver |
| quality code. Tests exercise invariants on the software's behavior, catch and |
| prevent regressions in functionality or other desired properties, and help scale |
| engineering processes. |
| |
| Measuring test coverage in terms of source code line coverage helps engineers |
| identify gaps in their testing solutions. Using test coverage as a metric |
| promotes higher-quality software and safer development practices. Measuring test |
| coverage continuously helps engineers maintain a high bar of quality. |
| |
| Test coverage does not guarantee bug-free code. Testing should be used along |
| with other tools such as [fuzz testing][fuzz-testing], static and dynamic |
| analysis, etc. |
| |
| ## Absolute test coverage |
| |
| Absolute test coverage is a measure of all lines of source code that are covered |
| by the complete set of tests. Fuchsia's Continuous Integration (CI) |
| infrastructure produces absolute coverage reports, and continuously refreshes |
| them. Coverage reports are typically at most a couple of hours stale. |
| |
| ### Absolute coverage dashboard |
| |
| The latest absolute coverage report is available [here][coverage-dashboard]. |
| This dashboard shows a tree layout of all code that was found to have been |
| covered by all tests that were exercised, as a subset of the source tree. You |
| can navigate the tree by the directory structure and view total coverage metrics |
| for directories or individual line coverage information for files. |
| |
| In addition, coverage information is also available as a layer in Google's |
| internal code search. |
| |
|  |
| |
| ## Incremental test coverage |
| |
| Incremental test coverage is shown in the context of a change on the |
| [Gerrit code review web UI][gerrit]. Incremental coverage shows, specifically in |
| the context of a given change, which modified lines are covered by tests and |
| which modified lines are not. |
| |
| Incremental test coverage is collected by Fuchsia's Commit Queue (CQ) |
| infrastructure. When sending a change to CQ (Commit-Queue+1), you can click |
| "show experimental tryjobs" to reveal a tryjob named `fuchsia-coverage`. When |
| this tryjob is complete, your patch set should have absolute coverage (|Cov.|) |
| and incremental coverage (ΔCov.) |
| |
| Maintaining high incremental test coverage for changes affecting a project helps |
| keep test coverage high continuously. Particularly, it prevents introducing new |
| untested code into a project. Change authors can review incremental coverage |
| information on their changes to ensure that their test coverage is sufficient. |
| Code reviewers can review incremental test coverage information on changes and |
| ask authors to close any testing gaps that they identify as important. |
| |
|  |
| |
|  |
| |
| ## End-to-end (E2E) tests exclusion |
| |
| Only unit tests and hermetic integration tests are considered reliable sources |
| of test coverage. No test coverage is collected or presented for E2E tests. |
| |
| E2E tests are large system tests that exercise the product as a whole and don't |
| necessarily cover a well-defined portion of the source code. For example, it's |
| common for E2E tests on Fuchsia to boot the system in an emulator, interact with |
| it, and expect certain behaviors. |
| |
| ### Why |
| |
| Because E2E tests exercise the system as a whole: |
| |
| * They were observed to often trigger different code paths between runs, |
| making their coverage results flaky. |
| * They frequently time out on coverage builders, making the builders flaky. |
| E2E tests run considerably slower than unit tests and small integration |
| tests, usually take minutes to finish. And they run even slower on coverage |
| builders due to coverage overhead that slows down performance. |
| |
| ### How |
| |
| For top-level buildbot bundles like [`core`][core_bundle]{:.external}, a |
| counterpart [`core_no_e2e`][core_no_e2e_bundle]{:.external} is provided, so bots |
| that collect coverage can use the `no_e2e` bundle to avoid building and running |
| any E2E tests. |
| |
| Currently, there are no reliable ways to identify all E2E tests in-tree. As a |
| proxy, `no_e2e` bundles maintain the invariant that they don't have any known |
| e2e test libraries in their recursive dependencies, through GN's |
| [`assert_no_deps`][assert-no-deps]{:.external}. The list of E2E test libraries |
| is manually curated and maintained, with the assumption that it changes very |
| infrequently: |
| |
| ```gn |
| {% includecode gerrit_repo="fuchsia/fuchsia" gerrit_path="build/config/BUILDCONFIG.gn" region_tag="e2e_test_libs" adjust_indentation="auto" %} |
| ``` |
| |
| ## Limitations |
| |
| Currently, test coverage is collected only if: |
| |
| * The code is written in C, C++, or Rust. |
| * The code either runs on Fuchsia devices in usermode, or is exercised by a |
| host test. |
| * The test is exercised under the `core.x64` or `core.arm64` configuration on |
| qemu. Tests that only run in other configurations, such as on hardware |
| targets, are not supported at this time. |
| * System tests, aka end-to-end (e2e) tests, are _excluded_. |
| |
| On that last note, e2e tests exercise a lot of code throughout the system, but |
| they do so in a manner that's inconsistent between runs (or "flaky"). To achieve |
| higher test coverage for code, it is possible and in fact recommended to do so |
| using unit tests and integration tests. |
| |
| Support for the following additional use cases is currently under development: |
| |
| * Kernel code coverage. |
| * Coverage on product configurations other than `core`, for instance `bringup` |
| or `workstation`. |
| * Coverage on hardware targets, that is collecting from tests that don't run |
| on qemu. |
| |
| ## Troubleshooting |
| |
| ### Unsupported configuration / language / runtime |
| |
| If you are not seeing absolute or incremental coverage information for your |
| code, first review the [limitations](#limitations) and ensure that your code is |
| expected to receive coverage support in the first place. |
| |
| ### Stale reports / latency |
| |
| Absolute coverage reports are generated after the code is merged and may take a |
| few hours to fully compile. The dashboard shows the commit hash for the |
| generated report. If you're not seeing expected results on the dashboard, ensure |
| that the data was generated past any recent changes that would have affected |
| coverage. If the data appears stale, come back later and refresh the page. |
| |
| Incremental coverage reports are generated by CQ. Ensure that you are looking at |
| a patch set that was sent to CQ. You can click "show experimental tryjobs" to |
| reveal a tryjob named `fuchsia-coverage`. If the tryjob is still running, come |
| back later and refresh the page. |
| |
| ### Ensure that your test ran |
| |
| If your code is missing coverage that you expect to see, then pick a test that |
| should have covered your code and ensure that it ran on the coverage tryjob. |
| |
| 1. Find the tryjob in Gerrit, or find a recent `fuchsia-coverage` run on the |
| [CI dashboard][fuchsia-coverage-ci]. |
| 1. In the Overview tab, find the "collect builds" step and expand it to find |
| links to the pages that show different coverage build & test runs for |
| different configurations. |
| 1. Each of these pages should have a Test Results tab showing all tests that |
| ran. Ensure that your expected test ran, and preferably that it passed. |
| |
| If your test didn't run on any coverage tryjob as expected then one reason might |
| simply be that it only runs in configurations not currently covered by CI/CQ. |
| Another is that the test is explicitly opted out in coverage variants. For |
| instance a `BUILD.gn` file referencing your test might look as follows: |
| |
| ```gn |
| group("tests") { |
| deps = [ |
| ":foo_test", |
| ":bar_test", |
| ] |
| # TODO(fxbug.dev/12345): This test is intentionally disabled on coverage. |
| if (!is_coverage) { |
| deps += [ ":qux_test" ] |
| } |
| } |
| ``` |
| |
| Look for context as to why your test is disabled on coverage and investigate. |
| |
| ### Test only flakes in coverage |
| |
| Related to the above, tests are more likely to be flaky under coverage |
| [example][fxr541525]. The extra overhead from collecting coverage at runtime |
| slows down performance, which in turn affects timing, which is often the cause |
| for additional flakiness. |
| |
| An immediate fix would be to disable your test under coverage (see the GN |
| snippet above), at the immediate cost of not collecting coverage information |
| from your test. As a best practice you should treat flakes on coverage the same |
| as you would treat flakes elsewhere, mainly fix the flakiness. |
| |
| See also: [flaky test policy][flaky-policy]. |
| |
| ## How test coverage works |
| |
| Fuchsia's code coverage build, test runtime support, and processing tools use |
| [LLVM source-based code coverage][llvm-coverage]{:.external}. The Fuchsia |
| platform is supported by compiler-rt profile runtime. |
| |
| Profile instrumentation is activated when the `"coverage"` build variant is |
| selected. The compiler will then generate counters that each correspond to a |
| branch in the code's control flow, and emit instructions on branch entry to |
| increment the associated counters. In addition, the profile instrumentation |
| runtime library is linked into the executable. |
| |
| For more implementation details see |
| [LLVM Code Coverage Mapping Format][llvm-coverage-mapping-format]{:.external}. |
| |
| Note that the instrumentation leads to increased binary size, increased memory |
| usage, and slower test execution time. Some steps were taken to offset this: |
| |
| * Tests in profile variants are afforded longer timeouts. |
| * Tests in profile variants are compiled with some optimizations. |
| * Coverage currently runs on emulators, where storage is less constrained. |
| * For incremental coverage, only binaries affected by the change are |
| instrumented. |
| |
| The profile runtime library on Fuchsia stores the profile data in a [VMO][vmo], |
| and publishes a handle to the VMO using the |
| [`fuchsia.debug.DebugData`][debugdata] protocol. This protocol is made available |
| to tests at runtime using the [Component Framework][cfv2] and is hosted by the |
| [Test Runner Framework][trf]'s on-device controller, Test Manager. |
| |
| The profiles are collected after the test realm terminates, along with any |
| components hosted in it. The profiles are then processed into a single summary |
| for the test. This is an important optimization that significantly reduces the |
| total profile size. The optimized profile is then sent to the host-side test |
| controller. |
| |
| The host uses the [`covargs`][covargs] tool, which itself uses the |
| [`llvm-profdata`][llvm-profdata]{:.external} and |
| [`llvm-cov`][llvm-cov]{:.external} tools, to convert raw profiles to a summary |
| format and to generate test coverage reports. In addition, `covargs` converts |
| the data to protobuf format which is used as an interchange format with various |
| dashboards. |
| |
| ## Roadmap |
| |
| Ongoing work: |
| |
| * Performance and reliability improvements to the coverage runtime. |
| * Performance and reliability improvements to the coverage processing |
| pipeline. |
| |
| Areas for future work: |
| |
| * Kernel support for source code coverage from ZBI tests. |
| |
| ## Further reading |
| |
| * [Code Coverage Best Practices][gtb-coverage-best]{:.external} |
| * [Measuring Coverage at Google][gtb-coverage-measure]{:.external} |
| * [Understand Your Coverage Data][gtb-coverage-understand]{:.external} |
| * [Measuring Coverage at Google][gtb-coverage-measure]{:.external} |
| * [How Much Testing is Enough?][gtb-testing-enough]{:.external} |
| |
| [assert-no-deps]: https://gn.googlesource.com/gn/+/HEAD/docs/reference.md#var_assert_no_deps |
| [cfv2]: /docs/concepts/components/v2/ |
| [core_bundle]: https://cs.opensource.google/fuchsia/fuchsia/+/main:bundles/buildbot/BUILD.gn;l=58;drc=9e1506dfbe789637c709fcc4ad43896f5044f947 |
| [core_no_e2e_bundle]: https://cs.opensource.google/fuchsia/fuchsia/+/main:bundles/buildbot/BUILD.gn;l=53;drc=9e1506dfbe789637c709fcc4ad43896f5044f947 |
| [covargs]: /tools/debug/covargs/ |
| [coverage-dashboard]: https://analysis.chromium.org/p/fuchsia/coverage |
| [debugdata]: https://fuchsia.dev/reference/fidl/fuchsia.debugdata |
| [flaky-policy]: /docs/concepts/testing/test_flake_policy.md |
| [fuchsia-coverage-ci]: https://ci.chromium.org/p/fuchsia/builders/ci/fuchsia-coverage |
| [fuzz-testing]: /docs/concepts/testing/fuzz_testing.md |
| [fx-smoke-test]: https://fuchsia.dev/reference/tools/fx/cmd/smoke-test |
| [fxr541525]: https://fuchsia-review.googlesource.com/c/fuchsia/+/541525 |
| [gerrit]: https://fuchsia-review.googlesource.com/ |
| [gtb-coverage-best]: https://testing.googleblog.com/2020/08/code-coverage-best-practices.html |
| [gtb-coverage-measure]: https://testing.googleblog.com/2014/07/measuring-coverage-at-google.html |
| [gtb-coverage-understand]: https://testing.googleblog.com/2008/03/tott-understanding-your-coverage-data.html |
| [gtb-testing-enough]: https://testing.googleblog.com/2021/06/how-much-testing-is-enough.html |
| [llvm-cov]: https://llvm.org/docs/CommandGuide/llvm-cov.html |
| [llvm-coverage-mapping-format]: https://llvm.org/docs/CoverageMappingFormat.html |
| [llvm-coverage]: https://clang.llvm.org/docs/SourceBasedCodeCoverage.html |
| [llvm-profdata]: https://llvm.org/docs/CommandGuide/llvm-profdata.html |
| [trf]: /docs/concepts/testing/v2/test_runner_framework.md |
| [vmo]: /docs/reference/kernel_objects/vm_object.md |