blob: f5d4194da883852c473f4e1ecc1e36704bb57b30 [file] [log] [blame] [view]
# Testing best practices
As you write tests for Fuchsia, you want to make sure that you are familiarized
with the [testing principles][testing-principles] and the
[testing scope][test-scope] for writing tests.
## Desirable properties of tests
The properties below are generally good to have, provided that they don't
conflict with other goals of the test.
- **Isolated**: tests should be isolated from code, systems, and details that
are outside the scope of the tests. Different test cases should be isolated
from each other. On Fuchsia we use the isolation guarantees given by the
Component Framework to isolate tests. A useful outcome of isolated tests is
that tests can be run in parallel or in a different order and their result is
the same.
- **Hermetic**: the result of a test is defined by the contents of the test. On
Fuchsia, a unit test or an integration test is hermetic if its result is
defined by the contents of the test’s package. If a test’s package hasn’t
changed then it can be assumed that its behavior hasn’t changed, and it’s
still passing or still failing. This property can be used to select which
tests to run in order to validate a given change. In the absence of
hermeticity guarantees, the next best alternative is to run all the tests,
which is costly.
- **Reproducible**: re-running the same test should produce the same result.
Isolation and hermeticity improve reproducibility. The larger the scope of the
test, the more difficult it is for the test to be reproducible.
- **Proximity to the code under test**: tests should focus on a particular unit
and test, and control for what’s not under test.
- **Resilient**: tests shouldn’t need to change when the code under test
changes, unless the change is important to the code’s purpose. Tests that
continue to work after benign changes to the code are more resilient. Tests
that exercise the code’s public APIs or other forms of contracts tend to be
more resilient. This also happens when you focus on testing behavior, not
implementation.
- **Easy to troubleshoot**: when tests fail, they should produce clear
actionable errors that identify the defect. Tests that isolate errors, or
otherwise have a smaller scope, are usually easier to troubleshoot when they
fail.
- **Fast**: tests are often part of the developer feedback loop. A faster
feedback loop is a better feedback loop that makes developers more productive.
- **Reliable**: test failure should indicate a real defect with the code under
test. Reliable tests give more confidence when they pass and produce fewer
false failures that are costly to maintain.
- **Flexible**: the fewer constraints there are on running tests, the easier it
is to run them. On Fuchsia we particularly appreciate if tests can run on
emulators, when possible.
## Undesirable properties of tests
The properties below are generally bad to have. Depending on circumstances they
may be the downsides of a tradeoff that was made for the purpose of the test,
which as a whole is a net positive.
- **Flaky**: tests that produce false failures, then pass when they’re retried,
are flaky. Flaky tests are costlier to maintain, slower to run due to retries,
and provide lower confidence results.
- **Slow**: tests that take longer to run create less efficient feedback loops
and make developers less productive. The bigger the scope of the test, the
slower it is usually to run.
- **Difficult to troubleshoot**: tests that fail with errors that are not
immediately actionable or otherwise don’t indicate the root cause of the
failure are more difficult to troubleshoot. Developers have to look elsewhere
other than the test failure itself, such as at system logs or internal system
state, to troubleshoot the test failure.
- **Change detectors**: tests that are coupled too closely with implementation
details that aren’t important for functionality will often fail when the code
under test changes in ways that are benign to the external observer. Change
detector tests are more costly to maintain.
## Test against interfaces and contracts
<span class="compare-better">Recommended</span>: Test using public APIs and
other interfaces and contracts offered to the client of the code under test.
These tests are more resilient to benign changes.
<span class="compare-worse">Not recommended</span>: Don’t test implementation
details that are not important to the client. Such tests often break when the
code under test changes in benign ways.
Further reading:
- [Change-Detector Tests Considered Harmful](https://testing.googleblog.com/2015/01/testing-on-toilet-change-detector-tests.html){:.external}
- [Prefer Testing Public APIs Over Implementation-Detail Classes](https://testing.googleblog.com/2015/01/testing-on-toilet-prefer-testing-public.html){:.external}
- [Test Behavior, Not Implementation](https://testing.googleblog.com/2013/08/testing-on-toilet-test-behavior-not.html){:.external}
- [Testing State vs. Testing Interactions](https://testing.googleblog.com/2013/03/testing-on-toilet-testing-state-vs.html){:.external}
## Write readable test code
Consider readability as you write tests, the same as you do when you write the
code under test.
- A test is **complete** if the body of the test contains all the information
you need to know in order to understand it.
- A test is **concise** if the test doesn’t contain any other distracting
information.
<span class="compare-better">Recommended</span>: Write test cases that are
complete and concise. Prefer writing more test individual test cases, each with
a narrow focus on specific circumstances and concerns.
<span class="compare-worse">Not recommended</span>: Don’t combine multiple
scenarios into fewer test cases in order to produce shorter tests with fewer
test cases.
Further reading:
- [What Makes a Good Test?](https://testing.googleblog.com/2014/03/testing-on-toilet-what-makes-good-test.html){:.external}
- [Keep Tests Focused](https://testing.googleblog.com/2018/06/testing-on-toilet-keep-tests-focused.html){:.external}
## Write reproducible, deterministic tests
Tests should be deterministic, meaning every run of the test against the same
revision of code produces the same result. If not, the test may become costly
to maintain.
Threaded or time-dependent code, random number generators (RNGs), and
cross-component communication are common sources of nondeterminism.
<span class="compare-better">Recommended</span>: Use these tips to write
deterministic tests:
- For time-dependent tests, use fake or mocked clocks to provide
determinism. See [`fuchsia_async::Executor::new_with_fake_time`] and
[fake-clock].
- Threaded code must always use the proper synchronization primitives to
avoid flakes. Whenever possible, prefer single-threaded tests.
- Always provide a mechanism to inject seeds for RNGs and use them in
tests.
- Use mocks in component integration tests. See [Realm Builder][realm-builder].
- When working with tests that are sensitive to flaky behavior,
consider running tests multiple times to ensure that they consistently pass.
You can use repeat flags, such as[`--gtest_repeat`][gtest_test_flags]{:.external}
in GoogleTest and [`--test.count`][go_test_flags]{:.external} in Go, to do this.
Aim for at least 100-1000 runs locally if your test is prone to flakes before
merging.
<span class="compare-worse">Not recommended</span>: Never use `sleep` in tests
as a means of weak synchronization. You may use short sleeps when polling in a
loop, between loop iterations.
## Test doubles: stubs, mocks, fakes
Test doubles stand in for a real dependency of the code under test during a
test.
- A **stub** is a test double that returns a given value and contains no logic.
- A **mock** is a test double that has expectations about how it should be
called. Mocks are useful for testing interactions.
- A **fake** is a lightweight implementation of the real object.
<span class="compare-better">Recommended</span>: Create fakes for code that you
own so that your clients can use that as a test double in their own tests. For
integration testing, consider making it possible to run an instance of your real
component in a test realm in isolation from the rest of the system, and document
this behavior.
<span class="compare-worse">Not recommended</span>: Don’t overuse mocks in your
tests, as you might create lower-quality tests that are less readable and more
costly to maintain while providing less confidence when they pass. Avoid mocking
dependencies that you don’t own.
Further reading:
- [Know Your Test Doubles](https://testing.googleblog.com/2013/07/testing-on-toilet-know-your-test-doubles.html){:.external}
- [Don’t Overuse Mocks](https://testing.googleblog.com/2013/05/testing-on-toilet-dont-overuse-mocks.html){:.external}
- [Don’t Mock Types You Don’t Own](https://testing.googleblog.com/2020/07/testing-on-toilet-dont-mock-types-you.html){:.external}
## Use end-to-end tests appropriately
<span class="compare-better">Recommended</span>: Use end-to-end tests to test
critical user journeys. Such tests should exercise the journey as a user, for
instance by automating user interactions and examining user interface state
changes.
<span class="compare-worse">Not recommended</span>: Don’t use end-to-end tests
to cover for missing tests at other layers or smaller scopes, since when those
end-to-end tests catch errors they will be very difficult to troubleshoot.
<span class="compare-better">Recommended</span>: Use end-to-end tests sparingly,
as part of a balanced testing strategy that leans more heavily on smaller-scoped
tests that run quickly and produce precise and actionable results.
<span class="compare-worse">Not recommended</span>: Don’t rely on end-to-end
tests in your development feedback cycle, because they typically take a long
time to run and often produce more flaky results than smaller-scoped tests.
Further reading:
- [Testing UI Logic? Follow the User!](https://testing.googleblog.com/2020/10/testing-on-toilet-testing-ui-logic.html){:.external}
- [Just Say No to More End-to-End Tests](https://testing.googleblog.com/2015/04/just-say-no-to-more-end-to-end-tests.html){:.external}
- [Test Flakiness - One of the main challenges of automated testing](https://testing.googleblog.com/2020/12/test-flakiness-one-of-main-challenges.html){:.external}
- [Test Flakiness - One of the main challenges of automated testing (Part II)](https://testing.googleblog.com/2021/03/test-flakiness-one-of-main-challenges.html){:.external}
- [Avoiding Flakey Tests](https://testing.googleblog.com/2008/04/tott-avoiding-flakey-tests.html){:.external}
- [Where do our flaky tests come from?](https://testing.googleblog.com/2017/04/where-do-our-flaky-tests-come-from.html){:.external}
[test-scope]: /docs/contribute/testing/scope.md
[testing-principles]: /docs/contribute/testing/principles.md
[audio-effects-example-tests]: /src/media/audio/examples/effects/test/audio_effects_example_tests.cc
[build-bringup]: /docs/development/build/build_system/bringup.md
[capabilities-protocol]: /docs/concepts/components/v2/capabilities/protocol.md
[cf]: /docs/concepts/components/v2/README.md
[cf-capabilities]: /docs/concepts/components/v2/capabilities/README.md
[cf-manifests]: /docs/concepts/components/v2/component_manifests.md
[channel]: /docs/reference/kernel_objects/channel.md
[continuous-integration]: https://martinfowler.com/articles/continuousIntegration.html
[contract-test]: https://martinfowler.com/bliki/ContractTest.html
[coverage-no-e2e]: /docs/contribute/testing/coverage.md#end-to-end_e2e_tests_exclusion
[cpuperf]: /garnet/bin/cpuperf/README.md
[create-e2e-test]: /docs/development/testing/create_a_new_end_to_end_test.md
[cts]: /sdk/cts/README.md
[dependency-injection]: https://en.m.wikipedia.org/wiki/Dependency_injection
[e2e-perf]: /src/tests/end_to_end/perf/README.md
[fidl]: /docs/concepts/fidl/overview.md
[fidl-benchmarks]: /src/tests/benchmarks/fidl/benchmark_suite/
[fidl-compatibility-tests]: /src/tests/fidl/compatibility/README.md
[fidl-wire-format]: /docs/reference/fidl/language/wire-format
[fonts-tests-integration]: /src/fonts/tests/integration/README.md
[fsi]: /docs/concepts/packages/system.md
[fuchsia.pkg.fontresolver]: https://fuchsia.dev/reference/fidl/fuchsia.pkg#FontResolver
[fuzzing]: /docs/development/testing/fuzzing/overview.md
[gidl]: /tools/fidl/gidl/README.md
[inspect]: /docs/development/diagnostics/inspect/README.md
[inspect-codelab]: /docs/development/diagnostics/inspect/codelab/codelab.md
[inspect-validator]: /docs/reference/diagnostics/inspect/validator/README.md
[inspect-vmo-format]: /docs/reference/diagnostics/inspect/vmo-format.md
[inspect-vmo-format-update]: /docs/reference/diagnostics/inspect/updating-vmo-format.md
[minfs]: /docs/concepts/filesystems/minfs.md
[minfs-stress]: /src/storage/stress-tests/minfs/
[multi-repo-dev]: https://testing.googleblog.com/2015/05/multi-repository-development.html
[netstack-benchmarks]: /src/connectivity/network/tests/benchmarks/README.md
[netstack3-roadmap]: /docs/contribute/roadmap/2021/netstack3.md
[practical-test-pyramid]: https://martinfowler.com/articles/practical-test-pyramid.html
[principles]: /docs/concepts/index.md
[principles-inclusive]: /docs/concepts/principles/inclusive.md
[principles-pragmatic]: /docs/concepts/principles/pragmatic.md
[principles-secure]: /docs/concepts/principles/secure.md
[principles-updatable]: /docs/concepts/principles/updatable.md
[reader-fuzzer]: /zircon/system/ulib/inspect/tests/reader_fuzzer.cc
[realm-builder]: /docs/development/testing/components/realm_builder.md
[run-e2e-test]: /docs/development/testing/run_an_end_to_end_test.md
[run-test-component]: /docs/development/run/run-test-component.md
[rust-stress-test-lib]: /docs/development/testing/rust_stress_test_library.md
[sanitizers]: /docs/contribute/testing/sanitizers.md
[sanitizers-supported-configs]: /docs/contribute/testing/sanitizers.md#supported_configurations
[screen-is-not-black]: /src/tests/end_to_end/screen_is_not_black/README.md
[stress-tests]: /docs/development/testing/stress_tests.md
[syscalls]: /docs/reference/syscalls/README.md
[test-coverage]: /docs/contribute/testing/coverage.md
[test-package-gn]: /docs/development/components/build.md#test-packages
[testing-integration]: /docs/development/testing/components/integration_testing.md
[testing-v2]: /docs/development/testing/components/README.md
[timer-slack]: /docs/concepts/kernel/timer_slack.md
[timer-tests]: /zircon/kernel/tests/timer_tests.cc
[timers-test]: https://fuchsia.googlesource.com/fuchsia/+/main/src/zircon/tests/timers/timers.cc
[userboot]: /docs/concepts/process/userboot.md
[utest-core]: /zircon/system/utest/core/README.md
[vdso]: /docs/concepts/kernel/vdso.md
[wikipedia-dependency-injection]: https://en.m.wikipedia.org/wiki/Dependency_injection
[`fuchsia_async::Executor::new_with_fake_time`]: https://fuchsia.googlesource.com/fuchsia/+/a874276/src/lib/fuchsia-async/src/executor.rs#345
[fake-clock]: https://fuchsia.googlesource.com/fuchsia/+/a874276/src/lib/fake-clock
[rust_65218]: https://github.com/rust-lang/rust/issues/65218
[go_test_flags]: https://golang.org/cmd/go/#hdr-Testing_flags
[gtest_test_flags]: https://github.com/google/googletest/blob/main/docs/advanced.md#repeating-the-tests