This document codifies the best practice for interacting with test flakes on Fuchsia.
A flaky test is a test that sometimes passes and sometimes fails, when run using the exact same revision of the code.
Flaky tests are bad because they:
This document is specific to test flakes, not infrastructure flakes.
The following provides the expected & recommended lifetime of a flake:
A flake fetching tool is currently in use to identify the vast majority of flakes.
The tool looks for test failures in CQ where the same test succeeded when retried on the same patch set.
One should prioritize, above all else, removing the test from the commit queue. This can be achieved in the following ways:
The above mechanisms are recommended because they remove the flaky test and prevent the commit queue from becoming unreliable. The first option (reverting code) is preferred, but it is not as easy as the second option (disabling test), which reduces test coverage. Importantly, neither of these options prevent diagnosis and fixing of the flake, but they allow it to be processed offline.
It is not recommended to attempt to fix the test without first removing it from CQ. This causes CQ to be unreliable for all other contributors, which allows additional flakes to compound in the codebase.
At this point, one can take the filed issue, locally re-enable the test, and work on reproducing the failure. This will enable them to find the root cause, and fix the issue. Once the issue has been fixed, the bug can be closed, and the test can be re-enabled. If any reverted patches need to re-land, they can re-land safely.
When fixing a flake, verify the fix by testing for flakiness in CQ.