src/ui/tests/README.md - fuchsia - Git at Google

 # Writing integration tests for graphics, input, and accessibility

 ## Checklist

 1.  Use [TestWithEnvironment](/sdk/lib/sys/cpp/testing/test_with_environment.h)
     to ensure that each test runs with a fresh component topology.
 1.  Do not use CMX `injected-services` to construct component topology. (It
     shares the same component instances between different test runs.)
 1.  Set up `fuchsia.hardware.display.Provider` to be vended by
     `fuchsia-pkg://fuchsia.com/fake-hardware-display-controller-provider#meta/hdcp.cmx"`,
     and bundle `fake-hardware-display-controller` with your test package.
 1.  Bundle [Root Presenter](/src/ui/bin/root_presenter/BUILD.gn)'s
     `component_v1_for_test` with the test's package. It prevents the actual
     input driver from interacting with the test. Invoke Root Presenter from the
     test package's URL. E.g.,
     `fuchsia-pkg://touch-input-test#meta/root_presenter.cmx`.
 1.  Bundle [Scenic](/src/ui/scenic/BUILD.gn)'s `component` with the test's
     package. It ensures that the test uses the Scenic it was built with. Invoke
     Scenic from the test package's URL. E.g.,
     `fuchsia-pkg://touch-input-test#meta/scenic.cmx`.
 1.  Don't invoke components from *another* test's package!
 1.  No sleeps or waits, unless the API is deficient. Every action by the test is
     gated on a logical condition that the test can observe. E.g., inject touch
     events only when the test observes the child view is actually connected to
     the view tree and vending content to hit.

 ## Guidelines for writing integration tests

 We have Fuchsia-based *products* built on the Fuchsia *platform*. As Fuchsia
 platform developers, we want to ship a solid platform, and validate that the the
 *platform* works correctly for all our supported *products*. Integration tests
 ensure we uphold correctness and stability of platform functionality that spans
 two or more components, via our prebuilt binaries (such as Scenic) and API
 contracts (over FIDL). This is especially valuable in validating our ongoing
 platform migrations. One example is the set of touch dispatch paths, such as
 from Input Pipeline to Scenic to Flutter.

 ### Models of production

 Integration tests model a specific product scenario by running multiple Fuchsia
 components together. For example, to ensure that the "touch dispatch path from
 device to client" continues to work as intended, we have a "touch-input-test"
 that exercises the actual components involved in touch dispatch, over the actual
 FIDLs used in production.

 Because integration tests are a model, there can (and should) be some
 simplification from actual production. Obviously, these tests won't run the
 actual product binaries; instead, a reasonable stand-in is crafted for a test.
 The idea is that it's the simplest stand-in that can be used in a test, which
 still can catch serious problems and regressions.

 Sometimes, it's not straightforward for the test to use an actual platform path
 used in production; we use a reasonable stand-in for these cases too. For
 example, we can't actually inject into `/dev/class/input-report`, so we have a
 dedicated
 [API surface](/sdk/fidl/fuchsia.input.injection/input_device_registry.fidl) on
 Input Pipeline to accept injections in a test scenario.

 The important thing is that the test gives us *confidence* that evolution of
 platform code and platform protocols will not break existing product scenarios.

 ### No flakes

 When the scenario involves graphics, it's very easy to accidentally introduce
 flakiness into the test, robbing us of confidence in our changes. Graphics APIs
 operate across several dimensions of lifecycle, topology, and
 synchronization/signaling schemes, in the domains of components, graphical
 memory management, view system, and visual asset placement. Furthermore, these
 APIs provide the basis for, and closely interact with, the Input APIs and the
 Accessibility APIs.

 The principal challenge is to write tests that set up a real graphics stack, in
 a way that is robust against elasticity in execution time. We talk about those
 challenges in "Synchronization challenges", below. There are other reasons for
 tests going wrong, and most of them can be dealt with by enforcing hermeticity
 at various levels. We talk about these first. A final challenge is to author
 tests that model enough of the interesting complexity on just the platform side,
 so that we know complex product scenarios don't break with platform evolution.

 ### Precarious stack of stuff

 At the bottom we have graphics tests. Input tests build on top of graphics
 tests. And accessibility tests build on top of input tests. Hence they have all
 the same problems, just with more components. It is thus critical that a basic
 graphics test is reasonable to write and understand, because they form the basis
 for "higher level" tests that inherently have more complexity.

 ### Questions and answers

 #### Why not just rely on product-side e2e tests?

 Product owners must write e2e tests to ensure their product is safe from
 platform changes. E2e tests are big, heavy, and expensive to run; often, they
 are flaky as well. They are authored in a different repository, and run in their
 own test automation regime ("CQ"). And they care about the subset of OS
 functionality that their product relies on.

 Given these realities, platform developers cannot rely on these e2e tests to
 catch problems in platform APIs and platform code.

 By authoring platform-side integration tests, platform developers can get
 breakage signals much faster with less code in the tests, and systematically
 exercise all the functionality used across the full range of supported products.
 Product owners benefit by increased confidence in the platform's reliability.

 #### Why all this emphasis on hermeticity?

 Deterministic, flake-free tests increase the signal-to-noise ratio from test
 runs. They make life better.

 When your tests rain down flakes every day, we ignore these tests, and they
 become noise. But when we try to fix the source of flakes, it often reveals a
 defect in our practices, or APIs, or documentation, which we can fix (think
 "impact"). Each of these hermeticity goals address a real problem that someone
 in Fuchsia encountered. When we have hermeticity, everyone benefits, and Fuchsia
 becomes better.

 #### Why all this emphasis on integration tests?

 Fuchsia's platform teams have important migrations in progress that affect
 products. Integration tests are a critical method of guaranteeing that our
 platform changes are safe and stable with respect to our product partners.
 Examples: Components Framework v2, Input API migration, Flatland API migration,
 etc.

 #### What about CTS tests?

 The
 [Fuchsia Compatibility Test Suite](/docs/contribute/governance/rfcs/0015_cts?hl=en)
 ensures that the implementations offered by the Fuchsia platform conform to the
 specifications of the Fuchsia platform. An effective CTS will have UI
 integration tests, and so this guidance doc applies to those UI integration
 tests.

 ## Prefer hermeticity

 Various types of
 [hermeticity](/docs/concepts/testing/v2/test_runner_framework?hl=en#hermeticity)
 make our tests more reliable.

 ### Package hermeticity

 All components used in the test should come from the same test package. This can
 be verified by examining the fuchsia-pkg URLs launched in the test; they should
 reference the test package.

 If we don't have package hermeticity, and a component C is defined in the
 universe U, then the C launched will come from U, instead of your locally
 modified copy of C. This issue isn't so much a problem in CQ, because it
 rebuilds everything from scratch. However, it is definitely an issue for local
 development, where it causes surprises - another sharp corner to trap the
 unwary. That is, a fix to C won't necessarily run in your test, and hampers
 developer productivity.

 There is a further advantage to package hermeticity. For those components that
 read from the `config-data` package, this practice allows a test package to
 define their own config-data for the components they contain. In fact, this is
 the only way to define a piece of custom config-data for a test. For example,
 the display rotation in Root Presenter is conveyed with config-data.

 ### Environment hermeticity

 All components in the test should be brought up and torn down in a custom
 Fuchsia environment. For example, in Components Framework v1, the
 "TestWithEnvironment" fixture allows the test to construct the precise
 environment it needs for the test to run properly.

 This practice forces component state to be re-initialized on each run of the
 test, thereby preventing inter-test state pollution.

 The advantages of doing so are:

 *   The test is far more reproducible, as the initial component state is always
     known to be good.
 *   It's trivial to run the test hundreds of times in a tight loop, thus
     speeding up flake detection.
 *   The test author can adjust the environment more precisely, and more
     flexibly, than otherwise possible.

 #### No to `injected-services`

 In component framework v1, it's possible to declare
 [`injected-services`](/docs/concepts/testing/v1_test_component?hl=en#integration_testing)
 in a test's CMX manifest. Declaring `injected-services` is somewhat of an
 anti-pattern. It, too, also constructs a test environment, but *all the test
 executions* run in the *same environment*. If a service component had dirtied
 state, a subsequent `TEST_F` execution will inadvertently run against that
 dirtied state.

 ### Capability hermeticity

 All components in the test should not be exposed to the actual root environment.
 For FIDL protocols, this is not so much an issue. However, there are other types
 of capabilities where CF v1 has leaks. A good example is access to device
 capabilities, such as `/dev/class/input-report` and
 `/dev/class/display-controller`. Components that declare access to device
 capabilities will actually access these capabilities, on the real device, in a
 test environment.

 We can gain capability hermeticity by relying on a reasonable fake. Two
 examples.

 *   The display controller, with some configuration, can be faked out. A
     subsequent advantage is that graphics tests can be run in parallel!
     *   One downside is that it's not easy to physically observe what the
         graphical state is, because the test no longer drives the real display.
         So development can be a little harder.
 *   The input devices are faked out with an injection FIDL, and that's how tests
     can trigger custom input. However, the component that receives injections
     still needs to avoid declaring access to `/dev/class/input-report`! The
     recommendation here is to put a `/dev`-less copy of the component manifest
     into the test package.
     *   Example:
         [root_presenter.cmx](/src/ui/bin/root_presenter/meta/root_presenter.cmx)
         (production) vs
         [root_presenter_base.cmx](/src/ui/bin/root_presenter/meta/root_presenter_base.cmx)
         (production, minus `/dev/class/input-report`).

 ## Synchronization challenges

 Correct, flake-free inter-component graphics synchronization depends intimately
 on the specific graphics API being used. The
 [legacy Scenic API](/sdk/fidl/fuchsia.ui.scenic/session.fidl), sometimes called
 "GFX", has sparse guarantees for when something is "on screen", so extra care
 must be taken to ensure a flake free test. As a rule of thumb, if you imagine
 the timeline of actions for every component stretching and shrinking by
 arbitrary amounts, a robust test will complete for all CPU-schedulable
 timelines. The challenge is to construct action gates where the test will hold
 steady until a desired outcome happens. Sleeps and timeouts are notoriously
 problematic for this reason. Repeated queries of global state (such as a pixel
 color test) are another mechanism by which we could construct higher-level
 gates, but incur a heavy performance penalty and adds complexity to debugging.

 Another dimension of complexity is that much of client code does not interact
 directly with Fuchsia graphics APIs; instead they run in an abstracted runner
 environment. Flutter and Web are good examples where the client code cannot
 directly use Scenic APIs. Some facilities can be piped through the runner, but
 tests generally cannot rely on full API access. Some runners even coarsen the
 timestamps, which also complicates testing a bit.

 One more subtlety. We're interested in the "state of the scene graph", which is
 not precisely the same thing as "state of the rendering buffer". For most
 purposes, they are loosely equivalent, because the entity taking a visual
 screenshot is the same entity that holds the scene graph - Scenic. However,
 specific actions, like accessibility color adjustments, will not be accurately
 portrayed in a visual screenshot, because the color adjustment takes place in
 hardware, below Scenic.

 ### GFX is hard

 For [GFX](/sdk/fidl/fuchsia.ui.scenic/session.fidl) in particular, nested views
 are particularly difficult to synchronize. The key difficulty is that a client
 needs two discrete pieces of information, relative view size and pixel metrics,
 to construct a correctly scaled content on a particular physical screen, but GFX
 conveys the view metrics only *after* the view is already "linked up" to the
 global scene graph. So from the parent view's perspective, child view
 connectivity cannot imply the child view's content has rendered to screen.

 A workaround signal,
 [`fuchsia.ui.gfx.ViewState.is_rendering`](/sdk/fidl/fuchsia.ui.gfx/types.fidl),
 tells the parent view that something in the child view started rendering actual
 content. This is actually sufficient for a single-depth child view, when the
 child's content is simple. In fact, some input tests rely on this signal to
 successfully gate input injection. But for a child view that actually consists
 of a 2+ view hierarchy, the `is_rendering` signal does not say *which* views in
 the child view hierarchy have rendered content, only that *some* descendant view
 has rendered content to screen.

 For client views that have direct access to the GFX API, it's possible to
 construct a tower of signals along the child view hierarchy, but this is
 fragile, complex, and subtle. It is also not feasible for clients that employ a
 runner, like web clients. (The web runner internally constructs a parent view
 and a child view for security.) From the test's perspective, such a client will
 not generate a reliable signal in the GFX API.

 ### Flatland makes it easier

 The upcoming Flatland API, in contrast, solves this problem with a sophisticated
 linkage system, where scene graph connectivity is made independent of view
 metrics. That is, a parent view can send a child view the correct view metrics
 *prior* to the child view actually connecting to the global scene graph. Then,
 when the child view has finished preparing the rendered content, it can connect
 to the global scene graph in an atomic step, without revealing intermediate
 content construction phases (for example, nested content in a view hierarchy).

 ### Test setup - one GFX child view

 For a single GFX child view, the test can set up a "interceptor" or "proxy"
 view, to harvest the
 [`fuchsia.ui.gfx.ViewState.is_rendering`](/sdk/fidl/fuchsia.ui.gfx/types.fidl)
 signal from Scenic. Then subsequent actions, such as injecting touch events, can
 be reliably performed, with good assumptions about the child's graphical
 content.

 An example is how `touch-input-test.cc` sets up `touch-gfx-client` in
 [`TouchInputTest.CppGfxClientTap`](/src/ui/tests/integration_input_tests/touch/touch-input-test.cc#395).
 Here, the test sets up its own Scenic view, links the child's view underneath
 its own, and waits for the child view to connect, using it as a gate for touch
 injection.

 Advantages:

 *   Programmatically correct use of APIs
 *   Reliable
 *   Fast, minimal computational disturbance; good for performance work

 Disadvantages:

 *   Setup and control flow can be delicate to manage

 ### Test setup - child is a GFX view hierarchy

 For a stacked view hierarchy, current best practice is still to set up an
 interceptor view, and gate subsequent actions on the `ViewState.is_rendering`
 signal. However, the signal merely indicates that at least one of the child
 views in the hierarchy started rendering content; for some scenarios, like
 Chromium, this is not a sufficiently robust gating mechanism. That is, the
 signal is a little too early, since the test actually needs both child view and
 grandchild view to be in a rendering state.

 To work around this nondeterminism, the subsequent action (touch injection)
 needs to run in a loop, until all descendant views have published their content
 to the scene graph. For example, for `TouchInputTest.ChromiumTap`, the test
 issues a repeated "tap-tap-tap" until it sees the client respond in an expected
 way.

 Advantages: same as above

 Disadvantages:

 *   Can't manage a child view hierarchy in a completely deterministic way.

 ### Test setup - gate on screenshot histogram

 An alternate synchronization scheme is to set up the scene with various
 predefined colors, which may get toggled in response to actions, such as touch
 input. The test requests screenshots in succession, until the desired color
 condition is reached, by counting the number of pixels in each color value (a
 "histogram"). For example, if the child view is expected to present a red
 rectangle, the test will loop until the histogram returns predominantly red, and
 then perform a subsequent action.

 Advantages:

 *   The resulting test code is nicely linear in control flow, and easy to grasp
     at a high level.
 *   Works with arbitrarily deep child hierarchies, assuming a good enough color
     scheme.

 Disadvantages:

 *   The color scheme has to be relatively simple, and the reader needs to
     understand the scheme.
 *   Heavyweight and slow, due to all those screenshots taken.
 *   Scenic actually has to work to generate those screenshots, so it can skew
     tests used for performance analysis.
 *   Not completely foolproof. Flakes can still happen.
 *   Screenshotting is a grandfathered global-scope capability (see
     [`fuchsia.ui.scenic.Scenic`](/sdk/fidl/fuchsia.ui.scenic/scenic.fidl#89)),
     and it will be more tightly controlled in the future.

 Example: see
 [`FlutterEmbedderTests.Embedding`](/src/ui/tests/integration_flutter_tests/embedder/flutter-embedder-test.cc).

 ### Test setup - one Flatland child view (TBD)

 ## Modeling complex scenarios

 The graphics API allows each product to generate an arbitrarily complex scene
 graph. However, the products we have today typically rely on a few "core
 topologies" that are stable and suitable for the product to build on.

 It's a valuable exercise to capture each of these core topologies in our
 platform integration tests. Some examples:

 *   Touch dispatch to the Flutter runner.
     *   The intra-component connection between the Flutter runner (C++) and the
         Flutter framework (Dart) is delicate, and a runner test will catch bad
         rolls into fuchsia.git. Workstation was once broken for many months due
         to a problem here.
 *   Touch dispatch to the Chromium runner.
     *   Chromium employs a two-view topology as part of its JS sandboxing
         strategy. Having a test ensures that Chromium is correctly using our
         APIs, and that our changes don't accidentally break Chromium.
 *   Touch dispatch from Flutter runner to Chromium runner.
     *   This scenario models a critical production path where the product
         reinjects touch events into its Chromium child view.
 *   One parent view and two child views, using assorted runners, to ensure touch
     dispatch is routed to the correct view.

 Developing new models are also how we test new topologies and interaction
 patterns to make sure the APIs are sensible and usable, and serve as as a
 foundation for converting an entire product.

 *   For example, converting a product to
     [Session Framework](/docs/concepts/session/introduction?hl=en) has many
     moving parts, and validating specific graphics scenarios builds confidence
     in APIs, implementations, and migration strategy.
	# Writing integration tests for graphics, input, and accessibility

	## Checklist

	1. Use [TestWithEnvironment](/sdk/lib/sys/cpp/testing/test_with_environment.h)
	to ensure that each test runs with a fresh component topology.
	1. Do not use CMX `injected-services` to construct component topology. (It
	shares the same component instances between different test runs.)
	1. Set up `fuchsia.hardware.display.Provider` to be vended by
	`fuchsia-pkg://fuchsia.com/fake-hardware-display-controller-provider#meta/hdcp.cmx"`,
	and bundle `fake-hardware-display-controller` with your test package.
	1. Bundle [Root Presenter](/src/ui/bin/root_presenter/BUILD.gn)'s
	`component_v1_for_test` with the test's package. It prevents the actual
	input driver from interacting with the test. Invoke Root Presenter from the
	test package's URL. E.g.,
	`fuchsia-pkg://touch-input-test#meta/root_presenter.cmx`.
	1. Bundle [Scenic](/src/ui/scenic/BUILD.gn)'s `component` with the test's
	package. It ensures that the test uses the Scenic it was built with. Invoke
	Scenic from the test package's URL. E.g.,
	`fuchsia-pkg://touch-input-test#meta/scenic.cmx`.
	1. Don't invoke components from another test's package!
	1. No sleeps or waits, unless the API is deficient. Every action by the test is
	gated on a logical condition that the test can observe. E.g., inject touch
	events only when the test observes the child view is actually connected to
	the view tree and vending content to hit.

	## Guidelines for writing integration tests

	We have Fuchsia-based products built on the Fuchsia platform. As Fuchsia
	platform developers, we want to ship a solid platform, and validate that the the
	platform works correctly for all our supported products. Integration tests
	ensure we uphold correctness and stability of platform functionality that spans
	two or more components, via our prebuilt binaries (such as Scenic) and API
	contracts (over FIDL). This is especially valuable in validating our ongoing
	platform migrations. One example is the set of touch dispatch paths, such as
	from Input Pipeline to Scenic to Flutter.

	### Models of production

	Integration tests model a specific product scenario by running multiple Fuchsia
	components together. For example, to ensure that the "touch dispatch path from
	device to client" continues to work as intended, we have a "touch-input-test"
	that exercises the actual components involved in touch dispatch, over the actual
	FIDLs used in production.

	Because integration tests are a model, there can (and should) be some
	simplification from actual production. Obviously, these tests won't run the
	actual product binaries; instead, a reasonable stand-in is crafted for a test.
	The idea is that it's the simplest stand-in that can be used in a test, which
	still can catch serious problems and regressions.

	Sometimes, it's not straightforward for the test to use an actual platform path
	used in production; we use a reasonable stand-in for these cases too. For
	example, we can't actually inject into `/dev/class/input-report`, so we have a
	dedicated
	[API surface](/sdk/fidl/fuchsia.input.injection/input_device_registry.fidl) on
	Input Pipeline to accept injections in a test scenario.

	The important thing is that the test gives us confidence that evolution of
	platform code and platform protocols will not break existing product scenarios.

	### No flakes

	When the scenario involves graphics, it's very easy to accidentally introduce
	flakiness into the test, robbing us of confidence in our changes. Graphics APIs
	operate across several dimensions of lifecycle, topology, and
	synchronization/signaling schemes, in the domains of components, graphical
	memory management, view system, and visual asset placement. Furthermore, these
	APIs provide the basis for, and closely interact with, the Input APIs and the
	Accessibility APIs.

	The principal challenge is to write tests that set up a real graphics stack, in
	a way that is robust against elasticity in execution time. We talk about those
	challenges in "Synchronization challenges", below. There are other reasons for
	tests going wrong, and most of them can be dealt with by enforcing hermeticity
	at various levels. We talk about these first. A final challenge is to author
	tests that model enough of the interesting complexity on just the platform side,
	so that we know complex product scenarios don't break with platform evolution.

	### Precarious stack of stuff

	At the bottom we have graphics tests. Input tests build on top of graphics
	tests. And accessibility tests build on top of input tests. Hence they have all
	the same problems, just with more components. It is thus critical that a basic
	graphics test is reasonable to write and understand, because they form the basis
	for "higher level" tests that inherently have more complexity.

	### Questions and answers

	#### Why not just rely on product-side e2e tests?

	Product owners must write e2e tests to ensure their product is safe from
	platform changes. E2e tests are big, heavy, and expensive to run; often, they
	are flaky as well. They are authored in a different repository, and run in their
	own test automation regime ("CQ"). And they care about the subset of OS
	functionality that their product relies on.

	Given these realities, platform developers cannot rely on these e2e tests to
	catch problems in platform APIs and platform code.

	By authoring platform-side integration tests, platform developers can get
	breakage signals much faster with less code in the tests, and systematically
	exercise all the functionality used across the full range of supported products.
	Product owners benefit by increased confidence in the platform's reliability.

	#### Why all this emphasis on hermeticity?

	Deterministic, flake-free tests increase the signal-to-noise ratio from test
	runs. They make life better.

	When your tests rain down flakes every day, we ignore these tests, and they
	become noise. But when we try to fix the source of flakes, it often reveals a
	defect in our practices, or APIs, or documentation, which we can fix (think
	"impact"). Each of these hermeticity goals address a real problem that someone
	in Fuchsia encountered. When we have hermeticity, everyone benefits, and Fuchsia
	becomes better.

	#### Why all this emphasis on integration tests?

	Fuchsia's platform teams have important migrations in progress that affect
	products. Integration tests are a critical method of guaranteeing that our
	platform changes are safe and stable with respect to our product partners.
	Examples: Components Framework v2, Input API migration, Flatland API migration,
	etc.

	#### What about CTS tests?

	The
	[Fuchsia Compatibility Test Suite](/docs/contribute/governance/rfcs/0015_cts?hl=en)
	ensures that the implementations offered by the Fuchsia platform conform to the
	specifications of the Fuchsia platform. An effective CTS will have UI
	integration tests, and so this guidance doc applies to those UI integration
	tests.

	## Prefer hermeticity

	Various types of
	[hermeticity](/docs/concepts/testing/v2/test_runner_framework?hl=en#hermeticity)
	make our tests more reliable.

	### Package hermeticity

	All components used in the test should come from the same test package. This can
	be verified by examining the fuchsia-pkg URLs launched in the test; they should
	reference the test package.

	If we don't have package hermeticity, and a component C is defined in the
	universe U, then the C launched will come from U, instead of your locally
	modified copy of C. This issue isn't so much a problem in CQ, because it
	rebuilds everything from scratch. However, it is definitely an issue for local
	development, where it causes surprises - another sharp corner to trap the
	unwary. That is, a fix to C won't necessarily run in your test, and hampers
	developer productivity.

	There is a further advantage to package hermeticity. For those components that
	read from the `config-data` package, this practice allows a test package to
	define their own config-data for the components they contain. In fact, this is
	the only way to define a piece of custom config-data for a test. For example,
	the display rotation in Root Presenter is conveyed with config-data.

	### Environment hermeticity

	All components in the test should be brought up and torn down in a custom
	Fuchsia environment. For example, in Components Framework v1, the
	"TestWithEnvironment" fixture allows the test to construct the precise
	environment it needs for the test to run properly.

	This practice forces component state to be re-initialized on each run of the
	test, thereby preventing inter-test state pollution.

	The advantages of doing so are:

	* The test is far more reproducible, as the initial component state is always
	known to be good.
	* It's trivial to run the test hundreds of times in a tight loop, thus
	speeding up flake detection.
	* The test author can adjust the environment more precisely, and more
	flexibly, than otherwise possible.

	#### No to `injected-services`

	In component framework v1, it's possible to declare
	[`injected-services`](/docs/concepts/testing/v1_test_component?hl=en#integration_testing)
	in a test's CMX manifest. Declaring `injected-services` is somewhat of an
	anti-pattern. It, too, also constructs a test environment, but *all the test
	executions* run in the same environment. If a service component had dirtied
	state, a subsequent `TEST_F` execution will inadvertently run against that
	dirtied state.

	### Capability hermeticity

	All components in the test should not be exposed to the actual root environment.
	For FIDL protocols, this is not so much an issue. However, there are other types
	of capabilities where CF v1 has leaks. A good example is access to device
	capabilities, such as `/dev/class/input-report` and
	`/dev/class/display-controller`. Components that declare access to device
	capabilities will actually access these capabilities, on the real device, in a
	test environment.

	We can gain capability hermeticity by relying on a reasonable fake. Two
	examples.

	* The display controller, with some configuration, can be faked out. A
	subsequent advantage is that graphics tests can be run in parallel!
	* One downside is that it's not easy to physically observe what the
	graphical state is, because the test no longer drives the real display.
	So development can be a little harder.
	* The input devices are faked out with an injection FIDL, and that's how tests
	can trigger custom input. However, the component that receives injections
	still needs to avoid declaring access to `/dev/class/input-report`! The
	recommendation here is to put a `/dev`-less copy of the component manifest
	into the test package.
	* Example:
	[root_presenter.cmx](/src/ui/bin/root_presenter/meta/root_presenter.cmx)
	(production) vs
	[root_presenter_base.cmx](/src/ui/bin/root_presenter/meta/root_presenter_base.cmx)
	(production, minus `/dev/class/input-report`).

	## Synchronization challenges

	Correct, flake-free inter-component graphics synchronization depends intimately
	on the specific graphics API being used. The
	[legacy Scenic API](/sdk/fidl/fuchsia.ui.scenic/session.fidl), sometimes called
	"GFX", has sparse guarantees for when something is "on screen", so extra care
	must be taken to ensure a flake free test. As a rule of thumb, if you imagine
	the timeline of actions for every component stretching and shrinking by
	arbitrary amounts, a robust test will complete for all CPU-schedulable
	timelines. The challenge is to construct action gates where the test will hold
	steady until a desired outcome happens. Sleeps and timeouts are notoriously
	problematic for this reason. Repeated queries of global state (such as a pixel
	color test) are another mechanism by which we could construct higher-level
	gates, but incur a heavy performance penalty and adds complexity to debugging.

	Another dimension of complexity is that much of client code does not interact
	directly with Fuchsia graphics APIs; instead they run in an abstracted runner
	environment. Flutter and Web are good examples where the client code cannot
	directly use Scenic APIs. Some facilities can be piped through the runner, but
	tests generally cannot rely on full API access. Some runners even coarsen the
	timestamps, which also complicates testing a bit.

	One more subtlety. We're interested in the "state of the scene graph", which is
	not precisely the same thing as "state of the rendering buffer". For most
	purposes, they are loosely equivalent, because the entity taking a visual
	screenshot is the same entity that holds the scene graph - Scenic. However,
	specific actions, like accessibility color adjustments, will not be accurately
	portrayed in a visual screenshot, because the color adjustment takes place in
	hardware, below Scenic.

	### GFX is hard

	For [GFX](/sdk/fidl/fuchsia.ui.scenic/session.fidl) in particular, nested views
	are particularly difficult to synchronize. The key difficulty is that a client
	needs two discrete pieces of information, relative view size and pixel metrics,
	to construct a correctly scaled content on a particular physical screen, but GFX
	conveys the view metrics only after the view is already "linked up" to the
	global scene graph. So from the parent view's perspective, child view
	connectivity cannot imply the child view's content has rendered to screen.

	A workaround signal,
	[`fuchsia.ui.gfx.ViewState.is_rendering`](/sdk/fidl/fuchsia.ui.gfx/types.fidl),
	tells the parent view that something in the child view started rendering actual
	content. This is actually sufficient for a single-depth child view, when the
	child's content is simple. In fact, some input tests rely on this signal to
	successfully gate input injection. But for a child view that actually consists
	of a 2+ view hierarchy, the `is_rendering` signal does not say which views in
	the child view hierarchy have rendered content, only that some descendant view
	has rendered content to screen.

	For client views that have direct access to the GFX API, it's possible to
	construct a tower of signals along the child view hierarchy, but this is
	fragile, complex, and subtle. It is also not feasible for clients that employ a
	runner, like web clients. (The web runner internally constructs a parent view
	and a child view for security.) From the test's perspective, such a client will
	not generate a reliable signal in the GFX API.

	### Flatland makes it easier

	The upcoming Flatland API, in contrast, solves this problem with a sophisticated
	linkage system, where scene graph connectivity is made independent of view
	metrics. That is, a parent view can send a child view the correct view metrics
	prior to the child view actually connecting to the global scene graph. Then,
	when the child view has finished preparing the rendered content, it can connect
	to the global scene graph in an atomic step, without revealing intermediate
	content construction phases (for example, nested content in a view hierarchy).

	### Test setup - one GFX child view

	For a single GFX child view, the test can set up a "interceptor" or "proxy"
	view, to harvest the
	[`fuchsia.ui.gfx.ViewState.is_rendering`](/sdk/fidl/fuchsia.ui.gfx/types.fidl)
	signal from Scenic. Then subsequent actions, such as injecting touch events, can
	be reliably performed, with good assumptions about the child's graphical
	content.

	An example is how `touch-input-test.cc` sets up `touch-gfx-client` in
	[`TouchInputTest.CppGfxClientTap`](/src/ui/tests/integration_input_tests/touch/touch-input-test.cc#395).
	Here, the test sets up its own Scenic view, links the child's view underneath
	its own, and waits for the child view to connect, using it as a gate for touch
	injection.

	Advantages:

	* Programmatically correct use of APIs
	* Reliable
	* Fast, minimal computational disturbance; good for performance work

	Disadvantages:

	* Setup and control flow can be delicate to manage

	### Test setup - child is a GFX view hierarchy

	For a stacked view hierarchy, current best practice is still to set up an
	interceptor view, and gate subsequent actions on the `ViewState.is_rendering`
	signal. However, the signal merely indicates that at least one of the child
	views in the hierarchy started rendering content; for some scenarios, like
	Chromium, this is not a sufficiently robust gating mechanism. That is, the
	signal is a little too early, since the test actually needs both child view and
	grandchild view to be in a rendering state.

	To work around this nondeterminism, the subsequent action (touch injection)
	needs to run in a loop, until all descendant views have published their content
	to the scene graph. For example, for `TouchInputTest.ChromiumTap`, the test
	issues a repeated "tap-tap-tap" until it sees the client respond in an expected
	way.

	Advantages: same as above

	Disadvantages:

	* Can't manage a child view hierarchy in a completely deterministic way.

	### Test setup - gate on screenshot histogram

	An alternate synchronization scheme is to set up the scene with various
	predefined colors, which may get toggled in response to actions, such as touch
	input. The test requests screenshots in succession, until the desired color
	condition is reached, by counting the number of pixels in each color value (a
	"histogram"). For example, if the child view is expected to present a red
	rectangle, the test will loop until the histogram returns predominantly red, and
	then perform a subsequent action.

	Advantages:

	* The resulting test code is nicely linear in control flow, and easy to grasp
	at a high level.
	* Works with arbitrarily deep child hierarchies, assuming a good enough color
	scheme.

	Disadvantages:

	* The color scheme has to be relatively simple, and the reader needs to
	understand the scheme.
	* Heavyweight and slow, due to all those screenshots taken.
	* Scenic actually has to work to generate those screenshots, so it can skew
	tests used for performance analysis.
	* Not completely foolproof. Flakes can still happen.
	* Screenshotting is a grandfathered global-scope capability (see
	[`fuchsia.ui.scenic.Scenic`](/sdk/fidl/fuchsia.ui.scenic/scenic.fidl#89)),
	and it will be more tightly controlled in the future.

	Example: see
	[`FlutterEmbedderTests.Embedding`](/src/ui/tests/integration_flutter_tests/embedder/flutter-embedder-test.cc).

	### Test setup - one Flatland child view (TBD)

	## Modeling complex scenarios

	The graphics API allows each product to generate an arbitrarily complex scene
	graph. However, the products we have today typically rely on a few "core
	topologies" that are stable and suitable for the product to build on.

	It's a valuable exercise to capture each of these core topologies in our
	platform integration tests. Some examples:

	* Touch dispatch to the Flutter runner.
	* The intra-component connection between the Flutter runner (C++) and the
	Flutter framework (Dart) is delicate, and a runner test will catch bad
	rolls into fuchsia.git. Workstation was once broken for many months due
	to a problem here.
	* Touch dispatch to the Chromium runner.
	* Chromium employs a two-view topology as part of its JS sandboxing
	strategy. Having a test ensures that Chromium is correctly using our
	APIs, and that our changes don't accidentally break Chromium.
	* Touch dispatch from Flutter runner to Chromium runner.
	* This scenario models a critical production path where the product
	reinjects touch events into its Chromium child view.
	* One parent view and two child views, using assorted runners, to ensure touch
	dispatch is routed to the correct view.

	Developing new models are also how we test new topologies and interaction
	patterns to make sure the APIs are sensible and usable, and serve as as a
	foundation for converting an entire product.

	* For example, converting a product to
	[Session Framework](/docs/concepts/session/introduction?hl=en) has many
	moving parts, and validating specific graphics scenarios builds confidence
	in APIs, implementations, and migration strategy.