docs/development/testing/stress_tests.md - fuchsia - Git at Google

 # Stress tests

 ## The need for stress tests

 When writing a stateful system that interacts with other processes/components, it is possible to
 introduce bugs that may not manifest until the system is put under load. These bugs arise from
 incorrect assumptions, off-by-one errors or subtle timing issues. The errors are hard to detect in
 advance because they are the result of complex, stateful interactions between layers of the system.

 Stress tests reveal errors in these interactions by generating large workloads for the system
 concurrently. If a workload produces an unexpected error, developers can root-cause the issue,
 fix it and write a test that covers that specific interaction.

 Our infrastructure also has a dedicated builder `core.x64-stress` for running stress tests.  This
 builder will run each stress test for a maximum of 22 hours.

 Note: Stress tests are currently restricted to the `QEMU` device type, since they run for long
 periods of time.

 ## Client Libraries

 ### Rust stress test library

 We offer a Rust Client Library for writing stress tests. A guide to writing tests using that library
 can be found [here](rust_stress_test_library.md).

 #### Concepts

 Developers can write stress tests that create an environment and produce actors
 that indefinitely run randomized workloads on that environment.

 ##### Actor

 An actor's responsibility is to perform operations on an environment. An actor should perform
 operations that stress the environment. It can do this in one of two ways:

 * Interacting cooperatively with the system-under-test (workload generation).

 * Intentionally breaking the system-under-test (failure simulation).

 ##### Actor Runner

 An Actor Runner is a thin wrapper on an Actor and is responsible for running the actor indefinitely.
 Runners are created for each actor by the environment and are run on individual threads.
 A runner repeatedly instructs its actor to perform a single operation and return the result.

 ##### Environment

 An environment's responsibility is to:

 * Store global state needed for the test.

 * Provide the exit criteria for the test.

 * Provide actors that run for the entire duration of the test.

 * Reset global state when requested by an actor.

 A stress test creates exactly one environment that lives for the entire duration of the test. A test
 writer must define an environment and provide it to the stress test framework.

 An environment can store global objects that outlive every instance of the system-under-test.
 For example, if a filesystem test operates on a block device backed by a VMO, store that VMO in the
 environment, so that even if an actor crashes the block device, the VMO remains intact and can be
 used to reset the state.

 The environment provides multiple runners to the framework, each containing an actor
 that can perform operations.

 On reset, an environment must update global state and the connections of each actor.
	# Stress tests

	## The need for stress tests

	When writing a stateful system that interacts with other processes/components, it is possible to
	introduce bugs that may not manifest until the system is put under load. These bugs arise from
	incorrect assumptions, off-by-one errors or subtle timing issues. The errors are hard to detect in
	advance because they are the result of complex, stateful interactions between layers of the system.

	Stress tests reveal errors in these interactions by generating large workloads for the system
	concurrently. If a workload produces an unexpected error, developers can root-cause the issue,
	fix it and write a test that covers that specific interaction.

	Our infrastructure also has a dedicated builder `core.x64-stress` for running stress tests. This
	builder will run each stress test for a maximum of 22 hours.

	Note: Stress tests are currently restricted to the `QEMU` device type, since they run for long
	periods of time.

	## Client Libraries

	### Rust stress test library

	We offer a Rust Client Library for writing stress tests. A guide to writing tests using that library
	can be found [here](rust_stress_test_library.md).

	#### Concepts

	Developers can write stress tests that create an environment and produce actors
	that indefinitely run randomized workloads on that environment.

	##### Actor

	An actor's responsibility is to perform operations on an environment. An actor should perform
	operations that stress the environment. It can do this in one of two ways:

	* Interacting cooperatively with the system-under-test (workload generation).

	* Intentionally breaking the system-under-test (failure simulation).

	##### Actor Runner

	An Actor Runner is a thin wrapper on an Actor and is responsible for running the actor indefinitely.
	Runners are created for each actor by the environment and are run on individual threads.
	A runner repeatedly instructs its actor to perform a single operation and return the result.

	##### Environment

	An environment's responsibility is to:

	* Store global state needed for the test.

	* Provide the exit criteria for the test.

	* Provide actors that run for the entire duration of the test.

	* Reset global state when requested by an actor.

	A stress test creates exactly one environment that lives for the entire duration of the test. A test
	writer must define an environment and provide it to the stress test framework.

	An environment can store global objects that outlive every instance of the system-under-test.
	For example, if a filesystem test operates on a block device backed by a VMO, store that VMO in the
	environment, so that even if an actor crashes the block device, the VMO remains intact and can be
	used to reset the state.

	The environment provides multiple runners to the framework, each containing an actor
	that can perform operations.

	On reset, an environment must update global state and the connections of each actor.