Fuzzing is a testing technique that feeds auto-generated inputs to a piece of target code in an attempt to crash the code. This technique finds security vulnerabilities and stability bugs that other testing might miss. You can see Fuchsia fuzzing trophies in Monorail by using the component:Security>clusterfuzz reporter:clusterfuzz@chromium.org
filter.
This guide focuses on LibFuzzer, an in-process fuzzing engine.
fuzzer
to the appropriate BUILD.gn.fuzzers_package
in an appropriate BUILD.gn.//bundles:tests
, to your fuzzers package.fx qemu -N
fx set core.x64 --fuzz-with asan --with //bundles:tests --with //garnet/packages/products:devtools
fx build
fx serve
$ fx fuzz list
$ fx fuzz <fuzzer>
$ fx fuzz check <fuzzer>
$ fx fuzz repro <fuzzer> [crash]
found-by-fuzzing
Sec-TriageMe
libfuzzer
A: Fuzzing or fuzz testing is style of testing that stochastically generates inputs to targeted interfaces in order to automatically find defects and/or vulnerabilities. In this document, a distinction will be made between two components of a fuzzer: the fuzzing engine, which produces context-free inputs, and the fuzz target function, which submits those inputs to a specific interface.
Among the various styles of fuzzing, coverage-based fuzzing has been shown to yield a particularly high number of bugs for the effort involved. In coverage-based fuzzing, the code under test is instrumented for coverage. The fuzzing engine can observe when inputs increase the overall code coverage and use those inputs as the basis for generating further inputs. This group of “seed” inputs is collectively referred to as a corpus.
A: LibFuzzer is an in-process fuzzing engine integrated within LLVM as a compiler runtime. Compiler runtimes are libraries that are invoked by hooks that compiler adds to the code it builds. Other examples include sanitizers such as ASan, which detects certain overflows and memory corruptions. LibFuzzer uses these sanitizers both for coverage data provided by sanitizer-common, as well as to detect when inputs trigger a defect.
A: LibFuzzer can be used to make a coverage-based fuzzer binary by combining it with a sanitized library and the implementation of the fuzz target function:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { // Use the data to do something interesting with your API return 0; }
Optionally, you can also add an initial corpus. Without it, libFuzzer will start from an empty fuzzer and will (eventually) learn how to make appropriate inputs on its own.
LibFuzzer then be able to generate, submit, and monitor inputs to the library code:
Developer-provided components are in green.
A: Coverage based fuzzing works best when fuzzing targets resemble unit tests. If your code is already organized to make it easy to unit test, you can add targets for each of the interfaces being tested., e.g. something like:
// std::string json = ...; Metadata metadata; EXPECT_TRUE(metadata.Parse(json));
becomes:
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { std::string json(static_cast<const char *>(Data), Size); metadata.Parse(json); return 0; }
With a corpus of JSON inputs, Data
may be close to what the Metadata
object expects to parse. If not, the fuzzer will eventually discover what inputs are meaningful to it through random mutations, trial and error, and code coverage data.
A: The FuzzedDataProvider
library helps you map portions of the provided Data
to “plain old data” (POD) types. More complex objects can almost always be (eventually) built out of POD types and variable arrays.
#include <fuzzer/FuzzedDataProvider.h> extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { FuzzedDataProvider fuzzed_data(Data, Size); auto flags = fuzzed_data.ConsumeIntegral<uint32_t>(); auto name_len = fuzzed_data.ConsumeIntegralInRange<size_t>(0, MAX_NAME_LEN - 1); std::string name = fuzzed_data.ConsumeBytesAsString(name_len); Parser parser(name.c_str(), flags); auto remaining = fuzzed_data.ConsumeRemainingBytes<char>(); parser.Parse(remaining.data(), remaining.size()); return 0; }
Note that using this library for splitting your data might make it harder for you to provide a corpus for your fuzzer, as the splitting happens dynamically. Other alternatives are explored in the split inputs documentation.
In some cases, you may have expensive set-up operations that you would like to do once. The libFuzzer documentation has tips on how to do startup initialization. Be aware though that such state will be carried over from iteration to iteration. This can be useful as it may expose new bugs that depend on the library's persisted state, but it may also make bugs harder to reproduce when they depend on a sequence of inputs rather than a single one.
Data
than what libfuzzer provides?If Size
isn‘t long enough for your needs, you can simply return 0;
. The fuzzer will quickly learn that inputs below that length aren’t interesting and will stop generating them.
By default, libfuzzer generates inputs with a maximum size of 4096. If you need to generate larger inputs, you can provide the -max_len
flag to fx fuzz start
. If you provide a corpus input large enough, libfuzzer will increase the maximum size to that corpus size.
A: In general, an in-process coverage-based fuzzer, iterations should be short and focused. The more focused a fuzz target is, the faster libFuzzer will be able to find “interesting” inputs that increase code coverage.
At the same time, becoming too focused can lead to a proliferation of fuzz targets. Consider the example of a routine that parses incoming requests. The parser may recognize dozens of different request types, so developing a separate fuzz target for each may be cumbersome. An alternative in this case may be to develop a single fuzzer, and include examples of the different requests in the initial corpus. In this way the single fuzz target can still bypass a large amount of shallow fuzzing by being guided towards the interesting inputs.
Note: Currently, libFuzzer can be used in Fuchsia to fuzz C/C++ code. Additional language support is planned.
A: There's many other fuzzing engines out there:
Note: AFL support on Fuchsia is not yet supported.
If you want to fuzz a service through FIDL calls in the style of an integration test, see Fuzzing FIDL Servers with LibFuzzer on Fuchsia.
If none of these options fit your needs, you can still write a custom fuzzer and have it run continuously under ClusterFuzz.
A: First, create your fuzz target function. It‘s recommended that the fuzzer’s target is clear from file name. If the library code already has a directory for unit tests, you should use a similar directory for your fuzzing targets. If not, make sure the file's name clearly reflects it is a fuzzer binary. In general, use naming and location to make the fuzzer easy to find and its purpose clear.
Example: A fuzzer for //src/lib/cmx
might be located at //src/lib/cmx/cmx_fuzzer.cc
, to match //src/lib/cmx/cmx_unittest.cc
.
Libfuzzer already provides tips on writing the fuzz target function itself.
Next, add the build instructions to the library's BUILD.gn file. Adding an import to //build/fuzzing/fuzzer.gni will provide two templates:
The fuzzer
template is used to build the fuzzer executable. Given a fuzz target function in a source file and the library under test as a dependency, it will provided the correct compiler flags to link against the fuzzing engine:
import("//build/fuzzing/fuzzer.gni") fuzzer("cowsay_simple_fuzzer") { sources = [ "cowsay_fuzzer.cpp" ] deps = [ ":cowsay_sources" ] }
It also enables the FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
build macro. If the software under test needs fuzzing-specific modifications, they can be wrapped in a preprocessor conditional on this macro, e.g.:
#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION srand(++global_counter); rand_int = rand(); #else zx_cprng_draw(&rand_int, size_of(rand_int)); #endif
This can be useful to allow either more deterministic fuzzing and/or deeper coverage.
The fuzzer template also allows you include additional inputs to control the fuzzer:
import("//build/fuzzing/fuzzer.gni") fuzzer("cowsay_simple_fuzzer") { sources = [ "cowsay_fuzztest.cpp" ] deps = [ ":cowsay_sources" ] dictionary = "test_data/various_moos.dict" options = "test_data/fuzzer.opts" }
When you use the fx fuzz tool, libFuzzer's merge
, jobs
, dict
, and artifact_prefix
options are set automatically. You do not need to specify these options unless they differ from the default values.
The fuzzers_package
template bundles fuzzers into a Fuchsia package in the same way that a normal package bundles binaries.
fuzzers_package("cowsay_fuzzers") { fuzzers = [ ":cowsay_simple_fuzzer" ] }
By default, the package will support all sanitizers. This can be restricted by providing an optional “sanitizers” list, e.g. sanitizers = [ "asan", "ubsan" ]
Once defined, a package needs to be included in the build dependency graph like any other test package. This typically means adding it to a group of tests, e.g. a group("tests")
target.
IMPORTANT: The Fuchsia build system will build the fuzzers only if it is explicitly told to instrument them for fuzzing with an appropriate sanitizer. The easiest way to achieve this is using the --fuzz-with <sanitizer>
flag with fx set
, e.g:
$ fx set core.x64 --fuzz-with asan --with //bundles:tests --with //garnet/packages/products:devtools $ fx build
Zircon has a different fuzzer.gni template from the rest of Fuchsia, but is used similarly:
import("$zx/public/gn/fuzzer.gni") fuzzer("zx-fuzzer") { sources = [ "zx_fuzzer.cpp" ] deps = [ ":zx_sources" ] }
NOTE: Due to gn unification, you will also need to manually add the binary targets to
build/unification/images/BUILD.gn
:
TODO(41279): Update these docs with zircon fuzzing instructions.
aggregate_manifest("legacy-image") { deps = [ (...) ":bin.zx-fuzzer.asan", ":bin.zx-fuzzer.asan-ubsan", ":bin.zx-fuzzer.ubsan",
Zircon fuzzers will be built with all supported sanitizers automatically. These fuzzers can be included in a Fuchsia instance by including the zircon_fuzzers
package, e.g.:
$ fx set core.x64 --with //garnet/tests/zircon:zircon_fuzzers --with //garnet/packages/products:devtools $ fx build
Note that Zircon fuzzers must have names that end in “-fuzzer”.
A: Use the fx fuzz
tool which knows how to find fuzzing related files and various common options.
The fuzzer binary can be started directly, using the normal libFuzzer options, if you prefer. However, it is easier to use the fx fuzz
devshell tool, which understands where to look for fuzzing related files and knows various common options. Try one or more of the following:
$ fx fuzz help
$ fx fuzz list
fx fuzz [package]/[fuzzer]
(Ignore errors of the form Error: no such package.
These come from CIPD and should not affect the fuzzer!)
package
and fuzzer
match those reported by fx fuzz list
, and may be abbreviated. For commands that accept a single fuzzer, e.g. check
, the abbreviated name must uniquely identify exactly one fuzzer.
When starting a fuzzer, the tool will echo the command it is invoking, prefixed by +
. This can be useful if you need to manually reproduce the bug with modified libFuzzer options.
A: Use the fx fuzz tool:
fx fuzz check [package]/[fuzzer]
fx fuzz repro [package]/[fuzzer]
The test artifact are also copied to //test_data/fuzzing/<package>/<fuzzer>/<timestamp>
. The most recent fuzzer run is symbolically linked to //test_data/fuzzing/<package>/<fuzzer>/latest
.
As with fx fuzz start
, the fuzzer will echo the command it is invoking, prefixed by +
. This can be useful if you need to manually reproduce the bug with modified parameters.
A: File them, then fix them!
Note: The bug tracker is currently only open to Googlers.
When filing bugs, please use the following custom labels: found-by-fuzzing
, libfuzzer
and Sec-TriageMe
. This will help the security team see where fuzzers are being used and stay aware of any critical issues they are finding.
As with other potential security issues, bugs should be filed in the component of the code under test (and not in the security component). Conversely, if you encounter problems or shortcomings in the fuzzing framework itself, please do open bugs or feature requests in the security component with the label libFuzzer
.
As with all potential security issues, don‘t wait for triage to begin fixing the bug! Once fixed, don’t forget to link to the bug in the commit message. This may also be a good time to consider minimizing and uploading your corpus at the same time (see the next section).
A: When you first begin fuzzing a new target, the fuzzer may crash very quickly. Typically, fuzzing has a large initial spike of defects found followed by a long tail. Fixing these initial, shallow defects will allow your fuzzer to reach deeper and deeper into the code. Eventually your fuzzer will run for several hours (e.g. overnight) without crashing. At this point, you will want to save the corpus.
To do this, use the fx fuzz tool: fx fuzz merge <package>/<fuzzer>
This will pull down the current corpus from CIPD, merge it with your corpus on the device, minimize it, and upload it to CIPD as the new latest corpus.
When uploaded, the corpus is tagged with the current revision of the integration branch. If needed, you can retrieve older versions of the corpus relating to a specific version of the code: fx fuzz fetch <package>/<fuzzer> <integration-revision>
A: Yes! by fetching the corpus, and then performing a normal corpus update:
fx fuzz fetch --no-cipd --staging /path/to/third/party/corpus [package]/[fuzzer]
fx fuzz merge [package]/[fuzzer]
A: Yes, although the extra tooling of fx fuzz
is not currently supported. This means you can build host fuzzers with the GN templates, but you'll need to manually run them, reproduce the bugs they find, and manage their corpus data.
If your fuzzers don't have Fuchsia dependencies, you can build host versions simply by setting fuzz_host=true
in the fuzzers_package
gn fuzzers package:
fuzzers_package("overnet_fuzzers") { fuzzers = [ "packet_protocol:packet_protocol_fuzzer" ] fuzz_host = true }
Upon building, the host fuzzers with can be found in in the host variant output directory, e.g. //out/default/host_x64-asan-fuzzer
.
A: Once crashes begin to become infrequent, it may be because almost all the bugs have been fixed, but it may also be because the fuzzer isn't reaching new code that still has bugs. Code coverage information is needed to determine the quality of the fuzzer. Use source-based code coverage to see what your current corpus reaches.
Note: Source-based code coverage is under active development.
If coverage in a certain area is low, there are a few options:
FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
build macro described above.The “run, merge, measure, improve” steps can be repeated for as many iterations as you feel are needed to create a quality fuzzer. Once ready, you'll need to upload your corpus and update the GN fuzzer in the appropriate project. At this point, others will be able use your fuzzer. This includes ClusterFuzz which will automatically find new fuzzers and continuously fuzz them, updating their corpora, filing bugs for crashes, and closing them when fixed.
Note: ClusterFuzz integration is in development.
A: As you can see from the various notes in this document, there's still plenty more to do!
We will continue to work on these features and others, and update this document accordingly as they become available.