tree: eff9a450bff835e7977e4a2e5da1774c62bebe4f [path history] [tgz]
  1. meta/
  2. tests/
  3. BUILD.gn
  4. component.cc
  5. component.h
  6. component_watcher.cc
  7. component_watcher.h
  8. job_watcher.cc
  9. job_watcher.h
  10. kernel_sampler.cc
  11. kernel_sampler.h
  12. main.cc
  13. process_watcher.cc
  14. process_watcher.h
  15. profiler_controller_impl.cc
  16. profiler_controller_impl.h
  17. README.md
  18. sampler.cc
  19. sampler.h
  20. symbolization_context.h
  21. symbolizer_markup.cc
  22. symbolizer_markup.h
  23. targets.cc
  24. targets.h
  25. taskfinder.cc
  26. taskfinder.h
  27. test_component.cc
  28. test_component.h
  29. unowned_component.cc
  30. unowned_component.h
src/performance/experimental/profiler/README.md

CPU Profiler

This is an experimental cpu profiler aimed at sampling stack traces and outputting them to pprof format.

Usage:

See Profiling Cpu Usage for a how to profile cpu usage using this profiler.

Kernel assistance

Experimental kernel assisted sampling via zx_sampler_create can be enabled via the GN flag ‘experimental_thread_sampler_enabled=true’ which improves sampling times to single digit us per sample.

If not enabled, the sampler will fall back to userspace based sampling which uses the root resource to suspend the target threads periodically, uses zx_process_read_memory and the fuchsia unwinder to read stack traces from the target, then exfiltrates the stack data. In this mode taking a sample using frame pointers takes roughly 300us[^1] per sample. In this fallback mode, it's recommended to set a relatively low sample rate to not overly perturb the profiled program. For reference, 50 samples per second would be a 1.5% sampling overhead.

Note: As experimental_thread_sampler_enabled=true isn't enabled in CI/CQ yet, integration tests need to be run locally with the build flag enabled -- CI/CQ results will reflect the state of the zx_process_read_memory based implementation.

[^1]: Numbers measured on core.x64-qemu