Authors: @lalitm
Status: Draft
Loading large traces (multi-GB) into Trace Processor is slow. Parsing protobuf trace data, sorting events, and populating tables can take minutes for large traces. This cost is paid on every load, even when the trace hasn't changed.
This is painful in several workflows:
TraceProcessor instance on the same trace during development.The parsed table representation is deterministic for a given trace + TP version + flags. Re-parsing is pure waste when the inputs haven't changed.
Add a parse cache that serializes parsed tables into an opaque binary format and transparently loads from cache on subsequent opens. The caching logic lives entirely in the shell layer — the TP core remains IO-free.
The parse cache format is an internal implementation detail of Trace Processor. It is not stable, not versioned for external consumers, and carries no forwards or backwards compatibility guarantees. The format may change between any two TP releases without notice.
Currently, the cache is a TAR archive of Arrow IPC files (one per intrinsic table), reusing the existing ExportToArrow() / Arrow TAR import infrastructure. This is a convenient implementation choice, not a public contract.
A separate, stable Arrow export feature (with explicit versioning and compatibility guarantees) will be provided independently for users who need a durable, interoperable representation. The parse cache is not that — it is purely an acceleration mechanism tied to a specific TP build.
The cache is identified by:
SHA256(tp_version + sorted_relevant_global_flags + trace_identity)
Where trace_identity is:
| Client | Identity |
|---|---|
| CLI shell | file path + size + mtime (from stat()) |
| Browser/UI | File.name + File.size + File.lastModified |
| Python | file path + size + mtime (from os.stat()) |
This is fast to compute (no file content hashing) and sufficient for practical invalidation. The chance of a false cache hit (different trace content with identical name + size + mtime) is negligible.
“Relevant global flags” are those that affect parsing behavior: --full-sort, --no-ftrace-raw, --crop-track-events, etc.
The inclusion of tp_version in the cache key means that upgrading TP automatically invalidates all existing caches. This is intentional — schema changes between versions make old caches unsafe to load.
~/.cache/perfetto/parse-cache/<hash> by default, overridable via --parse-cache-dir. The cache directory is created on first write.
--parse-cache global flagtrace_processor_shell --parse-cache query -c "SELECT ..." trace.pb
Behavior:
Parse() instead of the original trace. Skip to step 5.Parse() as normal.ExportToArrow().The background write means the first load has no added latency for queries. The wait-on-exit means the cache is guaranteed to exist after the first invocation completes.
parse-cache subcommandFor explicit cache management by human users:
# Create a parse cache for a trace trace_processor_shell parse-cache create trace.pb # Show cache status for a trace (exists, size, staleness) trace_processor_shell parse-cache info trace.pb # Delete cache for a specific trace trace_processor_shell parse-cache clear trace.pb # Delete all parse caches trace_processor_shell parse-cache clear --all
/parse with trace identityTo support caching for RPC clients (UI, Python), the first /parse call gains an optional trace_identity field:
message ParseRequest { optional bytes data = 1; // Optional. Sent only on the first /parse call. If the shell has a // valid parse cache for this identity, it loads from cache and ignores // subsequent /parse data. optional string trace_identity = 2; } message ParseResult { optional string error = 1; // If true, the shell loaded from cache. The client may skip sending // further /parse chunks. Old clients that ignore this field and // continue sending data are fine — the shell discards the bytes. optional bool skip_further_parse = 2; }
Flow:
/parse chunk with trace_identity set.skip_further_parse = true. Discard data from this and all subsequent /parse calls.skip_further_parse = false. Continue accepting /parse chunks.notify_eof, if cache was missed, write cache in background.trace_identity: no caching, existing behavior unchanged.skip_further_parse and keep sending: shell discards bytes, everything still works.This is fully backwards compatible in both directions.
Caching is configured via TraceProcessorConfig:
from perfetto.trace_processor import TraceProcessor, TraceProcessorConfig config = TraceProcessorConfig(parse_cache=True) tp = TraceProcessor(trace="large.pb", config=config)
Implementation: parse_cache=True causes the Python wrapper to pass --parse-cache to the shell subprocess. The Python client also sends trace_identity (from os.stat()) on the first /parse call so the shell can check its cache. If the response has skip_further_parse = true, Python skips sending remaining chunks.
No explicit cache management API in Python. Users who want manual control can use the CLI parse-cache subcommand.
__intrinsic_trace_file, __intrinsic_trace_import_logs, and similar metadata tables are excluded from the cache.The cache size is roughly comparable to the parsed table data, which can be a significant fraction of the original trace size. Mitigations:
--parse-cache flag or parse-cache create). Users make a conscious choice to spend disk space.parse-cache info shows cache sizes.parse-cache clear --all provides easy cleanup.Parse cache written: 8.2 GB at ~/.cache/perfetto/parse-cache/a1b2c3+------------------+ +-------------------+ +---------------+ | Clients | | Shell layer | | TP core | | | | (IO lives here) | | (IO-free) | | CLI user |---->| | | | | Browser/UI |---->| Cache check/write |---->| Parse() | | Python API |---->| File I/O | | ExportToArrow | | | | Background thread | | | +------------------+ +-------------------+ +---------------+
TP core provides the primitives (ExportToArrow, import via Parse). The shell layer orchestrates caching. RPC clients provide trace identity metadata. No caching logic in TP core or the RPC layer itself.
Add cache awareness to TraceProcessor directly (e.g., in NotifyEndOfFile() or Config).
Pro:
Con:
Use the existing ExportTraceToDatabase() as the cache format instead.
Pro:
Con:
Enable caching automatically without user opt-in.
Pro:
Con:
Use SHA256 of the full trace content as the cache key instead of path + size + mtime.
Pro:
Con:
parse-cache clear sufficient?parse-cache create support creating caches for multiple traces at once (e.g., parse-cache create *.pb)?~/.cache/perfetto/parse-cache/ always correct?