This guide is about extending the training tools to support new optimization problems. It is assumed the necessary LLVM changes have been made - i.e. instrumenting the optimization pass with a way to carry out decision making via a trained model, training log collection - see the lib/Analysis/MLInlineAdvisor.cpp and lib/CodeGen/MLRegallocEvictAdvisor.cpp for examples.
Refer to compiler_opt/rl/inlining
or compiler_opt/rl/regalloc
.
create a directory peer to inlining
and regalloc
. This placement is not necessary, but sufficient for illustration.
define the implementation of compiler_opt.rl.compilation_runner.CompilationRunner
that's specific to your problem. Refer to the examples. Note how we always start processes via the compiler_opt.rl.start_cancellable_process()
utility.
define the ML interface - see the config.py
file in each of the examples.
extend compiler_opt.rl.problem_configuration.ProblemConfiguration
. Make the new class gin-configurable. By convention, define this in the __init__.py
.
place specific gin configs in the subdirectory, as well as vocab (these are optional, but likely necessary). A convention here is to make sure your gin files make the configurable config_registry.get_configuration.implementation
point to your implementation of ProblemConfiguration
. See the common.gin
files in our examples. This allows any tool to just pick up your problem when pointing it (via --gin_files
) to your problem.
You can have multiple gin files for different algorithm configurations, and reuse common settings (like the above) via gin's import
mechanism. See our examples where we have different configs for PPO or behavioral cloning.
compiler_opt.rl.registry.py
, under the “Register implementations” comment.‘compilation problem’ is an optimization problem with a specific way of invoking clang and specific features and tensorflow topologies. The component model requires all these be exported in a class implementing ProblemConfiguration below, however, to avoid cycle dependencies in Bazel environments, do not explicitly inherit from it.
Internally, all the module's implementation parameters are expected to be gin-initialized.
Existing tools (e.g. train_locally.py
) will just transparently use your new component if you point the tool to one of your gin files. This assumes your gin file binds config_registry.get_configuration.implementation
as described:
--gin_bindings=config_registry.get_configuration.implementation=@configs.InliningConfig
To use in a new tool:
just get a ProblemConfiguration object in your python:
config = problem_configuration.get_configuration()
make sure your tool also exposes --gin_files
and --gin_bindings
and bootstraps gin.
to avoid long binding names, use the runners
module name for the CompilationRunner
implementation, and use the configs
module name for the implementation of ProblemConfiguration
.
the CompilationRunner
gin initialization should initialize to None, and use, the clang_path
and launcher_path
macros (https://github.com/google/gin-config#syntax-quick-reference):
clang_path = None launcher_path = None runners.MyCompilationRunner.clang_path = %clang_path runners.MyCompilationRunner.launcher_path = %launcher_path
Use a similar pattern for problem-specific additional flags (see inlining's llvm_size_path
for example). When running tools, this allows the user pass common flags transparently wrt the underlying runner - i.e. if swapping 2 runners, the clang flag stays the same: --gin_bindings=clang_path="'/foo/bar/clang'"