commit | fb4a09b2a556602acddc69454d1915d2409831eb | [log] [tgz] |
---|---|---|
author | Mircea Trofin <mtrofin@google.com> | Tue Oct 25 17:49:32 2022 -0700 |
committer | Mircea Trofin <mtrofin@google.com> | Tue Oct 25 17:49:32 2022 -0700 |
tree | 72e2a3c6c706d10d16ee452008a1c012b64df6de | |
parent | 91fbd44341435c9c1de1b9afb443d86ba2a8800d [diff] |
Introduce a Nesting Worker In some distributed object middleware, there's a limitation that one remote process host one object instance only. During training, we have 'lull' periods when we don't collect data for an experiment (because we run the training algo on the data). If we run a bunch of related experiments, we can reuse the remote resource during that time. This can be advantageous if the total resource utilization of 2N remote tasks used at (say) 50% is higher than that of N tasks used at 100% (which is usually the case - there's some book keeping cost) The design uses a special kind of worker which manages a bunch of underlying workers to which it delegates all calls. The current patch just introduces the bare minimum support. Subsequent patches will add better error handing (e.g. worker id doesn't exist), and potentially some load balancing support, if needed.
MLGO is a framework for integrating ML techniques systematically in LLVM. It replaces human-crafted optimization heuristics in LLVM with machine learned models. The MLGO framework currently supports two optimizations:
The compiler components are both available in the main LLVM repository. This repository contains the training infrastructure and related tools for MLGO.
We currently use two different ML algorithms: Policy Gradient and Evolution Strategies to train policies. Currently, this repository only support Policy Gradient training. The release of Evolution Strategies training is on our roadmap.
Check out this demo for an end-to-end demonstration of how to train your own inlining-for-size policy from the scratch with Policy Gradient.
For more details about MLGO, please refer to our paper MLGO: a Machine Learning Guided Compiler Optimizations Framework.
For more details about how to contribute to the project, please refer to contributions.
We occasionally release pretrained models that may be used as-is with LLVM. Models are released as github releases, and are named as [task]-[major-version].[minor-version].The versions are semantic: the major version corresponds to breaking changes on the LLVM/compiler side, and the minor version corresponds to model updates that are independent of the compiler.
When building LLVM, there is a flag -DLLVM_INLINER_MODEL_PATH
which you may set to the path to your inlining model. If the path is set to download
, then cmake will download the most recent (compatible) model from github to use. Other values for the flag could be:
# Model is in /tmp/model, i.e. there is a file /tmp/model/saved_model.pb along # with the rest of the tensorflow saved_model files produced from training. -DLLVM_INLINER_MODEL_PATH=/tmp/model # Download the most recent compatible model -DLLVM_INLINER_MODEL_PATH=download
Currently, the assumption for the is:
Training assumes a clang build with ML ‘development-mode’. Please refer to:
The model training - specific prerequisites are:
pip3 install --user -r requirements.txt
Where requirements.txt
is provided in the root of the repository.
Optionally, to run tests (run_tests.sh), you also need:
sudo apt-get install virtualenv
Note that the same tensorflow package is also needed for building the ‘release’ mode for LLVM.
An end-to-end demo using Fuchsia as a codebase from which we extract a corpus and train a model.