| ## Benchmarking |
| |
| This repository contains some tooling to help automate benchmarking, particularly |
| for the regalloc model (as of right now). The current benchmarking tooling works |
| with the llvm test suite, and the chromium performance tests. |
| |
| ## Initial Setup |
| |
| Make sure you have local checkouts of the repositories that are used: |
| ```bash |
| cd ~/ |
| git clone https://github.com/llvm/llvm-project |
| git clone https://github.com/google/ml-compiler-opt |
| ``` |
| And for benchmarking using the llvm-test-suite: |
| ```bash |
| git clone https://github.com/llvm-test-suite |
| ``` |
| |
| For acquiring the chromium source code, please see [their](https://chromium.googlesource.com/chromium/src/+/main/docs/linux/build_instructions.md) |
| documentation and follow it up by running hooks. You don't need to setup any |
| builds as the `benchmark_chromium.py` script does that automatically. |
| |
| Make sure that you have a local copy of libtensorflow: |
| ```bash |
| mkdir ~/tensorflow |
| wget --quiet https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-1.15.0.tar.gz |
| tar xfz libtensorflow-cpu-linux-x86_64-1.15.0.tar.gz -C ~/tensorflow |
| ``` |
| |
| And make sure you have installed all of the necessary python packages: |
| ```bash |
| python3 -m pip install --user -r ml-compiler-opt/requirements.txt |
| ``` |
| |
| ## Benchmarking the LLVM test suite |
| |
| You can use the `benchmark_llvm_test_suite.py` python script in order to |
| automatically configure everything to run a benchmark using the latest released |
| regalloc model: |
| ```bash |
| cd ~/ml-compiler-opt |
| PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_llvm_test_suite.py \ |
| --advisor=release \ |
| --compile_llvm \ |
| --compile_testsuite \ |
| --llvm_build_path=~/llvm-build \ |
| --llvm_source_path=~/llvm-project/llvm \ |
| --llvm_test_suite_path=~/llvm-test-suite \ |
| --llvm_test_suite_build_path=~/llvm-test-suite/build \ |
| --nollvm_use_incremental \ |
| --model_path="download" \ |
| --output_path=./results.json \ |
| --perf_counter=INSTRUCTIONS \ |
| --perf_counter=MEM_UOPS_RETIRED:ALL_LOADS \ |
| --perf_counter=MEM_UOPS_RETIRED:ALL_STORES \ |
| --tensorflow_c_lib_path=~/tensorflow |
| ``` |
| |
| This will output a bunch of test information to `./results.json` that can then |
| be used later on for downstream processing and data analysis. |
| |
| An explanation of the flags: |
| * `--advisor` - This flag specifies the register allocation eviction advisor that |
| is used by LLVM when compiling the test suite. It can be set to either `release` |
| or `default` depending upon if you want to test the model specified in the |
| `--model_path` flag or if you want to test the default register allocation eviction |
| behavior to grab a baseline measurement. |
| * `--compile_llvm` - This is a boolean flag (can also be set to `--nocompile_llvm`) |
| that specifies whether or not to compile LLVM. |
| * `--compile_testsuite` - Specifies whether or not to compile the test suite. |
| * `--llvm_build_path` - The path to place the LLVM build in that will be used. |
| This directory will be deleted and remade if the `--nollvm_use_incremental` flag |
| is set. |
| * `--llvm_source_path` - The path to the LLVM source. This cannot be the root path |
| to the LLVM monorepo, it specifically needs to be the path to the llvm |
| subdirectory within that repository. |
| * `--llvm_test_suite_path` - The path to the llvm-test-suite |
| * `--llvm_test_suite_build_path` - The path to place the build for the |
| llvm-test-suite. Similar behavior to the LLVM build path. |
| * `llvm_use_incremental` - Whether or not to do an incremental build of LLVM. |
| If you alread have all the correct compilation flags setup for running MLGO |
| with LLVM, you can set this flag and you should get an extremely fast LLVM |
| build as the only thing changing is the release mode regalloc model. |
| * `model_path` - The path to the regalloc model. If this is set to "download", |
| it will automatically grab the latest model from the ml-compiler-opt Github. |
| If it is set to "" or "autogenerate", it will use the autogenerated model. |
| * `output_path` - The path to the output file (in JSON format) |
| * `perf_counter` - A flag that can be specified multiple times that takes in |
| performance counters in the libpfm format. There can only be up to three |
| performance counters specified due to underlying limitations in Google |
| benchmark. |
| * `tensorflow_c_lib_path` - The path to the tensorflow c library if you aren't |
| doing an incremental LLVM build. |
| * `tests_to_run` - This specifies the LLVM microbenchmarks to run relative to |
| the microbenchmarks library in the LLVM test suite build directory. The default |
| values for this flag should be pretty safe and produce good results. |
| |
| You can also get detailed information on each flag by only passing the `--help` |
| flag to the script. You can also see the default values here as a lot of the |
| flags set in the example above are just to their default value. |
| |
| ## Benchmarking Chromium |
| |
| You can use the `benchmark_chromium.py` script in order to run chromium |
| benchmarks based on test description JSON files. |
| |
| Example: |
| ```bash |
| cd ~/ml-compiler-opt |
| PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_chromium.py \ |
| --advisor=release \ |
| --chromium_build_path=./out/Release \ |
| --chromium_src_path=~/chromim/src \ |
| --compile_llvm \ |
| --compile_tests \ |
| --depot_tools_path=~/depot_tools \ |
| --llvm_build_path=~/llvm-build \ |
| --llvm_source_path=~/llvm-project/llvm \ |
| --nollvm_use_incremental \ |
| --model_path="download" \ |
| --num_threads=32 \ |
| --output_file=./output-chromium-testing.json \ |
| --perf_counters=mem_uops_retired.all_loads \ |
| --perf_counters=mem_uops_retired.all_stores \ |
| --tensorflow_c_lib_path=~/tensorflow |
| ``` |
| |
| Several of the flags here are extremely similar to/the same as the flags |
| for the llvm test suite, so only the flags unique to the chromium script |
| will be highlighted here. |
| * `--chromium_build_path` - The path to place the chromium build in. This path |
| is relative to the chromium source path. |
| * `-chromium_src_path` - The path to the root of the chromium repository (ie |
| `./src/` where you ran `fetch --nohooks`) |
| * `--depot_tools_path` - The path yo your depot tools checkout. |
| * `--num_threads` - enables parallelism for running the tests. Make sure to use |
| this with caution as it can add a lot of noise to your benchmarks depending |
| upon what specifically you are doing. |
| * `--perf_counters` - similar to the llvm test suite perf counters, but instead |
| of being in the libpfm format, they're perf counters as listed in `perf list`. |
| * `--test_description` - Can be declared multiple times if you have custom test |
| descriptions that you want to run, but the default works well, covers a broad |
| portion of the codebase, and has been specifically designed to minimize run |
| to run variability. |
| |
| ### Generating Chromium Test Descriptions |
| |
| To generate custom test descriptions for gtest executables (ie the test |
| executables that are used by chromium), you can use the `list_gtests.py` script. |
| This script doesn't need to be used for running the chromium performance tests |
| unless you are interested in adjusting the currently set test descriptions |
| available in `/compiler_opt/benchmark/chromium_test_descriptions` or are |
| interested in using tests from a different project that also uses gtest. |
| |
| Example: |
| ```bash |
| PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/list_gtests.py \ |
| --gtest_executable=/path/to/executable \ |
| --output_file=test.json \ |
| --output_type=json |
| ``` |
| |
| Flags: |
| * `--gtest_executable` - The path to the gtest executable from which to extract |
| a list of tests |
| * `--output_file` - The path to the file to output all of the extracted test names |
| too |
| * `--output_type` - Either JSON or default. JSON packages everything nicely into |
| a JSON format, and default just dumps the test names separated by line breaks. |
| |
| There is also a utility `filter_tests.py` that allows for filtering the |
| individual tests available in a test executable, making sure that they exist |
| (sometimes tests that pop up when listing all the gtests don't run when passed |
| through `-gtest_filter`) and that they don't fail (some tests require setups |
| including GUI/GPUs). |
| |
| Example: |
| ```bash |
| PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/filter_tests.py \ |
| --input_tests=./compiler_pt/benchmark/chromium_test_descriptions/browser_tests.json \ |
| --output_tests=./browser_tests_filtered.json \ |
| --num_threads=32 \ |
| --executable_path=/chromium/src/out/Release/browser_tests |
| ``` |
| |
| Flags: |
| * `--input_tests` - The path to the test description generated by |
| `list_gtests.py` to be filtered. |
| * `--output_tests` - The path to where the new filtered output test suite |
| description should be placed. |
| * `--num_threads` - The number of threads to use when running tests to see if |
| they exist/whether or not they pass. |
| * `--executable_path` - The path to the gtest executable that the test suite |
| description corresponds to. |
| |
| TODO(boomanaiden154): investigate why some of the tests listed by the |
| executable later can't be found when using `--gtest_filter`. |
| |
| ## Comparing Benchmarks |
| |
| To compare benchmark runs, you can use the `benchmark_report_converter.py` script. |
| Let's say you have two benchmark runs (they need to be done with the same set |
| of tests), `baseline.json` and `experimental.json` from the llvm test suite |
| benchmarking script with the performance counter `INSTRUCTIONS` enabled. You can get |
| a summary comparison with the following command: |
| ```bash |
| PYTHONPATH=$PYTHONPATH:. python3 ./compiler_opt/benchmark/benchmark_report_converter.py \ |
| --base=baseline.json \ |
| --exp=experimental.json \ |
| --counters=INSTRUCTIONS \ |
| --out=reports.csv |
| ``` |
| This will create `reports.csv` with a line for each test that contains information |
| about the differences in performance counters for that specific test. |