A simple C++ binary to benchmark a compute graph and its individual operators, both on desktop machines and on Android.
(0) Refer to https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android to edit the WORKSPACE
to configure the android NDK/SDK.
(1) build for your specific platform, e.g.:
bazel build -c opt \ --crosstool_top=//external:android/crosstool \ --cpu=armeabi-v7a \ --host_crosstool_top=@bazel_tools//tools/cpp:toolchain \ --config monolithic \ tensorflow/tools/benchmark:benchmark_model
(2) Connect your phone. Push the binary to your phone with adb push (make the directory if required):
adb push bazel-bin/tensorflow/tools/benchmark/benchmark_model /data/local/tmp
(3) Push the compute graph that you need to test. For example:
adb push tensorflow_inception_graph.pb /data/local/tmp
(4) Run the benchmark. For example:
adb shell /data/local/tmp/benchmark_model \ --graph=/data/local/tmp/tensorflow_inception_graph.pb \ --input_layer="input:0" \ --input_layer_shape="1,224,224,3" \ --input_layer_type="float" \ --output_layer="output:0"
(1) build the binary
bazel build -c opt tensorflow/tools/benchmark:benchmark_model
(2) Run on your compute graph, similar to the Android case but without the need of adb shell. For example:
bazel-bin/tensorflow/tools/benchmark/benchmark_model \ --graph=tensorflow_inception_graph.pb \ --input_layer="input:0" \ --input_layer_shape="1,224,224,3" \ --input_layer_type="float" \ --output_layer="output:0"
The Inception graph used as an example here may be downloaded from https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip
To download TF .pb graphs of several popular models, run:
bash download_models.sh
We provide example scripts comparing TF-oneDNN performance with vanilla TF's that users can modify for their own benchmarks. The scripts assume that models are already downloaded by download_models.sh
. To run end-to-end model performance comparison between TF-oneDNN and vanilla TF, call
bash download_models.sh # Skip this step if models are already downloaded. bash run_onednn_benchmarks.sh
The output is a summary table in a CSV file: results.csv. Example output:
Showing runtimes in microseconds. `?` means not available. Model, Batch, Vanilla, oneDNN, Speedup bert-large, 1, x, y, x/y bert-large, 16, ..., ..., ... bert-large, 64, ..., ..., ... inception, 1, ..., ..., ... inception, 16, ..., ..., ... inception, 64, ..., ..., ... ⋮ ssd-resnet34, 1, ?, ..., ? ssd-resnet34, 16, ?, ..., ? ssd-resnet34, 64, ?, ..., ?
Vanilla TF can‘t run ssd-resnet34
on CPU because it doesn’t support NCHW format.