| * * * * * * * * * * |
| * PolyBench/C 3.2 * |
| * * * * * * * * * * |
| |
| Copyright (c) 2011-2012 the Ohio State University. |
| Contact: Louis-Noel Pouchet <pouchet@cse.ohio-state.edu> |
| |
| |
| ------------- |
| * New in 3.2: |
| ------------- |
| |
| - Rename the package to PolyBench/C, to prepare for the upcoming |
| PolyBench/Fortran and PolyBench/GPU. |
| - Fixed a typo in polybench.h, causing compilation problems for 5D arrays. |
| - Fixed minor typos in correlation, atax, cholesky, fdtd-2d. |
| - Added an option to build the test suite with constant loop bounds |
| (default is parametric loop bounds) |
| |
| |
| ------------- |
| * New in 3.1: |
| ------------- |
| |
| - Fixed a typo in polybench.h, causing compilation problems for 3D arrays. |
| - Set by default heap arrays, stack arrays are now optional. |
| |
| |
| ------------- |
| * New in 3.0: |
| ------------- |
| |
| - Multiple dataset sizes are predefined. Each file comes now with a .h |
| header file defining the dataset. |
| - Support of heap-allocated arrays. It uses a single malloc for the |
| entire array region, the data allocated is cast into a C99 |
| multidimensional array. |
| - One benchmark is out: gauss_filter |
| - One benchmark is in: floyd-warshall |
| - PAPI support has been greatly improved; it also can report the |
| counters on a specific core to be set by the user. |
| |
| |
| |
| ---------------- |
| * Mailing lists: |
| ---------------- |
| |
| ** polybench-announces@lists.sourceforge.net: |
| --------------------------------------------- |
| |
| Announces about releases of PolyBench. |
| |
| ** polybench-discussion@lists.sourceforge.net: |
| ---------------------------------------------- |
| |
| General discussions reg. PolyBench. |
| |
| |
| |
| ----------------------- |
| * Available benchmarks: |
| ----------------------- |
| |
| ::linear-algebra:: |
| linear-algebra/kernels: |
| linear-algebra/kernels/2mm/2mm.c |
| linear-algebra/kernels/3mm/3mm.c |
| linear-algebra/kernels/atax/atax.c |
| linear-algebra/kernels/bicg/bicg.c |
| linear-algebra/kernels/cholesky/cholesky.c |
| linear-algebra/kernels/doitgen/doitgen.c |
| linear-algebra/kernels/gemm/gemm.c |
| linear-algebra/kernels/gemver/gemver.c |
| linear-algebra/kernels/gesummv/gesummv.c |
| linear-algebra/kernels/mvt/mvt.c |
| linear-algebra/kernels/symm/symm.c |
| linear-algebra/kernels/syr2k/syr2k.c |
| linear-algebra/kernels/syrk/syrk.c |
| linear-algebra/kernels/trisolv/trisolv.c |
| linear-algebra/kernels/trmm/trmm.c |
| |
| linear-algebra/solvers: |
| linear-algebra/solvers/durbin/durbin.c |
| linear-algebra/solvers/dynprog/dynprog.c |
| linear-algebra/solvers/gramschmidt/gramschmidt.c |
| linear-algebra/solvers/lu/lu.c |
| linear-algebra/solvers/ludcmp/ludcmp.c |
| |
| ::datamining:: |
| datamining/correlation/correlation.c |
| datamining/covariance/covariance.c |
| |
| ::medley:: |
| medley/floyd-warshall/floyd-warshall.c |
| medley/reg_detect/reg_detect.c |
| |
| ::stencils:: |
| stencils/adi/adi.c |
| stencils/fdtd-2d/fdtd-2d.c |
| stencils/fdtd-apml/fdtd-apml.c |
| stencils/jacobi-1d-imper/jacobi-1d-imper.c |
| stencils/jacobi-2d-imper/jacobi-2d-imper.c |
| stencils/seidel-2d/seidel-2d.c |
| |
| |
| |
| ------------------------------ |
| * Sample compilation commands: |
| ------------------------------ |
| |
| |
| ** To compile a benchmark without any monitoring: |
| ------------------------------------------------- |
| |
| $> gcc -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -o atax_base |
| |
| |
| ** To compile a benchmark with execution time reporting: |
| -------------------------------------------------------- |
| |
| $> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_TIME -o atax_time |
| |
| |
| ** To generate the reference output of a benchmark: |
| --------------------------------------------------- |
| |
| $> gcc -O0 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_DUMP_ARRAYS -o atax_ref |
| $> ./atax_ref 2>atax_ref.out |
| |
| |
| |
| |
| ------------------------- |
| * Some available options: |
| ------------------------- |
| |
| They are all passed as macro definitions during compilation time (e.g, |
| -Dname_of_the_option). |
| |
| - POLYBENCH_TIME: output execution time (gettimeofday) [default: off] |
| |
| - POLYBENCH_NO_FLUSH_CACHE: don't flush the cache before calling the |
| timer [default: flush the cache] |
| |
| - POLYBENCH_LINUX_FIFO_SCHEDULER: use FIFO real-time scheduler for the |
| kernel execution, the program must be run as root, under linux only, |
| and compiled with -lc [default: off] |
| |
| - POLYBENCH_CACHE_SIZE_KB: cache size to flush, in kB [default: 33MB] |
| |
| - POLYBENCH_STACK_ARRAYS: use stack allocation instead of malloc [default: off] |
| |
| - POLYBENCH_DUMP_ARRAYS: dump all live-out arrays on stderr [default: off] |
| |
| - POLYBENCH_CYCLE_ACCURATE_TIMER: Use Time Stamp Counter to monitor |
| the execution time of the kernel [default: off] |
| |
| - POLYBENCH_PAPI: turn on papi timing (see below). |
| |
| - MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET, |
| EXTRALARGE_DATASET: set the dataset size to be used |
| [default: STANDARD_DATASET] |
| |
| - POLYBENCH_USE_C99_PROTO: Use standard C99 prototype for the functions. |
| |
| - POLYBENCH_USE_SCALAR_LB: Use scalar loop bounds instead of parametric ones. |
| |
| |
| |
| --------------- |
| * PAPI support: |
| --------------- |
| |
| ** To compile a benchmark with PAPI support: |
| -------------------------------------------- |
| |
| $> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_PAPI -lpapi -o atax_papi |
| |
| |
| ** To specify which counter(s) to monitor: |
| ------------------------------------------ |
| |
| Edit utilities/papi_counters.list, and add 1 line per event to |
| monitor. Each line (including the last one) must finish with a ',' and |
| both native and standard events are supported. |
| |
| The whole kernel is run one time per counter (no multiplexing) and |
| there is no sampling being used for the counter value. |
| |
| |
| |
| ------------------------------ |
| * Accurate performance timing: |
| ------------------------------ |
| |
| With kernels that have an execution time in the orders of a few tens |
| of milliseconds, it is critical to validate any performance number by |
| repeating several times the experiment. A companion script is |
| available to perform reasonable performance measurement of a PolyBench. |
| |
| $> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_TIME -o atax_time |
| $> ./utilities/time_benchmark.sh ./atax_time |
| |
| This script will run five times the benchmark (that must be a |
| PolyBench compiled with -DPOLYBENCH_TIME), eliminate the two extremal |
| times, and check that the deviation of the three remaining does not |
| exceed a given threshold, set to 5%. |
| |
| It is also possible to use POLYBENCH_CYCLE_ACCURATE_TIMER to use the |
| Time Stamp Counter instead of gettimeofday() to monitor the number of |
| elapsed cycles. |
| |
| |
| |
| |
| ---------------------------------------- |
| * Generating macro-free benchmark suite: |
| ---------------------------------------- |
| |
| (from the root of the archive:) |
| $> PARGS="-I utilities -DPOLYBENCH_TIME"; |
| $> for i in `cat utilities/benchmark_list`; do create_cpped_version.sh $i "$PARGS"; done |
| |
| This create for each benchmark file 'xxx.c' a new file |
| 'xxx.preproc.c'. The PARGS variable in the above example can be set to |
| the desired configuration, for instance to create a full C99 version |
| (parametric arrays): |
| |
| $> PARGS="-I utilities -DPOLYBENCH_USE_C99_PROTO"; |
| $> for i in `cat utilities/benchmark_list`; do ./utilities/create_cpped_version.sh "$i" "$PARGS"; done |
| |
| |