blob: dc350515d7c089aae3b8309cf031b831d0e64bb2 [file] [log] [blame]
* * * * * * * * * *
* PolyBench/C 3.2 *
* * * * * * * * * *
Copyright (c) 2011-2012 the Ohio State University.
Contact: Louis-Noel Pouchet <pouchet@cse.ohio-state.edu>
-------------
* New in 3.2:
-------------
- Rename the package to PolyBench/C, to prepare for the upcoming
PolyBench/Fortran and PolyBench/GPU.
- Fixed a typo in polybench.h, causing compilation problems for 5D arrays.
- Fixed minor typos in correlation, atax, cholesky, fdtd-2d.
- Added an option to build the test suite with constant loop bounds
(default is parametric loop bounds)
-------------
* New in 3.1:
-------------
- Fixed a typo in polybench.h, causing compilation problems for 3D arrays.
- Set by default heap arrays, stack arrays are now optional.
-------------
* New in 3.0:
-------------
- Multiple dataset sizes are predefined. Each file comes now with a .h
header file defining the dataset.
- Support of heap-allocated arrays. It uses a single malloc for the
entire array region, the data allocated is cast into a C99
multidimensional array.
- One benchmark is out: gauss_filter
- One benchmark is in: floyd-warshall
- PAPI support has been greatly improved; it also can report the
counters on a specific core to be set by the user.
----------------
* Mailing lists:
----------------
** polybench-announces@lists.sourceforge.net:
---------------------------------------------
Announces about releases of PolyBench.
** polybench-discussion@lists.sourceforge.net:
----------------------------------------------
General discussions reg. PolyBench.
-----------------------
* Available benchmarks:
-----------------------
::linear-algebra::
linear-algebra/kernels:
linear-algebra/kernels/2mm/2mm.c
linear-algebra/kernels/3mm/3mm.c
linear-algebra/kernels/atax/atax.c
linear-algebra/kernels/bicg/bicg.c
linear-algebra/kernels/cholesky/cholesky.c
linear-algebra/kernels/doitgen/doitgen.c
linear-algebra/kernels/gemm/gemm.c
linear-algebra/kernels/gemver/gemver.c
linear-algebra/kernels/gesummv/gesummv.c
linear-algebra/kernels/mvt/mvt.c
linear-algebra/kernels/symm/symm.c
linear-algebra/kernels/syr2k/syr2k.c
linear-algebra/kernels/syrk/syrk.c
linear-algebra/kernels/trisolv/trisolv.c
linear-algebra/kernels/trmm/trmm.c
linear-algebra/solvers:
linear-algebra/solvers/durbin/durbin.c
linear-algebra/solvers/dynprog/dynprog.c
linear-algebra/solvers/gramschmidt/gramschmidt.c
linear-algebra/solvers/lu/lu.c
linear-algebra/solvers/ludcmp/ludcmp.c
::datamining::
datamining/correlation/correlation.c
datamining/covariance/covariance.c
::medley::
medley/floyd-warshall/floyd-warshall.c
medley/reg_detect/reg_detect.c
::stencils::
stencils/adi/adi.c
stencils/fdtd-2d/fdtd-2d.c
stencils/fdtd-apml/fdtd-apml.c
stencils/jacobi-1d-imper/jacobi-1d-imper.c
stencils/jacobi-2d-imper/jacobi-2d-imper.c
stencils/seidel-2d/seidel-2d.c
------------------------------
* Sample compilation commands:
------------------------------
** To compile a benchmark without any monitoring:
-------------------------------------------------
$> gcc -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -o atax_base
** To compile a benchmark with execution time reporting:
--------------------------------------------------------
$> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_TIME -o atax_time
** To generate the reference output of a benchmark:
---------------------------------------------------
$> gcc -O0 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_DUMP_ARRAYS -o atax_ref
$> ./atax_ref 2>atax_ref.out
-------------------------
* Some available options:
-------------------------
They are all passed as macro definitions during compilation time (e.g,
-Dname_of_the_option).
- POLYBENCH_TIME: output execution time (gettimeofday) [default: off]
- POLYBENCH_NO_FLUSH_CACHE: don't flush the cache before calling the
timer [default: flush the cache]
- POLYBENCH_LINUX_FIFO_SCHEDULER: use FIFO real-time scheduler for the
kernel execution, the program must be run as root, under linux only,
and compiled with -lc [default: off]
- POLYBENCH_CACHE_SIZE_KB: cache size to flush, in kB [default: 33MB]
- POLYBENCH_STACK_ARRAYS: use stack allocation instead of malloc [default: off]
- POLYBENCH_DUMP_ARRAYS: dump all live-out arrays on stderr [default: off]
- POLYBENCH_CYCLE_ACCURATE_TIMER: Use Time Stamp Counter to monitor
the execution time of the kernel [default: off]
- POLYBENCH_PAPI: turn on papi timing (see below).
- MINI_DATASET, SMALL_DATASET, STANDARD_DATASET, LARGE_DATASET,
EXTRALARGE_DATASET: set the dataset size to be used
[default: STANDARD_DATASET]
- POLYBENCH_USE_C99_PROTO: Use standard C99 prototype for the functions.
- POLYBENCH_USE_SCALAR_LB: Use scalar loop bounds instead of parametric ones.
---------------
* PAPI support:
---------------
** To compile a benchmark with PAPI support:
--------------------------------------------
$> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_PAPI -lpapi -o atax_papi
** To specify which counter(s) to monitor:
------------------------------------------
Edit utilities/papi_counters.list, and add 1 line per event to
monitor. Each line (including the last one) must finish with a ',' and
both native and standard events are supported.
The whole kernel is run one time per counter (no multiplexing) and
there is no sampling being used for the counter value.
------------------------------
* Accurate performance timing:
------------------------------
With kernels that have an execution time in the orders of a few tens
of milliseconds, it is critical to validate any performance number by
repeating several times the experiment. A companion script is
available to perform reasonable performance measurement of a PolyBench.
$> gcc -O3 -I utilities -I linear-algebra/kernels/atax utilities/polybench.c linear-algebra/kernels/atax/atax.c -DPOLYBENCH_TIME -o atax_time
$> ./utilities/time_benchmark.sh ./atax_time
This script will run five times the benchmark (that must be a
PolyBench compiled with -DPOLYBENCH_TIME), eliminate the two extremal
times, and check that the deviation of the three remaining does not
exceed a given threshold, set to 5%.
It is also possible to use POLYBENCH_CYCLE_ACCURATE_TIMER to use the
Time Stamp Counter instead of gettimeofday() to monitor the number of
elapsed cycles.
----------------------------------------
* Generating macro-free benchmark suite:
----------------------------------------
(from the root of the archive:)
$> PARGS="-I utilities -DPOLYBENCH_TIME";
$> for i in `cat utilities/benchmark_list`; do create_cpped_version.sh $i "$PARGS"; done
This create for each benchmark file 'xxx.c' a new file
'xxx.preproc.c'. The PARGS variable in the above example can be set to
the desired configuration, for instance to create a full C99 version
(parametric arrays):
$> PARGS="-I utilities -DPOLYBENCH_USE_C99_PROTO";
$> for i in `cat utilities/benchmark_list`; do ./utilities/create_cpped_version.sh "$i" "$PARGS"; done