CPROVER Benchmarking ToolkitThe benchmarking toolkit consists of three main components:
Please report problems, bugs, questions, suggestions to Michael Tautschnig. DownloadDownload our benchmarking framework as tarball or as Debian/Ubuntu package. When using the .tar.gz archive, extract the archive and add cpbm-0.4 to your PATH enviroment variable. For the Debian/Ubuntu package: install it using dpkg -i cpbm_0.4-1_all.deb. Available Benchmarks
Benchmark Execution Example 1Download the benchmark package $ wget http://www.cprover.org/software/benchmarks/linux-2.6.19-ddverify.cprover-bm.tar.gz Unpack the kernel sources and the benchmark package $ cpbm unpack \ http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.tar.bz2 \ linux-2.6.19-ddverify.cprover-bm.tar.gz Enter the new directory $ cd linux-2.6.19-ddverify Run the default benchmark configuration and produce a LaTeX table $ cprover/rules -j4 table Produce an additional GNUplot script and display the graph $ cprover/rules graph | gnuplot Build a web page providing an overview table and all log files in cprover/results.satabs.dfl.web/: $ cprover/rules web Benchmark Execution Example 2Take a look at the complete workflow using examples from the SAMATE Reference Dataset. This ScreenFlow video requires a more recent version of the Adobe Flash Player to display. Please update your version of the Adobe Flash Player. Benchmark Archive ManagementBenchmarks consist of two files: (1) an original source archive as obtained from its original authors, without modification and (2) a corresponding <NAME>.cprover-bm.tar.gz file that contains all modifications, execution scripts, and other helpers. This design is inspired by Debian v3 source packages and may thus sound familiar to people who are used to working with Debian packages. (1) The original source archive must be a .zip, .tar.gz, .tgz, or .tar.bz2 archive. If the authors of the software to be benchmarked do not offer such an archive it should be built manually from whatever the authors provide as source. (2) The <NAME>.cprover-bm.tar.gz is created and managed by cpbm update as described below. The file ships a cprover/ directory that may contain arbitrary files, but at least a Makefile cprover/rules must be provided. The archive management scripts will populate a cprover/patches directory that precisely describes all modification to original sources. The basic design of this archive solution is completely independent of the software to be benchmarked. Its main purpose is the precise description of changes to the original source that were made in order to benchmark some tool. Archive management commands
Usage ExampleCreating a new benchmark suite, e.g., for the Linux kernel:Download the original sources from kernel.org: $ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.tar.bz2 Manually unpack them: $ tar xjf linux-2.6.19.tar.bz2 We rename the directory to get proper benchmark name "linux-2.6.19-foo" $ mv linux-2.6.19 linux-2.6.19-foo $ cd linux-2.6.19-foo Create the basic benchmarking archive linux-2.6.19-foo.cprover-bm.tar.gz $ cpbm init ../linux-2.6.19.tar.bz2 Edit some source files ... and record patches $ cpbm update ../linux-2.6.19.tar.bz2 To make the benchmark source and all patches available for others to use publish linux-2.6.19-foo.cprover-bm.tar.gz and the original source archive or its URL. Using the benchmark suite:$ cpbm unpack http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.19.tar.bz2 \ linux-2.6.19-foo.cprover-bm.tar.gz $ cd linux-2.6.19-foo Benchmark Executioncpbm init, as described above, creates as template the cprover/rules file (a Makefile). For most basic use cases filling in the names of the benchmarks to be run (e.g., all C files without the .c suffix) in the BENCHMARKS = XXX line and choosing a suitable default configuration in CONFIG = XXX (see examples in there) will suffice to produce a working benchmark package. The file may, however, be fully customized as deemed necessary. The only assumption made by cpbm is that cprover/rules clean performs a cleanup. Remember to run cpbm update before distributing the .cprover-bm.tar.gz file. The actual benchmark execution is then triggered by $ cprover/rules -j4 verifyto perform verification of all benchmark instances in 4 parallel threads of execution with the default configuration. This step first induces a build of each benchmark from source, if necessary. To choose a different configuration for the verification run, override the CONFIG variable. For instance, to perform verification using CPAchecker's explicit analysis use $ cprover/rules -j4 verify CONFIG=cpachecker.explicit Benchmark Execution HelpersBenchmark execution makes use of a number of helpers:
Result EvaluationThe names of all log files produced by cpbm run will be listed in cprover/verified.$CONFIG. Performing $ cprover/rules csvyields a CSV (comma-separated value) formatted summary of all results in cprover/results.$CONFIG.csv. This file may be read using most spreadsheet programs for further manual inspection and evaluation. With cpbm, however, also LaTeX tables and GNUplot graphs may be produced: $ cprover/rules tableyields a LaTeX summary of execution times and memory usage of all benchmark instances. With $ cprover/rules graphfurthermore a GNUplot script is produced which may be copied or piped into the gnuplot command. $ cprover/rules webcollects all log files in a new directory cprover/results.$CONFIG.web plus an index.html file that contains an HTML formatted version of the CSV table. Each benchmark links to the respective log file. These steps permits a number of customizations:
|