benchcomp configuration file

benchcomp's operation is controlled through a YAML file---benchcomp.yaml by default or a file passed to the -c/--config option. This page lists the different visualizations that are available.

Built-in visualizations

The following visualizations are available; these can be added to the visualize list of benchcomp.yaml.

Detailed documentation for these visualizations follows.


Print Markdown-formatted tables displaying benchmark results

For each metric, this visualization prints out a table of benchmarks, showing the value of the metric for each variant.

The 'out_file' key is mandatory; specify '-' to print to stdout.

'extra_colums' can be an empty dict. The sample configuration below assumes that each benchmark result has a 'success' and 'runtime' metric for both variants, 'variant_1' and 'variant_2'. It adds a 'ratio' column to the table for the 'runtime' metric, and a 'change' column to the table for the 'success' metric. The 'text' lambda is called once for each benchmark. The 'text' lambda accepts a single argument---a dict---that maps variant names to the value of that variant for a particular metric. The lambda returns a string that is rendered in the benchmark's row in the new column. This allows you to emit arbitrary text or markdown formatting in response to particular combinations of values for different variants, such as regressions or performance improvements.

Sample configuration:

- type: dump_markdown_results_table
  out_file: "-"
    - column_name: ratio
      text: >
        lambda b: str(b["variant_2"]/b["variant_1"])
        if b["variant_2"] < (1.5 * b["variant_1"])
        else "**" + str(b["variant_2"]/b["variant_1"]) + "**"
    - column_name: change
      text: >
        lambda b: "" if b["variant_2"] == b["variant_1"]
        else "newly passing" if b["variant_2"]
        else "regressed"

Example output:

## runtime

| Benchmark |  variant_1 | variant_2 | ratio |
| --- | --- | --- | --- |
| bench_1 | 5 | 10 | **2.0** |
| bench_2 | 10 | 5 | 0.5 |

## success

| Benchmark |  variant_1 | variant_2 | change |
| --- | --- | --- | --- |
| bench_1 | True | True |  |
| bench_2 | True | False | regressed |
| bench_3 | False | True | newly passing |


Print the YAML-formatted results to a file.

The 'out_file' key is mandatory; specify '-' to print to stdout.

Sample configuration:

- type: dump_yaml
  out_file: '-'


Terminate benchcomp with a return code of 1 if any benchmark regressed.

This visualization checks whether any benchmark regressed from one variant to another. Sample configuration:

- type: error_on_regression
  - [variant_1, variant_2]
  - [variant_1, variant_3]
  - metric: runtime
    test: "lambda old, new: new / old > 1.1"
  - metric: passed
    test: "lambda old, new: False if not old else not new"

This says to check whether any benchmark regressed when run under variant_2 compared to variant_1. A benchmark is considered to have regressed if the value of the 'runtime' metric under variant_2 is 10% higher than the value under variant_1. Furthermore, the benchmark is also considered to have regressed if it was previously passing, but is now failing. These same checks are performed on all benchmarks run under variant_3 compared to variant_1. If any of those lambda functions returns True, then benchcomp will terminate with a return code of 1.


Run an executable command, passing the performance metrics as JSON on stdin.

This allows you to write your own visualization, which reads a result file on stdin and does something with it, e.g. writing out a graph or other output file.

Sample configuration:

- type: run_command
  command: ./