benchcomp configuration file

benchcomp's operation is controlled through a YAML file---benchcomp.yaml by default or a file passed to the -c/--config option. This page lists the different visualizations that are available.

Variants

A variant is a single invocation of a benchmark suite. Benchcomp runs several variants, so that their performance can be compared later. A variant consists of a command-line argument, working directory, and environment. Benchcomp invokes the command using the operating system environment, updated with the keys and values in env. If any values in env contain strings of the form ${var}, Benchcomp expands them to the value of the environment variable $var.

variants:
    variant_1:
        config:
            command_line: echo "Hello, world"
            directory: /tmp
            env:
              PATH: /my/local/directory:${PATH}

Filters

After benchcomp has finished parsing the results, it writes the results to results.yaml by default. Before visualizing the results (see below), benchcomp can filter the results by piping them into an external program.

To filter results before visualizing them, add filters to the configuration file.

filters:
    - command_line: ./scripts/remove-redundant-results.py
    - command_line: cat

The value of filters is a list of dicts. Currently the only legal key for each of the dicts is command_line. Benchcomp invokes each command_line in order, passing the results as a JSON file on stdin, and interprets the stdout as a YAML-formatted modified set of results. Filter scripts can emit either YAML (which might be more readable while developing the script), or JSON (which benchcomp will parse as a subset of YAML).

Built-in visualizations

The following visualizations are available; these can be added to the visualize list of benchcomp.yaml.

Detailed documentation for these visualizations follows.

Environment

The core component of Jinja is the Environment. It contains important shared variables like configuration, filters, tests, globals and others. Instances of this class may be modified if they are not shared and if no template was loaded so far. Modifications on environments after the first template was loaded will lead to surprising effects and undefined behavior.

Here are the possible initialization parameters:

`block_start_string`
    The string marking the beginning of a block.  Defaults to ``'{%'``.

`block_end_string`
    The string marking the end of a block.  Defaults to ``'%}'``.

`variable_start_string`
    The string marking the beginning of a print statement.
    Defaults to ``'{{'``.

`variable_end_string`
    The string marking the end of a print statement.  Defaults to
    ``'}}'``.

`comment_start_string`
    The string marking the beginning of a comment.  Defaults to ``'{#'``.

`comment_end_string`
    The string marking the end of a comment.  Defaults to ``'#}'``.

`line_statement_prefix`
    If given and a string, this will be used as prefix for line based
    statements.  See also :ref:`line-statements`.

`line_comment_prefix`
    If given and a string, this will be used as prefix for line based
    comments.  See also :ref:`line-statements`.

    .. versionadded:: 2.2

`trim_blocks`
    If this is set to ``True`` the first newline after a block is
    removed (block, not variable tag!).  Defaults to `False`.

`lstrip_blocks`
    If this is set to ``True`` leading spaces and tabs are stripped
    from the start of a line to a block.  Defaults to `False`.

`newline_sequence`
    The sequence that starts a newline.  Must be one of ``'\r'``,
    ``'\n'`` or ``'\r\n'``.  The default is ``'\n'`` which is a
    useful default for Linux and OS X systems as well as web
    applications.

`keep_trailing_newline`
    Preserve the trailing newline when rendering templates.
    The default is ``False``, which causes a single newline,
    if present, to be stripped from the end of the template.

    .. versionadded:: 2.7

`extensions`
    List of Jinja extensions to use.  This can either be import paths
    as strings or extension classes.  For more information have a
    look at :ref:`the extensions documentation <jinja-extensions>`.

`optimized`
    should the optimizer be enabled?  Default is ``True``.

`undefined`
    :class:`Undefined` or a subclass of it that is used to represent
    undefined values in the template.

`finalize`
    A callable that can be used to process the result of a variable
    expression before it is output.  For example one can convert
    ``None`` implicitly into an empty string here.

`autoescape`
    If set to ``True`` the XML/HTML autoescaping feature is enabled by
    default.  For more details about autoescaping see
    :class:`~markupsafe.Markup`.  As of Jinja 2.4 this can also
    be a callable that is passed the template name and has to
    return ``True`` or ``False`` depending on autoescape should be
    enabled by default.

    .. versionchanged:: 2.4
       `autoescape` can now be a function

`loader`
    The template loader for this environment.

`cache_size`
    The size of the cache.  Per default this is ``400`` which means
    that if more than 400 templates are loaded the loader will clean
    out the least recently used template.  If the cache size is set to
    ``0`` templates are recompiled all the time, if the cache size is
    ``-1`` the cache will not be cleaned.

    .. versionchanged:: 2.8
       The cache size was increased to 400 from a low 50.

`auto_reload`
    Some loaders load templates from locations where the template
    sources may change (ie: file system or database).  If
    ``auto_reload`` is set to ``True`` (default) every time a template is
    requested the loader checks if the source changed and if yes, it
    will reload the template.  For higher performance it's possible to
    disable that.

`bytecode_cache`
    If set to a bytecode cache object, this object will provide a
    cache for the internal Jinja bytecode so that templates don't
    have to be parsed if they were not changed.

    See :ref:`bytecode-cache` for more information.

`enable_async`
    If set to true this enables async template execution which
    allows using async functions and generators.

Plot

Scatterplot configuration options

dump_markdown_results_table

Print Markdown-formatted tables displaying benchmark results

For each metric, this visualization prints out a table of benchmarks, showing the value of the metric for each variant, combined with an optional scatterplot.

The 'out_file' key is mandatory; specify '-' to print to stdout.

'extra_colums' can be an empty dict. The sample configuration below assumes that each benchmark result has a 'success' and 'runtime' metric for both variants, 'variant_1' and 'variant_2'. It adds a 'ratio' column to the table for the 'runtime' metric, and a 'change' column to the table for the 'success' metric. The 'text' lambda is called once for each benchmark. The 'text' lambda accepts a single argument---a dict---that maps variant names to the value of that variant for a particular metric. The lambda returns a string that is rendered in the benchmark's row in the new column. This allows you to emit arbitrary text or markdown formatting in response to particular combinations of values for different variants, such as regressions or performance improvements.

'scatterplot' takes the values 'off' (default), 'linear' (linearly scaled axes), or 'log' (logarithmically scaled axes).

Sample configuration:

visualize:
- type: dump_markdown_results_table
  out_file: "-"
  scatterplot: linear
  extra_columns:
    runtime:
    - column_name: ratio
      text: >
        lambda b: str(b["variant_2"]/b["variant_1"])
        if b["variant_2"] < (1.5 * b["variant_1"])
        else "**" + str(b["variant_2"]/b["variant_1"]) + "**"
    success:
    - column_name: change
      text: >
        lambda b: "" if b["variant_2"] == b["variant_1"]
        else "newly passing" if b["variant_2"]
        else "regressed"

Example output:

## runtime

| Benchmark |  variant_1 | variant_2 | ratio |
| --- | --- | --- | --- |
| bench_1 | 5 | 10 | **2.0** |
| bench_2 | 10 | 5 | 0.5 |

## success

| Benchmark |  variant_1 | variant_2 | change |
| --- | --- | --- | --- |
| bench_1 | True | True |  |
| bench_2 | True | False | regressed |
| bench_3 | False | True | newly passing |

dump_yaml

Print the YAML-formatted results to a file.

The 'out_file' key is mandatory; specify '-' to print to stdout.

Sample configuration:

visualize:
- type: dump_yaml
  out_file: '-'

error_on_regression

Terminate benchcomp with a return code of 1 if any benchmark regressed.

This visualization checks whether any benchmark regressed from one variant to another. Sample configuration:

visualize:
- type: error_on_regression
  variant_pairs:
  - [variant_1, variant_2]
  - [variant_1, variant_3]
  checks:
  - metric: runtime
    test: "lambda old, new: new / old > 1.1"
  - metric: passed
    test: "lambda old, new: False if not old else not new"

This says to check whether any benchmark regressed when run under variant_2 compared to variant_1. A benchmark is considered to have regressed if the value of the 'runtime' metric under variant_2 is 10% higher than the value under variant_1. Furthermore, the benchmark is also considered to have regressed if it was previously passing, but is now failing. These same checks are performed on all benchmarks run under variant_3 compared to variant_1. If any of those lambda functions returns True, then benchcomp will terminate with a return code of 1.

run_command

Run an executable command, passing the performance metrics as JSON on stdin.

This allows you to write your own visualization, which reads a result file on stdin and does something with it, e.g. writing out a graph or other output file.

Sample configuration:

visualize:
- type: run_command
  command: ./my_visualization.py