benchcomp
configuration file
benchcomp
's operation is controlled through a YAML file---benchcomp.yaml
by default or a file passed to the -c/--config
option.
This page lists the different visualizations that are available.
Variants
A variant is a single invocation of a benchmark suite. Benchcomp runs several
variants, so that their performance can be compared later. A variant consists of
a command-line argument, working directory, and environment. Benchcomp invokes
the command using the operating system environment, updated with the keys and
values in env
. If any values in env
contain strings of the form ${var}
,
Benchcomp expands them to the value of the environment variable $var
.
variants:
variant_1:
config:
command_line: echo "Hello, world"
directory: /tmp
env:
PATH: /my/local/directory:${PATH}
Filters
After benchcomp has finished parsing the results, it writes the results to results.yaml
by default.
Before visualizing the results (see below), benchcomp can filter the results by piping them into an external program.
To filter results before visualizing them, add filters
to the configuration file.
filters:
- command_line: ./scripts/remove-redundant-results.py
- command_line: cat
The value of filters
is a list of dicts.
Currently the only legal key for each of the dicts is command_line
.
Benchcomp invokes each command_line
in order, passing the results as a JSON file on stdin, and interprets the stdout as a YAML-formatted modified set of results.
Filter scripts can emit either YAML (which might be more readable while developing the script), or JSON (which benchcomp will parse as a subset of YAML).
Built-in visualizations
The following visualizations are available; these can be added to the visualize
list of benchcomp.yaml
.
Detailed documentation for these visualizations follows.
Environment
The core component of Jinja is the Environment
. It contains
important shared variables like configuration, filters, tests,
globals and others. Instances of this class may be modified if
they are not shared and if no template was loaded so far.
Modifications on environments after the first template was loaded
will lead to surprising effects and undefined behavior.
Here are the possible initialization parameters:
`block_start_string`
The string marking the beginning of a block. Defaults to ``'{%'``.
`block_end_string`
The string marking the end of a block. Defaults to ``'%}'``.
`variable_start_string`
The string marking the beginning of a print statement.
Defaults to ``'{{'``.
`variable_end_string`
The string marking the end of a print statement. Defaults to
``'}}'``.
`comment_start_string`
The string marking the beginning of a comment. Defaults to ``'{#'``.
`comment_end_string`
The string marking the end of a comment. Defaults to ``'#}'``.
`line_statement_prefix`
If given and a string, this will be used as prefix for line based
statements. See also :ref:`line-statements`.
`line_comment_prefix`
If given and a string, this will be used as prefix for line based
comments. See also :ref:`line-statements`.
.. versionadded:: 2.2
`trim_blocks`
If this is set to ``True`` the first newline after a block is
removed (block, not variable tag!). Defaults to `False`.
`lstrip_blocks`
If this is set to ``True`` leading spaces and tabs are stripped
from the start of a line to a block. Defaults to `False`.
`newline_sequence`
The sequence that starts a newline. Must be one of ``'\r'``,
``'\n'`` or ``'\r\n'``. The default is ``'\n'`` which is a
useful default for Linux and OS X systems as well as web
applications.
`keep_trailing_newline`
Preserve the trailing newline when rendering templates.
The default is ``False``, which causes a single newline,
if present, to be stripped from the end of the template.
.. versionadded:: 2.7
`extensions`
List of Jinja extensions to use. This can either be import paths
as strings or extension classes. For more information have a
look at :ref:`the extensions documentation <jinja-extensions>`.
`optimized`
should the optimizer be enabled? Default is ``True``.
`undefined`
:class:`Undefined` or a subclass of it that is used to represent
undefined values in the template.
`finalize`
A callable that can be used to process the result of a variable
expression before it is output. For example one can convert
``None`` implicitly into an empty string here.
`autoescape`
If set to ``True`` the XML/HTML autoescaping feature is enabled by
default. For more details about autoescaping see
:class:`~markupsafe.Markup`. As of Jinja 2.4 this can also
be a callable that is passed the template name and has to
return ``True`` or ``False`` depending on autoescape should be
enabled by default.
.. versionchanged:: 2.4
`autoescape` can now be a function
`loader`
The template loader for this environment.
`cache_size`
The size of the cache. Per default this is ``400`` which means
that if more than 400 templates are loaded the loader will clean
out the least recently used template. If the cache size is set to
``0`` templates are recompiled all the time, if the cache size is
``-1`` the cache will not be cleaned.
.. versionchanged:: 2.8
The cache size was increased to 400 from a low 50.
`auto_reload`
Some loaders load templates from locations where the template
sources may change (ie: file system or database). If
``auto_reload`` is set to ``True`` (default) every time a template is
requested the loader checks if the source changed and if yes, it
will reload the template. For higher performance it's possible to
disable that.
`bytecode_cache`
If set to a bytecode cache object, this object will provide a
cache for the internal Jinja bytecode so that templates don't
have to be parsed if they were not changed.
See :ref:`bytecode-cache` for more information.
`enable_async`
If set to true this enables async template execution which
allows using async functions and generators.
Plot
Scatterplot configuration options
dump_markdown_results_table
Print Markdown-formatted tables displaying benchmark results
For each metric, this visualization prints out a table of benchmarks, showing the value of the metric for each variant, combined with an optional scatterplot.
The 'out_file' key is mandatory; specify '-' to print to stdout.
'extra_colums' can be an empty dict. The sample configuration below assumes that each benchmark result has a 'success' and 'runtime' metric for both variants, 'variant_1' and 'variant_2'. It adds a 'ratio' column to the table for the 'runtime' metric, and a 'change' column to the table for the 'success' metric. The 'text' lambda is called once for each benchmark. The 'text' lambda accepts a single argument---a dict---that maps variant names to the value of that variant for a particular metric. The lambda returns a string that is rendered in the benchmark's row in the new column. This allows you to emit arbitrary text or markdown formatting in response to particular combinations of values for different variants, such as regressions or performance improvements.
'scatterplot' takes the values 'off' (default), 'linear' (linearly scaled axes), or 'log' (logarithmically scaled axes).
Sample configuration:
visualize:
- type: dump_markdown_results_table
out_file: "-"
scatterplot: linear
extra_columns:
runtime:
- column_name: ratio
text: >
lambda b: str(b["variant_2"]/b["variant_1"])
if b["variant_2"] < (1.5 * b["variant_1"])
else "**" + str(b["variant_2"]/b["variant_1"]) + "**"
success:
- column_name: change
text: >
lambda b: "" if b["variant_2"] == b["variant_1"]
else "newly passing" if b["variant_2"]
else "regressed"
Example output:
## runtime
| Benchmark | variant_1 | variant_2 | ratio |
| --- | --- | --- | --- |
| bench_1 | 5 | 10 | **2.0** |
| bench_2 | 10 | 5 | 0.5 |
## success
| Benchmark | variant_1 | variant_2 | change |
| --- | --- | --- | --- |
| bench_1 | True | True | |
| bench_2 | True | False | regressed |
| bench_3 | False | True | newly passing |
dump_yaml
Print the YAML-formatted results to a file.
The 'out_file' key is mandatory; specify '-' to print to stdout.
Sample configuration:
visualize:
- type: dump_yaml
out_file: '-'
error_on_regression
Terminate benchcomp with a return code of 1 if any benchmark regressed.
This visualization checks whether any benchmark regressed from one variant to another. Sample configuration:
visualize:
- type: error_on_regression
variant_pairs:
- [variant_1, variant_2]
- [variant_1, variant_3]
checks:
- metric: runtime
test: "lambda old, new: new / old > 1.1"
- metric: passed
test: "lambda old, new: False if not old else not new"
This says to check whether any benchmark regressed when run under variant_2 compared to variant_1. A benchmark is considered to have regressed if the value of the 'runtime' metric under variant_2 is 10% higher than the value under variant_1. Furthermore, the benchmark is also considered to have regressed if it was previously passing, but is now failing. These same checks are performed on all benchmarks run under variant_3 compared to variant_1. If any of those lambda functions returns True, then benchcomp will terminate with a return code of 1.
run_command
Run an executable command, passing the performance metrics as JSON on stdin.
This allows you to write your own visualization, which reads a result file on stdin and does something with it, e.g. writing out a graph or other output file.
Sample configuration:
visualize:
- type: run_command
command: ./my_visualization.py