0011-source-coverage - Kani RFC Book

Feature Name: Source-based code coverage (source-coverage)
Feature Request Issue: https://github.com/model-checking/kani/issues/2640
RFC PR: https://github.com/model-checking/kani/pull/3143
Status: Under Review
Version: 1
Proof-of-concept: https://github.com/model-checking/kani/pull/3119 (Kani) + https://github.com/model-checking/kani/pull/3121 (kani-cov)

Summary

A source-based code coverage feature for Kani built on top of Rust's coverage instrumentation.

In our first attempt to add a coverage feature fully managed by Kani, we introduced and made available a line coverage option (see RFC: Line coverage for more details). This option has since then allowed us to gather more data around the expectations for a coverage feature in Kani.

For example, the line coverage output we produced was not easy to interpret without knowing some implementation details. Aside from that, the feature requested in #2795 alludes to the need of providing coverage-specific tooling in Kani. Nevertheless, as captured in #2640, source-based coverage results provide the clearest and most precise coverage information.

In this RFC, we propose an integration with Rust's source-based code coverage instrumentation. This integration would allow us to report source-based code coverage results from Kani. Also, we propose adding a new user-facing, coverage-focused tool called kani-cov. The tool would allow users to process coverage results generated by Kani and produce coverage artifacts such as summaries and reports according to their preferences. In the next section, we will explain in more detail how we expect kani-cov to assist with coverage-related tasks.

With these changes, we expect our coverage options to become more flexible, precise and efficient. These options are expected to replace the previous options available through the line coverage feature. In the last section of this RFC, we will also discuss the requirements for a potential integration of this coverage feature with the LLVM toolchain.

User Experience

The proposed experience is partially inspired by that of the most popular coverage frameworks. First, let us delve into the LLVM coverage workflow, followed by an explanation of our proposal.

The LLVM code coverage workflow

The LLVM project is home to one of the most popular code coverage frameworks. The workflow associated to the LLVM framework is described in the documentation for source-based code coverage¹, but we briefly describe it here to better relate it with our proposal.

In short, the LLVM code coverage workflow follows three steps:

Compiling with coverage enabled. This causes the compiler to generate an instrumented program.
Running the instrumented program. This generates binary-encoded .profraw files.
Using tools to aggregate and export coverage information into other formats.

When working in a cargo project, step 1 can be done through this command:

RUSTFLAGS='-Cinstrument-coverage' cargo build

The same flag must to be used for step 2:

RUSTFLAGS='-Cinstrument-coverage' cargo run

This should populate the directory with at least one .profraw file. Each .profraw file corresponds to a specific source code file in your project.

At this point, we will have produced the artifacts that we generally require for the LLVM tools:

The instrumented binary which, in addition to the instrumented program, contains additional information (e.g., the coverage mappings) required to interpret the profiling results.
The .profraw files which essentially includes the profiling results (e.g., counter values) for each function of the corresponding source code file.

For step 3, the commands will depend on what kind of results we want. Most likely we will have to merge the .profraw files and produce a .profdata file as follows:

llvm-profdata merge -sparse *.profraw -o output.profdata

The resulting .profdata file will contain the aggregated coverage results from the .profraw files passed to the merge command.

Then, we can use a command such as

llvm-cov show target/debug/binary —instr-profile=output.profdata -show-line-counts-or-regions

to visualize the code coverage through the terminal as in the image:

Source-based code coverage with llvm-cov

or the command

llvm-cov report target/debug/binary --instr-profile=output.profdata --show-region-summary

to produce coverage summaries like this:

Filename                                             Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
/long/long/path/to/my/project/binary/src/main.rs           9                 3    66.67%           3                 1    66.67%          14                 4    71.43%           0                 0         -
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                                                      9                 3    66.67%           3                 1    66.67%          14                 4    71.43%           0                 0         -

The LLVM project refers to their own coverage feature as source-based code coverage. It is not rare to see the term region coverage being used instead to refer to the same thing. That is because LLVM's source-based code coverage feature can report coverage for code regions, but other coverage frameworks do not support the concept of code regions.

The Kani coverage workflow

The two main components of the Kani coverage workflow that we propose are the following:

The existing --coverage flag that drives the coverage workflow in Kani, emits raw coverage data (as in the .profraw files), and produces basic coverage results by default.
A new subcommand cov that allows users to further process raw coverage information emitted by Kani to produce customized coverage results (i.e., different to the ones produced by default with the --coverage option). The cov subcommand is an alias for the kani-cov tool.

In contrast to the LLVM workflow, where human-readable coverage results can be produced only after a sequence of LLVM tool commands, we provide some coverage results by default. This aligns better with our UX philosophy, and removes the need for a wrapper around our coverage features like cargo-llvm-cov. Alternatively, the cov subcommand offers the ability of producing more specific coverage results if needed. We anticipate the cov subcommand being particularly useful in less standard project setups, giving the users the flexibility required to produce coverage results tailored to their specific needs.

In the following, we describe each one of these components in more detail.

The `--coverage` option

The default coverage workflow will be kicked off through the unstable --coverage option:

cargo kani --coverage -Zsource-coverage

The main difference with respect to the regular verification workflow is that, at the end of the verification-based coverage run, Kani will generate two coverage results:

A coverage summary corresponding to the coverage achieved by the harnesses included in the verification run. This summary will be printed after the verification output.
A coverage report corresponding to the coverage achieved by the harnesses included in the verification run. The report will be placed in the same target directory where the raw coverage files are put. The path to the report will also be printed after the verification output.

Therefore, a typical --coverage run could look like this:

VERIFICATION:- SUCCESSFUL

Coverage Results:

| Filename | Regions | Missed Regions | Cover | Functions | Missed Functions | Cover |
| -------- | ------- | -------------- | ----- | --------- | ---------------- | ----- |
| main.rs  |       9 |              3 | 66.67 |         3 |                1 | 33.33 |
| file.rs  |      11 |              5 | 45.45 |         2 |                1 | 50.00 |

Coverage report available in target/kani/x86_64-unknown-linux-gnu/cov/kani_2024-04-26_15-30-00/report/index.html

The `cov` subcommand

The cov subcommand will be used to process raw coverage information generated by Kani and produce coverage outputs as indicated by the user. Hence, the cov subcommand corresponds to the set of LLVM tools (llvm-profdata, llvm-cov, etc.) that are used to produce coverage outputs through the LLVM coverage workflow.

In contrast to LLVM, we will have a single subcommand for all Kani coverage-related needs. The cov subcommand will just call the kani-cov tool, which is expected to be shipped along the rest of Kani binaries.

We suggest that the subcommand initially offers two options:

An option to merge the coverage results from one or more files and coverage mappings² into a single file.
An option to produce coverage outputs from coverage results, including summaries or coverage reports in human-readable formats (e.g., HTML).

Let's assume that we have run cargo kani --coverage -Zsource-coverage and generated coverage files in the my-coverage folder. Then, we would use cargo kani cov as follows to combine the coverage results³ for all harnesses:

cargo kani cov --merge my-coverage/*.kaniraw -o my-coverage.kanicov

Let's say the user is first interested in reading a coverage summary through the terminal. They can use the --summary option for that:

cargo kani cov --summary my-coverage/default.kanimap -instr-profile=my-coverage.kanicov

The command could print a coverage summary like:

| Filename | Regions | Missed Regions | Cover | Functions | ...
| -------- | ------- | -------------- | ----- | --------- | ...
| main.rs  |       9 |              3 | 66.67 |         3 | ...
[...]

Now, let's say the user wants to produce an HTML report of the coverage results. They will have to use the --report option for that:

cargo kani cov --report my-coverage/default.kanimap -format=html -instr-profile=my-coverage.kanicov -o coverage-report

This time, the command will generate a coverage-report folder including a browsable HTML webpage that highlights the regions covered in the source according to the coverage results in my-coverage.kanicov.

⁴

The llvm-cov tool includes the option gcov to export into GCC's coverage format Gcov, and the option export to export into the LCOV format. These may be good options to consider for kani-cov in the future but we should focus on basic formats for now.

Options to exclude certain coverage results (e.g, from the standard library) will likely be part of this option.

Coverage mappings essentially provide a snapshot of the source code reports for items that otherwise are unreachable or have been sliced away during the compilation process.

Integration with the Kani VS Code Extension

We will update the coverage feature of the Kani VS Code Extension to follow this new coverage workflow. In other words, the extension will first run Kani with the --coverage option and use kani cov to produce a .kanicov file with the coverage results. The extension will consume the source-based code coverage results and highlight region coverage in the source code seen from VS Code.

We could also consider other coverage-related features in order to enhance the experience through the Kani VS Code Extension. For example, we could automatically show the percentage of covered regions in the status bar by additionally extracting a summary of the coverage results.

Finally, we could also consider an integration with other code coverage tools. For example, if we wanted to integrate with the VS Code extensions Code Coverage or Coverage Gutters, we would only need to extend kani-cov to export coverage results to the LCOV format or integrate Kani with LLVM tools as discussed in Integration with LLVM.

Detailed Design

In this section, we provide more details on:

The Rust coverage instrumentation and how it can be integrated into Kani to produce source-based code coverage results.
The proposed coverage workflow to be run by default in Kani when the --coverage option is used.

This information is mostly intended as a reference for Kani contributors. Currently, the Rust coverage instrumentation continues to be developed. Because of that, Rust toolchain upgrades may result in breaking changes to our own coverage feature. This section should help developers to understand the general approach and resolve such issues by themselves.

The Rust coverage instrumentation

The Rust compiler includes two code coverage implementations:

A source-based coverage implementation which uses LLVM's coverage instrumentation to generate precise coverage data. This implementation can be enabled with -C instrument-coverage.
A Gcov-based coverage implementation that derives coverage data based on DebugInfo. This implementation can be enabled with -Z profile.

The Instrumentation-based Code Coverage chapter from the rustc book describes in detail how to enable and use the LLVM instrumentation-based coverage feature. In contrast, the LLVM Source-Based Code Coverage chapter from the rustc development guide documents how the LLVM coverage instrumentation is performed in the Rust compiler.

In this section, we will first summarize some information from the LLVM Source-Based Code Coverage chapter, limited to details which are relevant to the development of the source-based coverage feature in Kani. Then, we will explain how Kani taps into the Rust coverage instrumentation to perform its own coverage instrumentation and be able to report source-based code coverage results. This will also include mentions to current issues with this implementation, which we plan to further discuss in Future possibilities.

Understanding the Rust coverage instrumentation

The LLVM coverage instrumentation is implemented in the Rust compiler as a MIR pass called InstrumentCoverage.

The MIR pass first builds a coverage-specific version of the MIR Control Flow Graph (CFG) from the MIR. The initial version of this CFG is based on the MIR's BasicBlocks, which then gets refined by combining blocks that can be chained from a coverage-relevant point of view. The final version of the coverage CFG is then used to determine where to inject the StatementKind::Coverage statements in order to measure coverage for a single region coverage span.

The injection of StatementKind::Coverage statements is the main result we are interested in for the integration with Kani. Additionally, the instrumentation will also attach the FunctionCoverageInfo structure to each function's body.⁵ This result is also needed at the moment because coverage statements do not include information on the code region they are supposed to cover. However, FunctionCoverageInfo contains the coverage mappings, which represent the relation between coverage counters and code regions.

As explained in MIR Pass: InstrumentCoverage, many coverage statements will not be converted into a physical counter⁶. Instead, they will be converted into a coverage-counter expression that can be calculated based on other coverage counters. We highly recommend looking at the example in MIR Pass: InstrumentCoverage to better understand how this works. This optimization is mainly done for performance reasons because incrementing a physical counter causes a non-negligible overhead, especially within loops.

The (StatementKind::)Coverage statements that are injected by the Rust coverage instrumentation contain a CoverageKind field indicating the type of coverage counter. The variant CounterIncrement represents physical counters, while ExpressionUsed represents the counter expressions that we just discussed. Other variants such as SpanMarker or BlockMarker are not relevant to this work since they should have been erased after the InstrumentCoverage pass.

⁵

It is important to note that the rustc_public interface does not include FunctionCoverageInfo in function bodies. Because of that, we need to pull it from the internal rustc function bodies.

⁶

By physical counter, we refer to a global program variable that is initialized to zero and incremented by one each time that the execution passes through.

Integrating the instrumentation into Kani

Now that we have explained what the Rust coverage instrumentation does at a high level, we should be ready to discuss how it can be used from Kani. Here, we will follow an approach where, during the codegen stage, we generate a Kani reachability check for each code region and, after the verification stage, we postprocess the information in those checks to generate the coverage information. So this section will essentially be a retelling of the implementation in #3119, and we will discuss variations/extensions of this approach in the appropriate sections.

Clearly, the first step is adding -C instrument-coverage to the rustc flags we use when calling the compiler to codegen. This flag enables the Rust coverage instrumentation that we discussed earlier, resulting in

the injection of Coverage statements in the MIR code, and
the inclusion of FunctionCoverageInfo in function bodies.

The next step is handling the Coverage statements from codegen_statement.

Each Coverage statement contains opaque coverage information⁷ of the CoverageKind type which can be processed to determine the type of coverage counter (CounterIncrement for physical counters, ExpressionUsed for counter expressions) and the ID number of the counter. These two pieces of information allow us to uniquely identify the counter within a given function. For example, CounterIncrement(0) would generally refer to the first physical counter in the function.

Unfortunately, the CoverageKind information does not tell us anything about the code region that the counter covers. However, data about the code region can be pulled from the coverage mappings included in the FunctionCoverageInfo that is attached to the (internal) function body. Note that the coverage mappings includes information about all the coverage counters in a function, even for counters which have been dropped. Matching the CoverageKind information with that of the counters in the coverage mappings allows us to retrieve the code region for any counter.

Using all this data, for each coverage statement⁸ we generate a coverage check that maintains the essence of the coverage checks described in the RFC for line coverage:

Coverage checks are a new class of checks similar to cover checks. The main difference is that users cannot directly interact with coverage checks (i.e., they cannot add or remove them manually). Coverage checks are encoded as an assert(false) statement (to test reachability) with a fixed description. In addition, coverage checks are:

Hidden from verification results.

Postprocessed to produce coverage results.

Therefore, the last step is to postprocess the results from coverage checks to produce coverage results. This is not too complicated to do since the checks already include the counter information (type + ID) and the function name in the check's description. If the span of the code region is also included (this is what #3119 is currently doing), we can directly generate a primitive output like this:

<file_path> (<function_name>)
 * <region_start> - <region_end> <status>
 * ...
 * <region_start> - <region_end> <status>

For example, for the test case in #3119 we report this:

src/main.rs (main)
 * 14:1 - 19:2 COVERED

src/main.rs (test_cov)
 * 5:1 - 6:15 COVERED
 * 6:19 - 6:28 UNCOVERED
 * 7:9 - 7:13 COVERED
 * 9:9 - 9:14 UNCOVERED
 * 11:1 - 11:2 COVERED

NOTE: This section has been written according to the implementation in #3119, which currently produces a text-based output like the one shown above. There is ongoing work to store the coverage mappings in a separate file (as described in the next section), which would save us the need to attach code region data to the coverage checks.

⁷

The Rust compiler uses the Opaque type to prevent others from interfacing with unstable types (e.g., the Coverage type here). Nonetheless, this can be worked around by serializing its contents and parsing it back into an internal data type.

⁸

We could follow an alternative approach where we do not instrument each coverage statement, but only those that correspond to physical counters. Unfortunately, doing so would lead to incorrect coverage results due to the arithmetic nature of expression-based counters. We elaborate on this topic in the later parts of this document.

The default coverage workflow in Kani

In this section, we describe the default --coverage workflow from a developer's point of view. This will hopefully help developers understand how the different coverage components in Kani are connected. For example, we'll describe the raw coverage information that gets produced throughout the default --coverage workflow and define the basic cov commands that it will execute.

The main difference with respect to the regular verification workflow is that, at the end of the verification-based coverage run, Kani will generate two types of files:

One single file .kanimap file for the project. This file will contain the coverage mappings for the project's source code.
One .kaniraw file for each harness. This file will contain the verification-based results for the coverage-oriented properties corresponding to a given harness.

Note that .kaniraw files correspond to .profraw files in the LLVM coverage workflow. Similarly, the .kanimap file corresponds to the coverage-related information that's embedded into the project's binaries in the LLVM coverage workflow.⁹

The files will be written into a new timestamped directory associated with the coverage run. The path to this directory will be printed to standard output in by default. For example, the draft implementation writes the coverage files into the target/kani/<target_triple>/cov/ directory.

Users aren't expected to read the information in any of these files. Therefore, there's no need to restrict their format. The draft implementation uses the JSON format but we might consider using a binary format if it doesn't scale.

In addition, Kani will produce two types of coverage results:

A coverage summary with the default options.
A terminal-based coverage report with the default options. However, we will only do this if the program is composed of a single source file¹⁰.

⁹

Note that the .kanimap generation isn't implemented in #3119. The draft implementation of kani-cov simply reads the source files referred to by the code coverage checks, but it doesn't get information about code trimmed out by the MIR linker.

¹⁰

In other words, standalone kani would always emit these terminal-based reports, but cargo kani would not unless the project contains a single Rust file (for example, src/main.rs).

Rationale and alternatives

Other coverage implementations

In a previous version of this feature, we used an ad-hoc coverage implementation. In addition to being very inefficient¹¹, the line-based coverage results were not trivial to interpret by users. At the moment, there's only another unstable, GCC-compatible code coverage implementation based on the Gcov format. The Gcov format is line-based so it's not able to report region coverage results. In other words, it's not as advanced nor precise as the source-based implementation.

¹¹

Actual performance benchmarks to follow in #3119.

Open questions

Do we want to instrument dependencies by default? Preliminary benchmarking results show a slowdown of 100% and greater. More evaluations are required to determine how we handle instrumentation for dependencies, and what options we might want to provide to users.
How do we handle features/options for kani-cov? In particular, do we need more details in this RFC?

Future possibilities

Integration with LLVM

As part of this work, we explored a potential integration with the LLVM framework. The idea behind such an integration would essentially involve producing coverage results in formats compatible with the LLVM framework (e.g., the .profraw format). The main advantage of integrating with the LLVM framework in this way is that we would not need a tool like kani-cov to aggregate coverage results; we could just use LLVM tools such as llvm-profdata and llvm-cov to consume them.

However, at this time we recommend against integrating with LLVM due to these reasons:

Generating the instrumented binary used in the LLVM coverage workflow requires a standard rustc compilation with --cfg kani in addition to other flags including -C instrument-coverage. This is likely to result in compilation errors since the standard rustc backend cannot produce code for Kani APIs, for example.
Producing the .profraw files requires executing the instrumented binary at least once. This would be an issue for Rust projects which assume a particular environment for their execution.
There are no stable interfaces to create or modify files in formats compatible with the LLVM framework. Even though the documentation for the LLVM Code Coverage Mapping Format is excellent, the easiest way to interact with files on these format is through LLVM tools (e.g., llvm-cov) which bring in many other LLVM dependencies. During our exploration, we attempted to decode and re-encode files in the .profraw to set the counter data to the values obtained during verification. To this end, we tried tools like llvm-profparser which can be used as a replacement for llvm-profdata and llvm-cov but failed to parse coverage files emitted by the Rust compiler (this is also related to the next point). Another crate that we used is coverage-dump, a recent tool in the Rust compiler used for testing purposes. coverage-dump extracts coverage mappings from LLVM IR assembly files (i.e., human-readable *.ll files) but does not work with the binary-encoded formats. Finally, we also built some ad-hoc tooling to perform these modifications but it soon became evident that we would need to develop it further in order to handle any program.
LLVM releases a new version approximately every six months. This would likely result in another "toolchain update" problem for Kani in order to provide compatibility with newer LLVM versions. Moreover, the Rust compiler supplies their own version of LLVM tools (rust-profdata, rust-cov, etc.) which are fully compatible with coverage-related artifacts produced by rustc.

Optimization with coverage-counter expressions

In the subsection related to the integration, we noted that we could follow an alternative approach where we only instrument coverage statements that correspond to physical counters. In fact, this would be the logical choice since the replacement of physical counters by expression-based counters would also be a performance optimization for us.

However, the expressions used in expression-based counters are built with the arithmetic operators Add (+) and Sub (-). On the other hand, the coverage checks performed by Kani have a boolean meaning: you either cover a region or you do not. Thus, there are many cases where these two notions of coverage counters are incompatible. For example, let's say we have this function:

fn check_value(val: u32) {
   if val == VALUE {
      do_this();
   } else {
      do_that();
   }
   do_something_else();
}

One way to optimize the counters in this function is to have two physical counters for the branches of the if statement (c1 and c2), and then an expression-based counter associated to the do_something_else() statement adding those (i.e., c3 = c1 + c2). If we have, for example, two executions for this program, with each one taking a different branch, then the results for the coverage counters will be c1 = 1, c2 = 1 and c3 = c1 + c2 = 2.

But what does c3 = 2 mean in the context of a verification-based coverage result? That is not clear. For instance, in a Kani trace, you could have a nondeterministic value for val which just happens to be val == VALUE and not at the same time. This would result in the same counters (c1 = 1, c2 = 1 and c3 = 2), but the program is being run only once!

Note that finding a verification-based replacement for the runtime operators in counter-based expressions is an interesting research topic. If we could establish a relation between the runtime and verification expressions, then we could avoid the instrumentation of coverage checks for expression-based counters. For example, could we replace the Add operator (+) with an Or operator (||)? Intuitively, this makes sense since verification-based coverage counters are binary. It also seems to work for our example since covering any of the branches should result in the do_something_else() statement being covered as well, with the counter values now being c1 = 1, c2 = 1 and c3 = 1. However, it is not clear that this would work for all cases, nor it is clear that we can replace Sub with another verification-based operator.