bazel aquery is a new bazel command that queries the action graph, and thus allows you to gain insights about the actions executed in a build (inputs, outputs, command line, …).
aquery’s API is now stable and supported by the Bazel team.
When performing a
bazel build, you may find yourself wondering about the under-the-hood details of the build process, particularly about the actions executed:
"What was the exact command line that produced this file?"
"Did the new change in my rule implementation affect the actions previously generated by the rule?"
"Which actions have file X as an input?"
Those are some of the questions which can be answered with
aquery command allows you to query for actions to be executed in your build. It operates on the post-analysis action graph and exposes information about actions, artifacts and their relationships.
An example usage of
aquery can be found in the Bazel issue #6861, where we are migrating legacy CROSSTOOL fields. In this case, Bazel users would run a migration tool, and then use
aquery to verify that the migration tool works properly, in particular:
- The actions generated while building the same target before & after running the migration tool are the same.
- The command lines run for each action are the same.
The specific usage is implemented in
aquery_differ tool. This also serves as an example of how tools can be built on top of
Background & Motivation
Apart from providing the ability to build & test your projects, Bazel also offers insights into how those processes happen with
cquery. These existing tools have been very helpful with answering the questions about dependencies of targets in your Bazel project.
The Bazel build process consists of 3 phases1: loading, analysis and execution.
query operates on the post-loading phase target graph, which makes it unaware of the configurations of these targets.
cquery moves it further down the building process and queries the post-analysis configured targets, thus includes the actual configurations.
The topology of the configured target graph closely resembles the dependency graph of targets established by the BUILD files. It offers information on the dependency between targets in a build, but not on the actual build actions that will be run to execute that build. To gain insights on the exact actions executed in a build, we have to go one level deeper, to the action graph.
aquery runs on the configured target graph and queries the action graph. The action graph2 is the result of the analysis phase. It is a bipartite graph with the following types of nodes:
- Artifacts: either a source file or any output file produced by an action
- Actions: the functional step that takes a list of artifacts as input and outputs a list of artifacts. Note that any (output) artifact is produced by exactly one action. The action graph conveys explicit step-by-step instructions on how the build would be executed.
aquery, it is now possible to tap into that knowledge.
How To Use
aquery is useful when we are interested in the properties of the actions/artifacts in the action graph. It uses the same query language as
cquery, with some additional
aquery-specific functions. The basic structure of
aquery output is as follows:
$ bazel aquery '//some:label' action 'Writing file some_file_name' Mnemonic: ... Target: ... Configuration: ... ActionKey: ... Inputs: [...] Outputs: [...] ...
Each action entry encapsulates all the information you need to know about how this action is to be executed: the actual commands run, the configuration in which the action is run, its input/output artifacts, and other attributes.
Another nifty feature in
aquery is the ability to filter the actions based on their inputs, outputs and mnemonics. This is useful to answer questions like: “Which action, from which target, is responsible for creating file foo.out”.
# List all actions generated while building all dependencies of //src/target_a $ bazel aquery 'deps(//src/target_a)' # List all actions generated while building all dependencies of //src/target_a # that have C++ files in their inputs. $ bazel aquery 'inputs(".*cc”, deps(//src/target_a))' # Which action generated `foo.out` after building all dependencies of target //src/target_a $ bazel aquery ‘outputs(“.*foo.out”, deps(//src/target_a))’
Apart from these basic features,
aquery offers customizations for your specific use cases with its various flags and tools.
aquery supports 3 different output formats:
text (default, human-readable with formatting),
textproto (a human-readable representation of the proto output).
A common use case of
aquery is to find the action responsible for generating a particular file
foo.out. However, it is often the case that multiple build commands for different targets were run prior to the query. Imagine the following sequence:
bazel build //target_a
bazel build //target_b
One could run
bazel aquery 'outputs("foo.out", //target_a)' and
bazel aquery 'outputs("foo.out", //target_b)' to figure out the action responsible for creating
foo.out, and in turn the target. However, the number of different targets previously built can be larger than 2, which makes running multiple
aquery commands a hassle.
As an alternative, the
--skyframe_state flag can be used:
# Find all actions on skyframe that has “foo.out” as an output bazel aquery --skyframe_state --output=proto ‘outputs(“*.foo.out”)’
--skyframe_state mode, aquery takes the content of the action graph that Skyframe3 keeps on the current instance of Bazel and (optionally) performs filtering on it and outputs the content, without re-running the analysis phase.
Note that for
--skyframe_state, the target label is omitted from the query expression. More details on this flag can be found here.
Comparing Aquery Outputs With
There are times when there’s a need to compare two different
aquery outputs (for instance: when you make some changes to your rule definition and want to verify that the command lines being run is still the same).
aquery_differ is the tool for that.
# The tool is available at https://github.com/bazelbuild/bazel/tree/master/tools/aquery_differ $ bazel run //tools/aquery_differ -- \ --before=/path/to/before.proto \ --after=/path/to/after.proto \ --input_type=proto \ --attrs=cmdline \ --attrs=inputs
The above command returns the difference between the
aquery outputs (e.g. which actions were present in one but not the other, which actions have different command line/inputs in each
aquery output, ...).
With this blog post, we declare
aquery stable and supported by the Bazel team. Please give it a try and let us know what you think! For more details on
aquery, check out the aquery documentation.
1: In the actual implementation of Bazel, we interleave _loading & analysis phases. The “Target Graph” at the end of Loading phase is only materialized with
bazel query and not in actual builds._