In an earlier blog post, we described how a resolved file can be used to freeze external dependencies.
Repository rules may indicate how their arguments have to be changed to produce identical output. This is the transition from following a branch to a fixed commit. The Starlark version of the
git_repository
rule already does that, and other rules will follow suit soon.With
bazel sync
there is a command to unconditionally fetch all external repositories, and with the--experimental_repository_resolved_file
option all the reproducible descriptions can be collected in a Starlark value that is written to a file.
In this post, we describe some recently added features.
These changes have been committed to the HEAD
revision of Bazel and will
be part of the 0.19 release.
Directly reading a resolved file instead of the WORKSPACE
file
The resolved file can be used as a propoper substitute to the WORKSPACE
file.
The option to enable this is --experimental_resolved_file_instead_of_workspace
.
If specified, the WORKSPACE
file will be completely ignored, and all
information about external repositories will be taken from the specified
file.
- In this way, the
WORKSPACE
file gets back its natural shape of describing which upstream repositories a build follows. It no longer needs to be aware of the use of a resolved file. Thus this approach can be used for existing projects following floating branches without changing the project in quesiton. For example, if following protobuf, theWORKSPACE
file simply reads as follows.
load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")
git_repository(
name = "bazel_skylib",
remote = "https://github.com/bazelbuild/bazel-skylib",
branch = "master",
)
git_repository(
name = "com_google_protobuf",
remote = "https://github.com/google/protobuf",
branch = "master",
)
Getting a snapshot of the upstream repositories followed is still a simple
bazel sync --experimental_repository_resolved_file=resolved.bzl
, optionally followed by committing the newly obtainedresolved.bzl
after testing.bazel build --experimental_resolved_file_instead_of_workspace=resolved.bzl ..
will take all information about external repositories from the fileresolved.bzl
. Thus, the build is fixed to the snapshot taken by thebazel sync
.
Verifying the output of a repository rule
The main purpose of freezing dependencies is to be able to replay a
particular build later, and also on a different machine. While git
is very good at producing the same directory when the same commit hash
is specified, the programatic
transformations
may cause observable differences between multiple invocations. For
example, a build differing on two machines might be due to the tools
(such as patch
, sed
, find
) being only mostly the same on each machine.
To detect such problems, we've added a new entry output_tree_hash
to the dict describing a repository. For example, the entry for
com_google_protobuf
in the resolved file now looks as follows.
resolved = [
...
{
"original_rule_class": "@bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
"original_attributes": {
"name": "com_google_protobuf",
"remote": "https://github.com/google/protobuf",
"branch": "master"
},
"repositories": [
{
"rule_class": "@bazel_tools//tools/build_defs/repo:git.bzl%git_repository",
"output_tree_hash": "a776ce4f591327c6b23d88d367d6208a88af6ad889e08f7b86a0edfc76fcfd96",
"attributes": {
"remote": "https://github.com/google/protobuf",
"commit": "a6e1cc7e328c45a0cb9856c530c8f6cd23314163",
"shallow_since": "2018-09-17",
"init_submodules": False,
"verbose": False,
"strip_prefix": "",
"patches": [],
"patch_tool": "patch",
"patch_args": [
"-p0"
],
"patch_cmds": [],
"name": "com_google_protobuf"
}
}
]
}
]
This new output_tree_hash
entry is a hash of the
directory
generated by the repository rule. It includes the names, contents, and
executability bit of all files. However, information that is likely to
be different between various users and won't affect most builds (like owner of
the files or the modification time) is ignored. Additionally, for symlinks to files
outside the repository, the content of the file is hashed, rather than the link
path itself; so the link generated by a build_file
argument is not a problem.
The resolved file from which to take the hashes can be specified with the
--experimental_repository_hash_file
option. Of course, not for all types
of "external repositories" we even expect reproducible content. For example
the cc_autoconf
rule is specifically designed to detect the local C++
toolchain, which might well differ from machine to machine. So, you can
use the --experimental_verify_repository_rules
option to specify which rule
classes should be verified. For example,
bazel build --experimental_repository_hash_file=resolved.bzl
--experimental_verify_repository_rules=@bazel_tools//tools/build_defs/repo:git.bzl%git_repository
...
will verify the hashes of all git repositories, but not do any
verification for repositories generated by other rules.
If a mismatch is found, for example, because you added the not-so-hermetic
patch_cmds = ["date +%s > .timestamp"]
to the rule for com_google_protobuf
,
you will get an error like the following.
$ bazel-exp build @com_google_protobuf//:protobuf
Starting local Bazel server and connecting to it...
INFO: Repository rule 'com_google_protobuf' returned: {"remote": "https://github.com/google/protobuf", "commit": "a6e1cc7e328c45a0cb9856c530c8f6cd23314163", "shallow_since": "2018-09-17", "init_submodules": False, "verbose": False, "strip_prefix": "", "patches": [], "patch_tool": "patch", "patch_args": ["-p0"], "patch_cmds": ["date +%s > .timestamp"], "name": "com_google_protobuf"}
ERROR: Skipping '@com_google_protobuf//:protobuf': no such package '@com_google_protobuf//': git_repository rule //external:com_google_protobuf failed to create a directory with expected hash 416a412dbbb1fa4f822374844dffedeb0b582fda6ffda95afb7936fb2f378ca0
WARNING: Target pattern parsing failed.
ERROR: no such package '@com_google_protobuf//': git_repository rule //external:com_google_protobuf failed to create a directory with expected hash 416a412dbbb1fa4f822374844dffedeb0b582fda6ffda95afb7936fb2f378ca0
INFO: Elapsed time: 4.606s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
$
This way, you can be sure to build against the same source tree of your
external dependencies that you (or some one else) used when generating
the resolved.bzl
file.
.bazelrc
to specify the use of resolved.bzl
As the whole workflow is controlled only by flags, this all can be set
up once in your configuration file. Just add the following to
your .bazelrc
.
sync --experimental_repository_resolved_file=resolved.bzl
build --experimental_resolved_file_instead_of_workspace=resolved.bzl
build --experimental_repository_hash_file=resolved.bzl
build --experimental_verify_repository_rules=@bazel_tools//tools/build_defs/repo:git.bzl%git_repository
And then all steps are as simple as they could be.
To update you snapshot of external dependencies, simply type
bazel sync
. You might want to commit the updatedresolved.bzl
, once you have tested that the new snapshot works for your project.To build and test with the frozen dependencies, simply call
bazel build ...
orbazel test ...
as usual. Note that you have to have aresolved.bzl
first; either one committed to your repository, or generated bybazel sync
earlier. You will stay at the fixed snapshot recorded in youresolved.bzl
until you update it by anotherbazel sync
.
And, whenever an external git repository is fetched, the hash of the resulting
directory (with all the local transformations specified in the patches
and
patch_cmds
arguments already applied) is verified automatically.
Your feedback needed
Resolved files can now be used to fix external dependencies. But does the way it is implemented now fit your needs? We don't know and that's why the feature is marked as experimental. Please help us make fixing dependencies suit your needs by sending feedback to our discussion mailing list.