Bazel Blog

A new logo and homepage for Bazel

We are glad to unveil a new logo for Bazel:

Bazel logo

This new logo, which feels more modern and balanced, was designed by Sunkwan, creative lead at Google. It takes inspiration from assembled blocks, adds an emotional touch to it and keeps the green color that is now rooted into our identity. It was the top results of our internal and community votes.

Iteration on the Bazel logo

We also updated the homepage of the Bazel website: https://bazel.build

In addition to a visual refresh, this new website features clearer value statements, clearer call to actions and more relevant 'Get Started' links. We are also very proud to showcase a few famous users and their testimonials (fee free to add yours in the dedicated page of our wiki).

Screenshot of the Bazel homepage

Along the way, we moved the content of the main website and the blog in dedicated repositories (site repo, blog repo) and are using different subdomains: The documentation, which source is still hosted with our code in our main bazelbuild/bazel repo, is now served at docs.bazel.build. The blog is served at blog.bazel.build, which should streamline the blog publication process (contributions welcome!).

The work does not stop here, our next steps are to improve our documentation structure and 'Getting Started' content. By the way, feel free to suggest edits to our documentation, it is as easy as clicking the 'Edit' button at the top of documentation pages.

Edit button on Bazel docs

We hope you like this new visual identity.

Strict Java Deps and `unused_deps`

This blog post describes how Bazel implements "strict deps" for Java compilations ("SJD"), and how it is leveraged in unused_deps, a tool to remove unused dependencies. It is my hope this knowledge will help write rules for similar JVM-based languages such as Scala and Kotlin.

What's "Strict Deps"?

By "strict deps", we loosely mean that all directly used classes are loaded from jars provided by a rule's direct dependencies. In other words, if a Java file mentions another class, then it must be reflected in the BUILD file.
(the concept is similar to Buck's "first-order dependencies")

class A {
  void foo(B b) { }  // <---- B is used, therefore we must depend on :B
}
java_library(
    name = "A",
    srcs = ["A.java"],
    deps = [":B"], # <--- this dependency is required
)

Note that any dependencies of B itself are not listed in the deps of A.

The initial motivation for SJD was the ability to remove unused dependencies.

Consider a dependency chain A -> B -> C without strict deps. It's impossible to know if C can be removed from B's deps just by looking at it - all transitive users of B must be considered to make this decision. Strict deps mandates that if A also uses C, it must depend on it directly, therefore making it safe to remove C from B's deps.

unused_deps

unused_deps is a tool to remove dependencies that aren't needed from a java_library (and other Java rules).

When Bazel builds a Java rule :Foo on the command line, it writes two files - Foo.jar-2.params and Foo.jdeps. The former contains the command-line arguments to the Java compiler and the latter contains a serialized src/main/protobuf/deps.proto which specifies which jars were loaded during compilation.

unused_deps loads the two files and figures out which rules aren't needed in a rule's deps attribute, and emits Buildozer commands to delete them.

Implementation

Bazel always passes the entire transitive classpath to javac, not only the direct classpath. This frees the user from having to aggregate their transitive dependencies manually. In other words, javac never fails because of a missing symbol, as long as every rule specifies its direct dependencies. This is done at JavaCompileAction.java#L729.

SJD is enforced by a compiler plugin implemented in StrictJavaDepsPlugin.java. When Bazel constructs the command-line to javac, it specifies which jars come from indirect dependencies using the --indirect_dependency flag. The plugin then walks the .java sources and reports any symbols that come from indirect jars.
(A sketch of how it works: The compiler stores the name of the jar from which a symbol was loaded. The plugin walks the AST after the type annotation phase, and stops at each 'type expression', then checks whether the originating jar is an --indirect_dependency. If it is the plugin generates an error message. The message includes the missing direct dependency to add.)

This approach has the advantage that violations are easy to fix - Bazel tells the user exactly what to do.

Summary

  • Bazel passes all jars from the transitive dependencies of a rule.
  • Bazel notifies the SJD compiler plugin which jars are indirect.
  • During compilation, the compiler plugin reports any symbol mentioned in the Java file that is loaded from an indirect jar.

By Carmi Grushko

Google Summer of Code 2017

Thank you very much to everyone who applied for Google Summer of Code with Bazel. We received many interesting proposals, and we are excited to see that so many of you are enthusiastic about Bazel. Since this is the first Google Summer of Code with Bazel, we decided to mentor only one student. Of course, you are all welcome to contribute to our projects, even if it is outside of Google Summer of Code.

Harmandeep is going to work with us on Bazel this summer, and will develop a tool to provide editor services (e.g. code completion) for BUILD and .bzl files, using the Microsoft Language Server Protocol. For more information, you can check the proposal and follow Harmandeep's blog.

A big thank you to everyone who applied!

By Laurent Le Brun

Bazel 0.5.0 Released

We are delighted to announce the 0.5.0 release of Bazel (follow the link for the full release notes and list of changes).

This release simplifies Bazel installation on Windows and platforms where a JDK is not available. It solidifies the Build Event Protocol and Remote Execution APIs.

Note: Bazel release 0.5.0 contains a bug in the compiler detection on macOS which requires Xcode and the iOS tooling to be installed (corresponding issue #3063). If you had Command Line Tools installed, you also need to switch to Xcode using sudo xcode-select -s /Applications/Xcode.app/Contents/Developer.

Improvements from our roadmap

Bundled JDK

As announced earlier, when using an install script, bazel now comes by default bundled with JDK 8. This means fewer steps required to install Bazel. Read more about JDK 7 deprecation in the related blog post.

Windows support: now in beta

Bazel on Windows is now easier to install: it is no longer linked with MSYS. A following blog post will detail this further. Bazel is now able to build Java, C++ and Python on Windows.

Build Event Protocol

The Build Event Protocol is now available as an experimental option; it enables programmatic subscription to Bazel's events (build started, action status, target completed, test results…). Currently, the protocol can only be written to a file. A gRPC transport is already in the works and will be added in the next minor release. The API will be stabilized in 0.5.1.

Coverage support for pure Java targets

Use bazel coverage //my:target to generate coverage information from a java_test.

Other major changes since 0.4.0

New rules

New rules in Bazel: proto_library, java_lite_proto_library, java_proto_library and cc_proto_library.

New Apple rules

There is a new repository for building for Apple platforms: https://github.com/bazelbuild/rules_apple. These rules replace the deprecated iOS/watchOS rules built into Bazel. By rebuilding the rules from the ground up in Skylark and hosting them separately, we can more quickly fix bugs and implement new Apple features and platform versions as they become available.

Android Support Improvements

  • Integration with the Android Support Repository libraries in android_sdk_repository.
  • Support for Java 8 in Android builds with --experimental_desugar_for_android. See Android Studio's documentation for more details about Android's Java 8 language features.
  • Multidex is now fully supported via android_binary.multidex.
  • android_ndk_repository now supports Android NDK 13 and NDK 14.
  • APKs are now signed with both APK Signature V1 and V2. See Android documentation for more details about APK Signature Scheme v2.

Remote Execution API

We fixed a number of bugs in the Remote Execution implementation. The final RPC API design has been sent to bazel-discuss@ for discussion (see Design Document: Remote Execution API) and it should be finalized in the 0.6.0 release. The final API should only be a minor change compared to the implementation in this 0.5.0 release.

Skylark

  • Declared Providers are now implemented and documented. They enable more robust and clearly defined interfaces between different rules and aspects. We recommend using them for all rules and aspects.
  • The type formerly known as 'set' is now called 'depset'. Depsets make your rules perform much better, allowing rules memory consumption to scale linearly instead of quadratically with build graph size - make sure you have read the documentation on depsets.

Finally...

A big thank you to our community for your continued support. Particular shout-outs to Peter Mounce for the Chocolatey Windows package and Yuki Yugui Sonoda for maintaining rules_go (they both received an open source peer bonus from Google).

Thank you all, keep the questions and bug reports coming!

See the full list of changes on GitHub.

JDK7 deprecation

The Bazel team has been maintaining a separate, stripped-down build of Bazel that runs with JDK 7. The 0.5.1 release will no longer provide this special version.

To address the problem of JDK 8 not being available on some machines, starting with version 0.5.0, our installer will embed a JDK by default.

If you have any concerns, please reach out to bazel-discuss@googlegroups.com.

Recap:

0.5.0:

  • bazel-0.5.0-installer.sh: default version, with embedded JDK.
  • bazel-0.5.0-without-jdk-installer.sh: version without embedded JDK.
  • bazel-0.5.0-jdk7-installer.sh: last release compatible with JDK 7.

0.5.1:

  • bazel-0.5.1-installer.sh: default version, with embedded JDK.
  • bazel-0.5.1-without-jdk-installer.sh: version without embedded JDK.

Migration path:

If you are currently using the Bazel with JDK 7, then starting with version 0.5.0 you must start using the default installer.

If you are currently using the default installer and do not want to use a version with embedded JDK, then use the -without-jdk version.

Note:

Homebrew and debian packages do not contain the embedded JDK. This change only affects the shell installers.

Thanks:

Thanks everybody for bearing with all the JDK 7 related issues, including the Java team at Google, in particular Liam Miller-Cushon.

Special thanks to Philipp Wollermann who made this new installer possible.

A glimpse of the design of Skylark

This blog post describes the design of Skylark, the language used to specify builds in Bazel.

A brief history

Many years ago, code at Google was built using Makefiles. As other people noticed, Makefiles don't scale well with a large code base. A temporary solution was to generate Makefiles using Python scripts, where the description of the build was stored in BUILD files containing calls to the Python functions. But this solution was way too slow, and the bottleneck was Make.

The project Blaze (later open-sourced as Bazel) was started in 2006. It used a simple parser to read the BUILD files (supporting only function calls, list comprehensions and variable assignments). When Blaze could not directly parse a BUILD file, it used a preprocessing step that ran the Python interpreter on the user BUILD file to generate a simplified BUILD file. The output was used by Blaze.

This approach was simple and allowed developers to create their own macros. But again, this led to lots of problems in terms of maintenance, performance, and safety. It also made any kind of tooling more complicated, as Blaze was not able to parse the BUILD files itself.

In the current iteration of Bazel, we've made the system saner by removing the Python preprocessing step. We kept the Python syntax, though, in order to migrate our codebase. This seems to be a good idea anyway: Many people like the syntax of our BUILD files and other build tools (e.g. Buck, Pants, and Please) have adopted it.

Design requirements

We decided to separate description of the build from the extensions (macros and rules). The description of the build resides in BUILD files and the extensions reside in .bzl files, although they are all evaluated with the same interpreter. We want the code to be easy to read and maintain. We designed Bazel to be used by thousands of engineers. Most of them are not familiar with build systems internals and most of them don't want to spend time learning a new language. BUILD files need to be simple and declarative, so that we can build tools to manipulate them.

The language also needed to:

  • Run on the JVM. Bazel is written in Java. The data structures should be shared between Bazel and the language (due to memory requirements in large builds).

  • Use a Python syntax, to preserve our codebase.

  • Be deterministic and hermetic. We have to guarantee that the execution of the code will always yield the same results. For example, we forbid access to I/O and date and time, and ensure deterministic iteration order of dictionaries.

  • Be thread-safe. We need to evaluate a lot of BUILD files in parallel. Execution of the code needs to be thread-safe in order to guarantee determinism.

Finally, we have performance concerns. A typical BUILD file is simple and can be executed quickly. In most cases, evaluating the code directly is faster than compiling it first.

Parallelism and imports

One special feature of Skylark is how it handles parallelism. In Bazel, a large build require the evaluation of hundreds of BUILD files, so we have to load them in parallel. Each BUILD file may use any number of extensions, and those extensions might need other files as well. This means that we end up with a graph of dependencies.

Bazel first evaluates the leaves of this graph (i.e. the files that have no dependencies) in parallel. It will load the other files as soon as their dependencies have been loaded, which means the evaluation of BUILD and .bzl files is interleaved. This also means that the order of the load statements doesn't matter at all.

Each file is loaded at most once. Once it has been evaluated, its definitions (the global variables and functions) are cached. Any other file can access the symbols through the cache.

Since multiple threads can access a variable at the same time, we need a restriction on side-effects to guarantee thread-safety. The solution is simple: when we cache the definitions of a file, we "freeze" them. We make them read-only, i.e. you can iterate on an array, but not modify its elements. You may create a copy and modify it, though.

In a future blog post, we'll take a look at the other features of the language.

By Laurent Le Brun

Skylark and Java rules interoperability

As of Bazel 0.4.4, Java compilation is possible from a Skylark rule. This facilitates the Skylark and Java interoperability and allows creating what we call Java sandwiches in Bazel.

What is a Bazel Java sandwich?

A Java sandwich refers to custom rules written in Skylark being able to depend on Bazel native rules (e.g. java_library) and the other way around. A typical Java sandwich in Bazel could be illustrated like this:

java_library(name = "top", ...)
java_skylark_library(name = "middle", deps = [":top", ...], ...)
java_library(name = "bottom", deps = [":middle", ...], ...)

Built-in support for Java

In Skylark, an interface to built-in Java functionality is available via the java_common module. The full API can be found in the documentation.

java_common.compile

Compiles Java source files/jars from the implementation of a Skylark rule and returns a java_common.provider that encapsulates the compilation details.

java_common.merge

Merges the given providers into a single java_common.provider.

Examples

To allow other Java rules (native or custom) to depend on a Skylark rule, the Skylark rule should return a java_common.provider. All native Java rules return java_common.provider by default, which makes it possible for any Java related Skylark rule to depend on them.

For now, there are 3 ways of creating a java_common.provider:

  1. The result of java_common.compile.
  2. Fetching it from a Java dependency.
  3. Merging multiple java_common.provider instances using java_common.merge.

Using the Java sandwich with compilation example

This example illustrates the typical Java sandwich described above, that will make use of Java compilation:

java_library(name = "top", ...)
java_skylark_library(name = "middle", deps = [":top", ...], ...)
java_library(name = "bottom", deps = [":middle", ...], ...)

In the BUILD file we load the Skylark rule and have the rules:

load(':java_skylark_library.bzl', 'java_skylark_library')

java_library(
  name = "top",
  srcs = ["A.java"],
  deps = [":middle"]
)

java_skylark_library(
  name = "middle",
  srcs = ["B.java"],
  deps = [":bottom"]
)

java_library(
  name = "bottom",
  srcs = ["C.java"]
)

The implementation of java_skylark_library rule does the following:

  1. Collects all the java_common.providers from its dependencies and merges them using java_common.merge.
  2. Creates an artifact that will be the output jar of the Java compilation.
  3. Compiles the specified Java source files using java_common.compile, passing as dependencies the collected java_common.providers.
  4. Returns the output jar and the java_common.provider resulting from the compilation.
def _impl(ctx):
  deps = []
  for dep in ctx.attr.deps:
    if java_common.provider in dep:
      deps.append(dep[java_common.provider])

  output_jar = ctx.new_file("lib" + ctx.label.name + ".jar")

  compilation_provider = java_common.compile(
    ctx,
    source_files = ctx.files.srcs,
    output = output_jar,
    javac_opts = [],
    deps = deps,
    strict_deps = "ERROR",
    java_toolchain = ctx.attr._java_toolchain,
    host_javabase = ctx.attr._host_javabase
  )
  return struct(
    files = set([output_jar]),
    providers = [compilation_provider]
  )

java_skylark_library = rule(
  implementation = _impl,
  attrs = {
    "srcs": attr.label_list(allow_files=True),
    "deps": attr.label_list(),
    "_java_toolchain": attr.label(default = Label("@bazel_tools//tools/jdk:toolchain")),
    "_host_javabase": attr.label(default = Label("//tools/defaults:jdk"))
  },
  fragments = ["java"]
)

Just passing around information about Java rules example

In some use cases there is no need for Java compilation, but rather just passing information about Java rules around. A Skylark rule can have some other (irrelevant here) purpose, but if it is placed somewhere between two Java rules it should not lose information from bottom to top.

In this example we have the same Bazel sandwich as above:

java_library(name = "top", ...)
java_skylark_library(name = "middle", deps = [":top", ...], ...)
java_library(name = "bottom", deps = [":middle", ...], ...)

only that java_skylark_library won't make use of Java compilation, but will make sure that all the Java information encapsulated by the Java library bottom will be passed on to the Java library top.

The BUILD file is identical to the one from the previous example.

The implementation of java_skylark_library rule does the following:

  1. Collects all the java_common.providers from its dependencies
  2. Returns the java_common.provider that resulted from merging the collected dependencies.
def _impl(ctx):
  deps = []
  for dep in ctx.attr.deps:
    if java_common.provider in dep:
      deps.append(dep[java_common.provider])
  deps_provider = java_common.merge(deps)
  return struct(
    providers = [deps_provider]
  )

java_skylark_library = rule(
  implementation = _impl,
  attrs = {
    "srcs": attr.label_list(allow_files=True),
    "deps": attr.label_list(),
    "_java_toolchain": attr.label(default = Label("@bazel_tools//tools/jdk:toolchain")),
    "_host_javabase": attr.label(default = Label("//tools/defaults:jdk"))
  },
  fragments = ["java"]
)

More to come

Right now there is no way of creating a java_common.provider that encapsulates compiled code (and its transitive dependencies), other than java_common.compile. For example one may want to create a provider from a .jar file produced by some other means.

Soon there will be support for use cases like this. Stay tuned!

If you are interested in tracking the progress on Bazel Java sandwich you can subscribe to this Github issue.

Irina Iancu, on behalf of the Bazel Java team

A Google Summer of Code with Bazel

I'm happy to announce that Bazel has been accepted as a mentor organization for the Google Summer of Code 2017. If you are a student and interested in working on Bazel this summer, please read on.

Take a look at our ideas page: it is not exhaustive and we may extend it over time, but it should give you a rough idea of what you could work on. Feel free to come up with your new ideas or suggest variations on our proposals. Not all projects on the page will be taken: we expect to accept up to three students. Students will not work with a single mentor, you can expect to interact with multiple people from the Bazel team (although there will be a main contact point). This will ensure you'll get timely responses and assistance, even if one of us goes on vacation.

This is the first time we participate in Google Summer of Code, please bear with us if you miss some information. We will update our ideas page to answer the most frequent questions.

If you have any question, please contact us on bazel-core@googlegroups.com

I'm looking forward to hearing from you,

Laurent Le Brun, on behalf of the Bazel mentors.

Protocol Buffers in Bazel

Bazel currently provides built-in rules for Java, JavaLite and C++.

proto_library is a language-agnostic rule that describes relations between .proto files.

java_proto_library, java_lite_proto_library and cc_proto_library are rules that "attach" to proto_library and generate language-specific bindings.

By making a java_library (resp. cc_library) depend on java_proto_library (resp. cc_proto_library) your code gains access to the generated code.

TL;DR - Usage example

TIP: https://github.com/cgrushko/proto_library contains a buildable example.

NOTE: Bazel 0.4.4 lacks some features the example uses - you'll need to build Bazel from head. The easiest is to install Bazel, download Bazel's source code, build it (bazel build //src:bazel) and copy it somewhere (e.g., cp bazel-bin/src/bazel ~/bazel)

WORKSPACE file

Bazel's proto rules implicitly depend on the https://github.com/google/protobuf distribution (described below, in "Implicit Dependencies and Proto Toolchains"). The following satisfies these dependencies:

TIP: This is a shortened version of https://github.com/cgrushko/proto_library/blob/master/WORKSPACE

# proto_library rules implicitly depend on @com_google_protobuf//:protoc,
# which is the proto-compiler.
# This statement defines the @com_google_protobuf repo.
http_archive(
    name = "com_google_protobuf",
    urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
)

# cc_proto_library rules implicitly depend on @com_google_protobuf_cc//:cc_toolchain,
# which is the C++ proto runtime (base classes and common utilities).
http_archive(
    name = "com_google_protobuf_cc",
    urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
)

# java_proto_library rules implicitly depend on @com_google_protobuf_java//:java_toolchain,
# which is the Java proto runtime (base classes and common utilities).
http_archive(
    name = "com_google_protobuf_java",
    urls = ["https://github.com/google/protobuf/archive/b4b0e304be5a68de3d0ee1af9b286f958750f5e4.zip"],
)

BUILD files

TIP: This is a shortened version of https://github.com/cgrushko/proto_library/blob/master/src/BUILD

java_proto_library(
    name = "person_java_proto",
    deps = [":person_proto"],
)

cc_proto_library(
    name = "person_cc_proto",
    deps = [":person_proto"],
)
proto_library(
    name = "person_proto",
    srcs = ["person.proto"],
    deps = [":address_proto"],
)

proto_library(
    name = "address_proto",
    srcs = ["address.proto"],
    deps = [":zip_code_proto"],
)

proto_library(
    name = "zip_code_proto",
    srcs = ["zip_code.proto"],
)

This file yields the following dependency graph:

proto_library dependency graph

Notice how the proto_library provide structure for both Java and C++ code generators, and how there's only one java_proto_library even though there multiple .proto files.

Benefits

... in comparison with a macro that's responsible for compiling all .proto files in a project.

  1. Caching + incrementality: changing a single .proto only causes the rebuilding of dependant .proto files. This includes not only regenerating code, but also recompiling it. For large proto graphs this could be significant.
  2. Depend on pieces of a proto graph from multiple places: in the example above, one can add a cc_proto_library that deps on zip_code_proto, and including it together with //src:person_cc_proto in the same project. Though they both transitively depend on zip_code_proto, there won't be a linking error.

Recommended Code Organization

  1. One proto_library rule per .proto file.
  2. A file named foo.proto will be in a rule named foo_proto, which is located in the same package.
  3. A X_proto_library that wraps a proto_library named foo_proto should be called foo_X_proto, and be located in the same package.

FAQ

Q: I already have rules named java_proto_library and cc_proto_library. Will there be a problem?
A: No. Since Skylark extensions imported through load statements take precedence over native rules with the same name, the new rule should not affect existing usage of the java_proto_library macro.

Q: How do I use gRPC with these rules?
A: The Bazel rules do not generate RPC code since protobuf is independent of any RPC system. We will work with the gRPC team to create Skylark extensions to do so. (C++ Issue, Java Issue)

Q: Do you plan to release additional languages?
A: We can relatively easily create py_proto_library. Our end goal is to improve Skylark to the point where these rules can be written in Skylark, making them independent of Bazel.

Q: How does one use well-known types? (e.g., any.proto, descriptor.proto)
A: Once https://github.com/google/protobuf/issues/2763 is resolved, the following should be added to a .proto file: import google/protobuf/any.proto and the following: @com_google_protobuf//:well_known_types_protos to one's proto_library rule.

Q: Any tips for writing my own such rules?
A: First, make sure you're able to register actions that compile your target language. (as far as I know, Bazel Python actions are not exposed to Skylark, for example).
Second, take extra care to generate unique symbol names and unique filenames. There's an implicit assumption that different proto rules with different options, generate different symbols. For example, if you write a new rule foo_java_proto_library, it must not generate symbols that java_proto_library might. The risk is that a binary will contain both, leading to a one-definition rule violation (e.g., linking errors). The downside is that the binary might bloat, as it must contain multiple generated code for the same proto. We're working on a Skylark version of java_lite_proto_library which should provide a good example.

Implementation Details

Implicit Dependencies and Proto Toolchains

The proto_library rule implicitly depends on @com_google_protobuf//:protoc, which is the protocol buffer compiler. It must be a binary rule (in protobuf, it's a cc_binary). The rule can be overridden using the --proto_compiler command-line flag.

X_proto_library rules implicitly depend on @com_google_protobuf_X//:X_toolchain, which is a proto_lang_toolchain rule. These rules can be overridden using the --proto_toolchain_for_X command-line flags.

A proto_lang_toolchain rule describes how to call the protocol compiler, and what is the library (if any) that the resulting generated code needs to compile against. See an example in the protobuf repository.

Bazel Aspects

The X_proto_library rules are implemented using Bazel Aspects to have the best of two worlds -

  1. Only need a single X_proto_library rule for an arbitrarily-large proto graph.
  2. Incrementality, caching and no linking errors.

Conceptually, an X_proto_library rule creates a shadow graph of the proto_library it depends on, and each shadow node calls protocol-compiler and then compiles the generated code. This way, if there are multiple paths from a rule to a proto_library through X_proto_library, they all share the same node.

Descriptor Sets

When compiled on the command-line, a proto_library creates a descriptor set for the messages it srcs. The file is a serialized FileDescriptorSet, which is described in https://developers.google.com/protocol-buffers/docs/techniques#self-description.

One use case for the descriptor set is generating code without having to parse .proto files. (https://github.com/google/protobuf/issues/2725 tracks this ability in the protobuf compiler)

The aforementioned file only contains information about the .proto files directly mentioned by a proto_library rule; the collection of transitive descriptor sets is available through the 'proto.transitivedescriptorsets' Skylark provider. See documentation in ProtoSourcesProvider.

By Carmi Grushko

Invalidation of repository rules

Remote repositories are the way to use dependencies from "outside" of the Bazel world in Bazel. Using them, you can download binaries from the internet or use some from your own host. You can even use Skylark to define your own repository rules to depend on a custom package manager or to implement auto-configuration rules.

This post explains when Skylark repositories are invalidated and hence when they are executed.

Dependencies

The implementation attribute of the repository_rule defines a function (the fetch operation) that is executed inside a Skyframe function. This function is executed when one of its dependencies change.

For repository that are declared local (set local = True in the call to the repository_rule function), the fetch operation is performed on every call of the Skyframe function.

Since a lot of dependencies can trigger this execution (if any part of the WORKSPACE file change for instance), a supplemental mechanism ensure that we re-execute the fetch operation only when stricly needed for non-local repository rules (see the design doc for more details).

After cr.bazel.build/8218 is released, Bazel will re-perform the fetch operation if and only if any of the following dependencies change:

  • Skylark files needed to define the repository rule.
  • Declaration of the repository rule in the WORKSPACE file.
  • Value of any environment variable declared with the environ attribute of the repository_rule function. The value of those environment variable can be enforced from the command line with the --action_env flag (but this flag will invalidate every action of the build).
  • Content of any file used and referred using a label (e.g., //mypkg:label.txt not mypkg/label.txt).

Good practices regarding refetching

Declare your repository as local very carefully

First and foremost, declaring a repository local should be done only for rule that needs to be eagerly invalidated and are fast to update. For native rule, this is used only for local_repository and new_local_repository.

Put all slow operation at the end, resolve dependencies first

Since a dependency might be unresolved when asked for, the function will be executed up to where the dependency is requested and all that part will be replayed if the dependency is not resolved. Put those file dependencies at the top, for instance prefer

def _impl(repository_ctx):
   repository_ctx.file("BUILD", repository_ctx.attr.build_file)
   repository_ctx.download("BIGFILE", sha256 = "...")

myrepo = repository_rule(_impl, attrs = {"build_file": attr.label()})

over

def _impl(repository_ctx):
   repository_ctx.download("BIGFILE")
   repository_ctx.file("BUILD", repository_ctx.attr.build_file)

myrepo = repository_rule(_impl, attrs = {"build_file": attr.label()})

(in the later example, the download operation will be re-executed if build_file is not resolved when executing the fetch operation).

Declare your environment variables

To avoid spurious refetch of repository rules (and the impossibility of tracking all usages of environmnent variables), only environment variables that have been declared through the environ attribute of the repository_rule function are invalidating the repositories.

Therefore, if you think you should re-run if an environment variable changes (like for auto-configuration rules), you should declare those dependencies, or your user will have to do bazel clean --expunge each time they change their environment.