Bazel Blog

Preliminary sandboxfs support and performance results

Back in August of 2017, we introduced sandboxfs: a project to improve the performance and correctness of builds that have action sandboxing enabled. Today, after months of work to stabilize the codebase, we are happy to announce that preliminary support for sandboxfs is available in Bazel HEAD after April 13th!

This post presents the performance measurements we have gotten so far when using sandboxfs. As these metrics look promising, the true goal of this post is a call for action: we know a bunch of you have previously expressed that sandboxing was unusable due to its overhead so we want to know if sandboxfs makes things better for you.

Sandboxing basics

Before delving into the details, let's quickly recap what action sandboxing is and how sandboxfs is supposed to benefit it. (Refer to the previous post for a much more detailed explanation.)

Sandboxing is Bazel's ability to isolate the execution of each build action (think "compiler invocation") from the rest of the system. This feature restricts what actions can do so that they only have access to the tools and inputs they declare and so that they are only able to write the outputs they promised. In this way, we ensure that the build graph doesn't have hidden dependencies that could poison the reproducibility of the build.

More specifically, Bazel constructs an execroot for each action, which acts as the action's work directory at execution time. The execroot contains all input files to the action and serves as the container for any generated outputs. Bazel then uses an operating system-provided technique—containers on Linux and sandbox-exec on macOS—to constrain the action within the execroot. It is worth noting that the preparation of the disk layout and the actual sandboxing are orthogonal.

Traditionally, Bazel has been creating the execroot using symlinks, thus creation scales linearly with the number of inputs. Creating all these symlinks is costly for actions with thousands of inputs and unfortunately these are not uncommon. This is where sandboxfs is supposed to help.

sandboxfs is a FUSE file system that exposes an arbitrary view of the underlying file system and does so without time penalties. Bazel can then use sandboxfs to generate the execroot "instantaneously" for each action, avoiding the cost of issuing thousands of system calls. The downside is that all further I/O within the execroot is slower due to FUSE overhead. We hypothesized that this was a tradeoff with potentials for time savings and we are at a point where we can prove it. Let's see how this has played out so far.

Performance results

The experiments below were run under the following build machines:

  • MacBook Pro 2016: Intel Core i7-6567U CPU @ 3.30GHz, 2 cores, 16GB RAM, SSD.
  • Mac Pro 2013: Intel Xeon CPU E5-1650 v2 @ 3.50GHz, 6 cores, 32GB RAM, SSD.
  • Linux workstation: Intel Xeon CPU E5-2699 v3 @ 2.30GHz, 6 cores, 32GB RAM, SSD.

And here are the specific build times obtained from a variety of different targets. All these builds were clean builds, and each was run 10 times and averaged to minimize noise:

ID Target Machine No sandbox Symlinked sandbox sandboxfs sandbox
BL Bazel MacBook Pro 2017 581 sec 621 sec (+6%) 612 sec (+5%)
BW Bazel Mac Pro 2013 247 sec 265 sec (+7%) 250 sec (+1%)
IW iOS app Mac Pro 2013 1235 sec 4572 sec (+270%) 1922 sec (+55%)
CW C++/Go library Linux workstation 1175 sec 1318 sec (+12%) 828 sec (-30%)

Let's ignore the strange CW build results for a moment.

As you can see from all builds, sandboxfs-based builds are strictly better than symlinks-based builds. The cost of sandboxing, however, varies widely depending on what's being built and on what machine. For BL and BW, the cost of sandboxing is small enough to think that using sandboxing unconditionally is possible. For IW, however, the cost of sandboxing is significant in either case. That said, for IW we see the massive time savings of the sandboxfs-based approach, and this (slow iOS builds) is the specific case we set to fix at the beginning of the sandboxfs project.

These results are optimistic but we have also observed cases where sandboxfs builds are slower than symlinked builds. I wasn't able to reproduce those when preparing this blog post but be aware that it's entirely possible for you to observe slower builds when using sandboxfs. We have some work to do before we can gain more confidence on this.

Now, what's up with CW? Note that sandboxfs-based builds are faster than without sandboxing. This makes little sense: how can it possibly be that doing more work results in a faster build? We don't really know yet, but the measurements were pretty conclusive. One possible explanation is that using sandboxfs to expose the sources of the actions somehow reduces contention on srcfsd (the other FUSE file system we use in our builds, which exposes the monolithic Google repository) and makes its overall behavior faster.

Usage instructions

Convinced that you should give this a try? Excellent. Use the following steps to install sandboxfs and perform a Bazel build with it. Be aware that due to the current status of sandboxfs (no formal releases), these may change at any time.

  1. Ensure you are using a Bazel build newer than April 13th or wait for the future 0.13.x release series.

  2. Download and install sandboxfs so that the sandboxfs binary ends up in your PATH. There currently are no formal releases for this project so you will have to do a HEAD build from GitHub using Bazel.

  3. (macOS-only) Install OSXFUSE.

  4. (macOS-only) Run sudo sysctl -w vfs.generic.osxfuse.tunables.allow_other=1. You will need to do this after installation and after every reboot. This is unfortunately necessary to ensure core macOS system services work through sandboxfs.

  5. Run your favorite Bazel build with --experimental_use_sandboxfs.

That's it!

If you see local instead of darwin-sandbox or linux-sandbox as an annotation for the actions that are executed, this may mean that sandboxing is disabled. Pass --genrule_strategy=sandboxed --spawn_strategy=sandboxed to enable it.

Next steps

We cannot yet recommend using sandboxfs by default nor we can't convince you yet to enable sandboxing unconditionally due to its non-zero cost. But the current status may be sufficient for you to enable sandboxing in some cases (especially during release builds if you are not doing so yet).

Here are some things we are planning to look into:

  • Further investigate what can be optimized within sandboxfs. Some preliminary profiling routinely points at the CPU being spent in the Go runtime so it's unclear if fixes will be easy/possible. (Due to personal curiosity, I've been prototyping a reimplementation in Rust and have a feeling that it can significantly cut down CPU usage in sandboxfs.)

  • We know that symlinked sandboxing is faster than sandboxfs in some cases. Investigate what the cutoff point is (as a number of action inputs, or something else) and implement a mode where we only use sandboxfs in the cases where we know it will help most.

  • Improve the protocol between sandboxfs and Bazel so that we are confident in making a first release of sandboxfs for easier distribution. If we had binary releases, we could even bundle OSXFUSE within the image we ship so that you didn't need to mess with sysctl, for example.

  • Pie-in-the-sky idea: reimplement sandboxfs as a kernel module. This is really the only way to make sandboxing overhead minimal, but is also the hardest to maintain. On the bright side, note that sandboxfs (excluding tests) is only about 1200 lines and that the tests and Bazel integration are fully reusable for any implementation—this rewrite may not be as daunting as it sounds.

We know that many of you have previously raised the bad performance of sandboxing as a blocker for enabling it. We are very interested in knowing what kind of impact this has on your builds so that we can assess how important it is to continue working on this. Please give the instructions above a try and let us know how it goes! And also, let us know if you want to contribute!

By Julio Merino

Bazel 0.12

We've just released Bazel 0.12!

If you use Bazel on Windows, please tell us what you think! We've set up a survey that will help us prioritize work.

Notable changes

  • Android NDK r15 and r16 support is available. This includes compatibility with Unified Headers.
    • In r15, the minimum API level target is 14. If the android_ndk_repository.api_level attribute is set less than 14, 14 will be used instead.
    • In r16, libc++ is out of beta and is the preferred STL. Pass the flag --android_crosstool_top=@androidndk//:toolchain-libcpp to use the libc++ STL.
  • Experimental android_instrumentation_test support has landed. Learn more about how you can run Android instrumentation tests in an hermetic and reproducible way on the documentation page.

Tools

Community update

Other changes

  • --config expansion order is changed
  • The new --direct_run flag on bazel run lets one run interactive binaries. Tests are then run in an approximation of the official test environment. The BUILD_{WORKSPACE,WORKING}_DIRECTORY environment variables are available to the binary to inform it about the location of the workspace and the working directory Bazel was run from. The old way bazel run is worked is slated to be removed soon.
  • Add a --build_event_publishallactions flag to allow all actions to be published via the BEP. Note that this may increase the size of the BEP a lot.
  • flaky_test_attempts supports the regex@attempts syntax, like runs_per_test.
  • Query / Dump
  • BUILD / .bzl files
    • Removed flags --incompatible_checked_arithmetic, --incompatible_dict_literal_has_no_duplicates, --incompatible_disallow_keyword_only_args, --incompatible_load_argument_is_label, and --incompatible_comprehension_variables_do_not_leak (see the Backward Compatibility policy).
    • When calling a rule, dict-valued attributes are no longer lexicographically sorted. The rule now preserves the iteration order (from the BUILD or bzl file where it was created).
    • Calling the print function on a target now shows the provider keys of the target, as debug information.
  • Apple / iOS
  • Protocol buffers
  • Android
    • Updated default android_cpu value from armeabi to armeabi-v7a. This only affects Android builds that set --crosstool_top to the Android NDK crosstool and that do not use --fatapkcpu.
    • Corrected the include paths of llvm-libc++ headers in NDK r13+. This fixes missing link time files when compiling against libc++.
  • C++ Rules
    • CcToolchain: Introduced action_config for c++-link-nodeps-dynamic-library. Now we can specify different flags for shared libraries created by cc_binary and cc_library rules.
    • BAZEL_LINKOPTS is now consulted when autoconfiguring c++ toolchain. It can be used to switch from -lstdc++ to -lc++. Colon is the flag separator.
    • Introduced --experimental_drop_fully_static_linking_mode. With this flag linkopts be scanned for "-static" to enable fully static linking mode. Future default. Please use features instead.
    • Removed cc_inc_library, please use cc_library instead. cc_library is strictly more powerful and can emulate all use cases of cc_inc_library. You might find includes, include_prefix, and strip_include_prefix attributes useful for migration.
    • cc_binary and cc_test now enable static_linking_mode or dynamic_linking_mode CROSSTOOL features depending on the linking mode. We will eventually remove linkingmodeflags message from the CROSSTOOL, and use these features only. Since these features are enabled, you can migrate right now.
    • Added --ltobackendopt and --per_file_ltobackendopt for passing options to ThinLTO LTO backend compile actions only. See docs in the command line reference.
  • Repository Rules
    • repository_cache is no longer experimental and enabled by default.
    • The native http_archive rule has been deprecated. Use the Skylark version available via load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") instead.
    • The native git_repository rule has been deprecated. Use the Skylark version available via load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository") instead.

Did we miss anything? Fill the form to suggest content for a next blog post.

Discuss on Hacker News.

By Laurent Le Brun

Scalable Android Builds with Incremental Dexing

Bazel supports building Android apps with Java and C++ code out of the box through the android_binary rule and related rules. Android binary builds need a lot of machinery--more than we can cover in a blog post. However, one aspect that’s fairly important to Bazel’s Android support is scalability. That’s because we build most of Google’s own Android apps with Bazel and those apps are not only comparably large but also come with hundreds of engineers that want to build and test their changes quickly.

For over a year now, Bazel has used a feature we call incremental dexing to speed up Android builds. As the name implies, incremental dexing is designed to minimize the work needed to rebuild an app after code changes, but it also parallelizes builds and lets them scale better to the needs of Google’s own apps. But how does it work and what is "dexing" anyway?

Dexing is what we call the build step that converts Java bytecode to Android's .dex file format. Traditionally, that’s been done for an entire app at once by a tool fittingly called "dx". Even if only a single class changed, dx would reprocess the entire app, which could take a while. But dx really has two jobs: compile bytecode to corresponding .dex code, and merge all the classes that are going into the app into as few .dex files as possible. The latter is needed because while Java bytecode uses a separate file per class, a single .dex file can contain thousands of classes. But because of the differences in instruction encoding, the compilation step is the more time-consuming one, while merging can be done separately and relatively quickly even for large apps.

Incremental dexing, as you might have guessed, separates bytecode compilation and dex merging. Specifically, it runs the compilation step separately in parallel for each .jar file that’s part of the app’s runtime classpath. To arrive at a final app, Bazel then merges the compilation results from each .jar.

How does that help? In a number of ways:

  1. We can take advantage of the parallelism inherent in the build and farm out compilation to as many processes as there are .jars in the app.
  2. When rebuilding after making code changes, we only have to recompile the .jars that changed, avoiding potentially a whole lot of work that doesn’t change the output.
  3. We can re-use compilation results in cases where the same .jar is part of multiple apps (for example, common libraries or just to build test inputs).
  4. We can also parallelize other build steps that are needed as inputs for the merge step.
  5. With the --experimental_spawn_scheduler flag, it simplifies caching past dexing results (for example, when one class in a large .jar changes).
  6. Most importantly, this strategy scales much better for large apps.

What we mean by scalability is that the total number of classes in the app matters much less for how long it takes to build the app. This is especially important when rebuilding the app after small changes: with incremental dexing, the time spent on dexing is proportional to the size of the change. Previously, dexing time was always proportional to the number of classes in the app, no matter the change. This scalability has been critical in keeping up with our ever-growing apps.

One prerequisite for taking full advantage of incremental dexing is to split up the app into multiple, ideally small, .jars. Bazel naturally encourages and enables this with the java_library and android_library rules, which build .jar files from a set of Java sources, conventionally for a single Java package at a time. Third-party libraries are also often distributed as .jar or .aar files that Bazel ingests with the java_import and aar_import rules.

Tooling-wise it was reasonably straightforward to separate the merging step because Android provides a tool called dexmerger for just this purpose and compiling separately more or less just means running dx on one class at a time. One wrinkle for now is that dexmerger creates final .dex files that are larger than necessary. That means you want to turn off incremental dexing when you’re building a binary that you want to give to users, but it can speed up your development and test builds every day with no known adverse effect. Plus, we expect this to get better since Android Studio has started to use a similar scheme to build Android apps in Gradle.

By Kevin Bierhoff

Bazel 0.11

The Bazel team is happy to announce the release of version 0.11.0.

Notable Changes

Community Updates

Here are some updates on what happened in the Bazel community over the past month.

Languages & Rules

Tools

Community

Did we miss anything? Fill the form to suggest content for a next blog post.

Discuss on Hacker News or Reddit.

By Jingwen Chen

How Android Builds Work in Bazel

Background: How Bazel Works

In Bazel, BUILD files in directories specify targets that can be built from the contents of those directories.

Bazel goes through three steps when building targets:

  1. In the loading phase, Bazel parses the BUILD file of the target being built and all BUILD files that file transitively depends on.
  2. In the analysis phase, Bazel builds a graph of actions needed to build the specified targets.
  3. In the execution phase, Bazel runs those actions.

Each Bazel target is defined by a rule, which specifies inputs, outputs, and how to get from one to the other. Rules can specify things like creating an executable binary or defining a library. In Bazel’s code, individual rules are represented by instances of implementations of RuleConfiguredTargetFactory. Users can also extend Bazel and create new rules with Skylark.

Rules create, in turn, any number of actions. Each action takes any number of artifacts as inputs and produces one or more artifacts as outputs. These artifacts represent files that may not yet be available. They can either be source artifacts, such as source code checked in to the repository, or generated artifacts, such as output of other actions. For example, an action to compile a piece of code might take in source artifacts representing the code to be compiled and generated artifacts representing compiled dependencies, even though those dependencies have not yet been compiled, and output a generated artifact representing the compiled result. Additionally, rules may expose any number of provider objects. These providers are the API rules provide to other rules. They provide read-only information about internal state.

During analysis, Bazel runs the rules for each target being built and their transitive dependencies. Each rule generates and records all the actions it depends on. Bazel won’t necessarily run all of those actions; if an action doesn’t end up being required, Bazel will just ignore it. Skyframe is used to evaluate and cache the results of rules.

Information from the rules, including artifacts representing the future output of actions, are made available to other rules through the rules’ providers. Each rule has access to its direct dependencies’ providers. Because most information passed between rules is actually transitive, providers make use of the NestedSet class, a DAG-like data structure (it’s not actually a set!) made up of items and pointers to other (nested) NestedSet objects. NestedSets are specially optimized to work efficiently for analysis. For part of a provider that represents some transitive state, for example, a trivial implementation might be to build a new list that contains the items for the current rule and each transitive dependency (for a chain of n transitive dependencies, that means we’d add n + (n - 1) + … 1 = O(n^2) items to some list), but building a nested set containing the new item and a pointer to the previous nested set is much more efficient (we’d add 2 + 2 + … 2 = O(n) items to some nested set). This introduces similar efficiency in memory usage as well.

Artifacts can each be added to any number of output groups. Each output group represents a different group of outputs that a user might choose to build. For example, the source_jars output group specifies that Bazel should also produce the JARs of the source for a Java target and its transitive dependencies. The special default output group holds output that is specifically built for a target - for example, building a Java binary might produce a compiled .jar file in the default output group.

During execution, Bazel first looks at the artifacts in the requested output groups (plus, unless the user explicitly requested otherwise, the default output group). For each of those artifacts, it finds the actions that generate the artifact, then each of the artifacts each of those actions need, and so on until it finds all the actions and artifacts needed. If the action is not cached or the cache entry should be invalidated, Bazel follows this same process for the action’s dependencies, then runs the action. Once all of the actions in the requested output groups has been run or returned from cache, the build is complete.

Android Builds

There are a few important kinds of rules when building code for Android:

  • android_binary rules build Android packages (.apk files)
  • android_library rules build individual libraries that binaries and other libraries can consume.
  • android_local_test rules run test on Android code in a JVM.
  • aar_import rules import .aar libraries built outside of Bazel into a Bazel target.

Android Resources

One way in which the Android library build process differs from the normal Java build process is Android resources. Resources are anything that’s not code - strings, images, layouts, and so on.

Bazel generates R.java files (as well as related R.class and R.txt files) to contain references to available resources. These R files contain integer resource IDs that developers can use to refer to their resources. Within an app, each resource ID refers to one unique resource.

Developers can provide different versions of the same resource (to support, for example, different languages, regions, or screen sizes). Android makes references to the base resource available in the R files, and Android devices select the best available version of that resource at runtime.

Resource processing with aapt and aapt2

Diagram of android resource build process

Bazel supports processing resources using the original Android resource processor, aapt, or the new version, aapt2. Both methods are fundamentally similar but have a few important differences.

Bazel goes through three steps to build resources.

First, Bazel serializes the files that define the resources. In the aapt pipeline, the parse action serializes information about resources into symbols.bin files. In the aapt2 pipeline, an action calls into the aapt2 compile command which serializes the information into a format used by aapt2.

Next, the serialized resources are merged with similarly serialized resources inherited from dependencies. Conflicts between identically named resources are identified and, if possible, resolved during this merging. The contents of values resource files are generally explicitly merged. For other files, if resources from the target or its dependencies have the same name and qualifiers, the contents of the files are compared and, if they are different, a warning is produced and the resource that was provided last is chosen to be used.

Finally, Bazel checks that the resources for the target are reasonable and packages them up. In the aapt pipeline, the validate action calls into the aapt package command, and in the aapt2 pipeline, the aapt2 link command is called. In both cases, any malformed resources or references to unavailable resources cause a failure, and, if no failures are encountered, R.java and R.txt files are produced with information about the validated resources, and a Resource APK containing those resources is produced.

Using aapt2 rather than aapt provides better and more efficient support for a variety of cases. Additionally, more of the resource processing steps are handled by aapt2 as opposed to Bazel's custom resource processing tools. Finally, since the serialized format can be understood as-is by future calls to aapt2, Bazel no longer has to deserialize information about resources to a form aapt2 can understand.

The resource ID values generated for android_library targets are only temporary, since higher-level targets might depend on multiple targets where different resources were assigned the same ID. To ensure that resource IDs aren’t persisted anywhere permanent, the R files record the IDs as nonfinal, ensuring that compilation doesn’t inline them into other Java code. Additionally, an android_library's R files should be discarded after building is complete.

(Even though android_library files are eventually discarded, we still need to run resource processing to generate a temporary R.class to allow compilation, to merge resources so they can be inherited by consumers, and to validate that the resources can be compiled correctly - otherwise, if a developer introduces a bug in their resource definitions, it won’t be caught until they’re used in an android_binary, resulting in a lot of wasted work done by Bazel.)

Code in android libraries and binaries make references to code in the R files, so the R.class file must be generated before regular compilation can start. For android_library targets, since all resource IDs are temporary anyway, we can speed things up by generating a R.class file at the end of resource merging. For android_binary targets, we need to wait for the output of validation to get correct resource IDs. Validation does produce an R.java file, but generating an R.class file directly from the contents of the R.txt file is much faster than compiling the R.java file into an R.class file.

Android Resource Optimizations

Resource Filtering

The android_binary rule includes optional resource_configuration_filters and densities fields. These fields limit the types of devices that will be built for. For example, if you only wanted to build for English-language devices with HDPI displays, you could specify:

android_binary(
  # ...
  densities = ["hdpi"],
  resource_configuration_filters = ["en"],
)

Bazel will now be able to skip unneeded resources. As a result, the build will be faster and the resulting APK will be smaller. It won't support all kinds of devices and user preferences, but this speed improvement means developers can build and iterate faster.

Android Libraries

Diagram of `android_library` build process

An android_library rule is a pretty simple rule that builds and organizes an android library for use in another Android target. In the analysis phase, there are basically three groups of actions generated:

First, Bazel processes the library's resources, as described above.

Next comes the actual compilation of the library. This mostly just uses the regular Bazel Java compilation path. The biggest difference is that the R.class file produced in resource processing is also included in the compilation path (but is not inherited by consumers, since the R files need to be regenerated for each target).

Finally, Bazel does some additional work on the compiled code:

  1. The compiled .class files are desugared to replace bytecode only supported on Java 8 with Java 7 equivalents. Bazel does this so that Java 8 language features can be used for developing the app, even though the next tool, dx, does not support Java 8 bytecode.
  2. The desugared .class files are converted to .dex files, executables for Android devices, by dx. These .dex files are then packed into the .jar file used at runtime. These incremental .dex files, produced for each library, mean that, when some libraries from an app are changed, only those libraries, and not the entire app, need to be re-dexed.
  3. The source .java files for this library are used by hjar to generate a jar of .class files. Method bodies and private fields are removed from this compile-time .jar, and targets that depend on this library are compiled against this smaller .jar. Since these jars contain just the interface of the library, when private fields or method implementations change, dependent libraries do not need to be recompiled (they need to be recompiled only when the interface of the library changes), which results in faster builds.

Android Binaries

Diagram of `android_binary` build process

An android_binary rule packages the entire target and its dependencies into an APK. On a high level, binaries are built similarly to libraries. However, there are a few key differences.

For binaries, the three main resource processing actions (parse, merge, and validate), are all combined into a single large action. In libraries, Java compilation can get started while validation is still ongoing, but in binaries, since we need the final resource IDs from validation, we can't take advantage of similar parallelization. Since creating more actions always introduces a small cost, and there's no parallelization available to make up for it, having a single resource processing action is actually more efficient.

In binaries, the Java code is compiled, desugared, and dexed, just like in libraries. However, afterwards, the .dex files from the binary are merged together with the .dex files from dependencies.

Bazel also links together compiled C and C++ native code from dependencies into a single .so file for each CPU architecture specified by the --fat_apk_cpu flag.

The merged .dex files, the .so files, and the resource APK are all combined to build an initial binary APK, which is then zipaligned to produce an unsigned APK. Finally, the unsigned APK is signed with the binary's debug key to produce a signed APK.

The merged .dex files are combined with the resource APK to build an initial binary APK, which is then zipaligned to produce an unsigned APK. Finally, the unsigned APK is signed with the binary's debug key to produce a signed APK.

ProGuarded Android Binaries

Diagram of `android_binary` build process with
ProGuard

Bazel supports running ProGuard against android_binary targets to optimize them and reduce their size. Using ProGuard substantially changes elements of the build process. In particular, the build process does not use incremental .dex files at all, as ProGuard can only run on .class files, not .dex files.

ProGuarding uses a deploy.jar file, a single .jar file with all of the binary's Java bytecode, created from the binary's desugared (but not dexed) .class files as well as the binary's transitive runtime .jar files. (This deploy.jar file is an output of all android_binary targets, but it doesn't play a substantial role in builds without ProGuarding.)

Based on information from a series of Proguard specifications (from both the binary and its transitive dependencies), ProGuard makes serveral passes through the deploy.jar in order to optimize the code, remove unused methods and fields, and shorten and obfuscate the names of the methods and fields that remain. In addition to the resulting proguarded .jar file, ProGuard also outputs a mapping from old to new names of methods and fields.

ProGuard’s output is not dexed, so when building with ProGuard, the entire .jar must be re-dexed (even code from dependencies that were dexed incrementally). The dexed code is then built into the APK as usual.

ProGuard will also remove references to unused resources from the class files. If resource shrinking is enabled, the resource shrinker uses the proguard output to figure out what resources are no longer used, and then uses aapt or aapt2 to create a new, smaller resource APK with those resources removed. The shrunk resource APK and the dexed APK are then fed into the APK building process, which operates the same as it would without ProGuard.

Mobile-install

Mobile-install is a way of rapidly building and deploying Android applications iteratively. It’s based off of android_binary, but has some additional functionality to make builds and deployments more incremental.

By Alex Steinberg

Bazel 0.10

We're proud to announce the release of Bazel 0.10. The 400+ commits since last release include performance optimizations, bug fixes, and various improvements.

There is a new android test rule. android_local_test tests android_library code using Robolectric, a unit test framework designed for test-driven development without the need for an emulator or device. See the documentation for setup instructions and examples.

The depset type has evolved. To merge multiple depsets or add new elements, do not use the operators +, +=, |, or the .union method. They are deprecated and will be removed in the future. Instead use the new depset constructor, which has a better performance. For example, instead of d1 + d2 + d3, use depset(transitive = [d1, d2, d3]). See the documentation for more information and examples.

In addition to this new release, the Bazel community has been very active. See below what happened recently.

Languages & Rules

Tools

  • Spotify released a collection of tools for working with Bazel workspaces, mostly tailored towards writing JVM backend services.
  • Ever wanted to save a file and have your tests automatically run? How about restart your webserver when one of the source files change? Take a look at the Bazel watcher.

Performance

Did you know?

  • The heart of Bazel is a parallel evaluation and incrementality model called Skyframe.
  • Bazel is more than 10 years old, even though it was just open-sourced 3 years ago. John Field goes into the prehistory of Bazel in the opening remarks of Bazel Conference 2017 here.

Did we miss anything? Fill the form to suggest content for a next blog post.

Discuss on Hacker News or Reddit.

By Laurent Le Brun

Migration Help: --config parsing order

--config expansion order is changing, in order to make it better align with user expectations, and to make layering of configs work as intended. To prepare for the change, please test your build with startup option --expand_configs_in_place.

Please test this change with Bazel 0.10, triggered by the startup option --expand_configs_in_place. The change is mostly live with Bazel 0.9, but the newest release adds an additional warning if explicit flags are overriden, which should be helpful when debugging differences. The new expansion order will become the default behavior soon, probably in Bazel 0.11 or 0.12, and will no longer be configurable after another release.

Background: bazelrc & --config

The Bazel User Manual contains the official documentation for bazelrcs and will not be repeated here.
A Bazel build command line generally looks something like this:

bazel <startup options> build <command options> //some/targets

For the rest of the doc, command options are the focus. Startup options can affect which bazelrc's are loaded, and the new behavior is gated by a startup option, but the config mechanisms are only relevant to command options.

The bazelrcs allow users to set command options by default. These options can either be provided unconditionally or through a config expansion:

  • Unconditionally, they are defined for a command, and all commands that inherit from it,
    build --foo # applies "--foo" to build, test, etc.
  • As a config expansion # applies "--foo" to build, test, etc. when --config=foobar is set. build:foobar --foo

What is changing

The current order: fixed-point --config expansion

The current semantics of --config expansions breaks last-flag-wins expectations. In broad strokes, the current option order is

  1. Unconditional rc options (options set by a command without a config, "build --opt")
  2. All --config expansions are expanded in a "fixed-point" expansions.
    This does not check where the --config option initially was (rc, command line, or another --config), and will parse a single --config value at most once. Use `--announcerc` to see the order used!_
  3. Command-line specified options

Bazel claims to have a last-flag-wins command line, and this is usually true, but the fixed-point expansion of configs makes it difficult to rely on ordering where --config options are concerned.

See the Boolean option example below.

The new order: Last-Flag-Wins

Everywhere else, the last mention of a single-valued option has "priority" and overrides a previous value. The same will now be true of --config expansion. Like other expansion options, --config will now expand to its rc-defined expansion "in-place," so that the options it expands to have the same precedence.

Since this is no longer a fixed-point expansion, there are a few other changes:

  • --config=foo --config=foo will be expanded twice. If this is undesirable, more care will need to be taken to avoid redundancy. Double occurrences will cause a warning.
  • cycles are no longer implicitly ignored, but will error out.

Other rc ordering semantics remain. "common" options are expanded first, followed by the command hierarchy. This means that for an option added on the line "build:foo --buildopt", it will get added to --config=foo's expansion for bazel build, test, coverage, etc. "test:foo --testopt" will add --testopt after the (less specific and therefore lower priority) build expansion of --config=foo. If this is confusing, avoid alternating command types in the rc file, and group them in order, general options at the top. This way, the order of the file is close to the interpretation order.

How to start using new behavior

  1. Check your usual --config values' expansions by running your usual bazel command line with --announce_rc. The order that the configs are listed, with the options they expand to, is the order in which they are interpreted.

  2. Spend some time understanding the applicable configs, and check if any configs expand to the same option. If they do, you may need to move rc lines around to make sure the same value has priority with the new ordering. See "Suggestions for config definers."

  3. Flip on the startup option --expand_configs_in_place and debug any differences using --announce_rc

If you have a shared bazelrc for your project, note that changing it will propagate to other users who might be importing this bazelrc into a personal rc. Proceed with caution as needed

  1. Add the startup option to your bazelrc to continue using this new expansion order.

Suggestions for config definers

You might be in a situation where you own some --config definitions that are shared between different people, even different teams, so it might be that the users of your config are using both --expand_configs_in_place behavior and the old, default behavior.

In order to minimize differences between old and new behavior, here are some tips.

  1. Avoid internal conflicts within a config expansion (redefinitions of the same option)
  2. Define recursive --config at the END of the config expansion
    • Only critical if #1 can't be followed.

Suggestion #1 is especially important if the config expands to another config. The behavior will be more predictable with --expand_configs_in_place, but without it, the expansion of a single --config depends on previous --configs.

Suggestion #2 helps mitigate differences if #1 is violated, since the fixed-point expansion will expand all explicit options, and then expand any newly-found config values that were mentioned in the original config expansions. This is equivalent to expanding it at the end of the list, so use this order if you wish to preserve old behavior.

Motivating examples

The following example violates both #1 and #2, to help motivate why #2 makes things slightly better when #1 is impossible. bazelrc contents: build:foo --cpu=x86 build:misalteredfoo --config=foo # Violation of #2! build:misalteredfoo --cpu=arm64 # Violation of #1!

  • bazel build --config=misalteredfoo

effectively x86 in fixed-point expansion, and arm64 with in-place expansion

The following example still violates #1, but follows suggestion #2: bazelrc contents: build:foo --cpu=x86 build:misalteredfoo --cpu=arm64 # Violation of #1! build:misalteredfoo --config=foo

  • bazel build --config=misalteredfoo

effectively x86 in both expansions, so this does not diverge and appears fine at first glance. (thanks, suggestion #2!)

  • bazel build --config=foo --config=misalteredfoo

effectively arm64 in fixed-point expansion, x86 with in-place, since misalteredfoo's expansion is independent of the previous config mention.

Suggestions for users of --config

Lay users of --config might also see some surprising changes depending on usage patterns. The following suggestions are to avoid those differences. Both of the following will cause warnings if missed.

A. Avoid including to the same --config twice

B. Put --config options FIRST, so that explicit options continue to have precedence over the expansions of the configs.

Multiple mentions of a single --config, when combined with violations of #1, may cause surprising results, as shown in #1's motivating examples. In the new expansion, multiple expansions of the same config will warn. Multi-valued options will receive duplicates values, which may be surprising.

Motivating example for B

bazelrc contents:
    build:foo --cpu=x86
  • bazel build --config=foo --cpu=arm64 # Fine

effectively arm64 in both expansion cases

  • bazel build --cpu=arm64 --config=foo # Violates B

The explicit value arm64 has precedence with fixed-point expansion, but the config value x86 wins in in-place expansion. With in-place expansion, this will print a warning.

Additional Boolean Option Example

There are 2 boolean options, --foo and --bar. Each only accept one value (as opposed to accumulating multiple values).

In the following examples, the two options --foo and --bar have the same apparent order (and will have the same behavior with the new expansion logic). What changes from one example to the next is where the options are specified.

bazelrc Command Line Current final value New final value


--nofoo 
--foo
--bar 
--nobar 

--foo 
--nobar

--foo 
--nobar

# Config definitions 
build:all --foo
build:all --bar

--nofoo 
--config=all 
--nobar 

--nofoo
--nobar

--foo 
--nobar

# Set for every build
build --nofoo
build --config=all
build --nobar

# Config definitions build:all --foo build:all --bar


--foo
--bar

--foo 
--nobar

Now to make this more complicated, what if a config includes another config?

bazelrc Command Line Current final value New final value

# Config definitions 
build:combo --nofoo
build:combo --config=all
build:combo --nobar
build:all --foo
build:all --bar

--config=combo 

--foo 
--bar

--foo 
--nobar

--config=all
--config=combo 

--nofoo
--nobar

--foo 
--nobar

Here, counterintuitively, including --config=all explicitly makes its values effectively disappear. This is why it is basically impossible to create an automatic migration script to run on your rc - there's no real way to know what the intended behavior is.

Unfortunately, it gets worse, especially if you have the same config for different commands, such as build and test, or if you defined these in different files. It frankly isn't worth going into the further detail of the ordering semantics as it's existed up until now, this should suffice to demonstrate why it needs to change.

To understand the order of your configs specifically, run Bazel as you normally would (remove targets for speed) with the option --announce_rc. The order in which the config expansions are output to the terminal is the order in which they are currently interpreted (again, between rc and command line).

By Chloe Calvarin

Introducing Bazel Code Search

We are always looking for new ways to improve the experience of contributing to Bazel and helping users understanding how Bazel works. Today, we’re excited to share a preview of Bazel Code Search, a major upgrade to Bazel’s code search site. This new site features a refreshed user interface for code browsing and cross-repository semantic search with regular expression support, and a navigable semantic index of all definitions and references for the Bazel codebase. We’ve also updated the “Contribute” page on the Bazel website with documentation for this tool.

Getting started with Bazel Code Search

You can try Bazel Code Search right now by visiting https://source.bazel.build.

Select the repository you want to browse from the list on the main screen, or search across all Bazel repositories on the site using the search box at the top of the page.

Main screen of Bazel Code Search

Searching the Bazel codebase

Bazel Code Search has a semantic understanding of the Bazel codebase and allows you to search for either files or code within files. This semantic understanding of the code means that the search index identifies which parts of your code are entities such as classes, functions, and fields. Since the search index has classified these entities, your queries can include filters to scope the search to classes or functions and allows for improved search relevance by ranking important parts of code like classes, functions, and fields higher. By default, all searches use RE2 regular expressions though you can escape individual special characters with a backslash, or an entire string by enclosing it in quotes.

To search, start typing in the search box at the top of the screen and you’ll see suggestions for matching results. For Java, JavaScript, and Proto, result suggestions indicate if the match is an entity such as a Class, Method, Enum or Field. Semantic understanding for more languages is on the way.

Bazel code search suggestions

If you don’t see the result you want in the suggestions, you can submit your search and find all matches on the search result page. From the results page, you can select a matching line or file to view.

Here’s a sampling of different search examples to try out on your own:

  • ccToolchain
    • search for the substring “ccToolchain”
  • class ccToolchain
    • search for files containing both “class” and “ccToolchain” substrings
  • “class ccToolchain”
    • search for files containing the phrase “class ccToolchain”
  • class:ccToolchain
    • search for classes where the name of a class contains the substring “ccToolchain”
  • file:cpp ccToolchain
    • search for files containing the substring “ccToolchain” where “cpp” is in the file path
  • file:cpp lang:java ccToolchain
    • search for Java files containing the substring “ccToolchain” where “cpp” is in the file path
  • aggre.*test
    • search for the regular expression “aggre.*test”
  • ccToolchain -test
    • search for the substring “ccToolchain” excluding any files containing the substring “test”
  • cTool case:yes
    • search for the substring “cTool” (case sensitive)

Note that all searches are case insensitive unless you specify “case:yes” in the query.

Understanding the Bazel codebase using cross references

Another way to understand the Bazel repository is through the use of cross references. If you’ve ever wondered how to properly use a method, cross references can help by displaying all references to that method so you can see how it is used in other parts of the codebase. Alternatively, if you see a method being used but don’t understand what that method actually does, cross references enables you to click the method to view the definition or see how it’s used elsewhere.

Cross refereneces pane

Cross references aren’t only available for methods, they’re also generated for classes, fields, imports, and enums. Bazel Code Search uses the Kythe open source project to generate a semantic index of cross references for the Bazel codebase. These cross references appear automatically as hyperlinks within source files. To make cross references easier to identify, click the Cross References button to underline all cross references in a file.

Cross references underlined

Once you’ve clicked on a cross reference, the cross references pane will be displayed where you can view all the definitions and references organized by file. Within the cross references pane, you can navigate into multiple levels of depth of cross references while continuing to view the original file you were viewing in the File pane allowing you to maintain context of the original task.

Navigating through levels of cross references

Browsing through Bazel repositories

Selecting a repository from the main screen will take you to a view of the chosen repository with search scoped to its contents. The breadcrumb toolbar at the top allows you to quickly navigate to other repositories, refs, or folders.

Repository view on Bazel Code Search

From the view of the repository, you can browse through folders and files in the repository while taking advantage of blame, change history, a diff view and many other features.

Bazel Code Search File View showing Blame and History

Give Feedback

We hope you’ll try Bazel Code Search and provide feedback through the “!” button in the top right of any page on the Bazel Code Search site. We would love to hear whether this tool helps you work with Bazel and what else you’d like to see Bazel Code Search offer.

Keep in mind that this project is still experimental and is subject to change.

By Russell Wolf

Thank you for BazelCon 2017

We are truly thankful to our community for making our first annual Bazel Conference a success! Check out all the videos of all the talks from BazelCon 2017 on YouTube.

Your feedback reflected high level of satisfaction, and there was something of interest for everyone:

  • Bazel usage and migration stories from TensorFlow, Kubernetes, SpaceX, Pinterest, Wix, Stripe, DataBricks, and Dropbox.
  • Talks about upcoming features and tools such as remote execution & caching, buildozer, robolectric tests, PodToBUILD, bazeltfc, rules_typescript presented by Uber, Two Sigma, Google, Pinterest, and Asana.
  • Office hours where you got your Bazel questions answered, met engineers, and debugged on the fly. One attendee used their session to configure remote execution with buildfarm!

BazelCon2017 by the Numbers:

  • 200+ attendees
  • 60 organizations
  • 30 speakers
  • 12 informative talks
  • 14 hours of hands-on debugging and Q&A during Office Hours

What we heard from you:

  • Bazel Query is great!
  • Python is used more widely than we realized, and needs better Bazel support.
  • Reproducible builds are important especially when you're flying rockets, building autonomous vehicles, delivering media, and calculating financial transactions, leading to wider Bazel adoption by these communities.
  • Documentation is often hard to find, and is either too basic or too advanced.
  • Bazel's parallelism is impressive: much faster than Maven.
  • IDE integration is important, particularly with Visual Studio, CLion, XCode.
  • Build Event Protocol enables many options for internal visualization of events.
  • Aspects are a powerful tool.

What we can do next:

  • Work on better documentation and training, particularly for intermediate-level topics.
  • Prioritize IDE integration.
  • Engage the community in building better Python support in Bazel.
  • Implement improved support for third-party dependencies.
  • Continue work on cross-platform improvements.
  • Engage the community in wider adoption and contribution to Bazel.

What you can do next:

  • Contribute to Bazel, particularly to the new community driven buildfarm effort.
  • Kick off local meet-ups (we will reach out to volunteers who responded to this in our survey).
  • Get more info on Bazel at bazel.build.
  • Join the discussion on bazel-discuss@googlegroups.com.

We look forward to working with you and growing our Bazel user community in 2018!

By Helen Altshuler and David Stanke

Bazel on Windows -- a year in retrospect

Bazel on Windows is no longer experimental. We think Bazel on Windows is now stable and usable enough to qualify for the Beta status we have given Bazel on other platforms.

Over the last year, we put a lot of work into improving Bazel's Windows support:

  • Bazel no longer depends on the MSYS runtime. This means Bazel is now a native Windows binary and we no longer link it to the MSYS DLLs. Bazel still needs Bash (MSYS or Git Bash) and the GNU binutils (binaries under /usr/bin) if your dependency graph includes genrule or sh_* rules (similarly to requiring python.exe to build py_* rules), but you can use any MSYS version and flavour you like, including Git Bash.
  • Bazel can now build Android applications. If you use native android_* rules, Bazel on Windows can now build and deploy Android applications.
  • Bazel is easier to set up. You now (typically) no longer need to set the following environment variables:
    • BAZEL_SH and BAZEL_PYTHON -- Bazel attempts to autodetect your Bash and Python installation paths.
    • JAVA_HOME -- we release Bazel with an embedded JDK. (We also release binaries without an embedded JDK if you want to use a different one.)
  • Visual C++ is the default C++ compiler. We no longer use GCC by default, though Bazel still supports it.
  • Bazel integrates better with the Visual C++ compiler. Bazel no longer dispatches to a helper script to run the compiler; instead Bazel now has a CROSSTOOL definition for Visual C++ and drives the compiler directly. This means Bazel creates fewer processes to run the compiler. By removing the script, we have eliminated one more point of failure.
  • Bazel creates native launchers. Bazel builds native Windows executables from java_binary, sh_binary, and py_binary rules. Unlike the .cmd files that Bazel used to build for these rules, the new .exe files no longer dispatch to a shell script to launch the xx_binaries, resulting in faster launch times. (If you see errors, you can use the --[no]windows_exe_launcher flag to fall back to the old behavior; if you do, please file a bug. We'd like to remove this flag and only support the new behavior.)

Coming soon

We are also working on bringing the following to Bazel on Windows:

  • Android Studio integration. We'll ensure Bazel works with Android Studio on Windows the same way it does on Linux and macOS. See issue #3888.
  • Dynamic C++ Linking. Bazel will support building and linking to DLLs on Windows. See issue #3311.
  • Skylark rule migration guide. We'll publish tutorials on writing Skylark rules that work not just on Linux and macOS, but also on Windows. See issue #3889.

Looking ahead, we aim to maintain feature parity between Windows and other platforms. We aim to maximize portability between host systems, so you get the same fast, correct builds on your developer OS of choice. If you run into any problems, please don't hesitate to file a bug.

By László Csomor