Bazel Blog

Bazel 0.2.0 Released

We are delighted to announce the 0.2.0 release of Bazel. This release marks major improvements in support for external repositories, Skylark and testing, in particular how external repositories and Skylark can work together.

Improvements from our roadmap

Skylark rules can now be loaded from remote repositories

Skylark rules can now be loaded from a remote repository. For example, to use the Scala rules, add the following to your WORKSPACE file:

git_repository(
    name = "io_bazel_rules_scala",
    remote = "https://github.com/bazelbuild/rules_scala.git",
    tag = "0.0.1",
)
load("@io_bazel_rules_scala//scala:scala.bzl", "scala_repositories")
scala_repositories()

This will download all of the tools the rules need to build Scala programs.

Then load and use normally from your BUILD files:

load("@io_bazel_rules_scala//scala:scala.bzl", "scala_library")
scala_library(...)

We will gradually move the existing rules to their own repositories, announcing changes on the mailing list.

Go build and test support

There is now Go language support, see the documentation for details.

Open sourcing tests

We also open sourced over a hundred tests and laid the foundation for open sourcing more. We will continue to open source more tests (both to increase Bazel's stability and to make contributing easier), but this marks a dramatic increase in test coverage.

Other major changes since 0.1.0

  • The --package_path definition in .bazelrc is no longer required, nor is the base_workspace/ directory.
  • JUnit test runner for Java tests - Use the --nolegacy_bazel_java_test flag (soon to be the default) to get XML output for easy integration into CI systems and easier debugging with --java_debug.
  • Skylark macros can now be loaded and used in the WORKSPACE file.
  • Remote repository filesystem changes are tracked.
  • Debian packages and a Homebrew recipe.

For changes since 0.1.5 (the minor version before 0.2.0), see the release notes for changes.

Future plans

Looking ahead to 0.3.0:

  • Windows support is coming! (See the Windows label to follow the progress there).
  • Remote caching and execution is in progress (see Alpha Lam's work).
  • XCode integration and generic IDE support.
  • Ulf has been working on sandboxing for OS X, which will hopefully be available soon.
  • More work on parallelization. We currently have experimental support (which can be enabled with the --experimental_interleave_loading_and_analysis flag) which improves clean build time (~30% faster loading and analysis), especially for builds using a lot of select() expressions.

Finally...

A big thank you to our community for your continued support. Particular shout-outs to the following contributors:

  • Brian Silverman - for tons of important bug fixes and answering lots of user questions.
  • Alpha Lam - for writing up design docs and implementing remote caching/execution.
  • P. Oscar Boykin - for putting tons of time and effort into the Scala rules, as well as being a tireless supporter on Twitter.

Thank you all, keep the discussion and bug reports coming!

Using Bazel in a continuous integration system

When doing continuous integration, you do not want your build to fail because a a tool invoked during the build has been updated or some environmental conditions have changed. Because Bazel is designed for reproducible builds and keeps track of almost every dependency of your project, Bazel is a great tool for use inside a CI system. Bazel also caches results of previous build, including test results and will not re-run unchanged tests, speeding up each build.

Running Bazel on virtual or physical machines.

For ci.bazel.build, we use Google Compute Engine virtual machine for our Linux build and a physical Mac mini for our Mac build. Apart from Bazel tests that are run using the ./compile.sh script, we also run some projects to validate Bazel binaries against: the Bazel Tutorial here, re2 here, protobuf here, and TensorFlow here.

Bazel is reinstalled each time we run the tutorial or TensorFlow, but the Bazel cache is maintained across installs. The setup for those jobs is the following:

set -e

# Fetch the Bazel installer
URL=https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-installer-${INSTALLER_PLATFORM}.sh
export BAZEL_INSTALLER=${PWD}/bazel-installer/install.sh
curl -L -o ${BAZEL_INSTALLER} ${URL}
BASE="${PWD}/bazel-install"

# Install bazel inside ${BASE}
bash "${BAZEL_INSTALLER}" \
  --base="${BASE}" \
  --bazelrc="${BASE}/bin/bazel.bazelrc" \
  --bin="${BASE}/binary"

# Run the build
BAZEL="${BASE}/binary/bazel --bazelrc=${BASE}/bin/bazel.bazelrc"
${BAZEL} test //...

This tests installing a specific version of Bazel each time. Of course, if Bazel is installed on the path, one can simply bazel test //.... However, even with reinstalling all the time, Bazel caching simply works.

Running Bazel inside a Docker container

Several people want to use Bazel in a Docker container. First of all, Bazel has some feature that are incompatibles with Docker:

  • Bazel runs by default in client/server mode using UNIX domain sockets, so if you cannot mount the socket inside the Docker container, then you must disable client-server communication by running Bazel in batch mode with the --batch flag.
  • Bazel sandboxes all actions on linux by default and this needs special privileges in the Docker container (enabled by --privilege=true. If you cannot enable the namespace sandbox, you can deactivate it in Bazel with the --genrule_strategy=standalone --spawn_strategy=standalone flags.

So the last step of the previous script would look like:

# Run the build
BAZEL="${BASE}/binary/bazel --bazelrc=${BASE}/bin/bazel.bazelrc --batch"
${BAZEL} test --genrule_strategy=standalone --spawn_strategy=standalone \
    //...

This build will however be slower because the server has to restart for every build and the cache will be lost when the Docker container is destroyed.

To prevent the loss of the cache, it is better to mount a persistent volume for ~/.cache/bazel (where the Bazel cache is stored).

Return code and XML output

A final consideration when setting up a continuous integration system is getting the result from the build. Bazel has the following interesting exit codes when using test and build commands:

Exit Code Description
0 Success.
1 Build failed.
2 Command Line Problem, Bad or Illegal flags or command combination, or Bad Environment Variables. Your command line must be modified.
3 Build OK, but some tests failed or timed out.
4 Build successful but no tests were found even though testing was requested.
8 Build interrupted (by a Ctrl+C from the user for instance) but we terminated with an orderly shutdown.

These return codes can be used to determine the reason for a failure (in ci.bazel.build, we mark builds that have exited with exit code 3 as unstable, and other non zero code as failed).

You can also control how much information about test results Bazel prints out with the --test_output flag. Generally, printing the output of test that fails with --test_output=errors is a good setting for a CI system.

Finally, Bazel's built-in JUnit test runner generates Ant-style XML output file (in bazel-testlogs/pkg/target/test.xml) that summarizes the results of your tests. This test runner can be activated with the --nolegacy_bazel_java_test flag (this will soon be the default). Other tests also get a basic XML output file that contains only the result of the test (success or failure).

To get your test results, you can also use the Bazel dashboard, an optional system that automatically uploads Bazel test results to a shared server.

Persistent Worker Processes for Bazel

Bazel runs most build actions as a separate process. Many build actions invoke a compiler. However, starting a compiler is often slow: they have to perform some initialization when they start up, read the standard library, header files, low-level libraries, and so on. That’s why some compilers and tools have a persistent mode, e.g. sjavac, Nailgun and gcc server. Keeping a single process for longer and passing multiple individual requests to the same server can significantly reduce the amount of duplicate work and cut down on compile times.

In Bazel, we have recently added experimental support for delegating work to persistent worker processes that run as child processes of and are managed by Bazel. Our Javac wrapper (called JavaBuilder) is the first compiler that supports running as a worker.

We’ve tried the persistent JavaBuilder for a variety of builds and are seeing a ~4x improvement in Java build times, as Javac can now benefit from JIT optimizations over multiple runs and we no longer have to start a new JVM for every compile action. For Bazel itself, we saw a reduction in build time for a clean build from ~58s to ~16s (on repeated builds).

Full build Incremental build

If you often build Java code, we’d like you to give it a try. Just pass --strategy=Javac=worker to enable it or add build --strategy=Javac=worker to the .bazelrc in your home directory or in your workspace. Check the WorkerOptions class for flags to further tune the workers’ behavior or run “bazel help” and look for the “Strategy options” category. Let us know how it works for you.

We’re currently using a simple protobuf-based protocol to communicate with the worker process. Let us know if you want to add support for more compilers; in many cases, you can do that without any Bazel changes. However, the protocol is still subject to change based on your feedback.

About Sandboxing

We've only added sandboxing to Bazel two weeks ago, and we've already seen a flurry of fixes to almost all of the rules to conform with the additional restrictions imposed by it.

What is sandboxing?

Sandboxing is the technique of restricting the access rights of a process. In the context of Bazel, we're mostly concerned with restricting file system access. More specifically, Bazel's file system sandbox contains only known inputs, such that compilers and other tools can't even see files they should not access.

(We currently also mount a number of system directories into the sandbox to allow running locally installed tools and make it easier to write shell scripts. See below.)

Why are we sandboxing in Bazel?

We believe that developers should never have to worry about correctness, and that every build should result in the same output, regardless of the current state of the output tree. If a compiler or tool reads a file without Bazel knowing it, then Bazel won't rerun the action if that file has changed, leading to incorrect incremental builds.

We would also like to support remote caching in Bazel, where incorrect reuse of cache entries is even more of a problem than on the local machine. A bad cache entry in a shared cache affects every developer on the project, and the equivalent of 'bazel clean', namely wiping the entire remote cache, rather defeats the purpose.

In addition, sandboxing is closely related to remote execution. If the build works well with sandboxing, then it will likely work well with remote execution - if we know all the inputs, we can just as well upload them to a remote machine. Uploading all files (including local tools) can significantly reduce maintenance costs for compile clusters compared to having to install the tools on every machine in the cluster every time you want to try out a new compiler or make a change to an existing tool.

How does it work?

On Linux, we're using user namespaces, which are available in Linux 3.8 and later. Specifically, we create a new mount namespace. We create a temporary directory into which we mount all the files that the subprocess is allowed to see. We then use pivot_root to make the temporary directory appear as the root directory for all subprocesses.

We also mount /proc, /dev/null, /dev/zero, and a temporary filesystem (tmpfs) on /tmp. We mount /dev/random and /dev/urandom, but recommend against their usage, as it can lead to non-reproducible builds.

We currently also mount /bin, /etc, /usr (except /usr/local), and every directory starting with /lib, to allow running local tools. In the future, we are planning to provide a shell with a set of Linux utilities, and to require that all other tools are specified as inputs.

What about Mac and Windows?

We are planning to implement sandboxing for OS X (using OS X sandboxing, see our roadmap) and eventually Windows as well.

What about networking?

At some point, we'd like to also reduce network access, probably also using namespaces, with a separate opt-out mechanism.

How do I opt-out of sandboxing?

Preferably, you should make all your rules and scripts work properly with sandboxing. If you need to opt out, you should talk to us first - at Google, the vast majority of actions is fully sandboxed, so we have some experience with how to make it work. For example, Bazel has a special mechanism to add information about the current user, date, time, or the current source control revision to generated binaries.

If you still need to opt out for individual rules, you can add the local = 1 attribute to genrule or *_test calls.

If you're writing a custom rule in Skylark, then you cannot currently opt out. Instead, please file a bug and we'll help you make it work.

Bazel Builder Blasts Beyond Beta Barrier

Reposted from Google's Open Source blog.

We're excited to announce the Beta release of Bazel, an open source build system designed to support a wide variety of different programming languages and platforms.

There are lots of other build systems out there -- Maven, Gradle, Ant, Make, and CMake just to name a few. So what's special about Bazel? Bazel is what we use to build the large majority of software within Google. As such, it has been designed to handle build problems specific to Google's development environment, including a massive, shared code repository in which all software is built from source, a heavy emphasis on automated testing and release processes, and language and platform diversity. Bazel isn't right for every use case, but we believe that we're not the only ones facing these kinds of problems and we want to contribute what we've learned so far to the larger developer community.

Our beta release provides:

Check out the tutorails for working examples using several languages: * Build a Java Project * Build a C++ Project * Build an Android App * Build an iOS App

We still have a long way to go. Looking ahead towards our 1.0.0 release, we plan to provide Windows support, distributed caching, and Go support among other features. See our roadmap for more details and follow our blog or Twitter account for regular updates. Feel free to contact us with questions or feedback on the mailing list or IRC (#bazel on freenode).

By Jeff Cox, Bazel team

Build dashboard dogfood

WARNING: This feature has been removed (2017-04-19).

We've added a basic dashboard where you can see and share build and test results. It's not ready for an official release yet, but if any adventurous people would like to try it out (and please report any issues you find!), feel free to give it a try.

First, you'll need to download or clone the dashboard project.

Run bazel build :dash && bazel-bin/dash and add this line to your ~/.bazelrc:

build --use_dash --dash_url=http://localhost:8080

Note that the bazel build will take a long time to build the first time (the dashboard uses the AppEngine SDK, which is ~160MB and has to be downloaded). The "dash" binary starts up a local server that listens on 8080.

With --use_dash specified, every build or test will publish info and logs to http://localhost:8080/ (each build will print a unique URL to visit).

See the README for documentation.

This is very much a work in progress. Please let us know if you have any questions, comments, or feedback.

Building deterministic Docker images with Bazel

Docker images are great to automate your deployment environment. By composing base images, you can create an (almost) reproducible environment and, using an appropriate cloud service, easily deploy those image. However, V1 Docker build suffers several issues:

  1. Docker images are non-hermetic as they can run any command,
  2. Docker images are non-reproducible: each "layer" identifier is a random hex string (and not cryptographic hash of the layer content), and
  3. Docker image builds are not incremental since Docker assumes that RUN foo always does the same thing.

Googlers working on Google Container Registry developed a support for building reproducible Docker images using Skylark / Bazel that address these problems. We recently shipped it.

Of course, it does not support RUN command, but the rule also strips timestamps of the tar file and use a SHA sum that is function of the layer data as layer identifier. This ensure reproducibility and correct incrementality.

To use it, simply creates your images using the BUILD language:

load("/tools/build_defs/docker/docker", "docker_build")

docker_build(
   name = "foo",
   tars = [ "base.tar" ],
)

docker_build(
   name = "bar",
   base = ":foo",
   debs = [ "blah.deb" ],
   files = [ ":bazinga" ],
   volumes = [ "/asdf" ],
)

This will generate two docker images loadable with bazel run :foo and bazel run :bar. The foo target is roughly equivalent to the following Dockerfile:

FROM bazel/base

And the bar target is roughly equivalent to the following Dockerfile:

FROM bazel/foo
RUN dpkg -i blah.deb
ADD bazinga /
VOLUMES /asdf

Using remote repositories, it is possible to fetch the various base image for the web and we are working on providing a docker_pull rule to interact more fluently with existing images.

You can learn more about this docker support here.

Trimming your (build) tree

Reposted from @kchodorow's blog.

Jonathan Lange wrote a great blog post about how Bazel caches tests. Basically: if you run a test, change your code, then run a test again, the test will only be rerun if you changed something that could actually change the outcome of the test. Bazel takes this concept pretty far to minimize the work your build needs to do, in some ways that aren't immediately obvious.

Let's take an example. Say you're using Bazel to "build" rigatoni arrabiata, which could be represented as having the following dependencies:

Each food is a library which depends on the libraries below it. Suppose you change a dependency, like the garlic:

Bazel will stat the files of the "garlic" library and notice this change, and then make a note that the things that depend on "garlic" may have also changed:

The fancy term for this is "invalidating the upward transitive closure" of the build graph, aka "everything that depends on a thing might be dirty." Note that Bazel already knows that this change doesn't affect several of the libraries (rigatoni, tomato-puree, and red-pepper), so they definitely don't have to be rebuilt.

Bazel will then evaluate the "sauce" node and figures out if its output has changed. This is where the secret sauce (ha!) happens: if the output of the "sauce" node hasn't changed, Bazel knows that it doesn't have to recompile rigatoni-arrabiata (the top node), because none of its direct dependencies changed!

The sauce node is no longer “maybe dirty” and so its reverse dependencies (rigatoni-arrabiata) can also be marked as clean.

In general, of course, changing the code for a library will change its compiled form, so the "maybe dirty" node will end up being marked as "yes, dirty" and re-evaluated (and so on up the tree). However, Bazel's build graph lets you compile the bare minimum for a well-structured library, and in some cases avoid compilations altogether.

Configuring your Java builds

Let say that you want to build for Java 8 and errorprone checks off but keep the tools directory provided with Bazel in the package path, you could do that by having the following rc file:

build --javacopt="-extra_checks:off"
build --javacopt="-source 8"
build --javacopt="-target 8"

However, the file would becomes quickly overloaded, especially if you take all languages and options into account. Instead, you can tweak the java_toolchain rule that specifies the various options for the java compiler. So in a BUILD file:

java_toolchain(
    name = "my_toolchain",
    encoding = "UTF-8",
    source_version = "8",
    target_version = "8",
    misc = [
        "-extra_checks:on",
    ],
)

And to keep it out of the tools directory (or you need to copy the rest of the package), you can redirect the default one in a bazelrc:

build --java_toolchain=//package:my_toolchain

In the future, toolchain rules should be the configuration points for all the languages but it is a long road. We also want to make it easier to rebind the toolchain using the bind rule in the WORKSPACE file.

Sharing your rc files

You can customize the options Bazel runs with in your ~/.bazelrc, but that doesn't scale when you share your workspace with others.

For instance, you could de-activate Error Prone's DepAnn checks by adding the --javacopt="-Xep:DepAnn:OFF" flag in your ~/.bazelrc. However, ~/.bazelrc is not really convenient as it a user file, not shared with your team. You could instead add a rc file at tools/bazel.rc in your workspace with the content of the bazelrc file you want to share with your team:

build --javacopt="-Xep:DepAnn:OFF"

This file, called a master rc file, is parsed before the user rc file. There is three paths to master rc files that are read in the following order:

  1. tools/bazel.rc (depot master rc file),
  2. /path/to/bazel.bazelrc (alongside bazel rc file), and
  3. /etc/bazel.bazelrc (system-wide bazel rc file).

The complete documentation on rc file is here.