Farid Zakaria’s Blog

JVM boot optimization via JavaIndex

2024-11-08T14:01:00-08:00

Ever heard of a JarIndex? I had been doing JVM develpoment for 10+ years and I hadn’t. Read on to discover what it is and how it can speedup your compilation and boot time. 🤓

After having worked on Shrinkwrap and publishing our results in Mapping Out the HPC Dependency Chaos, you start to see the Linux environment as a bit of an oddball.

Everything in Linux is structured around O(n) or O(n^2) search and lookup.

This feels now unsuprising given that everything in Linux searches across colon separate lists (i.e. LD_LIBRARY_PATH, RUN_PATH). This idiom however is even more pervasive and has bled into all of our language.

The JVM for instance, must search for classes amongst a set of directories, files or JARs set on the CLASS_PATH.

Everytime the JVM needs to load a class file, it must perform a linear search along all entries in the CLASS_PATH. Thanksfully, if the entries are directories or JARs, no subsequent search must be performed since the package name of a class dictates the directory structure that must exist.

io.fzakaria.Example -> io/fzakaria/Example.class

Nevertheless, the CLASS_PATH size can be large. At $DAY_JOB$, almost all of our services launch with +300 entries (JARs) on the ClassPath.

Large enterprise codebases may feature over a thousand ClassPath entries. 😮

A large ClassPath means that the JavaVirtualMachine (JVM) needs to search entry for the desired class. This not only affects startup time for your application, on every startup, repeatedly, but also compilation as well via javac.

The authors of the JVM already knew about this problem, especially when the idea of Java Applets were dominant. Each JAR on the ClassPath would have been fetched via HTTP and would cause unbearable slowdown for startup.

The JDK has support for a JarIndex.

A JarIndex, is a JAR which has a special file INDEX.LIST that effectively contains an index of all JARs on the ClassPath and the packages found within.

JarIndex-Version: 1.0

libMain.jar
Main.class

lib/libA.jar
A.class

lib/libB.jar
B.class

Whenever a class must be searched rather than searching through the CLASS_PATH, the index file is used leading to constant-time lookup for classes.

This seemingly powerful primitive confusingly has been deprecated and ultimately removed in JDK22 (JDK-8302819) 🤔 – citing challenges when having to support a broad ranges of topics such as Multi-Version JARs.

Unsuprisingly, I think this feature would be an easy fit into Bazel, Spack or Nix – as there are a lot more constraints on the type of JARs that need be supported.

I put together a small gist on what this support might look like.

def _jar_index_impl(ctx):
    java_info = ctx.attr.src[JavaInfo]
    java_runtime = ctx.attr._java_runtime[java_common.JavaRuntimeInfo]
    java_home = java_runtime.java_home
    jar_bin = "%s/bin/jar" % java_home

    runtime_jars = " "
    for jar in java_info.transitive_runtime_jars.to_list():
        runtime_jars += jar.path + " "

    cmds = [
        "%s -i %s %s" % (jar_bin, java_info.java_outputs[0].class_jar.path, runtime_jars),
        "cp %s %s" % (java_info.java_outputs[0].class_jar.path, ctx.outputs.index.path),
    ]

    ctx.actions.run_shell(
        inputs = [ java_info.java_outputs[0].class_jar] + java_info.transitive_runtime_jars.to_list(),
        outputs = [ctx.outputs.index],
        tools = java_runtime.files,
        command = ";\n".join(cmds),
    )

    return [
        DefaultInfo(files = depset([ctx.outputs.index])),
    ]

jar_index = rule(
    implementation = _jar_index_impl,
    attrs = {
        "src": attr.label(
            mandatory = True,
            providers = [JavaInfo],
        ),
        "_java_runtime": attr.label(
            default = "@bazel_tools//tools/jdk:current_java_runtime",
            providers = [java_common.JavaRuntimeInfo],
        ),
    },
    outputs = {"index": "%{name}_index.jar"},
)

Further improvements can be made, to give this index-like support to the Java compiler itself and not only for java_binary targets.

We’ve gone out of our way on these systems to define our inputs, enforce contraints and model our dependencies. Not taking advantage of this stability and regressing to the default search often found in our tooling is leaving easy performance improvements on the floor.

Bazel Knowledge: What’s an Interface JAR?

2024-10-29T19:34:00-07:00

I spent the day working through an upgrade of our codebase at $DAYJOB$ to Java21 and hit Bazel issue#24138 as a result of an incorrectly produced hjar.

🤨 WTF is an hjar ?

☝️ It is the newer version of ijar !

😠 WTF is an ijar ?

Let’s discover what an ijar (Interface JAR) is and how it’s the magic sauce that makes Bazel so fast for Java.

Let’s consider a simple Makefile

program: main.o utils.o
	$(CC) -o program main.o utils.o

main.o: main.c utils.h
	$(CC) -c main.c

utils.o: utils.c utils.h
	$(CC) -c utils.c

We’ve been taught to make use of header files, especially in C/C++ so that we can avoid recompilation as a form of early cutoff optimization.

☝️ If we change utils.c solely, we do not have to recompile main.o.

We can visualize this Makefile in the following graph.

Ok, great! What does this have to do with Java & Bazel ?

Well, let’s remember back to my previous post on reproducible outputs.

Bazel constructs a similar graph to determine when to do early cutoff optimization through the “Action Key”. Bazel computes a hash for each action, that takes dependencies for instance, and if the hash hasn’t changed it can memoize the work.

In Java-world, dependencies are expressed as JARs.

Wouldn’t private-only changes to a dependency (i.e. renaming a private variable) cause the Action Key HASH to change (since it produced a different JAR) ?

🤓 YES! That is why we need an ijar !

ijar is a tool found within the Bazel repository bazel/third_party/ijar.

You can build and run it fairly simple with Bazel

$ bazel run //third_party/ijar
Usage: ijar [-v] [--[no]strip_jar] [--target label label] [--injecting_rule_kind kind] x.jar [x_interface.j
ar>]
Creates an interface jar from the specified jar file.

It’s purpose is straightforward. The tool strips all non-public information from the JAR. For example, it throws away:

Files whose name does not end in “.class”.
All executable method code.
All private methods and fields.
All constants and attributes except the minimal set necessary to describe the class interface.
All debugging information (LineNumberTable, SourceFile, LocalVariableTables attributes).

The end result is something in spirit to a C/C++ header file.

Let’s see it in practice. 🕵️

Let’s now create an incredibly simple JAR. It will have a single class file within it.

public class Banana {
    public void peel() {
        System.out.println("Peeling the banana...");
        squish();
    }
    private void squish() {
        System.out.println("Squish! The banana got squashed.");
    }
}

We compile it like usual.

$ javac Banana.java
$ jar cf banana.jar Banana.class

When we run ijar on it we get the hash e18e0ae82bdc4deb04f04aa

⚠️ I shortened the hashes to make them more legible.

$ bazel-bin/third_party/ijar/ijar banana.jar

$ sha256sum banana.jar
f813749013ea6aba2e00876  banana.jar

$ sha256sum banana-interface.jar
e18e0ae82bdc4deb04f04aa  banana-interface.jar

Let’s now change the internals of the Banana class; let’s rename the method squish() -> squash().

Let’s recompute the new sha256.

$ sha256sum banana.jar
9278282827ddb55c68eb370 banana.jar

$ sha256sum banana-interface.jar
e18e0ae82bdc4deb04f04aa  banana-interface.jar

🤯 Although the hash of banana.jar had changed, we still get e18e0ae82bdc4deb04f04aa for the ijar.

We now the equivalent of a header file for Java code. 🙌

Bazel will use the ijar when computing the Action Key hash in lieu of the JAR for the dependencies you may depend on; thus avoiding costly rebuilds when only private information changes within your dependency.

This is the amazing lesser known tool that makes Bazel super-powered 🦸 for JVM languages.

Bazel Knowledge: mind your PATH

2024-10-23T15:37:00-07:00

Have you encountered the following?

> bazel build
INFO: Invocation ID: f16c3f83-0150-494e-bd34-1a9cfb6a2e67
WARNING: Build option --incompatible_strict_action_env has changed, discarding analysis cache (this can be expensive, see https://bazel.build/advanced/performance/iteration-speed).
INFO: Analyzed target @@com_google_protobuf//:protoc (113 packages loaded, 1377 targets configured).
[483 / 845] 13 actions, 12 running
    Compiling src/google/protobuf/compiler/importer.cc; 3s disk-cache, darwin-sandbox
    Compiling src/google/protobuf/compiler/java/names.cc; 1s disk-cache, darwin-sandbox
    Compiling src/google/protobuf/compiler/java/name_resolver.cc; 1s disk-cache, darwin-sandbox
    Compiling src/google/protobuf/compiler/java/helpers.cc; 1s disk-cache, darwin-sandbox
    Compiling src/google/protobuf/compiler/objectivec/enum.cc; 1s disk-cache, darwin-sandbox
    Compiling absl/strings/cord.cc; 1s disk-cache, darwin-sandbox
    Compiling src/google/protobuf/compiler/objectivec/names.cc; 0s disk-cache, darwin-sandbox
    Compiling absl/time/internal/cctz/src/time_zone_lookup.cc; 0s disk-cache, darwin-sandbox ...

I finally had it with Bazel recompiling protoc 😤

The working title for this post: Why the #$@! does protoc keep recompiling! 🤬

If you are not interested in the story and just want to avoid recompiling protoc, try putting build --incompatible_strict_action_env in your .bazelrc.

Checkout Aspect’s bazelrc guide for other good tidbits.

Admittedly, I’ve been using Bazel a while and I wasn’t sure why I kept having to rebuild protoc despite nothing seemingly changing on my system.

Worse, my coworkers who I’ve been working hard to champion Bazel were starting to notice.

“You explained that Bazel is supposed to be hermetic and have great caching. Why am I recompiling protoc?”

This seems to be a bit of an issue within the Bazel community so much so, that one recommended approach is just use precompiled binaries via aspect-build/toolchains_protoc. 🤦

Aside: Using prebuilt binaries not only hinders my own personal adoption of Bazel on NixOS but devalues the value proposition of Bazel itself.

Turns out there is a long-standing 5 year old issue issues#7095 that provided some clues; specifically changing PATH is busting the action key.

I want to validate this assumption, by following the guide on how to debug remote cache hits.

I ran Bazel twice, once with a different PATH and stored the compact execution log. I then convert it to textual form and diff them.

# build protoc normally
> bazel build @com_google_protobuf//:protoc \
    --execution_log_compact_file=/tmp/exec1.log

# muck up the PATH
> PATH=$PATH:/bin4/ bazel build @com_google_protobuf//:protoc \
    --execution_log_compact_file=/tmp/exec2.log

> bazel-bin/src/tools/execlog/parser \
  --log_path=/tmp/exec1.log \
  --log_path=/tmp/exec2.log \
  --output_path=/tmp/exec1.log.txt \
  --output_path=/tmp/exec2.log.txt

> diff /tmp/exec1.log.txt /tmp/exec2.log.txt | head -n 30
# omitted for brevity

Sure enough; there is an action_env with the PATH variable and it’s different, causing the action digest to change.

But, why! 🤔

Some of the actions used in the C++ toolchain use the shell’s default environment. For instance, Bazel doesn’t include a C++ toolchain by default so it has to find a C++ compiler by searching on the PATH itself.

We can test this (thanks to keith for this) with a small demo. We can see what envs are in actions by default passing -s

def _impl(ctx):
    file = ctx.actions.declare_file(ctx.label.name)
    ctx.actions.run_shell(
        outputs = [file],
        command = "touch {}".format(file.path),
    )

    return DefaultInfo(files = depset([file]))

foo = rule(
    implementation = _impl,
)

Will produce:

SUBCOMMAND: # //:bar [action 'Action bar', configuration: 815f76489fb61a0245ff1941974c20af0ca4e7f91caa00c80538d4493d650289, execution platform: @@platforms//host:host, mnemonic: Action]
(cd /home/ubuntu/.cache/bazel/_bazel_ubuntu/1275a810ad76d4d1cc60319d4aaf0d39/execroot/_main && \
  exec env - \
  /bin/bash -c 'touch bazel-out/aarch64-fastbuild/bin/bar')

If we change the run_shell action to use use_default_shell_env=True we then get.

SUBCOMMAND: # //:bar [action 'Action bar', configuration: 815f76489fb61a0245ff1941974c20af0ca4e7f91caa00c80538d4493d650289, execution platform: @@platforms//host:host, mnemonic: Action]
(cd /home/ubuntu/.cache/bazel/_bazel_ubuntu/1275a810ad76d4d1cc60319d4aaf0d39/execroot/_main && \
  exec env - \
    PATH= \
  /bin/bash -c 'touch bazel-out/aarch64-fastbuild/bin/bar')

Okay, so how do we solve this?

There are two ways to solve this.

First, you can try --incompatible_strict_action_env in your .bazerc file. If this flag is set, Bazel will force set PATH to be a static value. If your C++ compiler is either a hermetic toolchain or found in the default lists set; you are good to go!

If you tried the first option but your build is failing, you’ll have to manually force set the PATH via the action_env flag such as --action_env=PATH=/usr/bin:/something/custom

Hopefully these settings get you from recompiling protoc and reach Bazel nirvana.

I highly recommend Aspect’s bazelrc guide for other no-nonsense settings that should likely just be the default 🙄

Bazel Knowledge: Aspects to generate Java CLASSPATH

2024-10-13T11:10:00-07:00

One of the more advanced features of Bazel is the concept of aspect.

For a very brief primer on why you may want an aspect is that Bazel let’s you audit and analyze the BUILD graph without performing any actual builds. It does this by constructing a “shadow graph” that your aspect can perform analysis on. This can be useful for a variety things such as IDE integration.

I wanted to ask a very simple question to make integration with Visual Studio Code straightforward:

“What’s the CLASSPATH I need for a particular target so that I don’t get red squigglies?”

Sometimes simple questions involve some of the more advanced features of Bazel. I wanted to generate a file that I can feed into any IDE, such as Visual Studio Code, and get semi-decent language integration.

My end goal:

> bazel build //:generate_classpath

> cat bazel-bin/classpath.txt
bazel-out/k8-fastbuild/bin/java/lib/liblib.jar
bazel-out/k8-fastbuild/bin/java/lib2/liblib2.jar

First thing we want to do is generate an aspect that will collect recursively all the compile time Jars.

We define our aspect which requires the sole deps attribute to be propagated. We then make sure we recursively merge all the results of the prior aspect invocations into the final ClassPathInfo provider object.

ClassPathInfo = provider(
    "Provider for classpath information",
    fields = {
        'compile_jars' : 'depset of compile jars'
    }
)


def _classpath_aspect_impl(target, ctx):
    # Make sure the rule has a deps attribute.
    if hasattr(ctx.rule.attr, 'deps'):
        target_compile_jars = target[JavaInfo].full_compile_jars
        dep_compile_jars = [
            dep[ClassPathInfo].compile_jars
            for dep in ctx.rule.attr.deps
        ]
        all_compile_jars = [target_compile_jars] + dep_compile_jars
        merged_depset = depset(transitive=all_compile_jars)

        classpath_strings = []
        for jar in merged_depset.to_list():
            classpath_strings.append(jar.path)

        output_file = ctx.actions.declare_file("classpath.txt")
        ctx.actions.write(
            output = output_file,
            content = "\n".join(classpath_strings),
            is_executable = False
        )

        return [ClassPathInfo(
            compile_jars = merged_depset
            ),
            OutputGroupInfo(
                compile_jars = depset([output_file])
            )]
    return [ClassPathInfo(compile_jars = depset())]

classpath_aspect = aspect(
    implementation = _classpath_aspect_impl,
    # attr_aspects is a list of rule attributes along
    # which the aspect propagates.
    attr_aspects = ['deps'],
)

A less documented feature of Bazel is the “output groups” which you can see here we are by specifying OuputGroupInfo. The idea here is that we can now specify our apect for any label by using this command line invocation:

> bazel build //java/app --aspects defs.bzl%classpath_aspect \
        --output_groups=compile_jars

INFO: Analyzed target //java/app:app (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Aspect //:defs.bzl%classpath_aspect of //java/app:app up-to-date:
  bazel-bin/java/app/classpath.txt

>  cat bazel-bin/java/app/classpath.txt
bazel-out/k8-fastbuild/bin/java/lib/liblib.jar
bazel-out/k8-fastbuild/bin/java/lib2/liblib2.jar⏎

That’s not all though! We can also create a rule that defines the labels provided must be of the type aspect. This let’s us encode the build targets in our BUILD files themselves.

The rule itself is straightforward. For each label provided, it goes through the items in the compile_jars depset and creates an output file which is the concatenated new-line delimited list.

def _generate_classpath_rule_impl(ctx):
    for dep in ctx.attr.deps:
        classpath_strings = []
        for jar in dep[ClassPathInfo].compile_jars.to_list():
            classpath_strings.append(jar.path)
        output_file = ctx.actions.declare_file("classpath.txt")
        ctx.actions.write(
            output = output_file,
            content = "\n".join(classpath_strings),
        )
        return [DefaultInfo(files = depset([output_file]))]
    return [DefaultInfo(files = None)]

generate_classpath_rule = rule(
    implementation = _generate_classpath_rule_impl,
    attrs = {
        'deps' : attr.label_list(aspects = [classpath_aspect]),
    },
)

❗ There is a bit of duplication in the rule for generating the output file. We could have also used the OutputGroupInfo and merged all the files together.

In order to invoke this rule you define it in a BUILD file and give it the top-level applications that you are working on.

generate_classpath_rule(
    name = "generate_classpath",
    deps = [
        "//java/app:app",
    ]
)

🎉 We now have two pretty simple ways to generate the compile-time CLASSPATH for a label. This can make integrations with IDEs a bit more straightforward if they don’t have a working Bazel plugin.

Bazel Knowledge: reproducible outputs

2024-09-26T11:50:00-07:00

You might hear a lot of about how Bazel is “reproducible” and “hermetic”, but what does that even mean ? 😕

Part of what makes Bazel incredibly fast is it effectively skips work by foregoing doing portions of the graph if the inputs have not changed.

Let’s consider this simple action graph in Bazel.

Bazel constructs an action key for each action which we can simplify down to constituting: the Starlark of the action itself & the SHA256 of the outputs of all the dependencies (i.e. srcs or deps).

Let’s consider a change to File D, which would mean that the action key for Action C now differs.

At this point Bazel will decide to rerun Action C and will produce an output SHA256-C.

If the SHA256-C is the exact same as before, Bazel will forgoe executing Action A again. 🤯

How often does this happen in practice? 🤔 A ton! Consider changes to comments that don’t effect the output. Bazel also considers the output hash on the target’s ABI, so in the case of C++ that might constitute the header files and in Java they strip out all private methods to create an “interface jar” (ijar).

Watch out though, if you use genrule you can find yourself easily producing outputs that are not reproducible if nothing changes which will kill this pruning of the action graph.

Let’s look at an example.

genrule(
    name = "output_zip",
    outs = ["output.zip"],
    cmd = """
    echo 'Hello, World!' > hello.txt && \\
    zip output.zip hello.txt && mv output.zip $@
    """,
)

genrule(
    name = "hello_text",
    srcs = [":output_jar"],
    outs = ["hello.txt"],
    cmd = """
    unzip $(location :output_jar) hello.txt -d $(GENDIR) \\
    && mv $(GENDIR)/hello.txt $@
    """,
)

This is a very simple setup where I’m producing a ZIP file and in in the final target unzipping it.

ZIP files unfortunately are normally non-reproducible because they include modification timestamp information embedded in them & the order the files are included are non-ordered.

Let’s build this with Bazel. We will use the execlog to view all the actions that were processed.

The execlog is an output file that is generated of all the actions Bazel undertook. We simply select the targetLabel to view the actions executed.

> bazel --ignore_all_rc_files build //:hello_text \
    --execution_log_json_file=/tmp/exec.log.json

> cat /tmp/exec.log.json | jq .targetLabel
"//:output_zip"
"//:hello_text"

Now let’s delete the output.zip file by doing rm bazel-bin/output.zip and re-run Bazel.

> bazel --ignore_all_rc_files build //:hello_text \
    --execution_log_json_file=/tmp/exec.log.json

> cat /tmp/exec.log.json | jq .targetLabel
"//:output_zip"
"//:hello_text"

Both targets are still being run! 😢

Fortunately, there are a few alternatives we can use such as rules_pkg or @bazel_tools//tools/zip:zipper that has support for creating ZIP archive format in a reproducible way.

Let’s modify our code now to use @bazel_tools//tools/zip:zipper.

genrule(
    name = "output_zip",
    outs = ["output.zip"],
    cmd = """
    echo 'Hello, World!' > hello.txt && \\
    $(location @bazel_tools//tools/zip:zipper) c $@ hello.txt
    """,
    tools = ["@bazel_tools//tools/zip:zipper"],
)

genrule(
    name = "hello_text",
    srcs = [":output_zip"],
    outs = ["hello.txt"],
    cmd = """
    unzip $(location :output_zip) hello.txt -d $(GENDIR) \\
    && mv $(GENDIR)/hello.txt $@
    """,
)

We’ve effectively done the same thing as before, but we are being more mindful about creating our output to be reproducible if the inputs are the same.

> rm bazel-bin/output.zip
override r-xr-xr-x fzakaria/wheel for bazel-bin/output.zip? y

> bazel --ignore_all_rc_files build //:hello_text \
    --execution_log_json_file=/tmp/exec.log.json

> cat /tmp/exec.log.json | jq .targetLabel
"//:output_zip"

🙌 YES! As expected we now only re-run the output_zip action and the final action can be skipped.

We now have our graph reproducible in a way that can help Bazel give us incremental rebuilds by skipping portions of the graph. 🥳

If reproducible builds interest you, I highly recommend you check out the wealth of information on the subject by the Reproducible Builds Group. They’ve documented all the various intricate ways they discovered software builds introduce nondeterminism into the build.

Bazel Knowledge: Secret //external directory

2024-09-25T13:47:00-07:00

Did you know Bazel has a secret //external package that is created that contains all the external repositories that are you added to WORKSPACE.bazel or MODULE.bazel ? 🤓

Let’s start with a very minimal WORKSPACE that pulls in the GNU Hello codebase.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "gnu_hello",
    urls = ["https://ftp.gnu.org/gnu/hello/hello-2.10.tar.gz"],
    strip_prefix = "hello-2.10",
    sha256 = "31e066137a962676e89f69d1b65382de95a7ef7d914b8cb956f41ea72e0f516b",
    build_file = "//third_party:gnu_hello.BUILD",
)

We can query for this repository directly. You can provide any of the output types (i.e. build, label, graph) but I tend to find build useful to see how a transitive dependency might be defined.

> bazel query //external:gnu_hello --output build

# /Users/fzakaria/code/playground/bazel/external-example/WORKSPACE:3:13
http_archive(
  name = "gnu_hello",
  urls = ["https://ftp.gnu.org/gnu/hello/hello-2.10.tar.gz"],
  sha256 = "31e066137a962676e89f69d1b65382de95a7ef7d914b8cb956f41ea72e0f516b",
  strip_prefix = "hello-2.10",
  build_file = "//third_party:gnu_hello.BUILD",
)
# Rule gnu_hello instantiated at (most recent call last):
#   /Users/fzakaria/code/playground/bazel/external-example/WORKSPACE:3:13 in 
# Rule http_archive defined at (most recent call last):
#   /private/var/tmp/_bazel_fzakaria/33b8700aff3f6dee9e443aa52af0983c/external/bazel_tools/tools/build_defs/repo/http.bzl:382:31 in 

🕵️ I wrote earlier about WORKSPACE chunking that talks about how figuring out the version for a particular version can be challenging. Unfortunately, it’s a known bug that querying //external gives you a different result than what’s actually fetched. 😭

Finally, if we wanted to audit all repositories (i.e. http_archive) we are bringing in you can use //external:* or //external:all-targets.

> bazel query //external:all-targets | head
Loading: 0 packages loaded
//external:WORKSPACE
//external:android/crosstool
//external:android/d8_jar_import
//external:android/dx_jar_import
//external:android/sdk
...

🟢 This is a great way to see which repositories are included by default by Bazel.

This //external directory contains all the downloaded source information. This is useful to audit as you write the BUILD files for the third-party package.

💁 A great tip is to create a symlink external in the root of your project that maps to this “secret” directory.

ln -s $(bazel info output_base)/external external

We can now easily view the GNU Hello source code as we write our build files.

> ll external/gnu_hello
.rw-r--r--  94k fzakaria 16 Nov  2014 ABOUT-NLS
.rw-r--r--  44k fzakaria 16 Nov  2014 aclocal.m4
.rw-r--r--  593 fzakaria 19 Jul  2014 AUTHORS
drwxr-xr-x    - fzakaria 25 Sep 13:39 build-aux
.rwxr-xr-x  622 fzakaria 25 Sep 13:39 BUILD.bazel
.rw-r--r--  13k fzakaria 16 Nov  2014 ChangeLog
...

You can in fact see all source for external repositories you are building!

> tree -d -L 1 external | head
external
├── apple_support~
├── apple_support~~apple_cc_configure_extension~local_config_apple_cc_toolchains
├── bazel_features~
├── bazel_features~~version_extension~bazel_features_globals
├── bazel_features~~version_extension~bazel_features_version
├── bazel_skylib
├── bazel_skylib~
├── bazel_tools -> /var/tmp/_bazel_fzakaria/install/b80b54a596e0fa4a6772cc7889abb086/embedded_tools
├── bazel_tools~cc_configure_extension~local_config_cc
...

All these external repositories have their own WORKSPACE file which allows bazel to avoid building them when you use //...

⚠️ This is why you might have run into errors previously if you tried to create an external directory in your repository - issues#4508

Bazel Knowledge: Reference targets by output name

2024-09-23T08:41:00-07:00

In an attempt to try and record some of the smaller knowledge brain gains on using Bazel, I’m hoping to write a few smaller article. 🤓

Did you know you can reference an output file directly by name or the target name that produced it?

load("@bazel_skylib//rules:diff_test.bzl", "diff_test")

genrule(
    name = "src_file",
    outs = ["file.txt"],
    cmd = "echo 'Hello, Bazel!' > $@",
)

diff_test(
    name = "test_equality",
    file1 = ":src_file",
    file2 = ":file.txt",
)

⚠️ If the output is the same name as the rule Bazel will give you a warning but everything still seems to work.

I tend to prefer matching by rule name. I’m not yet aware of any reason to prefer one over the other.

Bazel Overlay Pattern

2024-08-29T20:06:00-07:00

Do you have an internal fork of a codebase you’ve added Bazel BUILD files to?

Do you want to open-source the BUILD files (+ additional files) but doing so into the upstream project might be a bit too onerous to start? 🤔

Continuing with my dive 🤿 into Bazel for $DAYJOB$ , I wanted to touch on a pattern I’ve only ever seen employed by Google for LLVM but I’m finding very powerful: Bazel Overlay Pattern.

I have first encountered this pattern employed by Google in their llvm-bazel repository.

With the Bazel Overlay Pattern, you can open-source the Bazel build system for a separate project & repository.

This is useful if the upstream project does not want to accept the BUILD files themselves or if you want to validate it working in the open first before proposing the change itself.

🤨 Wait… I thought Bazel has the “Bazel Registry” which already has a bunch of external projects building with Bazel.

Sort of. The BUILD files introduced into the registry either wrap the existing build-system using something rules_foreign_cc or bring in the minimal Bazel BUILD files needed to build the final target. The BUILD files offered in the registry are not suited for daily development for that project, they are missing granular build targets, test targets & other developer producitivity targets (i.e. lint, format etc..).

🤌🏼 We want to upstream BUILD files that are meant to be the “real” build system for the project.

Bazel Overlay Pattern

I have created the project bazel-overlay-example that you can checkout on GitHub for reference.

Let’s run through a very minimal example to understand how this work. We have a C project, “hello_world”, with a single file.

hello_world/
└── cmd
    └── hello.c

In a separate project, “hello_world-overlay”, we create a directory with a directory structure matching that of the target project. In this repository within a folder called bazel-overlay, include all the files we only need to build our project using Bazel.

hello_world-overlay/
├── bazel-overlay
│   └── cmd
│       └── BUILD.bazel
├── BUILD.bazel
├── configure.bzl
├── overlay_directories.py
├── third_party
│   └── hello_world-> /tmp/hello_world
└── WORKSPACE

Additionally, create a reference to the original project in a directory third_party. This can likely a git-submodule but it can even be a symlink or http_archive in the WORKSPACE.bazel file.

The 💡 awesome idea for the overlay pattern leverages Bazel’s repository rules.

We’ve added two extra files: configure.bzl and overlay_directories.py.

These two files were simplified copies, nearly verbatim, from Google’s llvm-bazel repository.

These two files do the “magic” 🪄.

Feel free to read the files to see how they work. They are a bit too long to include verbatim on this post. They simply iterate over the files and setup symlinks.

We set it up in our WORKSPACE like so:

load(":configure.bzl", "overlay_configure")

overlay_configure(
    name = "hello-world",
    overlay_path = "bazel-overlay",
    src_path = "./third_party/hello_world",
)

When you try to build the external repository @hello-world//, the repository rule will symlink all the files in the overlay_path & the src_path together.

$ tree $(bazel info output_base)/external/hello-world

/home/fmzakari/.cache/bazel/_bazel_fmzakari/738ca8ce4d1d8ce828e952fe7b9fdd95/external/hello-world
├── cmd
│   ├── BUILD.bazel -> /tmp/hello_world-overlay/bazel-overlay/cmd/BUILD.bazel
│   └── hello.c -> /tmp/hello_world-overlay/third_party/hello_world/cmd/hello.c
└── WORKSPACE

It’s as if the two repositories were merged! 😎

$ bazel run @hello-world//cmd:hello
INFO: Analyzed target @hello-world//cmd:hello (24 pack ages loaded, 90 targets configured).
INFO: Found 1 target...
Target @hello-world//cmd:hello up-to-date:
  bazel-bin/external/hello-world/cmd/hello
INFO: Elapsed time: 0.068s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
INFO: Running command line: bazel-bin/external/hello-world/cmd/hello
Hello, World!

This is a surprising powerful pattern that lets you explore adding the Bazel build system for a separate repository. The benefit to doing in a separate repository vs. a branch is that it’s easy to track HEAD. If your third_party is a git-submodule, you can keep moving the submodule forward and validating the build succeeds.

I’m moving forward with this pattern to explore upstreaming $DAYJOB$ Bazel build system to the open source repository. 🙌

❗ I recently contributed PR#22349 to Bazel which does add an overlay concept to http_archive which almost looks like it could do this as well but if you had a lot BUILD files it would be tedious to manually list them out.

http_archive(
  name="hello_world",
  strip_prefix="hello_world-0.1.2",
  urls=["https://fake.com/hello_world.zip"],
  remote_file_urls={
    "WORKSPACE": ["https://fake.com/WORKSPACE"],
    "cmd/BUILD.bazel": ["https://fake.com/cmd/BUILD.bazel"],
  },
)

For now, I’m sticking with the Bazel Overlay Pattern.

Bazel WORKSPACE chunking

2024-08-29T10:36:00-07:00

I have been doing quite a lot of Bazel for $DAYJOB$ ; and it’s definitely got it’s fair share of warts.

I have my own misgivings of it’s migration to bzlmod and it converging to a standard-issue dependency-management style tool.

We have yet to transition to MODULE.bazel and our codebase is quite large. As you’d expect, we hit quite a lot of diamond dependency issues & specifically with external repositories in our WORKSPACE file.

A surprising implementation detail I recently learned was how Bazel does dependency resolution for external repositories in WORKSPACE.

To demonstrate, let’s see a quick example. I have two sample workspaces that export a single target //:version that will either have 1.0 or 2.0.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive", "http_file")

http_archive(
    name = "workspace",
    url = "file:///tmp/example-workspace/workspace-1.0.tar.gz",
    strip_prefix = "workspace-1.0",
)

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

http_archive(
    name = "workspace",
    url = "file:///tmp/example-workspace/workspace-2.0.tar.gz",
    strip_prefix = "workspace-2.0",
)

What is the version of @workspace//:version you expect ? 🤔

$ bazel build @workspace//:version
INFO: Analyzed target @workspace//:version (1 packages loaded, 1 target configured).
INFO: Found 1 target...
Target @workspace//:version up-to-date:
  bazel-bin/external/workspace/VERSION
INFO: Elapsed time: 0.078s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

$ cat  bazel-bin/external/workspace/VERSION
1.0

Huh. 😳. Okay.

So in this example it looks like it’s first version wins.

Let’s try another slightly different example.

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive", "http_file")

load("@bazel_tools//tools/build_defs/repo:git.bzl", "git_repository")

http_archive(
    name = "workspace",
    url = "file:///tmp/example-workspace/workspace-1.0.tar.gz",
    strip_prefix = "workspace-1.0",
)

http_archive(
    name = "workspace",
    url = "file:///tmp/example-workspace/workspace-2.0.tar.gz",
    strip_prefix = "workspace-2.0",
)

What is the version of @workspace//:version you expect now ? 🤔

$ bazel build @workspace//:version
INFO: Analyzed target @workspace//:version (1 packages loaded, 2 targets configured).
INFO: Found 1 target...
Target @workspace//:version up-to-date:
  bazel-bin/external/workspace/VERSION
INFO: Elapsed time: 0.145s, Critical Path: 0.02s
INFO: 2 processes: 1 internal, 1 linux-sandbox.
INFO: Build completed successfully, 2 total actions

$ cat  bazel-bin/external/workspace/VERSION
2.0

😳 2.0

So now it looks like it’s last one wins? 🤨

Turns out the version selection in Bazel is a little more complex.

🕵️ The WORKSPACE file into chunks separated by load statements. For a repo named X (@workspace// in our case), the last definition of X in the first chunk that contains a definition of X is the winner

If you ever face an issue of having the wrong version of an external repoitory, the thing to do is move it up load statements until you find the chunk that is defining it.

I found references to this on some GitHub issues & was helped out by @Wyverald on Slack but thought a clear concise example to demonstrate it was interesting.

If you want to play with the example above, I’ve uploaded it to bazel-workspace-chunking on GitHub.

❗ Don’t trust bazel query //external:workspace --output build to see the version.

You might be tempted to run the above query but it gives incorrect results.

$ bazel build @workspace//:version
INFO: Analyzed target @workspace//:version (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target @workspace//:version up-to-date:
  bazel-bin/external/workspace/VERSION
INFO: Elapsed time: 0.057s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action

$ cat  bazel-bin/external/workspace/VERSION
1.0

$ bazel query //external:workspace --output build
# /tmp/example-workspace/WORKSPACE.bazel:13:13
http_archive(
  name = "workspace",
  url = "file:///tmp/example-workspace/workspace-2.0.tar.gz",
  strip_prefix = "workspace-2.0",
)
# Rule workspace instantiated at (most recent call last):
#   /tmp/example-workspace/WORKSPACE.bazel:13:13 in 
# Rule http_archive defined at (most recent call last):
#   /home/fmzakari/.cache/bazel/_bazel_fmzakari/c72d1df8de0701fb5f44d35dec4b70b5/external/bazel_tools/tools/build_defs/repo/http.bzl:372:31 in 

Loading: 0 packages loaded

Even when the VERSION was still 1.0, the query simply gives back the last version cited in the WORKSPACE file.

NixOS, Raspberry Pi & Me

2024-08-13T08:52:00-07:00

I have written a lot about NixOS, so it’s no surprise that when I went to go dust off my old Raspberry Pi 4, I looked to rebrand it as a new NixOS machine.

Before I event went to play with my Pi, I was unhappy with my current home-networking setup and looked to give it a refresh.

I have had always a positive experience with Ubiquiti line of products. I installed two new AP (access points) and setup a beautiful home rack server that is completely unnecessary since my Internet provider is Comcast with top upload speeds of 35 Mbps 🥲.

I’m currently building a home-server that will fill a few of the empty slots on the rack.

If you are interested in the setup here is the list of materials I used:

Now, getting back to the Raspberry Pi, there’s quite a few blogs online on how to run NixOS but they all vary slightly in their setup and few are set up to use flake.nix.

Now I’m throwing mine to the mix 🥳.

You can find below my minimal configuration.nix for my Raspberry Pi (the link goes to GitHub permalink).

Notable points I had to do:

Feedback from the community on Matrix was to avoid nixos-hardware since the base image should just work. Unfortunately nixos-hardware.nixosModules.raspberry-pi-4 was necessary, as was the vendored Linux kernel it installs.
Cross compiling NixOS I believe is possible but I couldn’t find a simple set up with flake.nix to work. The ultimate workaround is to add support for emulation on your build host. This also does let you use the NixOS cache but be warned that if you install custom software might come to a crawl.
```
binfmt.emulatedSystems = ["aarch64-linux"];
```
I disabled ZFS to save a lot of time building my image.
compressImage is disabled since it takes a long time with emulation.
makeModulesClosure snippet below is needed since some Linux kernel modules fail to compile. 🤷

Final thoughts 🤔

While it’s cathartic to have your Pi running NixOS and “declarative”, the whole experience left much to be desired. While cross-compiling feels tenable at the individual package level, it was confusing and challenging to setup (I gave up!) to build a complete image.

Given the popularity of Raspberry Pi and the Internet of things, there is a strong opportunity to bring a simple setup for Pi’s into mainline Nixpkgs with clear support and guidance on how to cross-compile the image.

❗ If you think my configuration.nix could be improved or you have the solution to cross-compiling, let me know!

configuration.nix

Having a starting NixOS configuration means you can avoid the base installer and directly create the initial image via

❯ nix build '.#nixosConfigurations.kuato.config.system.build.sdImage'

Subsequent updates can happen via ssh. I build the image on my x86_64 machine via emulation using the following command which then copies the /nix/store closure and activates the new generation.

❯ nixos-rebuild switch --flake .#kuato \
                       --target-host fmzakari@kuato \
                       --use-remote-sudo

{
  config,
  pkgs,
  inputs,
  outputs,
  lib,
  modulesPath,
  ...
}: {
  imports = [
    # had a big discussion on Matrix on which to use for the Raspberry Pi 4
    # looks like there are pi specific modules but they told me to not use them.
    # Also don't use the vendored Linux kernel and just use the regular one.
    "${modulesPath}/installer/sd-card/sd-image-aarch64.nix"
    ../../modules/nix.nix
    # Feedback from Matrix was to disable this and it's unnecessary unless you are using
    # some esoteric hardware.
    inputs.nixos-hardware.nixosModules.raspberry-pi-4
  ];

  # let's not build ZFS for the Raspberry Pi 4
  boot.supportedFilesystems.zfs = lib.mkForce false;
  # compressing image when using binfmt is very time consuming
  # disable it. Not sure why we want to compress anyways?
  sdImage.compressImage = false;
  # enable the touch screen
  hardware.raspberry-pi."4".touch-ft5406.enable = true;

  # we don't import ../../modules/nixpkgs.nix since we don't want the overlay
  # rebuilding for the Raspberry Pi 4 is expensive; try to stick to what is in the cache.
  nixpkgs = {
    hostPlatform = lib.mkDefault "aarch64-linux";
    config = {
      allowUnfree = true;
    };
    overlays = [
      # Workaround: https://github.com/NixOS/nixpkgs/issues/154163
      # modprobe: FATAL: Module sun4i-drm not found in directory
      (final: super: {
        makeModulesClosure = x:
          super.makeModulesClosure (x // {allowMissing = true;});
      })
    ];
  };

  fileSystems = {
    "/" = {
      device = "/dev/disk/by-label/NIXOS_SD";
      fsType = "ext4";
      options = ["noatime"];
    };
  };

  networking = {
    networkmanager.enable = true;
    hostName = "kuato";
  };

  # Set your time zone.
  time.timeZone = "America/Los_Angeles";

  environment.systemPackages = with pkgs; [
    libraspberrypi
    raspberrypi-eeprom
  ];

  services = {
    # Enable the X11 windowing system.
    xserver = {
      enable = true;
      desktopManager = {
        gnome.enable = true;
      };
      displayManager = {
        # Enable the GNOME Desktop Environment
        gdm = {
          enable = true;

          # FIXME: I want to disable wayland but the touch screen seems
          # to not work otherwise.
          # wayland = false;
        };
      };
    };
  };

  # Normally we would re-use the same user configurations in the users directory
  # but since this is a Raspberry Pi 4, lets make a much smaller closure.
  users.users.fmzakari = {
    isNormalUser = true;
    shell = pkgs.bash;
    extraGroups = ["wheel" "networkmanager"];
    description = "Farid Zakaria";
    openssh.authorizedKeys.keyFiles = [
      ../../users/fmzakari/keys
    ];
    # Allow the graphical user to login without password
    initialHashedPassword = "";
  };

  # simplify sudo
  security = {
    sudo = {
      enable = true;
      wheelNeedsPassword = false;
    };
  };

  # Allow the user to log in as root without a password.
  users.users.root.initialHashedPassword = "";

  hardware.enableRedistributableFirmware = true;
  system.stateVersion = "23.11";
}