Bazel Knowledge: What's an Interface JAR?

Published 2024-10-29 on Farid Zakaria's Blog

I spent the day working through an upgrade of our codebase at $DAYJOB$ to Java21 and hit Bazel issue#24138 as a result of an incorrectly produced hjar.

🤨 WTF is an hjar ?

☝️ It is the newer version of ijar !

😠 WTF is an ijar ?

Let’s discover what an ijar (Interface JAR) is and how it’s the magic sauce that makes Bazel so fast for Java.

Let’s consider a simple Makefile

program: main.o utils.o
	$(CC) -o program main.o utils.o

main.o: main.c utils.h
	$(CC) -c main.c

utils.o: utils.c utils.h
	$(CC) -c utils.c

We’ve been taught to make use of header files, especially in C/C++ so that we can avoid recompilation as a form of early cutoff optimization.

☝️ If we change utils.c solely, we do not have to recompile main.o.

We can visualize this Makefile in the following graph.

Makefile as a graph

Ok, great! What does this have to do with Java & Bazel ?

Well, let’s remember back to my previous post on reproducible outputs.

Bazel constructs a similar graph to determine when to do early cutoff optimization through the “Action Key”. Bazel computes a hash for each action, that takes dependencies for instance, and if the hash hasn’t changed it can memoize the work.

Bazel Action Graph

In Java-world, dependencies are expressed as JARs.

Wouldn’t private-only changes to a dependency (i.e. renaming a private variable) cause the Action Key HASH to change (since it produced a different JAR) ?

🤓 YES! That is why we need an ijar !

ijar is a tool found within the Bazel repository bazel/third_party/ijar.

You can build and run it fairly simple with Bazel

$ bazel run //third_party/ijar
Usage: ijar [-v] [--[no]strip_jar] [--target label label] [--injecting_rule_kind kind] x.jar [x_interface.j
ar>]
Creates an interface jar from the specified jar file.

It’s purpose is straightforward. The tool strips all non-public information from the JAR. For example, it throws away:

  • Files whose name does not end in “.class”.
  • All executable method code.
  • All private methods and fields.
  • All constants and attributes except the minimal set necessary to describe the class interface.
  • All debugging information (LineNumberTable, SourceFile, LocalVariableTables attributes).

The end result is something in spirit to a C/C++ header file.

Let’s see it in practice. 🕵️

Let’s now create an incredibly simple JAR. It will have a single class file within it.

public class Banana {
    public void peel() {
        System.out.println("Peeling the banana...");
        squish();
    }
    private void squish() {
        System.out.println("Squish! The banana got squashed.");
    }
}

We compile it like usual.

$ javac Banana.java
$ jar cf banana.jar Banana.class

When we run ijar on it we get the hash e18e0ae82bdc4deb04f04aa

⚠️ I shortened the hashes to make them more legible.

$ bazel-bin/third_party/ijar/ijar banana.jar

$ sha256sum banana.jar
f813749013ea6aba2e00876  banana.jar

$ sha256sum banana-interface.jar
e18e0ae82bdc4deb04f04aa  banana-interface.jar

Let’s now change the internals of the Banana class; let’s rename the method squish() -> squash().

Let’s recompute the new sha256.

$ sha256sum banana.jar
9278282827ddb55c68eb370 banana.jar

$ sha256sum banana-interface.jar
e18e0ae82bdc4deb04f04aa  banana-interface.jar

🤯 Although the hash of banana.jar had changed, we still get e18e0ae82bdc4deb04f04aa for the ijar.

We now the equivalent of a header file for Java code. 🙌

Bazel will use the ijar when computing the Action Key hash in lieu of the JAR for the dependencies you may depend on; thus avoiding costly rebuilds when only private information changes within your dependency.

This is the amazing lesser known tool that makes Bazel super-powered 🦸 for JVM languages.