The yak shave for reproducibility

Published 2021-09-02 on Farid Zakaria's Blog

I have been on a mission to bring reproducibility through the use of Nix into my workplace as we envision the next version of our development environment.

Similar to the movies I watch that take place in space, it only takes a small hole to destroy your hermetic environment. 🧑‍🚀

I’ve written previously about my encounters trying to remove all impurities within a JVM environment.

I’ve actually upstreamed fixing the default java.library.path for the OpenJDK distributed by Nixpkgs. 🙌

Awesome! That should have solved my problem, right ?…

Unfortunately, impurities are tough to stamp out. NixOS is trying to accomplish a paradigm shift by dismissing the filesystem hierarchy standard but it is deeply rooted in assumptions when people build for Linux.

In my specific case, the search paths for the JNR libraries were hardcoded.

// paths should have no duplicate entries and have insertion order
LinkedHashSet<String> paths = new LinkedHashSet<String>();
try {
    paths.addAll(getPropertyPaths("jnr.ffi.library.path"));
    paths.addAll(getPropertyPaths("jaffl.library.path"));
    // Add JNA paths for compatibility
    paths.addAll(getPropertyPaths("jna.library.path"));
    // java.library.path should take care of Windows defaults
    paths.addAll(getPropertyPaths("java.library.path"));
} catch (Exception ignored) {
}
if (Platform.getNativePlatform().isUnix()) {
    // order is intentional!
    paths.add("/usr/local/lib");
    paths.add("/usr/lib");
    paths.add("/lib");
}

I’ve sent a fix upstream please feel free to comment.

This meant that I had to make sure the java.library.path resolved to the glibc I am using in Nixpkgs first.

I couldn’t get rid of the JAVA_TOOL_OPTIONS just yet. 😤

What’s the cost of all this?

First off, the JDK emits JAVA_TOOL_OPTIONS via stderr which is non-configurable.

if (os::getenv(name, buffer, sizeof(buffer)) &&
    !os::have_special_privileges()) {
  JavaVMOption options[N_MAX_OPTIONS];      // Construct option array
  jio_fprintf(defaultStream::error_stream(),
            "Picked up %s: %s\n", name, buffer);

This is frustrating because plenty of tools (i.e. IntelliJ) assume failure if anything is emitted to stderr.

Secondly, we have many developers that are being hit by this particular workflow:

  1. Developer sets up their IntelliJ JRuby SDK to point to Jruby which points to /nix/store path A linked with glibc B.
  2. Developer changes their Git branch to a different point in time where JRuby points to a different /nix/store path X linked with glibc Z.
  3. The developer restarts IntelliJ and picks up the new environment variable JAVA_TOOL_OPTIONS 💥

If you read my earlier post, you’ll understand why JAVA_TOOL_OPTIONS includes glibc.

That JAVA_TOOL_OPTIONS references glibc Z but their JRuby SDK is pointing to JRuby A which was built against glibc B.

They are then graced with this wonderful message.

java.lang.UnsatisfiedLinkError: /nix/store/cvr0kjg2q7z2wwhjblx6c73rv422k8cm-glibc-2.33-47/lib/libc.so.6: undefined symbol: _dl_catch_error_ptr, version GLIBC_PRIVATE
	at jnr.ffi.provider.jffi.NativeLibrary.loadNativeLibraries(NativeLibrary.java:93)

Ugh! The fix here is straightforward ultimately – the developer needs to be mindful of their JRuby SDK set in IntelliJ and keep it in sync with their local checkout.

Unfortunately it’s a sharp edge many are running into; and I don’t blame them!

We will be thinking of some sane ways to keep these two tools in sync so that we can remain on upstream for now… 🤔