Farid Zakaria's Blog

on Farid Zakaria's Blog

Scaling past 1 million ELF symbol relocations

Note
This is a follow up to my previous post on speeding up elf relocations for store based systems.

I wrote earlier about some impressive speedups that can be achieved by foregoing the typical dynamic linking that can be applied to systems such as Nix where the dependencies for a given application are static.


Fish the bash way

I have been a big fan of the fish shell lately mostly because it delivers what it promises; works out of the box™️.

The obvious downside to fish is that it is non-standard POSIX sh – meaning some (not all!) of the 1-line scripts you find on the Internet may not work.

I use to be a pretty big zsh fan but I hit enough oddities with my setup that I gave up in anger one day. 😤


Nix secrets for dummies

Despite having used Nix for several years already, I feel secure in admitting that new concepts in Nix continue to be confusing at first glance.

I had avoided doing any secret management for Nix for a long time, largely because I ran Nix atop another Linux distribution, but now that I’m on NixOS my time hath come.

I wanted to write out my learnings here so that others may benefit from the ELI5 style of learning that we are missing in Nix.


Nix remote building with Yubikey

There isn’t that much Nix documentation for remote building with Nix.

I’m leaving my tiny module that I’m using to enable my Framework Laptop running NixOS to perform remote builds using the Nix Community builders, specifically that I use a Yubikey key as my SSH private key.

Actually some of the best documentation is courtesy of nixbuild.


Learn Nix the Fun Way

This is a post inspired by many talks I’ve given to engineering groups about Nix. You can see an example of one such talk Why I love Nix, and you should too

I’ve given a lot of Nix talks. I’ve given Nix talks internally at companies where I’ve introduced it, at local meetups and even at NixCon.

Giving a talk about Nix is hard. As engineers I find often we try to explain why or how Nix works but never show the end result.

Many of the talks I’ve given start explaining “Nix developed as part of Eelco’s PhD thesis in 2003” and immediately eyes roll.

A meme photo of Picard hearing Nix terminology

Let’s do it different this time. Let’s learn Nix the fun way.


Reproducibility in Disguise: Bazel, Dependencies, and the Versioning Lie

Reproducibility has become a big deal. Whether it’s having higher confidence in one’s build or trying to better understand your supply chain for provenance, having an accurate view of your build graph is a must.

Tools such as Bazel have picked up mainstream usage from their advocacy by large companies that use it or via similar derivatives such as Buck. These companies write & proclaim how internally it’s solved many of their software development lifecycle problems. They’ve graciously open-sourced these tools for us to use so that we may also reap similar benefits. Sounds great right?


Speeding up ELF relocations for store-based systems

Since the introduction of Nix and similar store-based systems such as Guix or Spack, I have been fascinated about finding improvements that take advantage of the new paradigms they introduce. Linux distributions are traditionally dynamic in nature, with shared libraries and executables being linked at runtime. Store-based systems, however, are static in nature, with all dependencies being resolved at build time. This determinism allows for not only reproducibility but also the ability to optimize various aspects of our toolchain.

Work that I’ve have written previously about shows that there are worthwhile speedups that can be gained. While previously, I focused on improving the stat storm that occurs when resolving dependencies, I have recently been looking at speeding up the ELF relocations that occur when executing a program.

You can check out my publication Mapping Out the HPC Dependency Chaos about the development of shrinkwrap if you are interested in the topic.

Extending the idea further, I have been looking at how we can optimize the ELF relocations that occur when executing a program. In this post, I will discuss the basics of ELF relocations and symbol resolution and how we can optimize these processes for store-based systems.


Visualizing GitHub workflow run length time

GitHub Actions have now become an integral part of many open source projects, providing a free & powerful CI system. I am surprised however there is no provided way to visualize the run length time (or other meaningful metrics) of your actions.

🕵️ I did find a few other third-party solutions that either extract the data or themselves can be added as a step to your workflow to get similar visualizations. I wanted something simpler.

I previously wrote a post about the cost of runfiles which had become evident when we noticed our GitHub Bazel build workflow had slowed down by 50x.

After landing my fix, I wanted to visualize the run length time of the action; and objectively see if my fix had worked. Trust but verify.


Hermetic, but at what cost?

tl;dr; This is a little story about how making Bazel hermetic can lead to some unexpected consequences. In this particular case, it caused our GitHub action to slow down by 50x from 1 minute to over 60 minutes.

The fix recommended was to apply the following to your .bazelrc – I needed to understand why however.

# Disabling runfiles links drastically increases performance in slow disk IO situations
# Do not build runfile trees by default. If an execution strategy relies on runfile
# symlink tree, the tree is created on-demand. See: https://github.com/bazelbuild/bazel/> > issues/6627
# and https://github.com/bazelbuild/bazel/commit/03246077f948f2790a83520e7dccc2625650e6df
build --nobuild_runfile_links
test --nobuild_runfile_links

# https://bazel.build/reference/command-line-reference#flag--legacy_external_runfiles
build --nolegacy_external_runfiles
test --nolegacy_external_runfiles

Abusing GitHub as a PyPI server

I did not discover or invent this trick .

I wanted to make available a Python wheel to some developers but I did not want to publish it on PyPI for a variety of reasons.

  1. I am not the original author of the code and I did not want to take credit for it.
  2. I wanted to include the git commit hash in the version number which PyPI does not allow.