Logo

dev-resources.site

for different kinds of informations.

Creating Repeatable Builds

Published at
1/12/2024
Categories
softwareengineering
nix
Author
olddutchcap
Categories
2 categories in total
softwareengineering
open
nix
open
Author
11 person written this
olddutchcap
open
Creating Repeatable Builds

Repeatability

There's a very important concept in building reliable software. It's repeatability. If you work building software for any appreciable amount of time you will run across a situation where a client reports a bug in your software which you cannot recreate. This is what is meant by repeatability. You want the same behavior regardless of whether it's on your machine or the clients.

Another difficult situation is the case of an intermittent bug. The client repeatedly sees an issue but there doesn't seem to be any consistency to when or how it happens. This is also the opposite of repeatability.

Both of these situations are bad. For one thing they eat up time in hunting down bugs that are especially hard to diagnose. They also make the client less likely to trust you in the future; after all if you can't make your software work, regardless of the actual cause, why should they trust you?

A Funny Machine

Let's imagine for a second that you've got a machine that takes four inputs (e. g. threads) and transforms them into something else (e. g. cloth). Further let's imagine that you make some sort of mechanical change to the machine and suddenly it stops working as expected. What's the cause of the problem? While it might be any number of things, the most likely cause is the last change we made. It was working before we made the change and now it isn't so logically the change seems to be the most likely cause of the problem.

Now let's imagine that we change two of the inputs and we change the machine at the same time. And the machine stops working. Which alteration was it that caused the machine to stop working? The change to the machine or the change to one or the other input? This makes diagnosing the cause of the failure a lot more time-consuming and complex.

So there are two lessons we might draw from this imaginary scenario:

  1. Try not to change more than one thing at a time
  2. If you must change more than one thing, try to test after each change.

All Other Things Being Equal

So how does our imaginary machine relate to the idea of repeatability? In this way: if we create a process that works correctly when we deploy it to a client's machine anything that differs on the client's machine may cause a failure. Hence if we can we want to deploy to a client machine that as closely as possible matches the machine on which the software was built. In other words we want to keep everything "equal" other than the actual software we're deploying.

How can we achieve this state of "all other things being equal"? Docker is one great way to achieve this. By using a docker image and being very careful about the software included in the image we can remove differences between our machine and a client's machine. This is one of the strong appeals of using docker in the first place--it helps us to eliminate the "works on my machine" syndrome. However, there are steps beyond using docker we can take to eliminate any sort of variability.

When we deploy software we're dealing with at least two things: the code we've written ourselves and its dependencies. We need these dependencies else we'll end up having to build everything from scratch. What developer would bother to rewrite routines to fetch a file from the hard drive? We all use the OS provided routines to do this. But using the OS routines does introduce a dependency. Usually these dependencies don't concern us but any time any software we depend upon changes it's a potential source of a bug.

Versioning

If you've ever heard the term DLL Hell, in essence it was a problem with dependency management. When Microsoft's team first designed dynamic link libraries they relied on the name of the assembly alone to test if they had the right library. Hence if I had a library called mylibrary.dll and another application had a library called mylibrary.dll if I were not careful the application might use the wrong version of the library. There was nothing present in the linker/loader to check the library being loaded to insure it was the correct one.

Now one answer to this problem is strong versioning. In other words I make sure I identify the version of mylibrary.dll I need and I make sure I update the version any time a change is made. Semantic versioning is a step in this direction. In my code I can add code to test that I'm getting a particular version of a library--helps me to avoid DLL Hell. Of course if I have version 1.0 of mylibrary.dll and another software package has version 1.0 of mylibrary.dll then I still have a problem.

There's also the issue of a developer who isn't careful with his/her versioning? What then? He or she makes a change to a library (or a Javascript module etc.) that seems so minor they don't bother to change the version number. How do we account for that? The answer is a hash of the binary artifact.

What is a hash? Imagine you had a process to take every byte in a file and reduce them to a single value. Maybe you'd take the first byte and add it to the second and then take that sum and add it to the third and so forth. (Note: that example is extremely simplified. A real hash function is considerably more complex.) This is what we call hashing a file. In hashing we're trying to create a value from the file that is

  1. Totally unique to the contents of the file.
  2. Not likely to accidentally be the value of a different file (this is called a hash collision).
  3. Reduces to a value that's not too large.

The first property is the one we're most interested in. If I have a file that starts with the bytes 0x10, 0x15, 0xFF, I want the hash derived from the file to be different than the hash derived from a file starting 0x10, 0x16, 0xFF. In this way I can easily programmatically detect that the two files are not the same.

By hashing the binary artifact I no longer need to worry about the version information provided by the developer. If even one byte in the file has been altered the hash will be different and I'll know.

Great But How Does This Help Me?

This is what the NixOS package accomplishes for us. It gives us the infrastructure to build binary artifacts with confidence that all the dependencies (or inputs if you will) are invariant.

This is along the same lines as mocking when one is creating unit tests. If unit tests depend on potentially variable inputs then when we have a failure we need to check if the failure was caused by changed code or by changed dependencies. Mocking allows us to keep our dependencies unchanged between runs of our tests so we can be confident that if we see a failure it's our code change that's caused it.

NixOS allows us to take that idea of holding external elements the same and apply it much more rigorously to the entire build process. It allows us to build docker images that use precisely the same shared libraries to insure that if there is some sort of failure we don't have to spend time examining external dependencies. With so many of us deploying docker images to AWS and other cloud services, being able to know that the image on our machine and the image in the cloud is identical is a huge win.

nix Article's
30 articles in total
Favicon
Easy development environments with Nix and Nix flakes!
Favicon
A Conversation with Docker CTO Justin Cormack and Flux CEO Ron Efrani: The Future of Developer Environments
Favicon
Nice one
Favicon
NixOS - A Unique Linux Distribution
Favicon
Getting started with Nix and Nix Flakes
Favicon
My new Nix series!
Favicon
gRPC, Haskell, Nix, love, hate
Favicon
Easy way to setup Flutter Development Environment on NixOS without Flakes or dev-shell
Favicon
Dotfiles, the nix way
Favicon
Why Use Nix package manager, Even on macOS?
Favicon
Easy GitHub CLI Extensions with Nix
Favicon
Abusing Haskell: Executable Blog Posts
Favicon
Nix first steps
Favicon
Cross-Posting to Dev.to with API
Favicon
An Introduction to Nix for Ruby Developers
Favicon
Using niv to Manage Haskell Dependencies
Favicon
Don't Rebuild Yourself - an Intro to Nix Package Caches
Favicon
Packing Custom Fonts for NixOS
Favicon
Embrace the Power of Nix for Your Python + Rust Workflow
Favicon
How to Deploy Flutter on Upsun
Favicon
Declarative and reproducible environments with colima, nix and k8s
Favicon
Combining Nix with Terraform for better DevOps
Favicon
The Perfect System Configuration
Favicon
Azure Function app that runs Haskell Nix package
Favicon
A Journey to Find an Ultimate Development Environment
Favicon
The one thing I do not like about the Nix package manager (and a fix for it)
Favicon
Creating Repeatable Builds
Favicon
Develop R Packages under Nix Shell
Favicon
How I use Nix in my Elm projects
Favicon
Nix Quick Tips - Flake for OCaml

Featured ones: