functor.tokyo -- binplz: easily curl static binaries

binplz: easily curl static binaries

2023-02-26

Nixery is an interesting service. It dynamically serves Docker images created from arbitrary packages from Nixpkgs.

I thought that a similar type of service could be created that serves statically-linked binaries from Nixpkgs. This would be another way for users on any Linux distro to benefit from all the software packaged in Nixpkgs. I explained this to my friends Jonas Carpay (@jonascarpay) and Hideaki Kawai (@kayhide), and we came up with the name binplz.dev. I was thinking that the service would work like the following.

Imagine you're on your Linux machine, and you realize you need the tree command. Normally you could install it with your package manager, but maybe your package manager is broken. (Or maybe you don't want tree, but you want some package that isn't available on your distro.)

You should be able to download a statically-linked tree binary from binplz.dev:

$ curl binplz.dev/tree > tree
$ chmod +x ./tree

You can then use this tree binary like you'd expect:

$ ./tree --version
tree v1.8.0 (c) 1996 - 2018 by Steve Baker, ...

You can confirm that this tree binary is statically linked:

$ file ./tree
./tree: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped

Behind the scenes, when you access a URL like binplz.dev/tree, the web server running on binplz.dev builds the derivation pkgsStatic.tree from Nixpkgs, pulls out the statically-linked bin/tree binary from the output, and returns it.

binplz.dev should be able to dynamically build and serve any package from Nixpkgs.¹

Writing binplz

Jonas and Hideaki joined me to build an initial version of binplz. We created a binplz repo to hold our code, and Jonas quickly put together an initial implementation.

The initial implementation did everything we required above. We then put together some Terraform and Nix code to get it deployed and running on AWS. We had our proof-of-concept ready to go, and our next task was working it into something we could launch into production.

Problems

When trying to productionize this application, we ran into a few problems.

When a user requests a binary like tree, binplz is going to have to build it at least once, since the Nixpkgs Hydra doesn't build and cache most things from pkgsStatic. Our initial problems revolved around the following:

Where do we do the actual building?

A single build machine on AWS or Hetzner might work, but it would be slow if a bunch of users request big derivations like Chromium and Firefox. A nice solution might be auto-scaling Nix builders. nixbuild.net would be an obvious choice for this.

We had a really interesting conversion with Rickard Nilsson (@rickynils), the maintainer of nixbuild.net, about how we could use nixbuild.net in binplz. Rickard is a very smart guy, and it was obvious he has spent lots and lots of time thinking about the best ways to auto-scale Nix builds.²

The big problem with this is that we would need to come up with some way to limit having to pay too much money. There are a bunch of trade-offs we could consider here, like limiting the number parallel builds, making sure the same derivation isn't built multiple times in parallel, limiting multiple builds from individual users, limiting our total build time each month, etc.
Where do we store the build outputs?

An S3 cache is an obvious choice, but the store could get really expensive in certain cases.

For instance, if we allow users to build binaries from any arbitrary Nixpkgs commit, our stores could blow up to tens or hundreds of terrabytes. Or, even if we only allow users to build binaries from a single Nixpkgs commit, what if there happens to be some sort of dynamic attr path in Nixpkgs a user could use to keep building different derivations?

Another problem is that if we're using a builder we don't control (like nixbuild.net), we need to get the binary that has just been built from the builder machine to one of our own servers (in order to send it back to the user). But this brings up questions like how to cache binaries, whether or not to use the normal Nix protocols for fetching derivation outputs, when and where to run garbage collection so our web servers don't fill up, etc.
When, where, and what do we cache?

We probably need to keep track of when builds fail, so that we don't retry them. For instance, we don't want to repeatedly try to rebuild a huge piece of software like PyTorch if we know it will always fail halfway through.

This leads to a few interesting follow-up problems, like how do you tell an actual build failure from a transient network problem or hardware problem? Where and how do we cache build logs? How do you present an interface to an operator that would allow them to see why a build failed, and possibly retry it?
Funding?

We need to figure out how to fund the builders and S3 cache, since it could be somewhat expensive. One option might be to ask for funding from the community. Another option might be to find a company who wants to sponsor the project.

None of these problems are insurmountable, but when we got to the point where we were trying to productionize our proof-of-concept, we kept running into more and more of these types of problems. It felt like the longer we worked on it, the larger the gap became between our proof-of-concept and something production-ready.

If you're interested in more discussions on each of these problems, check out the issue tracker.

Other stuff in our lives came up, and we eventually lost steam on binplz. We were never able to make a fully production-ready service.

Conclusion

binplz turned out to be one of those projects where coming up with a proof-of-concept is relatively easy, but making it into a production-ready service was quite a challenge. None of the problems were that difficult on their own, but it was quite difficult to come up with an overall solution that didn't cost too much, and would be quick to implement.

I still think a service like binplz is a neat idea. It would really show off the flexibility and power of Nix and Nixpkgs. It could be a useful service if you need a certain binary in a pinch. There are a bunch of different directions you could take it in to do other cool things.

I hope someone decides to pick up this idea and drive it to completion! We would be very interested to see how you decide to tackle the above problems.

Footnotes

Well, at least any package that makes sense to build and link statically.↩︎
If you're looking for some way to auto-scale Nix builds, I'd highly recommend you take a look at nixbuild.net.

Rickard was very helpful answering all the questions we had, and even gave us a time-line for implementing additional features that we thought we might need from nixbuild.net. Overall, it was just a great experience.↩︎

tags: nixos