functor.tokyo -- Why PureNix?

Why PureNix?

2022-01-10

This post is the third post in a series about PureNix. The previous post is about who would find PureNix easy to use.

PureNix started out half as a joke. This post explains why we started working on PureNix, and how it moved from a joke to something we are excited about.

The Idea

This all started during @adisbladis's Summer of Nix 2021 presentation about poetry2nix.

Briefly, Poetry is a Python build system, akin to Haskell's Cabal or Rust's Cargo. Poetry is fairly unremarkable; you define your project in a pyproject.toml file (like Cabal's .cabal files, or Cargo's Cargo.toml), and when you first build your project, it freezes all (transitive) dependencies in a poetry.lock file (like Cargo's Cargo.lock).

It seems that would make poetry2nix to Poetry what cabal2nix is to Cabal (or cargo2nix is to Cargo), but poetry2nix is special in an important way. poetry2nix works by using Nix's builtins.readFile to read the raw pyproject.toml and poetry.lock files from disk, and then uses builtins.fromTOML to parse the TOML into plain Nix values. It then uses this data to create Nix derivations for building all the dependencies, as well as the package itself.

What's special about this is not so much what it does, but what it doesn't do. Because poetry2nix is implemented entirely in Nix and only uses Nix builtins, it never uses Import From Derivation (IFD). Up until this presentation, I thought that all translation layers like cabal2nix used IFD to take a native language lock file and transform it into a Nix derivation. I hadn't even considered that you could get away with not doing this.

Import From Derivation (IFD) and Haskell

A quick explanation of IFD:

You create a Nix derivation that outputs a .nix file.
You build this derivation, and the .nix file that is created end up in the Nix store (since it is a build output).
Within the same run of Nix, you import the .nix file from its path in the Nix store.

Checkout the following two links for a more detailed introduction of IFD:

There is a widely-used function in Nixpkgs that uses IFD to build a Haskell package: haskellPackages.callCabal2nix. Roughly, callCabal2nix works by running a Haskell program to parse an input .cabal file and pull out all necessary information for building the Haskell package with Nix. IFD is necessary in this process because the only library for parsing a .cabal file is written in Haskell. Parsing a raw .cabal file directly within Nix would be quite difficult.

After hearing that poetry2nix doesn't require IFD, I started thinking about what would be necessary to write a callCabal2nix function that doesn't rely on IFD.

`callCabal2nix` Without IFD

In order to write a callCabal2nix function without relying on IFD, you would first need to read in a .cabal file with the Nix builtins.readFile function, then parse the raw .cabal file and pull out all the important information from within Nix.

The big difficulty in this process is parsing the .cabal file. If you wanted to parse a .cabal file with Nix, you would need to write a .cabal parser in Nix itself. This would be quite a challenge¹.

I was thinking that if I was going to write a callCabal2nix function that doesn't use IFD, I would want to write it in a Haskell-like language that provides features like type-checking, algebraic data types, pattern-matching, type classes, etc. This language would need to compiled to Nix code, so that Nix can execute it.

My first thought was to write a Nix backend for PureScript. Users would be able to write PureScript code and compile it to Nix. This seemed like somewhat of a silly idea, so I decided to share it with my friend, Jonas Carpay.

Enter Jonas

Jonas is a Haskeller, he's interested in compilers and programming languages, and he's a heavy Nix user. I thought that if there would be anyone I could convince to work on this with me, it would be Jonas.

After telling Jonas about this, he surprisingly didn't think it was a crazy idea. After a little discussion, we came up with three potential approaches for making a Haskell-like language that compiles to Nix:

The alternative PureScript backend, as suggested above.
Use GHC's Core language as an intermediate representation, and translate that to Nix.

This approach would mean that the user would directly write a program in Haskell. Our compiler would use GHC to compile the Haskell program to GHC Core. Our compiler would then transpile this GHC Core to Nix.

The advantage of this approach is that the user would be able to use all of Haskell's features, even things like GADTs, type families, etc.

The disadvantage of this approach is that neither of us had ever really worked with GHC Core before. We weren't sure how hard it would be to translate Core into Nix, or what the consequences would be for the GHC Boot Libraries that are shipped with the compiler. We weren't sure of all the primitives exposed by GHC, or how these would translate to Nix.

I know there are compilers like GHCJS and Eta that attempt to hook into some step in GHC's compilation pipeline and output a separate language (JavaScript in the case of GHCJS, and Java in the case of Eta). But my image of these projects is that they are quite complicated.
Write a DSL in Haskell that outputs Nix code when run.

Prior art here might be a project like hnix.

The disadvantage of this approach is that it would be somewhat difficult to bootstrap our new ecosystem. We would have to write our own standard library. If we went with an alternative PureScript backend, we could just rely on the PureScript standard library.

We decided to go with writing an alternative PureScript backend, hoping it would be the quickest choice for actually writing a callCabal2nix function that doesn't use IFD.

Starting on PureNix

Writing an alternative PureScript backend is surprisingly easy². This section gives a short introduction to what is necessary for writing an alternative backend.

The PureScript compiler defines a functional Core language³. An alternative PureScript backend is responsible for taking a functional Core Module and converting it into a module in your target language. PureNix has three main parts that do this conversion into Nix code:

A definition of an AST for Nix
A function for converting a PureScript Module into our Nix AST
A function for taking our Nix AST and converting to raw Nix code

That's all there is to it. PureScript's functional Core language translates quite nicely to Nix, so we didn't have too much trouble here. The only difficulty was how to encode PureScript's data types and pattern-matching in Nix. Jonas came up with a good solution for this.

Writing PureNix only took about a month. It is currently under 1,000 lines of code.⁴ The end result was much better than either of us had anticipated. PureNix ends up working out really well in practice. The Nix code it outputs is very similar to what you'd write by hand⁵.

With the PureNix compiler mostly finished, the next step was to port some PureScript standard libraries over to be used with PureNix.

Porting PureScript Libraries

Unlike a language like Haskell or Python, PureScript doesn't have a big "standard library"⁶. However, there is a set of about 60 PureScript libraries maintained under the purescript organization on GitHub (all the repositories with the purescript- prefix). This set of libraries is often thought of as the "standard library" for PureScript. When writing an alternative backend, the first step is porting some of these libraries to your new backend.

After getting the PureNix compiler mostly working, we started on the process of porting some of the above libraries to PureNix. This process consists of forking the repository and rewriting all the JavaScript FFI files to Nix (the PureScript source files can mostly be used as-is). This is somewhat annoying and time-consuming, but it is not particularly difficult. The libraries that have been ported work well when called from Nix. It almost feels like magic being able to call functions written in PureScript from Nix.

We ended up porting about 25 libraries so far. We worked on this on and off, and it ended up taking about 2 months. See this issue for the status of the remaining libraries. Feel free to jump in and help!

With some libraries ported, the next step was to actually start writing a version of callCabal2Nix that doesn't need IFD!

No IFD

With a portion of the PureScript standard library available, writing a proof-of-concept .cabal parser was straight-forward. This project currently only parses a small subset of the full .cabal file syntax, but in theory this approach should be extendable to work with a full .cabal file. This project accomplishes the goal of parsing a .cabal file within Nix, without using IFD. This whole process ends up being quite similar to poetry2nix.

I plan to write a blog post about cabal2nixWithoutIFD in the future, but if you're interested, checkout the README.md in the repo.

Conclusion

While PureNix started out half as a joke, it ended up working out much better than anticipated. This is mostly due to the similarity between PureScript's functional Core language and Nix. PureScript's standard libraries also work well when compiled to Nix. It is quite nice to effectively be writing Nix, but using things like type-checking, algebraic data types, pattern-matching, and type classes. Relying on PureScript's standard libraries for writing Nix code is quite convenient, given that PureScript's standard libraries provide quite a lot of features.

The whole PureNix project ended up working out really well, and we hope that other people will also be able to find a use for PureNix.