2023-05-11

Recently my friend Jonas Carpay has been teaching me about the Haskell package `bound`, and the fundamental problem it is trying to solve. `bound` gives a good solution for defining a lambda calculus in Haskell, specifically with how to differentiate between bound and free variables.

While playing around with the ideas from `bound`, I found a neat way of defining a datatype for a lambda calculus expression using something like a free monad.

This post explains a little background for `bound`, and then shows my own definition using the free-monad-like structure.

The Haskell code in this post is based on GHC-9.2, Stackage LTS-20.14.

## Lambda Calculus

Most Haskellers will be familiar with lambda calculus. Even if you've never explicitly studied it, it will seem familiar to you just from having used Haskell.

An example lambda calculus expression might look like the following:

``\x -> \y -> a (x y)``

Lambda calculus has variables, function application, and lambda functions: all things you are familiar with from Haskell.

If you wanted to write a data type in Haskell to represent a lambda calculus expression, what might it look like? You might start with something like this:

``````data ExpTooSimp
= ExpTooSimpVar String
-- ^ This refers to a variable, like @x@.
| ExpTooSimpApp ExpTooSimp ExpTooSimp
-- ^ This refers to a function application, like @f x@.
| ExpTooSimpLam String ExpTooSimp
-- ^ This refers to a lambda function, like @\x -> _@.``````

Our above example lambda calculus expression might look like the following in Haskell:

``````ExpTooSimpLam "x"
( ExpTooSimpLam "y"
( ExpTooSimpApp
(ExpTooSimpVar "a")
(ExpTooSimpApp (ExpTooSimpVar "x") (ExpTooSimpVar "y"))
)
)``````

One generalization we can make is turning the `String` into a parameter:

``````data ExpSimp a
= ExpSimpVar a
-- ^ This refers to a variable, like @x@.
| ExpSimpApp (ExpSimp a) (ExpSimp a)
-- ^ This refers to a function application, like @f x@.
| ExpSimpLam a (ExpSimp a)
-- ^ This refers to a lambda function, like @\x -> _@.``````

Then, instead of having `ExpTooSimp`, we have `ExpSimp String`.

This datatype is certainly easy to understand, but it leaves a little to be desired.

## Bound vs Free Variables

One thing that the above Haskell datatype doesn't address is our desire to differentiate between bound and free variables.

For instance, in the expression:

``\x -> \y -> a (x y)``

the `x` and `y` are bound variables, while the `a` is a free variable.

Could we change our datatype to take this bound-vs-free information into account?

The article Bound by Edward Kmett shows a bunch of different ways to express this in Haskell. The article works towards explaining a specific technique, and why it is better than most other techniques.

The general technique the article works towards is referred to as Bird and Paterson. This technique was originally introduced in the first half of the De Bruijn notation as a nested datatype paper. The `bound` Haskell package then improves on this technique in a few ways, and wraps it up in a library.

The general idea from the Bird and Paterson technique is to take our simple expression datatype:

``````data ExpSimp a
= ExpSimpVar a
-- ^ This refers to a variable, like @x@.
| ExpSimpApp (ExpSimp a) (ExpSimp a)
-- ^ This refers to a function application, like @f x@.
| ExpSimpLam a (ExpSimp a)
-- ^ This refers to a lambda function, like @\x -> _@.``````

and rewrite it like this:

``````data Exp a
= ExpVar a
-- ^ This refers to a variable, like @x@.
| ExpApp (Exp a) (Exp a)
-- ^ This refers to a function application, like @f x@.
| ExpLam (Exp (Maybe a))
-- ^ This refers to a lambda function, like @\x -> _@.``````

The `ExpVar` and `ExpApp` cases haven't changed, but the `ExpLam` case needs some explaining.

First off, in the new `ExpLam`, where has the variable name gone? Why isn't this something like `ExpLam a (Exp a)`?

The insight is that in `ExpLam (Exp (Maybe a))`, we're recursing on `Exp`, but on every recursion, we add a new layer of `Maybe` to `a`. Each layer of `Maybe` refers to a variable being bound by a given lambda. This is a little tricky to understand, and it will help to look at a few examples.

Using `Exp String`, here's a top-level free variable without any lambdas involved.

top-level free variable `a` `ExpVar "a"`

This should be straightforward. There are no `Maybe`s involved here (since there are no uses `ExpLam`).

How about the two cases of a simple, single lambda expression: one with a free variable, and one with a bound variable?

bound variable `\x -> x` `ExpLam (ExpVar Nothing)`
free variable `\x -> a` `ExpLam (ExpVar (Just "a"))`

You can see that within one layer of `ExpLam`, `ExpVar` now takes a type of `Maybe String`. `Just` refers to a free variable, while `Nothing` refers to the closest bound variable.

This is pretty neat. We now have a way to differentiate free and bound variables. Bound variables don't even need to carry around a variable name (once a variable is bound, it doesn't really matter what its name is)!

What does this look like with more layers of lambdas? Here are a few more examples:

var bound to inner lambda `\x -> \y -> y` `ExpLam (ExpLam (ExpVar Nothing))`
var bound to outer lambda `\x -> \y -> x` `ExpLam (ExpLam (ExpVar (Just Nothing)))`
free var `\x -> \y -> a` `ExpLam (ExpLam (ExpVar (Just (Just "a"))))`

You can see here that within two layers of lambdas, `ExpVar` now takes a type of `Maybe (Maybe String)`. When adding each layer of lambda, we have to add a new layer of `Maybe`.

With two layers of lambda (and two corresponding layers of `Maybe`), you can see that `Nothing` refers to the closest binding lambda, `Just Nothing` refers to the next closest binding lambda, and `Just (Just v)` refers to a free variable.

While this is certainly clever, why do we prefer `Exp` over `ExpSimp`?

## Monad and Foldable for Exp

`Exp` is nice because it has a straightforward `Monad`, `Foldable`, and `Traversable` instance that allows us to manipulate free (non-bound) variables.

Here's what these instances look like:

``````data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))
deriving stock (Functor, Foldable, Traversable)

-- A trivial Applicative instance
instance Applicative Exp where pure = ExpVar ; (<*>) = ap

(>>=) :: Exp a -> (a -> Exp b) -> Exp b
e >>= k =
case e of
ExpVar a -> k a
ExpApp f a -> ExpApp (f >>= k) (a >>= k)
ExpLam l -> ExpLam (l >>= traverse k)``````

Let's look at how we can use each of these instances in turn. First, here's an example expression we can play around with:

``````exampleExp1 :: Exp String
exampleExp1 =
ExpLam \$
ExpApp
(ExpVar (Just "a"))
(ExpLam \$
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just "b"))
)``````

This corresponds to the following Lambda calculus expression:

``\x -> a (\y -> x b)``

First, let's look at the `Functor` instance for `Exp`. The `Functor` instance gives us the ability to modify free variables. Here's an example of replacing free variables with their ASCII codes:

``````> fmap (ord . head :: String -> Int) exampleExp1 :: Exp Int
ExpLam \$
ExpApp
(ExpVar (Just 97))
(ExpLam \$
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just 98))
)``````

This result corresponds to a lambda expression that would look something like the following1:

``\x -> 97 (\y -> x 98)``

Note that the bound `x` variable is not affected in any way, only the free `a` and `b` variables.

The `Monad` instance gives us a way to substitute free variables. For instance, in our running example, let's say we want to substitute in the identity function in place of the free variable `a`.

In our example `\x -> a (\y -> x b)`, we want to substitute `\z -> z` in place of `a`. The result should look like:

``\x -> (\z -> z) (\y -> x b)``

Let's start by writing a function to do the substitution:

``````substituteIdForA :: String -> Exp String
substituteIdForA v
| v == "a"  = ExpLam (ExpVar Nothing)
| otherwise = ExpVar v``````

Let's use this with `bind`:

``````> exampleExp1 >>= substituteIdForA :: Exp String
ExpLam \$
ExpApp
(ExpLam (ExpVar Nothing))
(ExpLam \$
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just "b"))
)``````

You can see how the `ExpVar (Just "a")` case has been replaced by the new lambda expression.

Finally, let's look at `Foldable`2. This gives us a way to pull out all the free variables3:

``````> foldMap (:[]) exampleExp1 :: [String]
["a", "b"]``````

This is a nice way to operate over all the free variables in an expression at once.

## Takeaway for the Bird and Paterson Technique

The takeaway here is that once you come up with your expression data type:

``````data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))``````

you can automatically derive `Functor`, `Foldable`, and `Traversable`. The `Applicative` instance is trivial. All you really need to write is the `Monad` instance, but it is straightforward.

Despite being easy to write, these instances give you quite a lot of power for manipulating free variables in your expressions. This is pretty nice!4

## Push it Further

While this is certainly nice, I started wondering if there was any way to push it further. Is there some formulation where you don't even have to write the `Monad` instance?

The first thing that came to my mind was a free-monad-like construction. With the normal approach to free monads for eDSLs, all you have to do is create a datatype and derive `Functor` for it. You then wrap it up in `Free` and you get the Monad instance... well, for free.

Could it work to just pass `Exp` to `Free`? Maybe something like the following?

``````data Free f a
= Pure a
| Free (f (Free f a))

This almost looks like it would work, but you'll notice that the `ExpFBadLam` data constructor is incorrect. Compare to `Exp`:

``````data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))``````

In `ExpLam`, we're not recursing on `Maybe (Exp a)`, we're recursing on `Exp (Maybe a)`. As far as I can tell, it is not possible to express this type of recursive datatype with `Free`5.

Let's rewrite `Exp` to make this a little more obvious:

``````data ExpTran a
= ExpTranVar a
| ExpTranApp (IdentityT ExpTran a) (IdentityT ExpTran a)
| ExpTranLam (MaybeT ExpTran a)``````

Here, instead of recursing directly in `Exp`, we've defined all recursion to happen through the monad transformers `IdentityT` and `MaybeT`. This `ExpTran` datatype is isomorphic to `Exp`, but it helps to show why it is not possible to define `Exp` using `Free` plus some functor.

Is there some sort of `Free`-like structure that we can use to define `Exp`?

## `FFree` and `ExpF` for `Exp`

After playing around with this a bunch, I came up with the following construction:

``````data ExpF f a
= ExpFApp (IdentityT f a) (IdentityT f a)
| ExpFLam (MaybeT f a)

data FFree f a
= FPure a
| FFree (f (FFree f) a)``````

Here's `ExpTran` and `Free` for an easy comparison:

``````data ExpTran a
= ExpTranVar a
| ExpTranApp (IdentityT ExpTran a) (IdentityT ExpTran a)
| ExpTranLam (MaybeT ExpTran a)

data Free f a
= Pure a
| Free (f (Free f a))``````

Let's see if we can write `Functor`, `Applicative`, and `Monad` for `FFree`. We may need the assistance of an instance like `Functor` for `ExpF`, but let's try without and see how far we get.

First, `Functor` for `FFree f`:

``````instance Functor (FFree f) where
fmap :: (a -> b) -> FFree f a -> FFree f b
fmap g = \case
FPure a -> FPure (g a)
FFree ex -> FFree _``````

Hmm, we're stuck on the `FFree` case. What is the type of `_`?

``_ :: f (FFree f) b``

What do we have in the environment that is relevant?

``````ex :: f (FFree f) a

g :: a -> b``````

It seems like we need `f` to be a `Functor`:

``theFunctionWeNeedForFmap :: (a -> b) -> f (FFree f) a -> f (FFree f) b``

Let's try writing the `Monad` instance for `FFree f`:

``````instance Monad (FFree f) where
(>>=) :: FFree f a -> (a -> FFree f b) -> FFree f b
m >>= k =
case m of
FPure a -> k a
FFree ex -> FFree _``````

We're stuck on the `FFree` case again. What is the type of `_` this time?

``_ :: f (FFree f) b``

What do we have in the environment that is relevant?

``````ex :: f (FFree f) a

k :: a -> FFree f b``````

This is a little different than the `Functor` case, but it looks like we need a function like the following:

``theFunctionWeNeedForBind :: f (FFree f) a -> (a -> FFree f b) -> f (FFree f) b``

We might even be able to generalize this a to:

``theFunctionWeNeed :: Monad h => f h a -> (a -> h b) -> f h b``

In our case, `f` will be `ExpF`, and `h` will be `FFree ExpF`. Fully specialized, this would look like:

``````theFunctionWeNeedSpecial
:: ExpF (FFree ExpF) a -> (a -> FFree ExpF b) -> ExpF (FFree ExpF) b``````

## The `Bound` Typeclass

Let's wrap this function up in a typeclass6:

``````class Bound f where
(>>>=) :: Monad h => f h a -> (a -> h b) -> f h b``````

We'll need an instance for `ExpF` (since `f` will eventually become `ExpF` and `h` will be `FFree ExpF`).

But first, lets confirm we can use this typeclass to write `Functor` and `Monad` for `FFree`:

``````instance Bound f => Functor (FFree f) where
fmap :: (a -> b) -> FFree f a -> FFree f b
fmap f = \case
FPure a -> FPure (f a)
FFree ex -> FFree \$ ex >>>= (pure . f)

instance Bound f => Applicative (FFree f) where pure = FPure ; (<*>) = ap

instance Bound f => Monad (FFree f) where
(>>=) :: FFree f a -> (a -> FFree f b) -> FFree f b
m >>= k =
case m of
FPure a -> k a
FFree ex -> FFree \$ ex >>>= k``````

Looks like it works!

Now for our `ExpF` instance of `Bound`:

``````instance Bound ExpF where
(>>>=) :: Monad h => ExpF h a -> (a -> h b) -> ExpF h b
ex >>>= f = case ex of
ExpFApp ida idb -> ExpFApp (ida >>= lift . f) (idb >>= lift . f)
ExpFLam may -> ExpFLam \$ may >>= lift . f``````

## What have we gained?

So what have we gained here? Let's review the relevant datatypes and instances. First, `Exp`:

``````data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))

instance Functor Exp
instance Applicative Exp

We have `Functor`, `Applicative`, and `Monad` for `Exp`.

Now `Bound`, `ExpF`, and `FFree`:

``````class Bound f where
(>>>=) :: Monad h => f h a -> (a -> h b) -> f h b

data ExpF f a
= ExpFApp (IdentityT f a) (IdentityT f a)
| ExpFLam (MaybeT f a)

instance Bound ExpF

data FFree f a
= FPure a
| FFree (f (FFree f) a)

instance Bound f => Functor (FFree f) where
instance Bound f => Applicative (FFree f) where
instance Bound f => Monad (FFree f) where``````

Here are the big ideas:

1. `FFree ExpF` is isomorphic to `Exp`.
2. `FFree` and `Bound` could be wrapped up in a library, similar to free.
3. All an end-user would have to provide is the datatype `ExpF` and the `Bound ExpF` instance. The end-user would not have to write the `Functor`, `Applicative`, or `Monad` instances.
4. `FFree` and `Bound` are sort of like `Free` and `Functor`, except that GHC can automatically derive `Functor` in most cases. The end-user would have to manually write the `Bound ExpF` instance.

## More Generalization

It is possible to generalize this a little more. You can pull out the `MaybeT` and `IdentityT` transformers used in `ExpF`, and define `Bound` for any `MonadTrans`. Here's what we originally had:

``````data ExpF f a
= ExpFApp (IdentityT f a) (IdentityT f a)
| ExpFLam (MaybeT f a)``````

This can be generalized to:

``````data ExpTransF i m f a
= ExpTransFApp (i f a) (i f a)
| ExpTransFLam (m f a)``````

The `Bound` instance for `ExpTransF` then becomes7:

``````instance
) => Bound (ExpTransF i m) where
(>>>=) :: Monad h => ExpTransF i m h a -> (a -> h b) -> ExpTransF i m h b
ex >>>= f = case ex of
ExpTransFApp ida idb -> ExpTransFApp (ida >>= lift . f) (idb >>= lift . f)
ExpTransFLam l -> ExpTransFLam \$ l >>= lift . f``````

Now, we have the following isomorphisms:

• `ExpTransF IdentityT MaybeT` is isomorphic to `ExpF`.
• `FreeF (ExpTransF IdentityT MaybeT)` is isomorphic to both `FreeF ExpF` and `Exp`.

Alternatively, you can actually define `Bound` instances for `MaybeT` and `IdentityT`, and call out to them in the implementation of the `Bound` instance for `ExpTransF`:

``````instance Bound IdentityT where
(>>>=) :: Monad h => IdentityT h a -> (a -> h b) -> IdentityT h b
(IdentityT m) >>>= f = IdentityT \$ m >>= f

instance Bound MaybeT where
(>>>=) :: Monad h => MaybeT h a -> (a -> h b) -> MaybeT h b
(MaybeT m) >>>= f = MaybeT \$ m >>= traverse f

instance (Bound i, Bound m) => Bound (ExpF i m) where
(>>>=) :: Monad h => ExpF i m h a -> (a -> h b) -> ExpF i m h b
ex >>>= f = case ex of
ExpFApp ida idb -> ExpFApp (ida >>>= f) (idb >>>= f)
ExpFLam l -> ExpFLam \$ l >>>= f``````

## Conclusion

In practice, is this `FFree`, `Bound`, and `ExpF` approach good?

In my opinion, using `FFree`, `Bound`, and `ExpF` likely doesn't save you that much work over just writing a `Monad` instance for `Exp`. And `show`ing a value of `FFree ExpF` is harder to understand than `Exp`, since there are many "extra" constructors.

If there was a `FFree` library, it might contain a few nice helper functions, but my hunch is that it wouldn't save you too much work over just using `Exp` and writing them yourself.

However, if you want to use `ExpTransF` and frequently change the monad transformer you are working with, maybe `FFree` and `Bound` would save you a little work?

I'd be interested in hearing from anyone who has used a similar approach, or possibly wants to collaborate on a library in this space8.

## Footnotes

1. It's another question entirely whether an operation like this would mean anything semantically for the underlying lambda calculus. This is mostly just an example of using `fmap`!↩︎

2. I'm not going to discuss `Traversable`, but it should be straightforward to understand if you understand the `Foldable` instance.↩︎

3. Note that `(:[])` is the `a -> [a]` singleton list construction function. Otherwise known as `pure` for the list monad.↩︎

4. There are a few other advantages for the Bird and Paterson technique, You can find them in the Bound article.↩︎

5. I haven't formally proven this, but I played around with it a bunch and wasn't able to get it to work. I plan on brushing off my dusty Coq skills and trying coming up with an actual proof of this.↩︎

6. This is actually the `Bound` typeclass from `bound`. I stumbled on this without realizing it was already defined in `bound`, but it was very nice to know I wasn't just pulling some random function out of the air.↩︎

7. The quantified constraints make this look a little complicated, but I expect they will go away starting with transformers-0.6.0.0, available with GHC-9.6, since `MonadTrans` has gained these exact quantified constraints.↩︎

8. The issue tracker for the `bound` library does have an open issue about people trying things in this same space.↩︎