2023-05-11
Recently my friend Jonas Carpay has been teaching me about the Haskell package bound
, and the fundamental problem it is trying to solve. bound
gives a good solution for defining a lambda calculus in Haskell, specifically with how to differentiate between bound and free variables.
While playing around with the ideas from bound
, I found a neat way of defining a datatype for a lambda calculus expression using something like a free monad.
This post explains a little background for bound
, and then shows my own definition using the free-monad-like structure.
The Haskell code in this post is based on GHC-9.2, Stackage LTS-20.14.
Lambda Calculus
Most Haskellers will be familiar with lambda calculus. Even if you've never explicitly studied it, it will seem familiar to you just from having used Haskell.
An example lambda calculus expression might look like the following:
\x -> \y -> a (x y)
Lambda calculus has variables, function application, and lambda functions: all things you are familiar with from Haskell.
If you wanted to write a data type in Haskell to represent a lambda calculus expression, what might it look like? You might start with something like this:
data ExpTooSimp
= ExpTooSimpVar String
-- ^ This refers to a variable, like @x@.
| ExpTooSimpApp ExpTooSimp ExpTooSimp
-- ^ This refers to a function application, like @f x@.
| ExpTooSimpLam String ExpTooSimp
-- ^ This refers to a lambda function, like @\x -> _@.
Our above example lambda calculus expression might look like the following in Haskell:
ExpTooSimpLam "x"
( ExpTooSimpLam "y"
( ExpTooSimpApp
(ExpTooSimpVar "a")
(ExpTooSimpApp (ExpTooSimpVar "x") (ExpTooSimpVar "y"))
)
)
One generalization we can make is turning the String
into a parameter:
data ExpSimp a
= ExpSimpVar a
-- ^ This refers to a variable, like @x@.
| ExpSimpApp (ExpSimp a) (ExpSimp a)
-- ^ This refers to a function application, like @f x@.
| ExpSimpLam a (ExpSimp a)
-- ^ This refers to a lambda function, like @\x -> _@.
Then, instead of having ExpTooSimp
, we have ExpSimp String
.
This datatype is certainly easy to understand, but it leaves a little to be desired.
Bound vs Free Variables
One thing that the above Haskell datatype doesn't address is our desire to differentiate between bound and free variables.
For instance, in the expression:
\x -> \y -> a (x y)
the x
and y
are bound variables, while the a
is a free variable.
Could we change our datatype to take this bound-vs-free information into account?
The article Bound by Edward Kmett shows a bunch of different ways to express this in Haskell. The article works towards explaining a specific technique, and why it is better than most other techniques.
The general technique the article works towards is referred to as Bird and Paterson. This technique was originally introduced in the first half of the De Bruijn notation as a nested datatype paper. The bound
Haskell package then improves on this technique in a few ways, and wraps it up in a library.
The general idea from the Bird and Paterson technique is to take our simple expression datatype:
data ExpSimp a
= ExpSimpVar a
-- ^ This refers to a variable, like @x@.
| ExpSimpApp (ExpSimp a) (ExpSimp a)
-- ^ This refers to a function application, like @f x@.
| ExpSimpLam a (ExpSimp a)
-- ^ This refers to a lambda function, like @\x -> _@.
and rewrite it like this:
data Exp a
= ExpVar a
-- ^ This refers to a variable, like @x@.
| ExpApp (Exp a) (Exp a)
-- ^ This refers to a function application, like @f x@.
| ExpLam (Exp (Maybe a))
-- ^ This refers to a lambda function, like @\x -> _@.
The ExpVar
and ExpApp
cases haven't changed, but the ExpLam
case needs some explaining.
First off, in the new ExpLam
, where has the variable name gone? Why isn't this something like ExpLam a (Exp a)
?
The insight is that in ExpLam (Exp (Maybe a))
, we're recursing on Exp
, but on every recursion, we add a new layer of Maybe
to a
. Each layer of Maybe
refers to a variable being bound by a given lambda. This is a little tricky to understand, and it will help to look at a few examples.
Using Exp String
, here's a top-level free variable without any lambdas involved.
lambda calculus | Haskell | |
---|---|---|
top-level free variable | a |
ExpVar "a" |
This should be straightforward. There are no Maybe
s involved here (since there are no uses ExpLam
).
How about the two cases of a simple, single lambda expression: one with a free variable, and one with a bound variable?
lambda calculus | Haskell | |
---|---|---|
bound variable | \x -> x |
ExpLam (ExpVar Nothing) |
free variable | \x -> a |
ExpLam (ExpVar (Just "a")) |
You can see that within one layer of ExpLam
, ExpVar
now takes a type of Maybe String
. Just
refers to a free variable, while Nothing
refers to the closest bound variable.
This is pretty neat. We now have a way to differentiate free and bound variables. Bound variables don't even need to carry around a variable name (once a variable is bound, it doesn't really matter what its name is)!
What does this look like with more layers of lambdas? Here are a few more examples:
lambda calculus | Haskell | |
---|---|---|
var bound to inner lambda | \x -> \y -> y |
ExpLam (ExpLam (ExpVar Nothing)) |
var bound to outer lambda | \x -> \y -> x |
ExpLam (ExpLam (ExpVar (Just Nothing))) |
free var | \x -> \y -> a |
ExpLam (ExpLam (ExpVar (Just (Just "a")))) |
You can see here that within two layers of lambdas, ExpVar
now takes a type of Maybe (Maybe String)
. When adding each layer of lambda, we have to add a new layer of Maybe
.
With two layers of lambda (and two corresponding layers of Maybe
), you can see that Nothing
refers to the closest binding lambda, Just Nothing
refers to the next closest binding lambda, and Just (Just v)
refers to a free variable.
While this is certainly clever, why do we prefer Exp
over ExpSimp
?
Monad and Foldable for Exp
Exp
is nice because it has a straightforward Monad
, Foldable
, and Traversable
instance that allows us to manipulate free (non-bound) variables.
Here's what these instances look like:
data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))
deriving stock (Functor, Foldable, Traversable)
-- A trivial Applicative instance
instance Applicative Exp where pure = ExpVar ; (<*>) = ap
instance Monad Exp where
(>>=) :: Exp a -> (a -> Exp b) -> Exp b
e >>= k =
case e of
ExpVar a -> k a
ExpApp f a -> ExpApp (f >>= k) (a >>= k)
ExpLam l -> ExpLam (l >>= traverse k)
Let's look at how we can use each of these instances in turn. First, here's an example expression we can play around with:
exampleExp1 :: Exp String
exampleExp1 =
ExpLam $
ExpApp
(ExpVar (Just "a"))
(ExpLam $
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just "b"))
)
This corresponds to the following Lambda calculus expression:
\x -> a (\y -> x b)
First, let's look at the Functor
instance for Exp
. The Functor
instance gives us the ability to modify free variables. Here's an example of replacing free variables with their ASCII codes:
> fmap (ord . head :: String -> Int) exampleExp1 :: Exp Int
ExpLam $
ExpApp
(ExpVar (Just 97))
(ExpLam $
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just 98))
)
This result corresponds to a lambda expression that would look something like the following1:
\x -> 97 (\y -> x 98)
Note that the bound x
variable is not affected in any way, only the free a
and b
variables.
The Monad
instance gives us a way to substitute free variables. For instance, in our running example, let's say we want to substitute in the identity function in place of the free variable a
.
In our example \x -> a (\y -> x b)
, we want to substitute \z -> z
in place of a
. The result should look like:
\x -> (\z -> z) (\y -> x b)
Let's start by writing a function to do the substitution:
substituteIdForA :: String -> Exp String
substituteIdForA v
| v == "a" = ExpLam (ExpVar Nothing)
| otherwise = ExpVar v
Let's use this with bind
:
> exampleExp1 >>= substituteIdForA :: Exp String
ExpLam $
ExpApp
(ExpLam (ExpVar Nothing))
(ExpLam $
ExpApp
(ExpVar (Just Nothing))
(ExpVar (Just (Just "b"))
)
You can see how the ExpVar (Just "a")
case has been replaced by the new lambda expression.
Finally, let's look at Foldable
2. This gives us a way to pull out all the free variables3:
> foldMap (:[]) exampleExp1 :: [String]
["a", "b"]
This is a nice way to operate over all the free variables in an expression at once.
Takeaway for the Bird and Paterson Technique
The takeaway here is that once you come up with your expression data type:
you can automatically derive Functor
, Foldable
, and Traversable
. The Applicative
instance is trivial. All you really need to write is the Monad
instance, but it is straightforward.
Despite being easy to write, these instances give you quite a lot of power for manipulating free variables in your expressions. This is pretty nice!4
Push it Further
While this is certainly nice, I started wondering if there was any way to push it further. Is there some formulation where you don't even have to write the Monad
instance?
The first thing that came to my mind was a free-monad-like construction. With the normal approach to free monads for eDSLs, all you have to do is create a datatype and derive Functor
for it. You then wrap it up in Free
and you get the Monad instance... well, for free.
Could it work to just pass Exp
to Free
? Maybe something like the following?
data Free f a
= Pure a
| Free (f (Free f a))
data ExpFBad r
= ExpFBadApp r r
| ExpFBadLam (Maybe r)
type ExpBad = Free ExpFBad
This almost looks like it would work, but you'll notice that the ExpFBadLam
data constructor is incorrect. Compare to Exp
:
In ExpLam
, we're not recursing on Maybe (Exp a)
, we're recursing on Exp (Maybe a)
. As far as I can tell, it is not possible to express this type of recursive datatype with Free
5.
Let's rewrite Exp
to make this a little more obvious:
data ExpTran a
= ExpTranVar a
| ExpTranApp (IdentityT ExpTran a) (IdentityT ExpTran a)
| ExpTranLam (MaybeT ExpTran a)
Here, instead of recursing directly in Exp
, we've defined all recursion to happen through the monad transformers IdentityT
and MaybeT
. This ExpTran
datatype is isomorphic to Exp
, but it helps to show why it is not possible to define Exp
using Free
plus some functor.
Is there some sort of Free
-like structure that we can use to define Exp
?
FFree
and ExpF
for Exp
After playing around with this a bunch, I came up with the following construction:
data ExpF f a
= ExpFApp (IdentityT f a) (IdentityT f a)
| ExpFLam (MaybeT f a)
data FFree f a
= FPure a
| FFree (f (FFree f) a)
Here's ExpTran
and Free
for an easy comparison:
data ExpTran a
= ExpTranVar a
| ExpTranApp (IdentityT ExpTran a) (IdentityT ExpTran a)
| ExpTranLam (MaybeT ExpTran a)
data Free f a
= Pure a
| Free (f (Free f a))
Let's see if we can write Functor
, Applicative
, and Monad
for FFree
. We may need the assistance of an instance like Functor
for ExpF
, but let's try without and see how far we get.
First, Functor
for FFree f
:
instance Functor (FFree f) where
fmap :: (a -> b) -> FFree f a -> FFree f b
fmap g = \case
FPure a -> FPure (g a)
FFree ex -> FFree _
Hmm, we're stuck on the FFree
case. What is the type of _
?
What do we have in the environment that is relevant?
It seems like we need f
to be a Functor
:
Let's try writing the Monad
instance for FFree f
:
instance Monad (FFree f) where
(>>=) :: FFree f a -> (a -> FFree f b) -> FFree f b
m >>= k =
case m of
FPure a -> k a
FFree ex -> FFree _
We're stuck on the FFree
case again. What is the type of _
this time?
What do we have in the environment that is relevant?
This is a little different than the Functor
case, but it looks like we need a function like the following:
We might even be able to generalize this a to:
In our case, f
will be ExpF
, and h
will be FFree ExpF
. Fully specialized, this would look like:
The Bound
Typeclass
Let's wrap this function up in a typeclass6:
We'll need an instance for ExpF
(since f
will eventually become ExpF
and h
will be FFree ExpF
).
But first, lets confirm we can use this typeclass to write Functor
and Monad
for FFree
:
instance Bound f => Functor (FFree f) where
fmap :: (a -> b) -> FFree f a -> FFree f b
fmap f = \case
FPure a -> FPure (f a)
FFree ex -> FFree $ ex >>>= (pure . f)
instance Bound f => Applicative (FFree f) where pure = FPure ; (<*>) = ap
instance Bound f => Monad (FFree f) where
(>>=) :: FFree f a -> (a -> FFree f b) -> FFree f b
m >>= k =
case m of
FPure a -> k a
FFree ex -> FFree $ ex >>>= k
Looks like it works!
Now for our ExpF
instance of Bound
:
instance Bound ExpF where
(>>>=) :: Monad h => ExpF h a -> (a -> h b) -> ExpF h b
ex >>>= f = case ex of
ExpFApp ida idb -> ExpFApp (ida >>= lift . f) (idb >>= lift . f)
ExpFLam may -> ExpFLam $ may >>= lift . f
What have we gained?
So what have we gained here? Let's review the relevant datatypes and instances. First, Exp
:
data Exp a
= ExpVar a
| ExpApp (Exp a) (Exp a)
| ExpLam (Exp (Maybe a))
instance Functor Exp
instance Applicative Exp
instance Monad Exp
We have Functor
, Applicative
, and Monad
for Exp
.
Now Bound
, ExpF
, and FFree
:
class Bound f where
(>>>=) :: Monad h => f h a -> (a -> h b) -> f h b
data ExpF f a
= ExpFApp (IdentityT f a) (IdentityT f a)
| ExpFLam (MaybeT f a)
instance Bound ExpF
data FFree f a
= FPure a
| FFree (f (FFree f) a)
instance Bound f => Functor (FFree f) where
instance Bound f => Applicative (FFree f) where
instance Bound f => Monad (FFree f) where
Here are the big ideas:
FFree ExpF
is isomorphic toExp
.FFree
andBound
could be wrapped up in a library, similar to free.- All an end-user would have to provide is the datatype
ExpF
and theBound ExpF
instance. The end-user would not have to write theFunctor
,Applicative
, orMonad
instances. FFree
andBound
are sort of likeFree
andFunctor
, except that GHC can automatically deriveFunctor
in most cases. The end-user would have to manually write theBound ExpF
instance.
More Generalization
It is possible to generalize this a little more. You can pull out the MaybeT
and IdentityT
transformers used in ExpF
, and define Bound
for any MonadTrans
. Here's what we originally had:
This can be generalized to:
The Bound
instance for ExpTransF
then becomes7:
instance
( forall x. Monad x => Monad (i x)
, MonadTrans i
, forall x. Monad x => Monad (m x)
, MonadTrans m
) => Bound (ExpTransF i m) where
(>>>=) :: Monad h => ExpTransF i m h a -> (a -> h b) -> ExpTransF i m h b
ex >>>= f = case ex of
ExpTransFApp ida idb -> ExpTransFApp (ida >>= lift . f) (idb >>= lift . f)
ExpTransFLam l -> ExpTransFLam $ l >>= lift . f
Now, we have the following isomorphisms:
ExpTransF IdentityT MaybeT
is isomorphic toExpF
.FreeF (ExpTransF IdentityT MaybeT)
is isomorphic to bothFreeF ExpF
andExp
.
Alternatively, you can actually define Bound
instances for MaybeT
and IdentityT
, and call out to them in the implementation of the Bound
instance for ExpTransF
:
instance Bound IdentityT where
(>>>=) :: Monad h => IdentityT h a -> (a -> h b) -> IdentityT h b
(IdentityT m) >>>= f = IdentityT $ m >>= f
instance Bound MaybeT where
(>>>=) :: Monad h => MaybeT h a -> (a -> h b) -> MaybeT h b
(MaybeT m) >>>= f = MaybeT $ m >>= traverse f
instance (Bound i, Bound m) => Bound (ExpF i m) where
(>>>=) :: Monad h => ExpF i m h a -> (a -> h b) -> ExpF i m h b
ex >>>= f = case ex of
ExpFApp ida idb -> ExpFApp (ida >>>= f) (idb >>>= f)
ExpFLam l -> ExpFLam $ l >>>= f
Conclusion
In practice, is this FFree
, Bound
, and ExpF
approach good?
In my opinion, using FFree
, Bound
, and ExpF
likely doesn't save you that much work over just writing a Monad
instance for Exp
. And show
ing a value of FFree ExpF
is harder to understand than Exp
, since there are many "extra" constructors.
If there was a FFree
library, it might contain a few nice helper functions, but my hunch is that it wouldn't save you too much work over just using Exp
and writing them yourself.
However, if you want to use ExpTransF
and frequently change the monad transformer you are working with, maybe FFree
and Bound
would save you a little work?
I'd be interested in hearing from anyone who has used a similar approach, or possibly wants to collaborate on a library in this space8.
Footnotes
It's another question entirely whether an operation like this would mean anything semantically for the underlying lambda calculus. This is mostly just an example of using
fmap
!↩︎I'm not going to discuss
Traversable
, but it should be straightforward to understand if you understand theFoldable
instance.↩︎Note that
(:[])
is thea -> [a]
singleton list construction function. Otherwise known aspure
for the list monad.↩︎There are a few other advantages for the Bird and Paterson technique, You can find them in the Bound article.↩︎
I haven't formally proven this, but I played around with it a bunch and wasn't able to get it to work. I plan on brushing off my dusty Coq skills and trying coming up with an actual proof of this.↩︎
This is actually the
Bound
typeclass frombound
. I stumbled on this without realizing it was already defined inbound
, but it was very nice to know I wasn't just pulling some random function out of the air.↩︎The quantified constraints make this look a little complicated, but I expect they will go away starting with transformers-0.6.0.0, available with GHC-9.6, since
MonadTrans
has gained these exact quantified constraints.↩︎The issue tracker for the
bound
library does have an open issue about people trying things in this same space.↩︎
tags: haskell