So far we have been discussing the mechanics of computing with lambda
expressions.
Our next topic is how to express "normal" program objects and
constructs using lambda expressions.
In particular, we'll consider how to encode:
First, let's consider one detail:
According to the definition given earlier, an abstraction (a term
of the form λx.M) defines a function of one argument (x).
What if we want functions of more than one argument?
To answer this question, let's think about how to define a
(non-pure) lambda expression to represent the "plus" function.
Normally, we think of plus as a function of two arguments;
i.e., its type is:
The type of this version of plus is:
First, let's consider how to encode a conditional expression
of the form: if P then A else B (i.e., the value of the
whole expression is either A or B, depending on the value of P).
We will represent this conditional expression using a lambda expression
of the form: COND P A B, where COND, P, A and B
are all lambda expressions.
In particular, COND is a function of 3 arguments that works by applying
P to A and B (i.e., P itself chooses A or B):
To make this definition work correctly, we must define the
representations of true
and false carefully (since the lambda expression P
that COND applies to its arguments A and B will reduce to either TRUE
or FALSE).
In particular, when TRUE is applied to a and b we want it to return a,
and when FALSE is applied to a and b we want it to return b.
Therefore, we will let TRUE be a function of two arguments that
ignores the second argument and returns the first argument, and we'll
let FALSE be a function of two arguments that
ignores the first argument and returns the second argument:
Now let's consider the Boolean operators and and or.
We'll use prefix notation (as we did for COND); e.g., A and B
will be represented as AND A B, where A and B are lambda expressions
that represent boolean expressions, and AND is the lambda expression
that represents the and operator.
Note that AND and OR need to be functions that take two (boolean)
arguments.
Note also that
given the expression A and B, if A is false, then
the value of the whole expression is false, while if A is
true, then the value of the whole expression is B
(and similarly for or).
In other words, AND A B must either reduce to FALSE (if A is FALSE)
or to B (if A is TRUE).
Recall that FALSE is a function of two arguments that returns the second,
while TRUE is a function of two arguments that returns the first.
So we can define AND and OR to apply their first argument A (which will
reduce to either TRUE or FALSE) to their second argument B and to the
appropriate boolean literal:
Now we'll consider how to encode LISP-style lists.
In LISP, a list is either (a) empty (nil), or (b) a pair: (item list).
Lists are built using the cons operator.
Think of cons as a function of two arguments (an item and a list) that
gives you back a list (the given one with the given item added to
the front).
Non-empty lists are "taken apart" using selector functions head and
tail:
Finally, the isempty function can be used to determine
whether a list is nil or non-nil.
Below are two examples of LISP-like functions that manipulate lists
(to show you how the list operators work).
The first computes the sum of the numbers in a given list L, and the
second creates a new list like the given one except that each item
is "doubled" (appears twice in a row).
Before we discuss implementing head and tail, let's
think about how to implement cons.
Recall that cons is a function that, when applied to two arguments
(an item and a list) returns a list.
Given our idea of having a list store its head and tail in its body,
it's easy to define cons:
Note that we're using TRUE and FALSE as shorthand for
λx.λy.x, and
λx.λy.y;
boolean values have nothing to do with our list implementation.
We still need a representation for nil, and a definition of the
isempty function.
Let's think about isempty first.
For ISEMPTY we want:
Here are some examples of lists:
Now let's try an example of some list operations:
We will consider two different ways to encode natural numbers.
In both cases we'll consider the following operations:
The first approach involves representing the number n using a list
of n x's:
Given this representation, here's how we implement the operations:
Write the lambda expression that represents: succ(0)
and do the beta reductions (make sure that your result is the
lambda expression that represents 1)
Our second technique for representing natural numbers is to
represent the number n as a lambda expression of the form:
Write the lambda expressions that represent: iszero(0),
iszero(1), and iszero(2),
and do the beta reductions (make sure that the results are
TRUE, FALSE, and FALSE).
Now let's consider the succ function.
For all numbers n, we want succ applied to n to
produce n+1.
That is, we want:
But f is not quite what we need for SUCC, because number n is not
xny, it's λx.λy.xny.
So we need to stick that "λx.λy" on to the front
of the result produced by f, defining SUCC as follows:
To complete our discussion of the second way to represent the natural
numbers, we need to define PRED.
Unfortunately, that definition is quite complicated, so we'll just
give the definition here (and the "interested reader" is welcome to
try it out and/or to think about why it makes sense).
The definition is from Denotational Semantics: The Scott-Strachey
Approach to Programming Language Theory by Joseph Stoy, and is
found on pageg 72:
In defining our encodings above, we have relied on intuition and
a few examples to convince ourselves that the definitions we gave
were correct.
A more rigorous approach would be to prove the correctness
of each encoding.
Doing so for all encodings would be a bit much, but we'll do it for
one example: the definition of ISZERO for the second way of representing
natural numbers.
Theorem: Our encoding of iszero is correct;
i.e., the representation of iszero(0) beta-reduces to TRUE, and for
all n > 0, the representation of iszero(n) beta-reduces to FALSE.
Proof by cases on n:
Case 1, n = 0: Proved in self-study problem 2.
Case 2, n = 1: Proved in self-study problem 2.
Case 3, n > 1:
We would like to be able to define recursive functions like factorial
in the lambda-calculus:
But we're not allowed to name functions in lambda calculus, so we can't
do this directly. In the previous notes on encoding natural numbers,
etc., we used names that represented
lambda-calculus expressions, such as "ISZERO", but these are better thought
of as shorthand for the corresponding lambda expressions (or macros that
are expanded into the corresponding lambda expressions).
We'd get an infinite "unfolding" if we treated "fact" as such a macro and
tried to expand it.
To understand recursion in lambda expressions, it helps to think about
equations like X = 4/X where we are to solve for X. In such cases, we can
test a potential solution by replacing X with some value;
for example:
It is also true that if the value x' is a solution, then:
Here we used a lambda expression in which the X on the right-hand side
was abstracted out, and then applied the expression
to the value x'.
We can use the same trick to define a recursive function.
For example, think of the definition of the factorial function given
above as an equation instead of a definition and abstract on "fact":
Note that the lambda expression takes an argument f, and gives you
a function that either returns 1 or returns the result of an
application of the given function.
Finding a value (a lambda expressions)
for fact' that solves this equation involves finding
a fixed-point of the function we created by abstracting.
Definition: A fixed point of a function F is a value
x such that x = F(x).
Here are some example functions and some of their fixed points:
An amazing fact is that in lambda-calculus, every
function has a fixed point, though it may not correspond to anything
"useful". (For example, the fixed point of λx.x+1 is a
lambda-expression that doesn't correspond to an integer.) The fixed
point may not have a normal form either (for recursive definitions),
but that's OK since normal forms are the lambda equivalent of
"answers" to computations and we don't expect a recursive definition
to be an answer.
Recall that our goal is to define recursive functions using lambda
expressions.
Our technique is to define an equation of the form
Solving an equation like that requires that we find a fixed point of
the function defined on the right-hand side by the outermost lambda
expression.
In general, given a function F, we need a way to produce a
lambda expression X such that X = (F)X.
The insight is as follows:
We need X to be an expression that can make a copy of itself that's ready
to be applied to the "n-1 case". This should keep happening
until the "n=0 case" is reached.
The form for X will be DD.
If we plug in (DD) for X we get:
What does it mean for lambda expression DD to be equal to lambda expression
(F)(DD)?
We haven't actually this discussed this question
(it is discussed below), but intuitively it seems
reasonable that they should be considered equal if one beta-reduces to
to the other.
We don't get to choose what F is (that's the abstraction of the recursive
function we're trying to define), so we can't define DD so that (F)(DD)
reduces to (DD).
However, we do get to define (DD), so let's try to do that so that
(DD) beta-reduces to (F)(DD).
To do that, we need to make F part of D; in particular we can define
D as:
Let's check this definition of D by actually trying it out (i.e., by
using it to replace each instance of D in (DD) -- the left-hand side
of the equation (DD) = (F)(DD) -- and verifying that the result
does beta-reduce to (F)(DD)):
It works!
Now we know how, given any function F, we can define a term X (of the form DD)
such that X is a fixed point of F.
But we can do more -- we can "automate" that process by defining
a lambda expression that, when applied to any function F, produces
a fixed point for F.
This fixed-point creator is called the Y combinator, and is defined
as follows:
Let's try it out to get a (non-recursive) definition of factorial;
i.e., we will apply Y to:
Here's what we get:
We can test this definition by applying it to the integer 3.
If it's correct, the whole thing should reduce to 3! = 6.
Here's the sequence of beta-reductions (for clarity, we start by using F
to represent the function to which we applied the Y combinator):
Summary: To define a recursive function "f = λx. ...f...f...", find a
closed-form solution (a lambda expression) as follows:
Overview
(int X int) → int
However, we can use currying (a technique defined by
Haskell Curry, having nothing to do with spicy food or dirty
horses) to define plus using only functions of one
argument.
The trick is to define plus as a function of one argument that,
when applied to that argument (the first number), produces a function
that, when applied to the second number produces the sum.
Note that when plus is applied to two numbers,
it produces their sum as expected.
int → (int → int)
Here's how we define it in lambda calculus:
λx.λy.x+y
And here's an example of an application:
(λx.λy.x+y) 3 2
→β
((λy.3+y) 2)
→β
(3+2) = 5
Encoding Conditionals and Boolean Values
COND == λp.λa.λb.p a b
(where == means "is defined to be").
TRUE == λx.λy.x
Now let's consider an example: COND TRUE M N.
Note that this expression should evaluate to M.
Let's see if it does (by substituting our definitions for COND and
TRUE, and evaluating the resulting expression).
The sequence of beta-reductions is shown below;
in each case, the redex about to be reduced is indicated by underlining
the formal parameter and the argument that will be substituted in
for that parameter.
FALSE == λx.λy.y
(λp.λa.λb.pab)(λx.λy.x)M N →β
(λa.λb.(λx.λy.x)ab)M N →β
(λb.(λx.λy.x)Mb)N →β
(λx.λy.x)MN →β
(λy.M)N →β
M
Encoding Boolean Operations
AND == λa.λb.a b FALSE
To see how these boolean operators work, consider the example: FALSE OR TRUE
(which should reduce to TRUE).
Here are the reductions:
OR == λa.λb.a TRUE b
(λa.λb.a TRUE b)
(λx.λy.y)
(λx.λy.x)
OR
FALSE
TRUE
→β (λb.(λx.λy.y) TRUE b)(λx.λy.x)
→β (λx.λy.y) TRUE (λx.λy.x)
= (λx.λy.y) (λx.λy.x) (λx.λy.x)
→β (λy.y)(λx.λy.x)
→β (λx.λy.x)
= TRUE
Encoding Lists
head: list → item (returns first item in given list)
tail: list → list (returns everything except first item)
function sum(L: list) {
if (isempty L) 0
else + (head L)(sum (tail L))
}
function double(L: list) {
if (isempty L) nil
else cons (head L)(cons (head L) (double (tail L)))
}
The way we'll define a non-nil list is as a function that "stores"
the head and tail of the list in its body.
Its argument will be a selector function (head or tail).
(We'll see in a minute how that permits us to implement the
selector functions themselves in a clever manner.)
So every non-nil list is of the form:
λs.s h t
where s is the selector and h and t are the head and tail of the list.
CONS == λh.λt.(λs.s h t)
Now let's think about the selectors.
We want head applied to list L to give us back the
head of the list (which is stored in L's body).
We can get that value if we apply the list itself (remember, it's
a function of the form λs.s h t) to a function s that
takes two arguments (h and t), and returns h.
Similarly, for tail we want to apply the list to a function s
that takes two arguments (h and t), and returns t.
Here are the appropriate definitions:
HEAD == λL.L TRUE
TAIL == λL.L FALSE
(ISEMPTY NIL) →β TRUE
and
(ISEMPTY L) →β FALSE
for a non-nil list L.
All non-nil lists are of the form λs.s h t so we really want:
(ISEMPTY (λs.s h t)) →β FALSE
So ISEMPTY needs to "feed" its list argument an appropriate
value for s that ignores its arguments h and t, and
just returns FALSE.
That's easy to define:
ISEMPTY == λL.L(λh.λt.FALSE)
Once we have this definition of ISEMPTY, we can decide how
to define NIL so that ISEMPTY works correctly when applied
to NIL, too (i.e., returns TRUE in that case).
In particular, we want:
((λL.L(λh.λt.FALSE))NIL) →β TRUE
So we need to define NIL as follows:
NIL == λx.TRUE
List
Encoding
nil
λx.TRUE
(a)
λs.s a NIL
(a,b)
(λs.s a (λs.s b NIL))
ISEMPTY (TAIL (CONS a NIL))
(the answer should be TRUE):
(λL.L(λh.λt.FALSE))
( (λL.L FALSE)
((λh.λt.λs.s h t)
a NIL) )
ISEMPTY
TAIL
CONS
→β (λL.L(λh.λt.FALSE)) ( (λL.L FALSE) ((λt.λs.s a t) NIL) )
→β (λL.L(λh.λt.FALSE)) ( (λL.L FALSE) (λs.s a NIL) )
→β (λL.L(λh.λt.FALSE)) ( (λs.s a NIL) FALSE )
→β (λL.L(λh.λt.FALSE)) (FALSE a NIL)
→β* (λL.L(λh.λt.FALSE)) NIL (since FALSE returns its 2nd arg)
→β NIL(λh.λt.FALSE)
= (λa.TRUE)(λh.λt.FALSE)
→β TRUE
Encoding Natural Numbers
Method #1
0
==
empty list
=
NIL
=
λx.TRUE
1
==
one element list
=
λs.s x NIL
2
==
two element list
=
λs.s x (λs.s x NIL)
etc.
Method #2
λx.λy.xn y
where xn y means x(x(x ... (xy))...), with n x's.
For example:
Now let's think about how to define the operations.
We want iszero applied to (the representation of) zero to
evaluate to true, and applied to anything other than zero to
evaluate to false.
Note that ZERO is a function of two arguments;
ISZERO needs to "get rid" of that function by applying it to
the appropriate two values so that the result is TRUE;
i.e., we want ISZERO to be of the form:
0
==
λx.λy.y
1
==
λx.λy.xy
2
==
λx.λy.x(xy)
3
==
λx.λy.x(x(xy))
etc.
λf.f _ _
What should the missing values be?
Well, ZERO is the function that returns its second argument,
and we want ISZERO applied to ZERO to evaluate to TRUE,
so the second argument better be TRUE.
The numbers other than ZERO are functions that apply their first
argument to the second argument some number of times, and for all of
those numbers we want the final result to be FALSE.
So the first argument needs to be a lambda term g such that
g applied to TRUE is FALSE; g applied to (g applied to TRUE) is
FALSE, etc.
The answer is actually very simple: make g be the function
that ignores its argument and returns FALSE:
g == λx.FALSE
And now we know that ISZERO should be defined like this:
λf.f(λx.FALSE)TRUE
(SUCC(λx.λy.xn y)) →β (λx.λy.xn+1 y)
First, think about what function f applied to (λx.λy.xn y)
yields (xn+1 y)?
To define that function we'll use what is becoming our standard trick:
define f so that it applies its argument (λx.λy.xn y)
to something.
Since (λx.λy.xn y) is a function of two arguments,
f needs to apply it to two arguments:
f == λa.a arg1 arg2
When f is applied to (λx.λy.xn y), arg1
will be substituted in for x, and arg2 for y;
i.e., we'll have:
arg1n arg2
and we want that to be
xn+1 y
So arg1 needs to be x, and arg2 needs to be xy (because
xn(xy) = xn+1y).
SUCC == λa.(λx.λy.a x (xy))
PRED == λk.(k(λp.λu.u(SUCC(p TRUE))(p TRUE))(λu.u ZERO ZERO))FALSE
Proof of correctness of ISZERO
In this case, the representation of iszero(n) is:
This term beta-reduces as follows (with the formal and the argument underlined
each time):
(λf.f(λa.FALSE)TRUE)
(λx.λy.xn y)
ISZERO
n
(λf.f(λa.FALSE)TRUE)(λx.λy.xn y) →β
Since n > 1 this is equal to:
(λx.λy.xn y)(λa.FALSE)TRUE →β
(λy.(λa.FALSE)n y)TRUE →β
(λa.FALSE)n (TRUE)
(λa.FALSE)((λa.FALSE)n-1 (TRUE))
But λa.FALSE applied to anything beta-reduces to FALSE,so
this whole expression beta-reduces to FALSE, and the proof is complete.
Defining Recursive Functions
fact = λx. if x==0 then 1 else x*fact(x-1)
2 = 4/2
x' = (λy.4/y)(x')
fact' = (λf.λx. if x==0 then 1 else x*f(x-1))(fact')
Function
Fixed Point
f(x) = 5
5
f(x) = x
all integers
f(x) = 4/x
2, -2
f(x) = x+1
none in the domain of integers
Finding fixed points
f' = (λf.λx. function body including an application of f)(f')
(DD) = (F)(DD)
λy.F(y y)
(λy.F(y y))
(λy.F(y y))
→β
F(
(λy.F(y y))
(λy.F(y y))
D
D
F
D
D
λf.(λy.f(y y))(λy.f(y y))
(λf.λx. if x==0 then 1 else x*f(x-1))
(λf.(λy.f(y y))(λy.f(y y)))(λf.λx. if x==0 then 1 else x*f(x-1))
((λf.(λy.f(y y))(λy.f(y y)))F)(3)
Note that when the base case was reached (the 3rd-to-last line), the
"else" part of the "if" was thrown away and the recursive "calls"
finished.
(λy.F(y y))(λy.F(y y))(3)
F((λy.F(y y))(λy.F(y y)))(3)
((λf.λx. if x==0 then 1 else x*f(x-1))((λy.F(y y))(λy.F(y y))))(3)
(λx. if x==0 then 1 else x*((λy.F(y y))(λy.F(y y)))(x-1))(3)
if 3==0 then 1 else 3*((λy.F(y y))(λy.F(y y)))(3-1)
3*(((λy.F(y y))(λy.F(y y)))(2))
3*(F((λy.F(y y))(λy.F(y y)))(2))
3*(((λf.λx. if x==0 then 1 else x*f(x-1))((λy.F(y y))(λy.F(y y))))(2))
3*((λx. if x==0 then 1 else x*((λy.F(y y))(λy.F(y y)))(x-1))(2))
3*(if 2==0 then 1 else 2*((λy.F(y y))(λy.F(y y)))(2-1))
3*(2*(((λy.F(y y))(λy.F(y y)))(1)))
3*(2*(F((λy.F(y y))(λy.F(y y)))(1)))
3*(2*(((λf.λx. if x==0 then 1 else x*f(x-1))((λy.F(y y))(λy.F(y y))))(1)))
3*(2*((λx. if x==0 then 1 else x*((λy.F(y y))(λy.F(y y)))(x-1))(1)))
3*(2*(if 1==0 then 1 else 1*((λy.F(y y))(λy.F(y y)))(1-1)))
3*(2*(1*(((λy.F(y y))(λy.F(y y)))(0))))
3*(2*(1*(F((λy.F(y y))(λy.F(y y)))(0))))
3*(2*(1*(((λf.λx. if x==0 then 1 else x*f(x-1))((λy.F(y y))(λy.F(y y))))(0))))
3*(2*(1*((λx. if x==0 then 1 else x*((λy.F(y y))(λy.F(y y)))(x-1))(0))))
3*(2*(1*(if 0==0 then 1 else 0*((λy.F(y y))(λy.F(y y)))(0-1))))
3*(2*(1*1))
6
Now you have a lambda expression that you can apply to any argument.
The result (after beta-reducing) will be the result you want: the
result of applying the recursive function to that argument.