So far we have been discussing the mechanics of computing with lambda expressions. Our next topic is how to express "normal" program objects and constructs using lambda expressions. In particular, we'll consider how to encode:
First, let's consider one detail: According to the definition given earlier, an abstraction (a term of the form λx.M) defines a function of one argument (x). What if we want functions of more than one argument?
To answer this question, let's think about how to define a (non-pure) lambda expression to represent the "plus" function. Normally, we think of plus as a function of two arguments; i.e., its type is:
The type of this version of plus is:
First, let's consider how to encode a conditional expression of the form: if P then A else B (i.e., the value of the whole expression is either A or B, depending on the value of P). We will represent this conditional expression using a lambda expression of the form: COND P A B, where COND, P, A and B are all lambda expressions. In particular, COND is a function of 3 arguments that works by applying P to A and B (i.e., P itself chooses A or B):
To make this definition work correctly, we must define the representations of true and false carefully (since the lambda expression P that COND applies to its arguments A and B will reduce to either TRUE or FALSE). In particular, when TRUE is applied to a and b we want it to return a, and when FALSE is applied to a and b we want it to return b. Therefore, we will let TRUE be a function of two arguments that ignores the second argument and returns the first argument, and we'll let FALSE be a function of two arguments that ignores the first argument and returns the second argument:
Now let's consider the Boolean operators and and or. We'll use prefix notation (as we did for COND); e.g., A and B will be represented as AND A B, where A and B are lambda expressions that represent boolean expressions, and AND is the lambda expression that represents the and operator. Note that AND and OR need to be functions that take two (boolean) arguments. Note also that given the expression A and B, if A is false, then the value of the whole expression is false, while if A is true, then the value of the whole expression is B (and similarly for or). In other words, AND A B must either reduce to FALSE (if A is FALSE) or to B (if A is TRUE). Recall that FALSE is a function of two arguments that returns the second, while TRUE is a function of two arguments that returns the first. So we can define AND and OR to apply their first argument A (which will reduce to either TRUE or FALSE) to their second argument B and to the appropriate boolean literal:
(λa.λb.a TRUE b) | (λx.λy.y) | (λx.λy.x) |
OR | FALSE | TRUE |
→_{β} (λb.(λx.λy.y) TRUE b)(λx.λy.x) | ||
→_{β} (λx.λy.y) TRUE (λx.λy.x) | ||
= (λx.λy.y) (λx.λy.x) (λx.λy.x) | ||
→_{β} (λy.y)(λx.λy.x) | ||
→_{β} (λx.λy.x) | ||
= TRUE |
Now we'll consider how to encode LISP-style lists. In LISP, a list is either (a) empty (nil), or (b) a pair: (item list). Lists are built using the cons operator. Think of cons as a function of two arguments (an item and a list) that gives you back a list (the given one with the given item added to the front).
Non-empty lists are "taken apart" using selector functions head and tail:
Finally, the isempty function can be used to determine whether a list is nil or non-nil.
Below are two examples of LISP-like functions that manipulate lists (to show you how the list operators work). The first computes the sum of the numbers in a given list L, and the second creates a new list like the given one except that each item is "doubled" (appears twice in a row).
function sum(L: list) { if (isempty L) 0 else + (head L)(sum (tail L)) } function double(L: list) { if (isempty L) nil else cons (head L)(cons (head L) (double (tail L))) }The way we'll define a non-nil list is as a function that "stores" the head and tail of the list in its body. Its argument will be a selector function (head or tail). (We'll see in a minute how that permits us to implement the selector functions themselves in a clever manner.) So every non-nil list is of the form:
Before we discuss implementing head and tail, let's think about how to implement cons. Recall that cons is a function that, when applied to two arguments (an item and a list) returns a list. Given our idea of having a list store its head and tail in its body, it's easy to define cons:
Note that we're using TRUE and FALSE as shorthand for λx.λy.x, and λx.λy.y; boolean values have nothing to do with our list implementation.
We still need a representation for nil, and a definition of the isempty function. Let's think about isempty first. For ISEMPTY we want:
Here are some examples of lists:
List | Encoding |
nil | λx.TRUE |
(a) | λs.s a NIL |
(a,b) | (λs.s a (λs.s b NIL)) |
Now let's try an example of some list operations:
(λL.L(λh.λt.FALSE)) | ( (λL.L FALSE) | ((λh.λt.λs.s h t) | a NIL) ) |
ISEMPTY | TAIL | CONS | |
→_{β} (λL.L(λh.λt.FALSE)) ( (λL.L FALSE) ((λt.λs.s a t) NIL) ) | |||
→_{β} (λL.L(λh.λt.FALSE)) ( (λL.L FALSE) (λs.s a NIL) ) | |||
→_{β} (λL.L(λh.λt.FALSE)) ( (λs.s a NIL) FALSE ) | |||
→_{β} (λL.L(λh.λt.FALSE)) (FALSE a NIL) | |||
→_{β}* (λL.L(λh.λt.FALSE)) NIL (since FALSE returns its 2nd arg) | |||
→_{β} NIL(λh.λt.FALSE) | |||
= (λa.TRUE)(λh.λt.FALSE) | |||
→_{β} TRUE |
We will consider two different ways to encode natural numbers. In both cases we'll consider the following operations:
The first approach involves representing the number n using a list of n x's:
0 | == | empty list | = | NIL | = | λx.TRUE |
1 | == | one element list | = | λs.s x NIL | ||
2 | == | two element list | = | λs.s x (λs.s x NIL) | ||
etc. |
Given this representation, here's how we implement the operations:
Write the lambda expression that represents: succ(0) and do the beta reductions (make sure that your result is the lambda expression that represents 1)
Our second technique for representing natural numbers is to represent the number n as a lambda expression of the form:
0 | == | λx.λy.y |
1 | == | λx.λy.xy |
2 | == | λx.λy.x(xy) |
3 | == | λx.λy.x(x(xy)) |
etc. |
Write the lambda expressions that represent: iszero(0), iszero(1), and iszero(2), and do the beta reductions (make sure that the results are TRUE, FALSE, and FALSE).
Now let's consider the succ function. For all numbers n, we want succ applied to n to produce n+1. That is, we want:
But f is not quite what we need for SUCC, because number n is not x^{n}y, it's λx.λy.x^{n}y. So we need to stick that "λx.λy" on to the front of the result produced by f, defining SUCC as follows:
To complete our discussion of the second way to represent the natural numbers, we need to define PRED. Unfortunately, that definition is quite complicated, so we'll just give the definition here (and the "interested reader" is welcome to try it out and/or to think about why it makes sense). The definition is from Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory by Joseph Stoy, and is found on pageg 72:
In defining our encodings above, we have relied on intuition and a few examples to convince ourselves that the definitions we gave were correct. A more rigorous approach would be to prove the correctness of each encoding. Doing so for all encodings would be a bit much, but we'll do it for one example: the definition of ISZERO for the second way of representing natural numbers.
Theorem: Our encoding of iszero is correct; i.e., the representation of iszero(0) beta-reduces to TRUE, and for all n > 0, the representation of iszero(n) beta-reduces to FALSE.
Proof by cases on n:
Case 1, n = 0: Proved in self-study problem 2.
Case 2, n = 1: Proved in self-study problem 2.
Case 3, n > 1:
(λf.f(λa.FALSE)TRUE) | (λx.λy.x^{n} y) |
ISZERO | n |
We would like to be able to define recursive functions like factorial in the lambda-calculus:
But we're not allowed to name functions in lambda calculus, so we can't do this directly. In the previous notes on encoding natural numbers, etc., we used names that represented lambda-calculus expressions, such as "ISZERO", but these are better thought of as shorthand for the corresponding lambda expressions (or macros that are expanded into the corresponding lambda expressions). We'd get an infinite "unfolding" if we treated "fact" as such a macro and tried to expand it.
To understand recursion in lambda expressions, it helps to think about equations like X = 4/X where we are to solve for X. In such cases, we can test a potential solution by replacing X with some value; for example:
It is also true that if the value x' is a solution, then:
Here we used a lambda expression in which the X on the right-hand side was abstracted out, and then applied the expression to the value x'.
We can use the same trick to define a recursive function. For example, think of the definition of the factorial function given above as an equation instead of a definition and abstract on "fact":
Note that the lambda expression takes an argument f, and gives you a function that either returns 1 or returns the result of an application of the given function.
Finding a value (a lambda expressions) for fact' that solves this equation involves finding a fixed-point of the function we created by abstracting.
Definition: A fixed point of a function F is a value x such that x = F(x).
Here are some example functions and some of their fixed points:
Function | Fixed Point |
---|---|
f(x) = 5 | 5 |
f(x) = x | all integers |
f(x) = 4/x | 2, -2 |
f(x) = x+1 | none in the domain of integers |
An amazing fact is that in lambda-calculus, every function has a fixed point, though it may not correspond to anything "useful". (For example, the fixed point of λx.x+1 is a lambda-expression that doesn't correspond to an integer.) The fixed point may not have a normal form either (for recursive definitions), but that's OK since normal forms are the lambda equivalent of "answers" to computations and we don't expect a recursive definition to be an answer.
Recall that our goal is to define recursive functions using lambda expressions. Our technique is to define an equation of the form
Solving an equation like that requires that we find a fixed point of the function defined on the right-hand side by the outermost lambda expression. In general, given a function F, we need a way to produce a lambda expression X such that X = (F)X.
The insight is as follows: We need X to be an expression that can make a copy of itself that's ready to be applied to the "n-1 case". This should keep happening until the "n=0 case" is reached. The form for X will be DD.
If we plug in (DD) for X we get:
What does it mean for lambda expression DD to be equal to lambda expression (F)(DD)? We haven't actually this discussed this question (it is discussed below), but intuitively it seems reasonable that they should be considered equal if one beta-reduces to to the other. We don't get to choose what F is (that's the abstraction of the recursive function we're trying to define), so we can't define DD so that (F)(DD) reduces to (DD). However, we do get to define (DD), so let's try to do that so that (DD) beta-reduces to (F)(DD). To do that, we need to make F part of D; in particular we can define D as:
Let's check this definition of D by actually trying it out (i.e., by using it to replace each instance of D in (DD) -- the left-hand side of the equation (DD) = (F)(DD) -- and verifying that the result does beta-reduce to (F)(DD)):
(λy.F(y y)) | (λy.F(y y)) | →_{β} | F( | (λy.F(y y)) | (λy.F(y y)) |
D | D | F | D | D |
It works! Now we know how, given any function F, we can define a term X (of the form DD) such that X is a fixed point of F. But we can do more -- we can "automate" that process by defining a lambda expression that, when applied to any function F, produces a fixed point for F. This fixed-point creator is called the Y combinator, and is defined as follows:
Let's try it out to get a (non-recursive) definition of factorial; i.e., we will apply Y to:
Here's what we get:
We can test this definition by applying it to the integer 3. If it's correct, the whole thing should reduce to 3! = 6. Here's the sequence of beta-reductions (for clarity, we start by using F to represent the function to which we applied the Y combinator):
Summary: To define a recursive function "f = λx. ...f...f...", find a closed-form solution (a lambda expression) as follows:
Below are the rules of equality for lambda expressions. Note that rules 1 - 3 are required for any equivalence relation (it must be reflexive, symmetric, and transitive). Rule 4 says that if one lambda-expression alpha- or beta-reduces to another, then the two are equivalent. Rule 5 says that equal terms can be substituted on either side of an application, or as the body of a function, taking care not to mess up free variables.
Proof:
Now we've shown that both YM and M(YM) reduce to the same term T (using a sequence of beta-reductions). By rule 4, YM=T and M(YM)=T. By rule 2, T=M(YM), and by rule 3, YM=T=M(YM). Thus YM = M(YM).
Y is Curry's fixed-point combinator, and though we've just shown that YM=M(YM), it is not the case that YM reduces to M(YM) using a sequence of beta-reductions. Turing used a different combinator T, which has the property that for all M, TM →_{β}* M(TM). What is T?