In this assignment, you'll manipulate abstract syntax trees of lambda-calculus terms in two different forms. You should do the following:
Lambda.to_debruijn
, which converts traditional lambda-calculus terms to De Bruijn-indexed terms.DeBruijn.beta_lor
, which performs the leftmost, outermost beta reduction in a De Bruijn-indexed term.In traditional lambda notation, two expressions may have the same essential structure, but be technically different due to the names of bound variables. That is, two expressions may be equal up to alpha reductions, but not syntactically identical. De Bruijn indices ameliorate this problem: if you can alpha reduce two traditional lambda-calculus terms to syntactically identical terms, then their corresponding De Bruijn-indexed terms are syntactically identical.
The binding depth of a variable is the number of lambdas
between that variable and the lambda to which it is bound, including
the binding lambda.
So, in \x.\y.\z.xyzy
, the binding depths of the variables
are 3, 2, and 1 for x
, y
, and z
,
respectively. (We're using \
to represent lambdas so that
we don't have to worry about handling Unicode in everyone's browsers,
editors, and terminals.)
A De Bruijn-indexed term replaces the names of variables with their
binding depth, and thus removes the need to name variables at
all. Thus, \x.\y.\z.xyzy
is equivalent to the De
Bruijn-indexed term \\\3 2 1 2
. For a more complicated
example, \x.\y.\f.f(\x.x)(\z.y)
is \\\1(\1)(\3)
with De Bruijn indices. Notice that the
occurrences of f
and x
have the same De Bruijn
index, but they are in different contexts, and so they are bound to
different lambdas.
De Bruijn indices simplify reductions by eliminating the renaming usually required to avoid variable capture. However, beta-reduction is still more complicated than simply replacing some variables in the left-hand side of an application with the term in the right-hand side, as you'll shortly learn.
If you're working from a CSL machine, go to some suitable working directory and run
/u/a/w/aws/public/html/courses/cs704-code/asn1/grab
This will create the Prog1
directory, and populate it with the code you'll need - symbolic links to the files you should not change, and local copies of the files that you should change.
If you've never used OCaml, you'll need to learn some of the language. Chapters 1 and 2 of the OCaml Manual form a reasonable introduction to the language. None of the skeleton code, nor none of the code you write, needs to handle OCaml's object-oriented syntax, labeled parameters, or polymorphic variants. You won't need to write functor modules, either, but you may want to use them - OCaml's standard maps and sets are implemented as functors.
I strongly recommend playing with the ocamlc
interpreter to help learn the language. Invoke it as rlwrap ocamlc
- this interposes the input functions from the Readline library, which gives you a much nicer interactive shell.
To build the code, run ./build
in the Prog1
directory. This will run ocamlbuild
, producing four executables: ToDeBruijnCheck.byte
. ToDeBruijnRepl.byte
, BetaLorCheck.byte
, and BetaLorRepl.byte
; it will put a bunch of OCaml object files in the _build
directory.
These programs each drive your code in various ways. The .byte
extension on these programs indicates that they're compiled OCaml bytecode. The ToDeBruijn
programs read traditional lambda-calculus terms and write equivalent De Bruijn-indexed terms; they're thin wrappers around the Lambda.to_debruijn
function. The BetaLor
programs read De Bruijn-indexed terms and write those terms after a single beta reduction in normal order; they're thin wrappers around the DeBruijn.beta_lor
function. The Check
programs read one term per file; these are intended for automated testing. The Repl
programs read one term per line, and respond immediately; these are intended for interactive use. As with ocamlc
, these programs are more pleasant if you run them with rlwrap
.
To clean away the built files, run ocamlbuild -clean
.
The check
program is the program we'll be using to grade your program. It builds the program, and then runs ToDeBruijnCheck.byte
and BetaLorCheck.byte
against every test case in the Tests
directory. You can (and should!) run this yourself to see if your code handles everything we expect it to handle. Run this before you turn in your code! The Tests
directory contains the test files that check
uses. You may, of course, define your own test cases.
Input that you give to ToDeBruijnRepl.byte
on a single line,
input that you feed to ToDeBruijnCheck.byte
in an entire file,
and the contents of .lam
files should be in the following
(BNF)
syntax, where var
is any single, lowercase letter:
<term> ::= <term> <term>
| \ <var> . <term>
| ( <term> )
| <var>
Whitespace is optional between any two elements of the grammar. As
in the usual notation, application is left-associative, and
application has higher precedence than abstraction.
Thus, abcd
means the same thing
as ((ab)c)d
, and \x.\y.y\z.yz
means the same
thing as \x.(\y.(y (\z.(yz))))
.
Input that you give to BetaLorRepl.byte
on a single line,
input that you feed to BetaLorCheck.byte
in an entire file,
and the contents of .db
files should be in
the following
(BNF) syntax, where var
is any number:
<term> ::= <term> <term>
| \ <term>
| ( <term> )
| <var>
Whitespace is optional between most elements of the grammar,
but must occur between two
adjacent var
s. Otherwise, the parser would be unable to
distinguish between, e.g., 1 1
, a one applied to another
one, and 11
, an eleven. Again, application is left
associative, and application has higher precedence than abstraction.
When you're writing your code, you should only have to worry about a few files. As is usual in OCaml, .mli
files describe the interface to a module (like a header in C), and .ml
files implement that interface.
The only modules you should need to think much about are the Lambda
, DeBruijn
, and IO
modules.
Lambda
ModuleThe Lambda module has the following signature:
type expr =
Var of char
| Lambda of char * expr
| Apply of expr * expr
exception Empty
val to_debruijn: expr -> DeBruijn.expr
The Lambda.expr
type is the type of abstract syntax
trees for traditional lambda-calculus terms. Var
is an
expression that consists of a single variable, Apply(a,b)
is the
application of a
to b
,
and Lambda(c,e)
is a lambda term in which the variable
c
is bound. For instance,
\x.\y.xyx
is encoded as:
Lambda('x',
Lambda('y',
Apply(
Apply( Var 'x', Var 'y'),
Var 'x')))
Lambda.Empty
is an exception that the lambda-term parser throws when it receives empty input; you don't need to worry about it.
Lambda.to_debruijn: Lambda.expr -> DeBruijn.expr
is
the first function that you should implement in this assignment. In
the file Lambda.ml
, it is currently implemented trivially
and incorrectly - there is just enough there that it type-checks when
you try to compile it. Per its signature, to_debruijn
takes a lambda term t
as input, and returns an
equivalent DeBruijn term. When you implement this, be careful to
correctly implement lexical scopes: \x.(\x.x)x
is \(\1)1
, and \x.\y.\x.xy
is \\\1 2
.
In to_debruijn
, you should treat free variables as if
they were all bound just one level outside the entire expression. So,
the term ``a
'' translates to ``1
'',
and ``\x.xy\z.w
'' translates to ``\1 2 \3
''.2
When you're done with Lambda.ml
, it should contain the
type of expr
, the declaration of the
exception Empty
, and your implementation
of to_debruijn
. If you remove any of these pieces, the
project won't compile. On the other hand, it is entirely fine to define
more values and types in Lambda.ml
- in particular, you
may find it useful to define auxiliary functions.
DeBruijn
ModuleThe DeBruijn module has the following signature:
type expr =
Var of int
| Lambda of expr
| Apply of expr * expr
exception Empty
val beta_lor: expr -> expr option
DeBruijn.expr
is the type of abstract syntax trees for
De Bruijn-indexed terms. Again, Var
is an expression
that consists of a single variable, Apply(a,b)
is the
application of a
to b
, and Lambda
e
is a lambda term that introduces a new binding into the
context of expression e
. For example, the
term \\2 1 2
is encoded as:
Lambda(
Lambda(
Apply(
Apply( Var 2, Var 1),
Var 2)))
Again, DeBruijn.expr
is an exception that the De Bruijn parser throws when it receives empty input.
Again, you shouldn't need to worry about it.
DeBruijn.beta_lor: DeBruijn.expr -> DeBruijn.expr option
is the second function you should implement in this assignment. It takes, as input, a De Bruijn-indexed expression: let's call it e
.
If e
contains a beta-redex, then beta_lor
finds the leftmost-outermost redex3 and performs that beta reduction. If we call the reduced expressions out
, then beta_lor
then returns Some(out)
. On the other hand, if e
does not contain a beta-redex, then beta_lor
has no expression to return, so it returns None
.4
Note that when you perform a beta-reduction, you will need to alter
the binding depths of the free variables of the left-hand and
right-hand sides of the Apply
. In particular, you need to ensure that
after all changes have been made to create the contractum,
each variable still present in the result is bound to the same Lambda
node that it was bound to before the beta-reduction -- including variables that are bound
in the surrounding context. Consider carefully what this requires, remembering
that the Apply
and the Lambda
nodes that
make up the top two nodes of the redux are removed in a beta-reduction
transition.
Again, it is OK to define auxiliary values and types in DeBruijn.ml
.
IO
ModuleThe interface for the IO
module is:
val print_lam: Lambda.expr -> unit
val print_db: DeBruijn.expr -> unit
val lam_of_string: string -> Lambda.expr
val db_of_string: string -> DeBruijn.expr
val lam_of_channel: in_channel -> Lambda.expr
val db_of_channel: in_channel -> DeBruijn.expr
IO.print_lam
and IO.print_db
print Lambda and DeBruijn expressions, respectively. You might find these useful for debugging. The of_string
and of_channel
functions produce Lambda or DeBruijn expressions from strings or file streams - they get called by the various Check and Repl programs, but you probably won't need to use them.
The files ending in Check.ml
and Repl.ml
are the top-level implementations of the corresponding executables. The files ending in Lex.mll
or Parse.mly
are the sources for ocamllex
and ocamlyacc
, respectively, that get called by the of_string
and of_channel
functions in IO
.
Before you submit your work, you should really run check
. Once you're satisfied with your program's correctness, copy your versions of Lambda.ml
and DeBruijn.ml
to a folder titled lastname.firstname.asn1
, zip the folder and submit it via canvas.
Do not send edited versions of any of the other files.
In English, "De Bruijn" is pronounced more like "de brown" than "de broyn" (Benjamin A. Pierce. Types and Programming Languages, page 76. MIT Press, 2002). ↩
Yes, this means that differently-named variables might resolve to the same virtual binding, if this expression was part of a larger expression. I think that's inelegant, but it makes about as much sense as any other compromise. Free variables are only meaningful in some sort of context, even if it is only an implicit context. In the traditional context for lambda-calculus terms, it is assumed that you'll have an environment of bindings of variables to values, so a free variable has a meaning in whatever context you put it in. With De Bruijn indexing, that doesn't mean anything; rather, we assume that the context is an environment of bindings of various depths. Without some translation from names to bindings, any reinterpretation of free variables is going to lose important information. ↩
The "leftmost-outermost" redex of a lambda term is the redex whose Apply
is first encountered during a left-to-right preorder traversal of the term.
↩
Some
and None
are constructors for Ocaml's polymorphic option
type. So, Some(3)
, Some(8)
, and None
are all values of type int option
. An OCaml programmer uses this when it is uncertain whether a function will actually have a value to return. Think of it as checking for Null
or None
, but in a type-safe way.
↩