In this set of notes we will define a formal system for type inference:
a set of axioms and inference rules.
By definition, an ML expression e has type t
iff there exists a proof in this system.
To handle recursive function definitions more cleanly, we will make
a slight change to the syntax.
Instead of:
Using this new syntax, a recursive function definition is a sub-case of
a general let expression;
e.g., the definition of length is of the form
Overview
let rec id = exp1 in exp2
we'll use:
let id = fix id . exp1 in exp2
where the two ids are the same;
for example:
let length = fix length . λL . if null(L) then ... else ... in ...
let id = exp1 in ...
where exp1 is
fix length . λL . if null(L) then ... else ...
A.x:σ |- e:τ | ||||
[ABS] | (fn abstraction) | |||
A |- (λx.e):σ → τ |
A |- e1:σ → τ | A |- e2:σ | ||||
[APP] | (fn application) | ||||
A |- e1 (e2):τ |
A |- e1:σ | A.x:σ |- e2:τ | ||||
[LET] | (let exp) | ||||
A |- (let x = e1 in e2):τ |
A.x:τ |- e:τ | ||
[FIX] | (recursive fn) | |
A |- (fix x.e):τ |
A |- e:∀ α.τ | ||
[SPEC] | (specialization) | |
A |- e:τ[σ/α] |
A |- e:τ | |||
[GEN] | (where α is not free in A) | (generalization) | |
A |- e:∀ α.τ |
The last rule, for generalization, may seem counter intuitive. It says that if, given A, we can infer that expression e has type τ, then we can infer that it has type ∀α.τ (for any type variable α that is not free in A).
To understand why this makes sense, note that:
And why do we have the restriction that α cannot be free in A? That is because if α is free in A it means there is a "link" between some assumption in A and the fact that e has type τ. For example:
means that e has the same type as y. Allowing e:∀α.α would break that link (and allow us to reach an invalid conclusion about the type of e).
Now we give some examples of proofs that use the system defined above.
Example Proofs
3:int |- (λx.x)(3): int |
/ \ / \ [APP] v v
3:int |- (λx.x): σ→int | 3:int |- 3: σ |
Note that there is no sigma in the sequent in the bottom of the [APP] rule; i.e., to show that a function application has type τ, we must show that the function has type σ→τ, and that the argument has type σ for some σ. We'll complete our proof by showing that it holds when σ is int. So our proof tree is:
3:int |- (λx.x)(3): int |
/ \ / \ [APP] v v
3:int |- (λx.x): int→int | 3:int |- 3: int |
The right leaf is an axiom, so that branch of the proof is complete. To complete the left branch we use the [ABS] rule (since the left leaf involves lambda abstraction):
3:int |- (λx.x)(3): int |
/ \ / \ [APP] v v
3:int |- (λx.x): int→int | 3:int |- 3: int |
| | [ABS] v
3:int . x: int |- x: int |
Show that:
has type:
Assume the initial type environment A:
(Note that this is a much longer proof than the examples given above!)
Finally we're ready to present Algorithm W, our sound and
complete-up-to-shallow-types type-inference algorithm.
The input to Algorithm W is a type environment A and an ML
expression e.
An expression has a well-typing T iff there is a proof
in the system defined above that e has type T.
Algorithm W computes the most general type of e if it (and all
its subexpressions) have
shallow well-typings.
A shallow type is a type in which all quantifiers
occur at the beginning.
For example,
is shallow, while
is not shallow.
The fact that Algorithm W can only handle expressions with
shallow well-typings is a limitation of the algorithm
compared to the formal method.
For example, the expression
only has only a non-shallow type:
(∀α.∀β.α→β) →
(∀ γ.∀δ.γ x δ),
and so Algorithm W fails on that expression itself, and also on:
because it includes a sub-expression with a non-shallow type
(even though the whole expressions has a shallow type,
namely, int x bool).
Algorithm W has been shown to be sound and complete up to shallow types.
where soundness and completeness are defined as follows:
To understand Algorithm W, we must first understand
substitution and unification.
Algorithm W
∀ α. ∀ β. α→β
∀ α.(α→(∀ β.(α x β)))
λ f. pair(f(3))(f(true))
(λ f. pair(f(3))(f(true)))(λx.x)
Unify (S, exp1, exp2) | ||||||||||||||
if (S == FAIL) return FAIL | ||||||||||||||
if (exp1 is TYPEVAR(t)) | ||||||||||||||
if (exp2 is also TYPEVAR(t)) return S | ||||||||||||||
else if (t occurs in exp2) return FAIL | ||||||||||||||
else if (S maps t to some type expression e) | ||||||||||||||
// this prevents returning a cyclic type | ||||||||||||||
return Unify(S, e, exp2) | ||||||||||||||
else let exp2' = S(exp2) in | ||||||||||||||
if (exp2' is TYPEVAR(t)) return S | ||||||||||||||
else if (t occurs in exp2') return FAIL | ||||||||||||||
else return t:exp2' o S | ||||||||||||||
if (exp2 is TYPEVAR(t)) return unify(S, exp2, exp1) | ||||||||||||||
// here if neither exp1 nor exp2 is a type variable | ||||||||||||||
if (root(exp1) ≠ root(exp2)) return FAIL | ||||||||||||||
if (root(exp1) and root(exp2) are primitive types) | ||||||||||||||
return S | ||||||||||||||
// the roots of exp1 and exp2 are type operators | ||||||||||||||
// (→ or x or list) | ||||||||||||||
for (each corresponding pair of subtrees T1 and T2 of the two roots) { | ||||||||||||||
unify T1 and T2; | ||||||||||||||
use S for the first call to Unify; | ||||||||||||||
use the result of the previous call for each subsequent call; | ||||||||||||||
} | ||||||||||||||
return the final resulting substitution | ||||||||||||||
} |
Note that when we unify a type variable t with a type expression e, we need to be very careful to keep the returned substitution idempotent We can't simply add t:e to S. For example, if we have:
then adding (t3: t2→bool) to S would create a non-idempotent substitution. We also can't simply compose t:e with S, because S may include a mapping for t, and e may include a mapping for some type variable in an expression in S (i.e., there are times when we would need to return S o t:e, and times when we would need to return t:e o S). Therefore, we apply S to both exp1 and exp2. If the result of the first application is still exp1, then we can return the result of composing the new mapping with S. If not, we call Unify again with s and the results of the two applications.
Here are some examples of unification, always assuming an empty initial substitution:
exp1 exp2 X X / \ / \ int bool t1 bool U = {t1: int}Let's consider the last two examples in more detail. When we attempt to unify (t1→(t2 x bool)) with (int→t1), the roots match, so we unify the left subtrees. That produces the substitution (t1: int). Now we use that substitution as S when we unify the right subtrees. The root of the subtree for the second expression is a type variable (t1), and there is a mapping for t1 in S (namely (t1: int)), so we attempt to unify int with (t2 x bool). That fails, since the root of one expression tree is a primitive type, while the root of the other is a type operator.
→ → / \ / \ t1 X int X / \ / \ t2 int bool t1 U = { t1:int; t2:bool }
→ → / \ / \ t1 X int t3 / \ t2 bool U = { t1:int; t3: t2 X bool }
→ → / \ / \ t1 X int t1 / \ t2 bool FAIL
→ → / \ / \ t1 X t2 t1 / \ t2 bool FAIL
The last example starts similarly: we first unify the two left subtrees, producing the substitution (t1: t2), which is used as S when we unify the right subtrees. Again, the root of the subtree for the second expression is a type variable (t1), and again there is a mapping for t1 in S (namely (t1: t2)), so we attempt to unify t2 with (t2 x bool). This fails because t2 occurs in (t2 x bool).