University of Wisconsin - Madison | CS 540 Lecture Notes | C. R. Dyer |
First-Order Logic (Chapters 8 - 9)
First-Order Logic (FOL or FOPC) Syntax
- User defines these primitives:
- Constant symbols (i.e., the "individuals" in the world)
E.g., Mary, 3
- Function symbols (mapping individuals to individuals)
E.g., father-of(Mary) = John, color-of(Sky) = Blue
- Predicate symbols (mapping from individuals to truth values)
E.g., greater(5,3), green(Grass), color(Grass, Green)
- FOL supplies these primitives:
- Variable symbols. E.g., x, y
- Connectives. Same as in PL: not (~), and (^), or (v), implies (=>),
if and only if (<=>)
- Quantifiers: Universal (A) and Existential (E)
- Universal quantification corresponds to conjunction ("and")
in that (Ax)P(x) means that P holds for all values of x in the
domain associated with that variable. E.g., (Ax) dolphin(x) => mammal(x)
- Existential quantification corresponds to disjunction ("or")
in that (Ex)P(x) means that P holds for some value of x in the domain
associated with that variable. E.g., (Ex) mammal(x) ^ lays-eggs(x)
- Universal quantifiers usually used with "implies" to form
"if-then rules." E.g., (Ax) cs540-student(x) => smart(x) means
"All cs540 students are smart." You rarely use universal quantification
to make blanket statements about every individual in the world:
(Ax)cs540-student(x) ^ smart(x) meaning that everyone in the world
is a cs540 student and is smart.
- Existential quantifiers usually used with "and" to specify a
list of properties or facts about an individual. E.g.,
(Ex) cs540-student(x) ^ smart(x) means "there is a cs540 student who
is smart." A common mistake is to represent this English sentence
as the FOL sentence: (Ex) cs540-student(x) => smart(x)
But consider what happens when there is a person who is
NOT a cs540-student.
- Switching the order of universal quantifiers does not change
the meaning: (Ax)(Ay)P(x,y) is logically equivalent to (Ay)(Ax)P(x,y).
Similarly, you can switch the order of existential quantifiers.
- Switching the order of universals and existentials does
change meaning:
- Everyone likes someone: (Ax)(Ey)likes(x,y)
- Someone is liked by everyone: (Ey)(Ax)likes(x,y)
- Sentences are built up from terms and atoms:
- A term (denoting a real-world individual) is
a constant symbol, a variable symbol, or an n-place function of
n terms. For example, x and f(x1, ..., xn)
are terms, where each xi is a term.
- An atom (which has value true or false) is either
an n-place predicate of n terms, or, if P and Q are atoms, then
~P, P V Q, P ^ Q, P => Q, P <=> Q are atoms
- A sentence is an atom, or, if P is a sentence and x is
a variable, then (Ax)P and (Ex)P are sentences
- A well-formed formula (wff) is a sentence containing
no "free" variables. I.e., all variables are "bound" by universal
or existential quantifiers. E.g., (Ax)P(x,y) has x bound as a
universally quantified variable, but y is free.
Translating English to FOL
- Every gardener likes the sun.
(Ax) gardener(x) => likes(x,Sun)
- You can fool some of the people all of the time.
(Ex) (person(x) ^ (At)(time(t) => can-fool(x,t)))
- You can fool all of the people some of the time.
(Ax) (person(x) => (Et) (time(t) ^ can-fool(x,t)))
- All purple mushrooms are poisonous.
(Ax) (mushroom(x) ^ purple(x)) => poisonous(x)
- No purple mushroom is poisonous.
~(Ex) purple(x) ^ mushroom(x) ^ poisonous(x)
or, equivalently,
(Ax) (mushroom(x) ^ purple(x)) => ~poisonous(x)
- There are exactly two purple mushrooms.
(Ex)(Ey) mushroom(x) ^ purple(x) ^ mushroom(y) ^ purple(y) ^
~(x=y) ^ (Az) (mushroom(z) ^ purple(z)) => ((x=z) v (y=z))
- Deb is not tall.
~tall(Deb)
- X is above Y if X is on directly on top of Y or else there is
a pile of one or more other objects directly on top of one another
starting with X and ending with Y.
(Ax)(Ay) above(x,y) <=> (on(x,y) v (Ez) (on(x,z) ^ above(z,y)))
Inference Rules for FOL
- Inference rules for PL apply to FOL as well. For example,
Modus Ponens, And-Introduction, And-Elimination, etc.
- New (sound) inference rules for use with quantifiers:
- Universal Elimination
If (Ax)P(x) is true, then P(c) is true, where c is a constant in the
domain of x. For example, from (Ax)eats(Ziggy, x)
we can infer eats(Ziggy, IceCream). The variable symbol can be
replaced by any ground term, i.e., any constant symbol or function
symbol applied to ground terms only.
- Existential Introduction
If P(c) is true, then (Ex)P(x) is inferred. For example,
from eats(Ziggy, IceCream) we can infer (Ex)eats(Ziggy, x).
All instances of the given constant symbol are replaced by the new
variable symbol. Note that the variable symbol cannot already
exist anywhere in the expression.
- Existential Elimination
From (Ex)P(x) infer P(c). For example,
from (Ex)eats(Ziggy, x) infer eats(Ziggy, Cheese).
Note that the variable is replaced by a brand new constant
that does not occur in this or any other sentence in
the Knowledge Base. In other words, we don't want to accidentally
draw other inferences about it by introducing the constant. All
we know is there must be some constant that makes this true,
so we can introduce a brand new one to stand in for that (unknown)
constant.
- Paramodulation
- Given two sentences (P1 v ... v PN) and (t=s v Q1 v ... v QM)
where each Pi and Qi is a literal (see definition below) and
Pj contains a term t,
derive new sentence
(P1 v ... v Pj-1 v Pj[s] v Pj+1 v ... v PN v Q1 v ... v QM)
where Pj[s] means a single occurrence of the term t
is replaced by the term s in Pj
- Example: From P(a) and a=b derive P(b)
- Generalized Modus Ponens (GMP)
- Combines And-Introduction, Universal-Elimination, and Modus Ponens
- Example: from P(c), Q(c), and (Ax)(P(x) ^ Q(x)) => R(x),
derive R(c)
- In general, given atomic sentences P1, P2, ..., PN, and
implication sentence (Q1 ^ Q2 ^ ... ^ QN) => R, where Q1, ..., QN and
R are atomic sentences, and subst(Theta, Pi) = subst(Theta, Qi)
for i=1,...,N, derive new sentence: subst(Theta, R)
- subst(Theta, alpha) denotes the result of applying a set of
substitutions defined by Theta to the sentence alpha
- A substitution list Theta = {v1/t1, v2/t2, ..., vn/tn} means
to replace all occurrences of variable symbol vi by term ti.
Substitutions are made in left-to-right order in the list.
Example: subst({x/IceCream, y/Ziggy}, eats(y,x)) = eats(Ziggy, IceCream)
Automated Inference for FOL
- Automated inference using FOL is harder than using PL because
variables can take on potentially an infinite number of possible
values from their domain. Hence there are potentially an
infinite number of ways to apply Universal-Elimination rule of
inference
- Godel's Completeness Theorem says that FOL entailment is only
semidecidable. That is, if a sentence is true given a set of
axioms, there is a procedure that will determine this. However,
if the sentence is false, then there is no guarantee that a
procedure will ever determine this. In other words, the procedure
may never halt in this case.
- The Truth Table method of inference is not complete for FOL
because the truth table size may be infinite
- Natural Deduction is complete for FOL but is
not practical for automated inference because the "branching
factor" in a search is too large, caused by the fact that
we would have to potentially try every inference rule in every
possible way using the set of known sentences
- Generalized Modus Ponens is not complete for FOL
- Generalized Modus Ponens is complete for
KBs containing only Horn clauses
- A Horn clause is a sentence of the form:
(Ax) (P1(x) ^ P2(x) ^ ... ^ Pn(x)) => Q(x)
where there are 0 or more Pi's, and the Pi's and Q are positive
(i.e., un-negated) literals
- Horn clauses represent a subset of the set of sentences
representable in FOL. For example, P(a) v Q(a) is a sentence in FOL
but is not a Horn clause.
- Natural deduction using GMP is complete for KBs containing only
Horn clauses. Proofs start with the given axioms/premises in KB,
deriving new sentences using GMP until the goal/query sentence is
derived. This defines a forward chaining inference procedure
because it moves "forward" from the KB to the goal.
- Backward-chaining deduction using GMP is complete for KBs containing
only Horn clauses. Proofs start with the goal query, find implications
that would allow you to prove it, and then prove each of the antecedents
in the implication, continuing to work "backwards" until we get to the
axioms, which we know are true.
- Example: Does Ziggy eat fish?
To prove eats(Ziggy, Fish), first see if this is known from one of
the axioms directly. Here it is not known, so see if there is a
Horn clause that has the consequent (i.e., right-hand side) of the
implication matching the goal. Here,
Proof:
- Goal matches RHS of Horn clause (2), so try and prove new sub-goals
cat(Ziggy) and likes(Ziggy, Fish) that correspond to the LHS of (2)
- cat(Ziggy) matches axiom (3), so we've "solved" that sub-goal
- likes(Ziggy, Fish) matches the RHS of (1), so try and prove cat(Ziggy)
- cat(Ziggy) matches (as it did earlier) axiom (3), so
we've solved this sub-goal
- There are no unsolved sub-goals, so we're done. Yes, Ziggy eats fish.
Resolution Refutation Procedure (aka Resolution Procedure)
- Resolution procedure is a sound and complete inference procedure for FOL
- Resolution procedure uses a single rule of inference: the Resolution Rule (RR),
which is a generalization of the same rule used in PL.
- Resolution Rule for PL:
From sentence P1 v P2 v ... v Pn and sentence ~P1 v Q2 v ... v Qm
derive resolvent sentence: P2 v ... v Pn v Q2 v ... v Qm
- Examples
- From P and ~P v Q, derive Q (Modus Ponens)
- From (~P v Q) and (~Q v R), derive ~P v R
- From P and ~P, derive False
- From (P v Q) and (~P v ~Q), derive True
- Resolution Rule for FOL:
Given sentence P1 v ... v Pn and sentence Q1 v ... v Qm
where each Pi and Qi is a literal, i.e., a positive or
negated predicate symbol with its terms, if Pj and ~Qk unify
with substitution list Theta, then derive the resolvent sentence:
subst(Theta, P1 v ... v Pj-1 v Pj+1 v ... v Pn v Q1 v ... Qk-1 v Qk+1 v ... v Qm)
- Example
- From clause P(x, f(a)) v P(x, f(y)) v Q(y) and
clause ~P(z, f(a)) v ~Q(z), derive resolvent clause
P(z, f(y)) v Q(y) v ~Q(z) using Theta = {x/z}
Unification
- Unification is a "pattern matching" procedure that takes two
atomic sentences, called literals, as input, and returns "failure" if they do not match
and a substitution list, Theta, if they do match. That is,
unify(p,q) = Theta means subst(Theta, p) = subst(Theta, q) for
two atomic sentences p and q.
- Theta is called the most general unifier (mgu)
- All variables in the given two literals are implicitly universally
quantified
- To make literals match, replace (universally-quantified) variables
by terms
- Unification algorithm
procedure unify(p, q, theta)
Scan p and q left-to-right and find the first corresponding
terms where p and q "disagree" ; where p and q not equal
If there is no disagreement, return theta ; success
Let r and s be the terms in p and q, respectively,
where disagreement first occurs
If variable(r) then
theta = union(theta, {r/s})
unify(subst(theta, p), subst(theta, q), theta)
else if variable(s) then
theta = union(theta, {s/r})
unify(subst(theta, p), subst(theta, q), theta)
else return "failure"
end
- Examples
Literal 1 | Literal 2 | Result of Unify |
parents(x, father(x), mother(Bill)) | parents(Bill, father(Bill), y) | {x/Bill, y/mother(Bill)} |
parents(x, father(x), mother(Bill)) | parents(Bill, father(y), z) | {x/Bill, y/Bill, z/mother(Bill)} |
parents(x, father(x), mother(Jane)) | parents(Bill, father(y), mother(y)) | Failure |
- Unify is a linear time algorithm that returns the most general unifier (mgu),
i.e., a shortest length substitution list that makes the two literals
match. (In general, there is not a unique minimum length substitution
list, but unify returns one of those of minimum length.)
- A variable can never be replaced by a term containing that variable.
For example, x/f(x) is illegal. This "occurs check" should be
done in the above pseudo-code before making the recursive calls.
Resolution Refutation Procedure (aka Resolution Procedure)
- Proof by contradiction method
- Given a consistent set of axioms KB and goal sentence Q,
we want to show that KB |= Q. This means that every interpretation I
that satisfies KB, satisfies Q. But we know that any interpretation
I satisfies either Q or ~Q, but not both. Therefore if in fact
KB |= Q, an interpretation that
satisfies KB, satisfies Q and does not satisfy ~Q.
Hence KB union {~Q} is unsatisfiable, i.e., that it's false
under all interpretations.
- In other words, (KB |- Q) <=> (KB ^ ~Q |- False)
- What's the gain? If KB union ~Q is unsatisfiable, then
some finite subset is unsatisfiable
- Resolution procedure can be used to establish that a given sentence Q
is entailed by KB; however, it cannot, in general, be used to generate
all logical consequences of a set sentences. Also, the resolution procedure cannot
be used to prove that Q is not entailed by KB.
- Resolution procedure won't always give an answer since entailment
is only semidecidable. And you can't just run two proofs in parallel,
one trying to prove Q and the other trying to prove ~Q, since
KB might not entail either one.
- Algorithm
procedure resolution-refutation(KB, Q)
;; KB is a set of consistent, true FOL sentences
;; Q is a goal sentence that we want to derive
;; return success if KB |- Q, and failure otherwise
KB = union(KB, ~Q)
while false not in KB do
pick 2 sentences, S1 and S2, in KB that contain
literals that unify
if none, return "failure"
resolvent = resolution-rule(S1, S2)
KB = union(KB, resolvent)
return "success"
end
Example using PL Sentences
Problems Yet to Be Addressed
- Resolution rule of inference is only applicable with sentences that are in
the form P1 v P2 v ... v Pn, where each Pi is a negated or un-negated
predicate and contains functions, constants, and universally quantified
variables, so can we convert every FOL sentence into this form?
- How to pick which pair of sentences to resolve?
- How to pick which pair of literals, one from each sentence,
to unify?
Converting FOL Sentences to Clause Form
Revised Resolution Refutation Procedure
procedure resolution-refutation(KB, Q)
;; KB is a set of consistent, true FOL sentences
;; Q is a goal sentence that we want to derive
;; return success if KB |- Q, and failure otherwise
KB = union(KB, ~Q)
KB = clause-form(KB) ; convert sentences to clause form
while false not in KB do
pick 2 clauses, C1 and C2, that contain literals that unify
if none, return "failure"
resolvent = resolution-rule(C1, C2)
KB = union(KB, resolvent)
return "success"
end
Example: Hoofers Club
- Problem Statement
Tony, Shi-Kuo and Ellen belong to the Hoofers Club.
Every member of the Hoofers Club is either a skier
or a mountain climber or both.
No mountain climber likes rain, and
all skiers like snow.
Ellen dislikes whatever Tony likes and likes
whatever Tony dislikes.
Tony likes rain and snow.
- Query
Is there a member of the Hoofers Club
who is a mountain climber but not a skier?
- Translation into FOL Sentences
Let S(x) mean x is a skier,
M(x) mean x is a mountain climber,
and L(x,y) mean x likes y,
where the domain of the first variable is Hoofers Club members, and
the domain of the second variable is snow and rain.
We can now translate the above English sentences into the following
FOL wffs:
- (Ax) S(x) v M(x)
- ~(Ex) M(x) ^ L(x, Rain)
- (Ax) S(x) => L(x, Snow)
- (Ay) L(Ellen, y) <=> ~L(Tony, y)
- L(Tony, Rain)
- L(Tony, Snow)
- Query: (Ex) M(x) ^ ~S(x)
- Negation of the Query: ~(Ex) M(x) ^ ~S(x)
- Conversion to Clause Form
- S(x1) v M(x1)
- ~M(x2) v ~L(x2, Rain)
- ~S(x3) v L(x3, Snow)
- ~L(Tony, x4) v ~L(Ellen, x4)
- L(Tony, x5) v L(Ellen, x5)
- L(Tony, Rain)
- L(Tony, Snow)
- Negation of the Query: ~M(x7) v S(x7)
- Resolution Refutation Proof
Clause 1 | Clause 2 | Resolvent | MGU (i.e., Theta) |
8 | 1 | 9. S(x1) | {x7/x1} |
9 | 3 | 10. L(x1, Snow) | {x3/x1} |
10 | 4 | 11. ~L(Tony, Snow) | {x4/Snow, x1/Ellen} |
11 | 7 | 12. False | {} |
- Answer Extraction
Clause 1 | Clause 2 | Resolvent | MGU (i.e., Theta) |
~M(x7) v S(x7) v (M(x7) ^ ~S(x7)) | 1 | 9. S(x1) v (M(x1) ^ ~S(x1)) | {x7/x1} |
9 | 3 | 10. L(x1, Snow) v (M(x1) ^ ~S(x1)) | {x3/x1} |
10 | 4 | 11. ~L(Tony, Snow) v (M(Ellen) ^ ~S(Ellen)) | {x4/Snow, x1/Ellen} |
11 | 7 | 12. M(Ellen) ^ ~S(Ellen) | {} |
- Answer to the Query
Ellen!
Resolution Procedure as Search through a Search Space
- Resolution procedure can be thought of as the bottom-up construction of a
search tree, where the leaves are the clauses produced by KB and
the negation of the goal. When a pair of clauses generates a
new resolvent clause, add a new node to the tree with arcs directed
from the resolvent to the two parent clauses. The resolution procedure succeeds
when a node
containing the False clause is produced, becoming the root node
of the tree.
- A strategy is complete if its use guarantees that the empty
clause (i.e., false) can be derived whenever it is entailed
- Some Strategies for Controlling Resolution's Search
- Breadth-First
- Level 0 clauses are those from the original axioms and the
negation of the goal. Level k clauses are the resolvents computed
from two clauses, one of which must be from level k-1 and the other
from any earlier level.
- Compute all level 1 clauses possible, then all possible level 2
clauses, etc.
- Complete, but very inefficient.
- Set-of-Support
- At least one parent clause must be from the negation of the goal
or one of the "descendents" of such a goal clause (i.e., derived from
a goal clause)
- Complete (assuming all possible set-of-support clauses are derived)
- Unit Resolution
- At least one parent clause must be a "unit clause," i.e.,
a clause containing a single literal
- Not complete in general, but complete for Horn clause KBs
- Input Resolution
- At least one parent from the set of original clauses (from the
axioms and the negation of the goal)
- Not complete in general, but complete for Horn clause KBs
Last modified October 14, 1998
Copyright © 1996 by Charles R. Dyer. All rights reserved.