CS 540 | Lecture Notes | Fall 1996 |
More generally, propositions can include the equality predicate with random variables and the possible values they can have. For example, we might have a random variable Color with possible values red, green, blue, and other. Then P(Color=red) indicates the likelihood that the color of a given object is red. Similarly, for Boolean random variables we can ask P(A=True), which is abbreviated to P(A), and P(A=False), which is abbreviated to P(~A).
In particular, conditional probabilities are important for reasoning because they formalize the process of accumulating evidence and updating probabilities based on new evidence. Some of the most important rules related to conditional probability are:
P(A|B ^ C) = ((P(A)P(B ^ C | A))/P(B ^ C) = P(A) * [P(B|A)/P(B)] * [P(C | A ^ B)/P(C|B)]Again, this shows how the conditional probability of A is updated given B and C. The problem is that it may be hard in general to obtain or compute P(C | A ^ B). But this difficulty is circumvented if we know evidence B and C are independent.
P(A | B ^ C) = (P(A)P(B ^ C | A))/P(B ^ C) = (P(A)P(B|A)P(C|A))/(P(B)P(C|B)) = P(A) * [P(B|A)/P(B)] * [P(C|A)/P(C|B)]
Furthermore, if B and C are independent, then:
P(A | B ^ C) = P(A) * [P(B|A)/P(B)] * [P(C|A)/P(C)]
The doctor wants to determine the likelihood that the patient has a PickledLiver. Based on no other information, she knows that the prior probability P(PickledLiver) = 10^-17. So, this represents the doctor's initial belief in this diagnosis. However, after examination, she determines that the patient has jaundice. She knows that P(Jaundice) = 2^-10 and P(Jaundice | PickledLiver) = 2^-3, so she computes the new updated probability in the patient having PickledLiver as:
P(PickledLiver | Jaundice) = P(P)P(J|P)/P(J) = (2^-17 * 2^-3)/2^-10 = 2^-10
So, based on this new evidence, the doctor increases her belief in this diagnosis from 2^-17 to 2^-10. Next, she determines that the patient's eyes are bloodshot, so now we need to add this new piece of evidence and update the probability of PickledLiver given Jaundice and Bloodshot. Say, P(Bloodshot) = 2^-6 and P(Bloodshot | PickledLiver) = 2^-1. Then, she computes the new conditional probability:
P(PickledLiver | Jaundice ^ Bloodshot) = (P(P)P(J|P)P(B|P))/(P(J)P(B)) = 2^-10 * [2^-1 / 2^-6] = 2^-5So, after taking both symptoms into account, the doctor's belief that the patient has a PickledLiver is 2^-5.
For example, for the burglar alarm example in the textbook, Alarm causes John to call and Mary to call, but these two events are conditionally independent given Alarm. So the net will contain:
P(X1=x1, ..., Xn=xn) = P(xi | Parents(Xi)) * ... * P(xn | Parents(Xn))
Hence, we don't need the full joint probability distribution, only conditionals relative to the parent variables.
Consider a domain consisting of 5 Boolean random variables:
T: The lecture started at 11:05 L: The lecturer arrived late V: The lecture is on computer vision C: The lecturer is Chuck S: It is sunny
In this domain it makes sense to make the following additional assumptions:
Based on the above, the following is a Belief Net that represents all of these relationships:
Now add the following quantitative information to the net:
Doing this for the above example, we get the following Belief Net:
Notice that in this example, a total of 10 probabilities are computed and stored in the net, whereas the full joint probability distribution would require a table containing 2^5 = 32 probabilities. The reduction is due to the conditional independence of many variables.
Two variables that are not directly connected by an arc can still affect each other. For example, S and T are not independent, but T does not directly depend on S.
Given a belief net, we can easily read off the conditional independence relations that are represented. Specifically, each node is conditionally independent of all of its nonsuccessors given its parents. E.g., in the above example T is conditionally independent of S, C, and V given L. So, P(T | S,L,C,V) = P(T | L).
Notes about this algorithm:
Goal: Compute P(S ^ ~C ^ L ^ ~V ^ T)
P(T ^ ~V ^ L ^ ~C ^ S) = P(T | ~V ^ L ^ ~C ^ S) * P(~V ^ L ^ ~C ^ S) by Product Rule = P(T|L) * P(~V ^ L ^ ~C ^ S) by cond. indep. = P(T|L) P(~V | L ^ ~C ^ S) P(L ^ ~C ^ S) by Product Rule = P(T|L) P(~V|~C) P(L ^ ~C ^ S) by cond. indep. = P(T|L) P(~V|~C) P(L | ~C ^ S) P(~C ^ S) by Product Rule = P(T|L) P(~V|~C) P(L|~C ^ S) P(~C | S) P(S) by Product Rule = P(T|L) P(~V|~C) P(L|~C ^ S) P(~C) P(S) by cond. indep. = (.3)(1 - .6)(.1)(1 - .6)(.3) = .00144
where all of the numeric values are available directly in the Belief Net (since P(~A|B) = 1 - P(A|B)).
Last modified December 11, 1996
Copyright © 1996 by Charles R. Dyer. All rights reserved.