Assigned Sept 22, Due Oct 6: Bayesian Network Inference.
Implement Gibbs Sampling for Bayesian network inference. The format for your input file, as well as detailed grading criteria,
can be found here. Your program should accept the name of the input file as a command-line argument. Your implementation may
assume that all variables are Boolean. Your grade includes a
one-page write-up describing experimentation with your implementation.
Create a graph to show how your accuracy varies with chain length.
Your results should be averaged over at least 5 Bayes nets of 8-10 nodes each and at least 3 queries per Bayes net, with two evidence variables per query.
Use a burn-in of 200.
You will want to create additional test data (Bayes nets and queries) for
your experimentation. If you want to also experiment with larger Bayes nets, longer chain lengths, longer burn-in, or different numbers of evidence variables, that is fine. Submit your code to the
handin directory at ~cs731-1/handin/userid/prog1, and your one-page write-up in class; both are due at the
start of class on October 6. Please deposit both source code and executable, and call your executable "gibbs" (you may have also have file name extension if your language requires that).
Project: To Be Revised
Projects should be proposed by November 7 (verbal or email communication is
acceptable). Projects must be done individually.
The basis for the project grade will be your written report, which must
be turned in no later than the last day of final exams. The report
should be in the style of a conference paper, providing an
introduction/motivation, discussion of related work, a description of
your work that is detailed enough that the work could be replicated,
and a conclusion.
The format of the description of your work will depend on the
nature of your project. If it is an implementation, then the description
should make clear the algorithm(s) implemented and provide experimental
results.
If it is an application project, the description should say which system
was used, how the data (or any other materials used) were collected,
what experimental methodology was employed, and some estimate of the
quality of the experimental results (e.g. a 10-fold cross-validation
accuracy estimate).
If it is a theoretical project, then the project description should
consist of detailed definitions, theorems, and proofs.
An example of an outstanding project report is
here (Word File).
You may choose projects from any area of AI (even those not covered in
the course), but the following are some suggestions related to topics in the
course.
- Build a large Bayes Net application and develop the
CPTs based in part on real-world data.
- Build a Bayes net learning algorithm. To be non-trivial
this should include either (1) learning in the
case of missing data using EM, Gibbs-sampling,
or gradient-ascent (not discussed in class) or
(2) learning/modification of the BN structure
as well.
- Implement an influence diagram -- a Bayes Net that
also suggests actions to take (I can point you
to readings).
- Write a first-principles proof of one of the properties
discussed in class that we did not take time to
prove. Examples include (1) one node is conditionally
independent of a second given evidence if the two nodes
are d-separated given that evidence, (2) every clique graph has
a junction tree, (3) the Metropolis-Hastings
algorithm converges to a stationary distribution.
For example, for (1) you'd need to work back
from Pearl's book through a proof based on
graphoids, and then rewrite the proof instead
to use only the concepts and terminology we
have used in class (eg., so your classmates
could understand it).
- Apply Graphplan or SATplan to a real-world planning
task.
- Implement a Dynamic Bayes Net or Dynamic Influence
Diagram.
- Apply a novel ILP system and apply it to a toy domain or a real-world data set (for example, the
Predictive Toxicology Evaluation set).
Some Freely-Available Software Related to Course Topics
INDUCTIVE LOGIC PROGRAMMING SYSTEMS
BAYESIAN NETWORK SYSTEMS