s1 s2

active_molecule(A) if:

hydrogen_acceptor(A,C) &
hydrogen_acceptor(A,D) &
distance(C,D,4.17) &
hydrophobe(A,E) &
distance(C,E,3.38) &
distance(D,E,7.20) &
hydrophobe(A,F) &
distance(C,F,3.68) &
distance(D,F,5.41) &
distance(E,F,5.0).

C. David Page Jr.

ace1 ace2


active_molecule(A) if:

zinc_binding_site(A,B) &
hydrogen_acceptor(A,C) &
distance(B,C,6.1) &
hydrogen_acceptor(A,D) &
distance(B,D,7.3) &
distance(C,D,2.2) &
hydrogen_acceptor(A,E) &
distance(B,E,3.9) &
distance(C,E,3.2) &
distance(D,E,4.0).

Research Group

Current

Alumni


David Page's Most Recent Work

Inductive Logic Programming (ILP) and Statistical Relational Learning (SRL), including applications to clinical data

Inductive Logic Programming (ILP) is the study of automated inductive learning where the knowledge representation used is first-order definite clause logic (as embodied in the language Prolog). The richness of the representation makes ILP particularly well-suited to domains such as organic chemistry and molecular biology, natural language processing, and telecommunications, where examples are easily described as sets of objects (e.g., atoms in a molecule) together with relations that hold among those objects (e.g., bonds or distance relations). Because of the close correspondence between logic and databases (relational algebra is equivalent to Datalog, a simplification of Prolog), ILP is a leading approach to directly mining databases with multiple relational tables. Statistical Relational Learning (SRL) combines explicit representation of uncertainty, as in Bayesian networks and related graphical model approaches, with the ability to analyze relational data as just described. The following are some of David Page's ILP and SRL activities.

Skewing: Learning Correlation Immune Targets

It has long been recognized that most machine learning systems have difficulty learning parity functions and related functions where the relevant features individually are not correlated with the class value. It turns out there are many such functions beyond parity; these functions are known in cryptography as the correlation immune functions, and they arise within nature in areas such as genetics and protein-protein interactions. The traditional ML approach to learning such functions is lookahead, but lookahead has high costs both in runtime and susceptibility to overfitting. Skewing is an alternative approach to learning correlation immune functions.

Pharmacophore Discovery for Drug Design

One important application of ILP and SRL is the discovery of pharmacophores---the 3-dimensional substructures of molecules responsible for their biological activities. Here are some papers on this topic.

Analysis of High-Throughput Genetic Data

The University of Wisconsin is using single-nucleotide polymorphism (SNP) genotype data and gene expression data from microarrays to study disease susceptibility, regulatory mechanisms in organisms such as E. coli as well as to identify new targets for anti-cancer drugs and other types of drugs. Here are some results.

Other Activities in Bioinformatics and Artificial Intelligence

Background