Mining for Low-abundance Transcripts in Microarray Data

by

Yi Lin (Statistics), Samuel T. Nadler (Biochemistry), Alan D. Attie (Biochemistry) and Brian S. Yandell (Statistics & Horticulture)
Technical Report #1031, January 2001, U WI Madison Statistics

Plant and animal studies of quantitative trait loci provide data which arise from mixtures of distributions with known mixing proportions. Previous approaches to estimation involve modelling the distributions parametrically. We propose a semiparametric alternative which assumes that the log ratio of the component densities satisfies a linear model, with the baseline density unspecified. It is demonstrated that a constrained empirical likelihood has an irregularity under the null hypothesis that the two densities are equal. A factorization of the likelihood suggests a partial empirical likelihood which permits unconstrained estimation of the parameters. The partial likelihood is shown to give consistent and asymptotically normal estimators, regardless of the null. The asymptotic null distribution of the log-partial likelihood ratio is chi-square. Theoretical calculations show that the procedure may be as efficient as the full empirical likelihood in the regular set-up. The usefulness of the robust methodology is illustrated with a rat study of breast cancer resistance genes.

Click to get manuscript

Click to get microarray.zip software