R/qtl-related package demo

This is a quick demo of R/qtl and related packages R/qtlhot, R/qtlnet and R/qtlyeast for workshops on causal networks. For information and tutorials on R and R/qtl, visit http://www.rqtl.org . Each package his its own vignettes.

R/qtl

We first show some features of R/qtl. For more info, see http://rqtl.org/tutorials/rqtltour2.pdf .

library(qtl)
library(qtlyeast)
summary(yeast.orf)
##     Backcross
## 
##     No. individuals:    112 
## 
##     No. phenotypes:     5740 
##     Percent phenotyped: 99.2 
## 
##     No. chromosomes:    16 
##         Autosomes:      1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 
## 
##     Total markers:      2956 
##     No. markers:        159 188 49 224 153 113 291 149 132 209 285 273 207 
##                         105 309 110 
##     Percent genotyped:  98.1 
##     Genotypes (%):      AA:50.1  AB:49.9
plot.map(yeast.orf)

plot of chunk unnamed-chunk-1

This shows a summary of the Brem and Kruglyak (2001) yeast dataset, with 2956 markers and 5740 phenotypes.

tf.orf <- "YBR158W"
tf.gene <- yeast.annot[yeast.annot$orf == tf.orf, "gene"]

## Invoke QTL hotspot and causal pair library.
library(qtlhot)
## Loading required package: lattice
## Loading required package: corpcor
## Loading required package: mnormt
cand <- cis.cand.reg$cis.reg[cis.cand.reg$cis.reg$gene == tf.orf, ]
cand.peak <- cand$peak.pos
cand.pos <- cis.cand.reg$cis.reg$phys.pos[cis.cand.reg$cis.reg$gene == tf.orf]

We examine one particular open reading frame, YBR158W, corresponding to transcription factor, AMN1. Here is an image of the genotypes for chromosome 2 reordered by this ORF.

cand.mar <- which.min(abs(cand.pos - pull.map(yeast.orf)[[2]]))
geno.image(yeast.orf, 2, reorder = find.pheno(yeast.orf, tf.orf))
abline(v = cand.mar, lwd = 3, col = "green")

plot of chunk unnamed-chunk-3

This ORF is located at 225.663cM; the nearest marker (number 132) is identified with a green line on the genotype image. Now we do a simple interval map scan, which finds the strongest evidence (LOD) for a QTL right at the gene location. The red line is the 1% permutation threshold.

yeast.orf <- calc.genoprob(yeast.orf, step = 2)
scan.orf <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, tf.orf), method = "hk")
plot(scan.orf)

plot of chunk unnamed-chunk-4

plot(scan.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
lod.thr <- c(summary(perm.orf, 0.01))
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)

plot of chunk unnamed-chunk-4

R/qtlhot: hotspots

Now let's look at hotspots. The vignette("qtlhot", "qtlhot") walks through a simulated set of hotspots. For the yeast data, see vignette("yeast_cmst", "qtlyeast"). Basically, we use the saved high LOD scores (highlod.orf) and LOD permutation threshold (lod.thr) to find hotspot size across the genome (hotsize.orf). The summary shows several significant hotspots, with one of the largest on chr 2, right at YBR158W. The red 5% threshold of 96 is for the number of traits with significant LOD at a position (Chaibub Neto et al. 2012).

hotsize.orf <- hotsize(highlod.orf, lod.thr = lod.thr)
hot.thr <- 96
(hot.sum <- summary(hotsize.orf, threshold = hot.thr))
## hotsize elements:  chr pos max.N 
## LOD threshold: 4.378
##                      chr   pos max.N
## YBR154C_chr2@548401    2 224.1   243
## c3.loc60               3  60.0   108
## YLR257W_chr12@659357  12 323.6   123
## c14.loc242            14 242.0   360
## gOL02_chr15@170945    15  61.1   214
plot(hotsize.orf, chr = hot.sum$chr, bandcol = "gray70")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)

plot of chunk unnamed-chunk-5

plot(hotsize.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)

plot of chunk unnamed-chunk-5

R/qtlhot: causal pairs

Find co-mapped traits and get joint-CMST p-values with Benjaminie-Hochberg adjustment (Chaibub Neto et al. 2013).

cand.tf <- cand.reg[cand.reg$gene == tf.orf, , drop = FALSE]
comap.tf <- GetCoMappingTraits(highlod.orf, cand.tf)[[1]]

The TransFac database provides transcription factor targets, which are stored in ko.list in the qtlyeast package. Let's find which of the co-mapping traits are targets of the transcription factor AMN1, and do causal mapping tests for them.

match.orf <- !is.na(match(comap.tf, ko.list[[tf.orf]]))
out.peak <- CMSTtests(yeast.orf, tf.orf, comap.tf[match.orf], Q.chr = 2, Q.pos = cand.peak, 
    method = "joint", penalty = "bic")
causal.call <- function(x) {
    ordered(c("causal", "react", "indep", "corr")[apply(x, 1, which.min)], c("causal", 
        "react", "indep", "corr"))
}
target.peak <- data.frame(pval = apply(out.peak$pvals.j.BIC, 1, min), call = causal.call(out.peak$pvals.j.BIC))
row.names(target.peak) <- factor(comap.tf[match.orf])
table(target.peak$pval <= 0.1, target.peak$call)
##        
##         causal react indep corr
##   FALSE      3     0    18    1
##   TRUE       0     0     6    0
target.peak$gene <- yeast.annot[match(row.names(target.peak), yeast.annot$orf), 
    "gene"]
target.peak[order(target.peak$pval), ]
##                pval   call    gene
## YJL078C   1.137e-06  indep    PRY3
## YHR143W   9.246e-03  indep    DSE2
## YER124C   7.223e-02  indep    DSE1
## YNL066W   8.403e-02  indep    SUN4
## YLR042C   8.722e-02  indep YLR042C
## YLR286C   9.070e-02  indep    CTS1
## YLR051C   1.186e-01  indep    FCF2
## YHR081W   1.545e-01  indep    LRP1
## YGL029W   1.733e-01  indep    CGR1
## YNL163C   2.408e-01  indep    RIA1
## YER044C.A 2.555e-01  indep    <NA>
## YGR272C   2.667e-01  indep YGR272C
## YMR132C   2.980e-01  indep    JLP2
## YEL003W   3.151e-01  indep    GIM4
## YNL327W   3.194e-01  indep    EGT2
## YOR021C   3.229e-01  indep YOR021C
## YGL028C   3.919e-01  indep   SCW11
## YDL048C   3.957e-01  indep    STP4
## YMR030W   4.833e-01  indep    RSF1
## YDR171W   5.148e-01 causal   HSP42
## YGR041W   5.183e-01  indep    BUD9
## YOR264W   5.772e-01  indep    DSE3
## YLR285W   6.116e-01  indep    NNT1
## YNR067C   6.232e-01  indep    DSE4
## YMR173W   6.254e-01 causal   DDR48
## YMR169C   6.522e-01 causal    ALD3
## YKR083C   6.579e-01   corr    DAD2
## YKL014C   6.803e-01  indep    URB1

Only 6 targets had a significant causal call, and all of those were for the independent model (pleiotropy). Let's look only at the alleged targets of AMN1. We use color to identify the calls for these 28 ORFs.

plot(scan.orf, chr = 2)
for (target in row.names(target.peak)) {
    tmp <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, target), method = "hk")
    if (max(tmp)$lod > lod.thr) {
        col <- "gray"
        if (target.peak[target, "pval"] < 0.1) 
            col <- c("red", "blue", "green", "gray")[unclass(target.peak[target, 
                "call"])]
        plot(tmp, chr = 2, add = TRUE, col = col)
    }
}
## Warning: Dropping 6 individuals with missing phenotypes.
## Warning: Dropping 2 individuals with missing phenotypes.
## Warning: Dropping 1 individuals with missing phenotypes.
## Warning: Dropping 7 individuals with missing phenotypes.
## Warning: Dropping 2 individuals with missing phenotypes.
## Warning: Dropping 1 individuals with missing phenotypes.
plot(scan.orf, chr = 2, add = TRUE)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)

plot of chunk unnamed-chunk-8

So, it appears that the AMN1 is pleiotropic to targets with strong QTL signals.

R/qtlnet: causal network

Let's take the significant targets of AMN1 and see if we can develop a causal network.

net.set <- c(tf.orf, row.names(target.peak)[target.peak$pval <= 0.1])
library(qtlnet)
## Loading required package: igraph
## Loading required package: sem
## Loading required package: MASS
## Loading required package: matrixcalc
## Attaching package: 'matrixcalc'
## The following object(s) are masked from 'package:corpcor':
## 
## is.positive.definite
## Loading required package: graph
## Attaching package: 'graph'
## The following object(s) are masked from 'package:igraph':
## 
## degree, edges
## Loading required package: pcalg
## Loading required package: abind
## Loading required package: sfsmisc

The following will create MCMC samples for the QTLnet, but it takes awhile. Instead we read already created data.

amn1.qtlnet <- mcmc.qtlnet(yeast.orf, pheno.col = find.pheno(yeast.orf, net.set),
                           threshold = lod.thr, nSamples = 1000, thinning = 20,
                           random.seed = 92387475)
save(amn1.qtlnet, file = "amn1.qtlnet.RData", compress = TRUE)
load("amn1.qtlnet.RData")
amn1.graph <- igraph.qtlnet(amn1.qtlnet)
plot(amn1.graph)

plot of chunk unnamed-chunk-10

This apparently did not work well, as the YBR158W is shown as directly causal to one target and reactive to two others. More work would be needed to study this more carefully.