This is a quick demo of R/qtl and related packages R/qtlhot, R/qtlnet and R/qtlyeast for workshops on causal networks. For information and tutorials on R and R/qtl, visit http://www.rqtl.org . Each package his its own vignettes.
We first show some features of R/qtl. For more info, see http://rqtl.org/tutorials/rqtltour2.pdf .
library(qtl)
library(qtlyeast)
summary(yeast.orf)
## Backcross
##
## No. individuals: 112
##
## No. phenotypes: 5740
## Percent phenotyped: 99.2
##
## No. chromosomes: 16
## Autosomes: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
##
## Total markers: 2956
## No. markers: 159 188 49 224 153 113 291 149 132 209 285 273 207
## 105 309 110
## Percent genotyped: 98.1
## Genotypes (%): AA:50.1 AB:49.9
plot.map(yeast.orf)
This shows a summary of the Brem and Kruglyak (2001) yeast dataset, with 2956 markers and 5740 phenotypes.
tf.orf <- "YBR158W"
tf.gene <- yeast.annot[yeast.annot$orf == tf.orf, "gene"]
## Invoke QTL hotspot and causal pair library.
library(qtlhot)
## Loading required package: lattice
## Loading required package: corpcor
## Loading required package: mnormt
cand <- cis.cand.reg$cis.reg[cis.cand.reg$cis.reg$gene == tf.orf, ]
cand.peak <- cand$peak.pos
cand.pos <- cis.cand.reg$cis.reg$phys.pos[cis.cand.reg$cis.reg$gene == tf.orf]
We examine one particular open reading frame, YBR158W, corresponding to transcription factor, AMN1. Here is an image of the genotypes for chromosome 2 reordered by this ORF.
cand.mar <- which.min(abs(cand.pos - pull.map(yeast.orf)[[2]]))
geno.image(yeast.orf, 2, reorder = find.pheno(yeast.orf, tf.orf))
abline(v = cand.mar, lwd = 3, col = "green")
This ORF is located at 225.663cM; the nearest marker (number 132) is identified with a green line on the genotype image. Now we do a simple interval map scan, which finds the strongest evidence (LOD) for a QTL right at the gene location. The red line is the 1% permutation threshold.
yeast.orf <- calc.genoprob(yeast.orf, step = 2)
scan.orf <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, tf.orf), method = "hk")
plot(scan.orf)
plot(scan.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
lod.thr <- c(summary(perm.orf, 0.01))
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)
Now let's look at hotspots. The vignette("qtlhot", "qtlhot")
walks through a simulated set of hotspots. For the yeast data, see vignette("yeast_cmst", "qtlyeast")
. Basically, we use the saved high LOD scores (highlod.orf
) and LOD permutation threshold (lod.thr
) to find hotspot size across the genome (hotsize.orf
). The summary shows several significant hotspots, with one of the largest on chr 2, right at YBR158W. The red 5% threshold of 96 is for the number of traits with significant LOD at a position (Chaibub Neto et al. 2012).
hotsize.orf <- hotsize(highlod.orf, lod.thr = lod.thr)
hot.thr <- 96
(hot.sum <- summary(hotsize.orf, threshold = hot.thr))
## hotsize elements: chr pos max.N
## LOD threshold: 4.378
## chr pos max.N
## YBR154C_chr2@548401 2 224.1 243
## c3.loc60 3 60.0 108
## YLR257W_chr12@659357 12 323.6 123
## c14.loc242 14 242.0 360
## gOL02_chr15@170945 15 61.1 214
plot(hotsize.orf, chr = hot.sum$chr, bandcol = "gray70")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)
plot(hotsize.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)
Find co-mapped traits and get joint-CMST p-values with Benjaminie-Hochberg adjustment (Chaibub Neto et al. 2013).
cand.tf <- cand.reg[cand.reg$gene == tf.orf, , drop = FALSE]
comap.tf <- GetCoMappingTraits(highlod.orf, cand.tf)[[1]]
The TransFac database provides transcription factor targets, which are stored in ko.list
in the qtlyeast
package. Let's find which of the co-mapping traits are targets of the transcription factor AMN1, and do causal mapping tests for them.
match.orf <- !is.na(match(comap.tf, ko.list[[tf.orf]]))
out.peak <- CMSTtests(yeast.orf, tf.orf, comap.tf[match.orf], Q.chr = 2, Q.pos = cand.peak,
method = "joint", penalty = "bic")
causal.call <- function(x) {
ordered(c("causal", "react", "indep", "corr")[apply(x, 1, which.min)], c("causal",
"react", "indep", "corr"))
}
target.peak <- data.frame(pval = apply(out.peak$pvals.j.BIC, 1, min), call = causal.call(out.peak$pvals.j.BIC))
row.names(target.peak) <- factor(comap.tf[match.orf])
table(target.peak$pval <= 0.1, target.peak$call)
##
## causal react indep corr
## FALSE 3 0 18 1
## TRUE 0 0 6 0
target.peak$gene <- yeast.annot[match(row.names(target.peak), yeast.annot$orf),
"gene"]
target.peak[order(target.peak$pval), ]
## pval call gene
## YJL078C 1.137e-06 indep PRY3
## YHR143W 9.246e-03 indep DSE2
## YER124C 7.223e-02 indep DSE1
## YNL066W 8.403e-02 indep SUN4
## YLR042C 8.722e-02 indep YLR042C
## YLR286C 9.070e-02 indep CTS1
## YLR051C 1.186e-01 indep FCF2
## YHR081W 1.545e-01 indep LRP1
## YGL029W 1.733e-01 indep CGR1
## YNL163C 2.408e-01 indep RIA1
## YER044C.A 2.555e-01 indep <NA>
## YGR272C 2.667e-01 indep YGR272C
## YMR132C 2.980e-01 indep JLP2
## YEL003W 3.151e-01 indep GIM4
## YNL327W 3.194e-01 indep EGT2
## YOR021C 3.229e-01 indep YOR021C
## YGL028C 3.919e-01 indep SCW11
## YDL048C 3.957e-01 indep STP4
## YMR030W 4.833e-01 indep RSF1
## YDR171W 5.148e-01 causal HSP42
## YGR041W 5.183e-01 indep BUD9
## YOR264W 5.772e-01 indep DSE3
## YLR285W 6.116e-01 indep NNT1
## YNR067C 6.232e-01 indep DSE4
## YMR173W 6.254e-01 causal DDR48
## YMR169C 6.522e-01 causal ALD3
## YKR083C 6.579e-01 corr DAD2
## YKL014C 6.803e-01 indep URB1
Only 6 targets had a significant causal call, and all of those were for the independent model (pleiotropy). Let's look only at the alleged targets of AMN1. We use color to identify the calls for these 28 ORFs.
plot(scan.orf, chr = 2)
for (target in row.names(target.peak)) {
tmp <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, target), method = "hk")
if (max(tmp)$lod > lod.thr) {
col <- "gray"
if (target.peak[target, "pval"] < 0.1)
col <- c("red", "blue", "green", "gray")[unclass(target.peak[target,
"call"])]
plot(tmp, chr = 2, add = TRUE, col = col)
}
}
## Warning: Dropping 6 individuals with missing phenotypes.
## Warning: Dropping 2 individuals with missing phenotypes.
## Warning: Dropping 1 individuals with missing phenotypes.
## Warning: Dropping 7 individuals with missing phenotypes.
## Warning: Dropping 2 individuals with missing phenotypes.
## Warning: Dropping 1 individuals with missing phenotypes.
plot(scan.orf, chr = 2, add = TRUE)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)
So, it appears that the AMN1 is pleiotropic to targets with strong QTL signals.
Let's take the significant targets of AMN1 and see if we can develop a causal network.
net.set <- c(tf.orf, row.names(target.peak)[target.peak$pval <= 0.1])
library(qtlnet)
## Loading required package: igraph
## Loading required package: sem
## Loading required package: MASS
## Loading required package: matrixcalc
## Attaching package: 'matrixcalc'
## The following object(s) are masked from 'package:corpcor':
##
## is.positive.definite
## Loading required package: graph
## Attaching package: 'graph'
## The following object(s) are masked from 'package:igraph':
##
## degree, edges
## Loading required package: pcalg
## Loading required package: abind
## Loading required package: sfsmisc
The following will create MCMC samples for the QTLnet, but it takes awhile. Instead we read already created data.
amn1.qtlnet <- mcmc.qtlnet(yeast.orf, pheno.col = find.pheno(yeast.orf, net.set),
threshold = lod.thr, nSamples = 1000, thinning = 20,
random.seed = 92387475)
save(amn1.qtlnet, file = "amn1.qtlnet.RData", compress = TRUE)
load("amn1.qtlnet.RData")
amn1.graph <- igraph.qtlnet(amn1.qtlnet)
plot(amn1.graph)
This apparently did not work well, as the YBR158W is shown as directly causal to one target and reactive to two others. More work would be needed to study this more carefully.