Three-dimensional genome architecture and disease variants

I feel lucky to run into Three-dimensional genome architecture and emerging technologies: looping in disease on Twitter. This review comprehensively and logically discussed the hierarchical genome architecture and summarized existing three-dimensional genome structure techniques: “C”-related, CHiC, Capture-HiC, DNase-HiC, ChIA-PET/HiChIP/PLAC-seq and CRISPR-based methods, with a gigantic table, reviewing the advantages/limits/relevant examples/computational pipeline. More importantly, it is such a resourceful paper that summarized how researchers associated all kinds of 3D techniques with disease-related SNPs. This gave us a clear view of what kind of 3D datasets publically available and to what extent have scientists come up with statistical models to link SNPs from non-coding regions with distal genes. I listed a few of them as follows:

  1. 3D techniques, CHi-C/p-CHi-C/HiCap, provides high-resolution cis-interactome data at clinically relevant loci such as regulatory elements, single nucleotide polymorphisms (SNPs) from GWASs, TAD boundaries or promoters. Important tools for connecting GWAS outcomes to target genes (many loci to all loci). (Namely, they can extract “promoter - other regions” interactions.)

  2. 3D techniques, ChIA-PET/HiChIP/PLAC-seq, study protein-specific chromatin interactome. Important in identifying chromatin architectural roles for proteins (many loci to all loci). (Namely, they study the “protein binding sites - other regions” interactions. Protein binding sites are similarly studied by ChIP-seq.)

  3. Hi-C maps for cortical and germinal brain regions identified increased promoter-enhancer interactions compared with other tissues. The authors found novel human-gained enhancers showed significant overlap with lineage-specific lncRNAs and 108 significant schizophrenia-associated variants. (I feel it paper offers excellent data for a more advanced statistical model to link Hi-C with schizophrenia GWAS study!): Chromosome conformation elucidates regulatory relationships in developing human brain.

  4. Integrating study of 21 different cell and tissues types of Hi-C maps determines frequently interacting enhancer regions (FIRE) which usually associated with super-enhancers, typical enhancers etc (basically they are biologically functionally active regions). FIREs are tissue-specific, distinct disease-associated FIREs further strengthens the association: Alzheimer’s SNPs were found in brain-specific FIREs and SNPs for acute lymphoblastic leukemia were found in GM12878-specific super-FIREs. A Compendium of Chromatin Contact Maps Reveals Spatially Active Regions in the Human Genome.

  5. A MODEL leveraged existing Hi-C data to determine that variants at regulatory elements outside of LD blocks interacted with genes or their enhancers harboring linked SNPs to impact gene expression and disease risk: Modeling disease risk through analysis of physical interactions between genetic variants within chromatin regulatory circuitry.

  6. Another study generates extensive catalogues of distal genomic regions that interact with promoters, or promoter-interacting regions (PIRs), in 17 primary hematopoietic cell types, linking 2500 novel SNPs to putative disease-associated genes related to blood and autoimmune disorders. Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters.

  7. This is not in the review paper but certainly offers good Hi-C and other DNA sequencing datasets for Cardiac Myocytes: High Resolution Mapping of Chromatin Conformation in Cardiac Myocytes Reveals Structural Remodeling of the Epigenome in Heart Failure.

Collectively, this review paper is really a good starting point for establishing thoughts of modeling how distal non-coding regulatory variants impact gene using 3D technologies.

License

Copyright 2017-present Ye Zheng.

Released under the MIT license.

Avatar
Ye Zheng
Ph.D. Candidate in Statistics