MAY 10, 2018 10:30 AM PDT

Excavating the Deep Genome: Deciphering Structural Variation in Complex and Repetitive Regions

C.E. CREDITS: CEU | P.A.C.E. CE | Florida CE
  • Assistant Professor, Department of Computational Medicine & Bioinformatics, Assistant Professor, Department of Human Genetics, University of Michigan
      Ryan Mills is an Assistant Professor in Computational Medicine & Bioinformatics and Human Genetics at the University of Michigan. After receiving his PhD in 2006 at Georgia Tech, he worked as an NRSA Postdoctoral Fellow at Emory University where he helped produce some of the first published genome-wide maps of insertion/deletion (indel) variation in human populations and develop technologies to derive their genotypes using microarrays. As a Research Associate at Harvard Medical School, he expanded the scope of his work into the mapping of larger structural and copy number variation as part of the 1000 Genomes and other projects. His current research is focused on developing methods for the identification, resolution and analysis of complex genomic rearrangements consisting of multiple breakpoints that are the result of overlapping or co-occurring structural changes to the genome.
      <br />


    Structural variants (SVs), defined as rearrangements of genomic sequences, are both a major source of genetic diversity in human populations and are also directly responsible for the pathogenesis of numerous diseases. Many studies have been conducted in the past decade to discover and analyze SVs, however these have predominantly focused on analyzing short-read sequencing data to infer the presence and structure of these variants. A primary limitation shared by all these approaches is their inability to access more complex regions of the genome, particularly repetitive sequences, which preclude the accurately alignment from the shorter read lengths. Recent studies using long-read sequencing technologies have suggested that the genomics community is thus routinely missing tens of thousands of SVs. Here, I outline several efforts to leverage these longer sequences to identify and assess structural variants in previously interrogated well-characterized genomes. I will first describe a new method, PALMER, that uses a pre-masking strategy to identify nested retrotransposition events that have inserted into existing repetitive sequences.  Our early results suggest that as many as 40% of mobile element insertions fall in such regions, and thus their discovery will impact ongoing estimates of mobile element insertion rates. Next, I will present a strategy we recently developed for the high throughput assessment and validation of SVs using recurrence analysis of long-reads. This has enabled the proper interpretation of many false positive or mis-annotated variants called by SV detection algorithms, particularly around complex and repetitive sequences, as well as providing additional support for previously tenuous predictions. We believe these tools will aid the genomics community into better deciphering chromosomal structural rearrangements and furthering our understanding of their mechanistic origins and functional impact.

    Show Resources
    Loading Comments...