MAY 13, 2015 10:30 AM PDT
Population Scale Human Genome Analysis on the Cloud
Presented at the Genetics and Genomics Virtual Event
4 58 1764

Speakers:
  • Peter White, Co-Founder, Chief Scientific Advisor, GenomeNext LLC, Assistant Professor of Pediatrics, Nationwide Children's Hospital, James Hirmas, Co-Founder, CEO, GenomeNext LLC
    Biography
      DR. PETER WHITE
      CHIEF SCIENTIFIC ADVISOR

      Dr. Peter White is the developer and inventor of the "Churchill" platform, and serves as GenomeNext's principal genomic scientist and technical advisor.

      He is a principal investigator in the Center for Microbial Pathogenesis at The Research Institute at Nationwide Children's Hospital and an Assistant Professor of Pediatrics at The Ohio State University. He is also Director of the Biomedical Genomics Core, a nationally recognized microarray and next-generation sequencing facility assisting numerous investigators in the design, production and analysis of genomics data. He is also Director of Molecular Bioinformatics, serving on the Research Computing Executive Governance Committee and the Data and Analytics Strategy Executive Committee. Dr. White has established multiple genomics initiatives as part of Nationwide Children's strategic goal to develop a cutting edge genomic medicine program. As developer and inventor of the balanced parallelization strategy for human genome analysis (named "Churchill"), Dr. White is a co-founder of GenomeNext LLC.

      Dr. White's research program at Nationwide Children's Hospital focuses on developing high performance computing solutions for "big data", utilizing disruptive technologies to rapidly analyze and interpret genomic data sets. Through genomic analysis of individuals, families and populations, his team is discovering genetic variation associated with diseases such as congenital heart defects, autism spectrum disorders and rare genetic diseases.

      Dr. White received his PhD in Molecular Biology from the University of Cambridge, England and completed his postdoctoral training in the Department of Genetics at The University of Pennsylvania, Philadelphia. He has over 15 years of experience in the field of genomics and computational biology, is the recipient of multiple awards from the National Institutes of Health and has authored over 50 peer reviewed publications.

      JAMES HIRMAS
      CHIEF EXECUTIVE OFFICER

      Serving as the Chief Executive Officer of GenomeNext, James Hirmas has a passion for building companies to solve some of the world's greatest challenges through the use of disruptive technology. Throughout his career, James has facilitated technological advances in commercial enterprises and has worked extensively with the US Federal Government to solve information technology and business challenges with the use of cutting edge technologies, such as cloud computing.

      In 2009, James lead the re-design and implementation of www.recovery.gov, the first Federal public facing website built on Amazon Web Services. In 2010, he co-founded and became the CEO of JHC Technology, a company dedicated to assisting large commercial enterprises, non-profits and public sector organizations to transition datacenter operations, IT strategy, and software development to the Cloud. While James remains a part of the JHC Technology Board of Directors, he has stepped down from his role as CEO of JHC Technology in March of 2014 in order to launch GenomeNext.

      James hopes that by merging cloud computing advancements, cutting edge genomic algorithms, and an expert team of passionate software developers, cloud computing engineers, and bioinformatics researchers that GenomeNext will propel genomics into an era of unparalleled discovery and innovations.

      James is a retired service disabled veteran of the US Armed Forces (US Army 2000 - 2004), and currently serves as a member of the DC Chapter of the Service Disabled, Veteran-Owned Small Business Council.

    Abstract:
    Advanced sequencing technologies have made population scale whole genome sequencing a possibility. However, current strategies for analysis of this data rely upon parallelization approaches that have limited scalability, lack reproducibility and are complex to implement, requiring substantial investment in specialized IT solutions. To overcome these challenges our goal was to develop a platform that fully automates all the necessary components to perform both single sample and large-scale genomic data analysis. We developed a highly accurate and deterministic analysis solution, named Churchill, which fully automates the analytical process required to perform the complex and computationally intensive process of alignment, post-alignment processing and genotyping. Our parallelization strategy enables division of each analysis step across multiple compute instances, enabling whole genome analysis to be completed in under 90 minutes. In addition to rapid single sample analysis, Churchill optimizes utilization of available compute resources and scales in a near linear fashion. Utilizing Amazon Web Services (AWS) cloud computing resources we developed a platform that enables population scale genome analysis to be performed. To demonstrate this, we analyzed the 1000 Genomes Project dataset of 2,504 whole genome and exome sequenced individuals. Starting from FASTQ raw input data, we were able to fully automate the analysis process, ultimately performing multi-sample variant calling and generating population allele frequencies in seven days. Our approach demonstrates the feasibility of generating population allele frequencies specific to a given unified analysis approach, critical for accurately filtering datasets for discovery of rare pathogenic variants. Moreover, through use of on demand cloud computing resources, our method represents a solution for the genomics computational bottleneck and will keep pace with the magnitude of data generated by population scale sequencing. Learning Objectives: 1 Understanding the steps required to analyze human genome sequencing data, for both single sample analysis and large scale genomic studies 2 Optimizing compute resources and leveraging cloud computing to resolve the bioinformatics bottleneck

    Show Resources
    Loading Comments...