MAY 12, 2016 09:00 AM PDT

Keynote: Planet Scale Analysis with Millions of Genomes

2 22 177

  • Director, Informatics, Harvard Personal Genome Project, Chief Scientist and Co-founder, Curoverse
      Alexander (Sasha) Wait Zaranek, PhD is co-founder and Chief Scientist at Curoverse, a venture-backed company focused on building a free and open-source platform for storing, analyzing and sharing biomedical data. Sasha works on open technologies that are part of the revolution that reduced human DNA sequencing costs by a million-fold since the completion of the Human Genome Project. A current research focus is the development of clinical-quality applications for processing massive data sets spanning millions of individuals across collaborating organizations, eventually encompassing exabytes of data.
      His contributions have led to highly cited publications in Science, Nature, the Lancet and other leading scientific journals. Sasha is also a co-founder and Director of Informatics at the Harvard Personal Genome Project.


    As millions of people all over the world get their genome sequenced, physicians and researchers as well as the individuals themselves will want to ask questions of these data. To ask questions at a planet scale, however, we must re-imagine the computational and storage infrastructure required. We also need a consistent naming scheme for parts of the genome. To address this we invented tiling – a technique that divides the genome into about 10 million overlapping, variable-length sequences, or “tiles”, each with a unique 24-base tag at each end. We use examples from public data to show tiling supports simple and consistent names, annotation, queries, machine learning,
    and clinical screening.    Someday soon the general public may get to
    know the tiles in their own genomes, while researchers and doctors may use the information to realize precision medicine.

    • Learn about free and open source software (FOSS), open-standards and open-data that can help you build necessary informatics infrastructure
    • Use Tiling to facilitate planet scale analyses of genomic data

    Show Resources
    Loading Comments...