APR 20, 2021 7:30 AM PDT

Keynote Presentation: A structural variation map of 499 Han Chinese individuals using long-read sequencing data

Structural variants (SVs) are essential in human evolution and genetic disease but remain understudied. This is especially the case for non-Caucasian ethnicities. We report, for the first time, a comprehensive study of SVs across the genomes of 499 Han Chinese, the largest ethnicity group, using ~15x Oxford Nanopore technology. We identify a total of 81,752 SVs, of which 46.86% are novel when compared to the SV calls in the dbVar and many can be successfully genotyped in short read data. Using these we were able to identify and compare Han Chinese specific SV hotspots across the genome and identified novel sequences that can be place in the human genome.  We found that SV hotspots not only affected the well-known amylase locus, but also seven carbohydrate metabolic processes in the Han Chinese genomes. Further, we uncovered 355 unreported natural knockout genes in the Han Chinese genomes. These data provide the first sequence map of structural variation map of Han Chinese and are a valuable resource for both the population genetics and medical communities.

Learning Objectives:

1. Attendees will learn about variability and complexity of SV across the Han Chinese population and its implication on phenotypes

2. Attendees will learn about what we are missing on GRCH38/37 and why ethnicity specific reference/ pan genomes are necessary

3. We will review approaches and sequencing technologies to scale up the study of hundreds to thousands of long read genomes

