The Genome in a Bottle Consortium has published benchmarks for variant calling, but some challenging medically-relevant genes have been partially or fully excluded due to mapping challenges, structural variation, and issues in the reference genomes. Here, we use a trio-based, long-read, phased diploid assembly to form phased small variant and structural variant benchmarks for 273 out of 396 autosomal genes covered <90% by mapping-based benchmarks. We curated >1000 variants to exclude errors in the assembly, mostly in homopolymers and highly homozygous regions, and to ensure the benchmark accurately identifies false positives and false negatives across call sets. The new benchmark for challenging medically relevant genes will improve the characterization of challenging variants, leading to better insights for clinical genomics in the near future.
1. Learn about applications of long read sequencing technologies for variant detection in medical genes
2. Clarify use of Genome in a Bottle benchmarks and how a medical gene focused benchmark improves clinical application