OCT 01, 2021 12:00 AM PDT

Making the Most of Your NGS Data: Understanding Metrics for Target-enriched NGS

SPONSORED BY: Roche Sequencing

Introduction

Targeted next-generation sequencing (NGS) is often performed using hybridization-based target enrichment, which deploys oligonucleotide probes to capture regions of interest for downstream sequencing.  Although targeted sequencing reduces sequencing expense, it is still time-consuming and expensive, so an understanding of key sequencing metrics can help you to maximize the value of each run.

Beyond common metrics (e.g., base quality, cluster density, number of reads passing filter), several additional metrics provide more in-depth insights into the success of a sequencing run:  Depth of coverage (the number of times that a particular base within the target region is represented in the sequence data) and on-target rate (the number of bases that map to the target region) are fairly intuitive concepts.  Also intuitive is the duplication rate for a sequencing run, which reflects the percentage of duplicate reads (reads that are mapped to the exact same location, including the coordinates of the 3’ and 5’ ends) out of the total mapped reads. This article focuses on two less-intuitive metrics: GC-bias and Fold-80 penalty, and offers some tips on how to improve them.

GC bias

The distribution of AT-rich and GC-rich regions—often referred to as GC content—is uneven across genomes. During sequencing, regions of high or low GC content are often unevenly sequenced, causing disproportionate coverage of these regions; this is known as GC bias. GC bias in sequencing data across regions of variable GC content can be visualized in GC-bias distribution plots (Figure 1).

High levels of GC bias can be introduced during library preparation (especially in workflows dependent on PCR), during hybrid capture, or during the sequencing run itself.  This bias increases the amount of sequencing that must be performed, driving up expense; thus, it is important to choose a library preparation kit that minimizes GC bias.

Fold-80 Base Penalty

Analysis of sequencing data typically reveals that some target regions have achieved higher coverage than others. The Fold-80 base penalty metric is one way to assess coverage uniformity. Once the mean target coverage is determined for an experiment, the Fold-80 base penalty describes how much more sequencing is required to bring 80% of the target bases to that mean coverage.  Thus, a run with perfect coverage uniformity would have a Fold-80 base penalty score of 1, indicating an on-target rate of 100% and uniform coverage (see Figure 2). Values > 1 reflect uneven levels of uniformity. For instance, a Fold-80 value of 2 means that twice as much (2-fold) sequencing is required for 80% of the reads to reach the mean coverage.

The Fold-80 base penalty provides information about the capture efficiency of the probes in the panel, which is impacted by both probe design and probe quality. To decrease the Fold-80 base penalty and reduce the need for additional, costly sequencing runs, use high-quality, well-designed probes.

Understanding sequencing metrics can help you to get the most out of valuable sequencing resources, including time, money, and precious samples. To watch short videos about the five metrics mentioned here, and to learn about other aspects of NGS, visit: https://go.roche.com/Targeted-NGS-Metrics

About the Sponsor
  • At Roche Sequencing, we are building on Roche's legacy of innovation to transform NGS and its application. By simplifying workflows & expanding assay menus, we are broadening access to genomic data & lowering barriers to routine use. Our growing suite of products spans the genomics workflow, from sample acquisition & preparation through data analysis and final result, helping you answer important questions in genetics, cancer & beyond.
You May Also Like
SEP 25, 2021
Health & Medicine
Mapping the Spread of COVID-19 in Africa Using Genomics
SEP 25, 2021
Mapping the Spread of COVID-19 in Africa Using Genomics
After a year-long study, researchers have assembled a detailed narrative about how the SARS-Cov-2 virus has spread on th ...
SEP 27, 2021
Genetics & Genomics
DNA Gives Clues to the Mystery of 'Skeleton' Lake
SEP 27, 2021
DNA Gives Clues to the Mystery of 'Skeleton' Lake
While this research has provided some answers, it also raised many new questions. An image by Atish Waghwase/Harney et a ...
SEP 28, 2021
Genetics & Genomics
Revealing the Age of Lobsters with DNA
SEP 28, 2021
Revealing the Age of Lobsters with DNA
It's tough to tell how old lobsters are. Even lobster researchers aren't sure exactly how old they can get. Generally, t ...
OCT 05, 2021
Genetics & Genomics
Genetic Mutations May Not be Related to the Aging Process
OCT 05, 2021
Genetic Mutations May Not be Related to the Aging Process
As our body ages, cells have to divide to replenish those that become worn out or damaged. Most cells also carry the gen ...
NOV 05, 2021
Plants & Animals
Understanding of the Social Lives of Dinosaurs - Herd Behavior Identified for the First time in Argentina
NOV 05, 2021
Understanding of the Social Lives of Dinosaurs - Herd Behavior Identified for the First time in Argentina
Meet Mussaurus patagonicus, an adorable five-foot-tall early sauropod or "long neck"
NOV 16, 2021
Clinical & Molecular DX
Algorithm Mines Big Data, Finds Gene Linked to Psychiatric Disease
NOV 16, 2021
Algorithm Mines Big Data, Finds Gene Linked to Psychiatric Disease
We have a wealth of human genome data overlaid with gene ‘hotspots’ linked to the development of particular ...
Loading Comments...