It’s easy to be overwhelmed by the number of workflows that are available for NGS. How do you choose? While cost and turnaround time are important considerations, the most import question is: What are you trying to learn from your samples? Here, we provide an overview of three general categories of NGS workflows—whole-genome sequencing (WGS), whole-exome sequencing (WES), and target enrichment using custom panels—and the types of information that each provides.
WGS, as its name implies, yields sequence data across the entire genome—which, for humans, contains ~3.1 million base pairs. This includes (a) protein-coding regions, comprising ~1–3% of the genome (the exome), and (b) the rest of the genome, including intergenic and regulatory regions, intron sequences, and regions corresponding to noncoding RNAs.
WGS can identify genetic variants in coding and noncoding regions, including insertions and deletions (indels), chromosomal rearrangements, copy number variations (CNVs), and single-nucleotide polymorphisms (SNPs). WGS also provides the basis for genome-wide association studies (GWAS), used for finding associations between genetic variants and particular traits or diseases. Unlike many targeted approaches, WGS can also identify variants that change mRNA splicing patterns or that impact the expression of noncoding RNAs, many of which have critical cellular functions.
While WGS is the most comprehensive and least biased of the workflows discussed here, it requires the greatest amount of sequencing (which is expensive) and the most intensive analysis (which is time-consuming and requires specialized expertise). In addition, because the data from WGS spans the entire genome, it typically does not achieve the depth of sequencing required to answer questions about somatic mutations or rare variants.
WES is a more focused method that primarily captures regions that encode proteins, though some WES panels also include regulatory regions. This selection process is called target enrichment, and is typically carried out during sample preparation using biotinylated probes complementary to the desired sequences (an exome panel, for WES).
Because the majority of known disease-associated variants occur within coding regions, WES is often used to detect these mutations. Compared to WGS, WES greatly reduces sequencing costs by focusing on only ~1–5% of the genome. Thus, for researchers whose focus is protein-coding regions, WES offers a cost-effective alternative to WGS by enabling deeper sequence coverage of target regions and simplifying data analysis.
When choosing between WGS and WES, it is important to remember that (1) WES data includes only genome regions covered by the exome panel, thus limiting the discovery of variants in other regions and the detection of structural variants, and (2) even the best target enrichment workflows are prone to some degree of target dropout and coverage bias, especially in GC- or AT-rich regions—although well-designed, high-quality probe panels can greatly reduce these issues.
Targeted and custom-designed probe panels offer an efficient way to gain genetic insights into specific biological processes or disorders. These panels cover highly curated combinations of genomic regions, and can be designed to include a combination of coding and noncoding regions and/or genes associated with particular conditions—such as specific cancers, metabolic disorders, and neurological or cardiac disorders. Custom panels can also target the genomes of other organisms; for example, targeted NGS of bacteria, fungi, and viruses enables the detection of these pathogens in complex samples where the majority of the nucleic acids is from the host.
Although these smaller panels ensure that most of the sequencing resources are devoted to the specific research question, their success depends on many factors, including expert probe design and high-quality probes, as noted above for WES.
In the end, the research question itself is the best guide for choosing the NGS workflow.