Even the most straightforward next-generation sequencing (NGS) experiments can be expensive and time-consuming. Thus, it’s a good idea to test your workflow on a few samples to make sure you’ll get the kind of results you need to answer your questions with confidence—before diving into a large-scale experiment that consumes precious samples, time, and expensive reagents. These initial experiments are often called proof-of-principle (PoP), proof-of-concept, or feasibility studies. Here, we offer some tips for whole-genome sequencing and target-enriched DNA sequencing; for RNA-seq, see Simplifying the options for RNA-seq.
1- Know your definition of success. What do you need to learn from your samples? If you're looking for germline (inherited) variants in human genomic DNA, which are typically present at 50% (heterozygous) or 100% (homozygous) variant allele frequency (VAF), shallow sequencing (20-40X coverage depth) is usually sufficient. However, for rare variants, like somatic mutations in tumor tissues, deeper sequencing (100X to 1000X or more) is needed. This usually involves target enrichment using probe-based or amplicon-based methods to focus sequencing reads on specific genomic regions, providing more sensitive results for rare variant detection. Your expected VAF helps determine required sequencing depth and whether to use target enrichment. See also: Choosing an NGS workflow.
For non-human genomes, the genome size and complexity (such as GC content, repetitive regions, and chromosome structure) will impact the coverage required—so it’s important to reference similar studies.
2- Conduct your PoP study on the same type of samples you will use later.
The DNA extracted from different sample types, or with different extraction methods, can vary greatly in terms of concentration, quality, and contaminant levels—and this can impact the success of NGS library preparation. For example, DNA from fresh-frozen tissue is generally higher quality than from formalin-fixed paraffin-embedded tissue (FFPET)—especially older samples, which can be degraded or crosslinked. Cell-free DNA (cfDNA) from blood or serum is often fragmented, allowing you to skip the initial "DNA fragmentation" step in many library prep workflows. Conducting a PoP study enables you to troubleshoot your workflows to minimize the waste of actual samples.
3- Choose the right library preparation workflow for your samples and goals.
DNA quantity and quality are crucial for selecting a library prep workflow, so be sure to perform appropriate quality checks to ensure you're using the best method for your samples. Also, verify workflow compatibility with available equipment, as some require specialized systems (e.g., Covaris sonication).
The choice of library prep workflow—including whether it uses PCR amplification—significantly impacts final sequencing data quality, rare variant detection, and uniformity and depth of coverage. Workflows vary widely in:
Review each protocol for areas requiring optimization, such as fragmentation and amplification cycles. Also ensure that the predicted yield meets your needs; for example, much greater library yield is required for probe-based target enrichment vs immediate sequencing, and sequencing requirements may vary based on fragment size and expected allele frequency. In addition, reference standards can be useful for testing custom panels and as controls in actual experiments to detect artifacts arising from library prep, target enrichment, or sequencing.
4- Determine how many samples, replicates, and sequencing reads you need. Involving a bioinformatics team with a firm grasp of statistics during the planning stages will help ensure that your results will have sufficient statistical power. This prevents realizing, at the end of a long experiment, that you lacked enough—or the right kinds of—replicates to draw confident conclusions. It’s also important to know what your sequencing parameters will be: what sequencing platform are you using, how much final library do you need per sample, will your samples be multiplexed, and what read length do you need? For more about bioinformatics, see Bioinformatics in NGS (webinar) and An Overview of Bioinformatics,
5- Know what milestones you need to meet along the path from sample to data. First and foremost, checking the quantity and quality of the DNA prior to library prep can make the difference between achieving high-quality, sequencing-ready libraries or needing to start over. It’s also important to check the yield of your library (using a qPCR-based or dPCR-based method for accuracy) and the size distribution (using an automated electrophoresis instrument). This is critical whether (1) the library is headed directly for the sequencer (to avoid over- or under-loading, and to enable accurate multiplexing) or (2) the libraries will be used for target enrichment, as these workflows can require as much as a microgram of high-quality library.
Conclusion:
A well-planned proof-of-principle study can increase your confidence in new NGS workflows, and can make a huge difference in your ultimate NGS success. When in doubt, don’t hesitate to contact the vendors of the library prep kits, target enrichment probes, or sequencers; they can help with planning, technical questions, and troubleshooting to help you get the most out of every sample.