Single cell gene expression: new insights through the lens of full length mRNA isoform resolution



Single cell RNA sequencing (scRNA-seq) emerged to characterize gene expression differences between individual cells, allowing a higher resolution look at mRNA abundance than bulk RNA- seq. However, most scRNA-seq methodologies are coupled to short read sequencing platforms that report either sparse information spread across the entire mRNA or 3’ end tags that serve to quantify gene counts. Neither reveals contiguous mature mRNA sequences and the associated isoform-specific open reading frames which often encode proteins with different functional properties. To address this, we and other groups have modified the workflows of existing single cell transcriptomic technologies to instead generate long read RNA-seq libraries.  These libraries are then profiled using PacBio’s SMRT sequencing technology on the Sequel II instrument. In the steps that follow the single cell technology, the workflow is similar to PacBio’s RNA-seq offering, Iso-Seq. Importantly, the single molecule HiFi reads are highly accurate so the required single cell information (barcode and unique molecular index) can be directly assessed in the sequencing reads. In this Labroots session, I discuss publications and results from this approach using three different single cell technologies, with a special focus on quality control metrics that we have applied to the single cell long reads using orthogonal data types. I will show that existing single cell protocols from these providers can be modified in a subtle fashion to generate matched short read and long reads sequencing libraries harboring the same single cell information.  I’ll also touch on the bioinformatic methods we are using to process and analyze the single cell data.  Visualizing some examples from this single cell data type will highlight new insights uncovered when single cells are profiled for their isoform expression.

Learning Objectives:

1. How to modify laboratory workflows from existing single cell transcriptomic platforms to obtain full-length isoform information

2. Bioinformatic methods and how these can be applied as a quality control step for long read isoform sequencing from single cells

3. How isoform-level information can uncover biological phenomena that remain invisible with short reads alone.