MAY 19, 2020 7:55 PM PDT

A Faster Way to Identify New Cell Types

WRITTEN BY: Carmen Leitch

Many different kinds of cells are needed to carry out the functions of complex organisms. Different cell types express different genes; the genes that are active in a muscle cell in the heart, for example, are different from the genes that are expressed in a neuron in the brain. The identity of a cell can thus be ascertained by analyzing the genes that are active at the single-cell level. It's important to be able to identify the types of cells that are present in a tissue and how they behave in order to have a better understanding of health and disease. Methods for classifying cells work well for cells we already know a lot about, but are not great at finding new kinds of cells.

A human brain tissue biopsy specimen with different cell types / Credit: CDC / R.D. Kimbrough

Reporting in Nature Methods, researchers have now created a tool called Single Cell Clustering Assessment Framework (SCCAF) to overcome this problem. Gene expression patterns can be used to cluster cells together as one type. In this work, the researchers created a clustering algorithm. The computational technique can replicate the time-consuming manual work that has typically been used to characterize cell types and identify new ones.

The method groups cells into clusters, and each cluster is then divided into a training and testing set. A model uses the training set to classify clusters of cells and predict what clusters will probably be found in the testing set.  

"The model repeats the training and testing steps for each cell cluster, gradually merging indistinguishable clusters, until its accuracy reaches a good enough level. Finally, our Single Cell Clustering Assessment Framework lists a set of feature genes to characterize each annotated cluster," explained the first author of the study Dr. Zhichao Miao of the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) and the Wellcome Sanger Institute.

The researchers validated the process, which they said is quick.

"We've tested the method on many existing large-scale datasets of human and mouse gene expression, treating human annotation as a gold standard. Our method can reproduce human annotation in an automated manner. By minimizing human involvement in data processing, we solve the most important bottleneck in high-throughput projects, such as the Human Cell Atlas," said the senior study author Dr. Alvis Brazma, a Functional Genomics Senior Team Leader at EMBL-EBI.

"The Human Cell Atlas initiative is a global consortium to map every cell type in the human body, to understand health and disease. The new automated cell-clustering method will enable us to identify cell types much more easily than before, helping us expand our understanding of cellular function and diversity," added Dr. Sarah Teichmann, a senior author from the Wellcome Sanger Institute, and co-chair of the Human Cell Atlas Organizing Committee.

Sources: Phys.org via Wellcome Trust Sanger Institute, Nature Methods

About the Author
  • Experienced research scientist and technical expert with authorships on 28 peer-reviewed publications, traveler to over 60 countries, published photographer and internationally-exhibited painter, volunteer trained in disaster-response, CPR and DV counseling.
You May Also Like
MAR 23, 2020
Cell & Molecular Biology
MAR 23, 2020
How a Father's Diet Can Impact the Health of His Offspring
When fathers consume a diet high in fat or low in protein it can increase the risk of metabolic disorders like diabetes ...
MAR 30, 2020
Microbiology
MAR 30, 2020
The Microbial Communities That Form on the Tongue
Scientists used a fluorescent imaging tool to analyze how bacteria grow on the human tongue.
APR 03, 2020
Neuroscience
APR 03, 2020
Why Autism is More Common in Boys than Girls
Researchers from the National Institutes of Health (NIH) have identified how a change in a single amino acid may be link ...
APR 06, 2020
Genetics & Genomics
APR 06, 2020
Do Genetic Factors Influence the Severity of COVID-19 Infections?
SARS-CoV-2 is the name for the pandemic coronavirus that causes the illness COVID-19, which affects people in extremely ...
APR 29, 2020
Genetics & Genomics
APR 29, 2020
Toxoplasma Infections Can Cause Epigenetic Changes in Males
Anywhere from 25 to 80% of the world's population is infected with a parasite called Toxoplasma gondii.
MAY 06, 2020
Genetics & Genomics
MAY 06, 2020
Advances in Genome Sequencing Technology
We've come a long way from the human genome project, which took years to complete. It now takes about 6 hours to sequenc ...
Loading Comments...