APR 03, 2022 8:08 AM PDT

The Human Genome Sequence is Finally, Truly Complete

WRITTEN BY: Carmen Leitch

The Human Genome Project was declared complete in 2002. But it wasn't exactly finished. Most of the sequence, about 92 percent, had been totally deciphered, particularly the sections that contain protein-coding genes. But the genome also holds long stretches of repetitive sequences that can be very difficult to unravel using traditional or advanced DNA sequencing techniques. Now the gaps in the sequence have finally been filled in. The work has been reported in Science.

Those repetitive sequences were once dismissed as "junk DNA," but researchers have been finding more sections of that junk that have important biological functions. Since they do not code for protein, studying them can be  extremely challenging. But not only are they thought to be connected to some diseases, they may be essential to certain biological functions, making them important to understand.

The effort to map the elusive portions of the genome was named the Telomere-to-Telomere (T2T) Consortium, because the caps that sit on the ends of chromosomes and protect them are called telomeres. Like the dense middles of chromosomes, called centromeres, telomeres are also full of repetitive sequences that are hard to sequence. Those centromeres are also a critical part of DNA replication and cell division.

In the early days of sequencing, specific sections of the genome could be amplified; a selected sequence was targeted with small molecules called primers, which match short sections on the ends of those specific sequences. Once amplified into many copies by an enzyme, each base of that specific sequence can then be tagged with a fluorescent molecule, then the sequence of fluorescent colors is read as bases of DNA by a machine.

More advanced sequencing methods took a different approach. In next-gen sequencing, portions of the genome are chopped into tiny parts that are then sequenced and finally assembled together like puzzle pieces to create a long sequence. Repetition in the genome is difficult for both methods to deal with, and a third-generation sequencing technique was engineered. In third-generation or nanopore sequencing, much longer reads are possible. A single molecule of DNA is passed through a nanopore, and every base is read electronically.

Merfin is another tool that researchers created for this work. Merfin can correct mistakes made in the sequencing process, automatically detecting and correcting those errors.

Image credit: Modified from Pixabay

"Stretches of identical base pairs, such as AAA," can be difficult for current technologies to read, explained postdoctoral researcher Giulio Formenti, PhD, who developed Merfin. "There are often errors in those sequences, even now. Merfin corrects them."

The researchers are hoping that the techniques used to finish the human genome sequence, which were presented in a Nature Methods paper, will help scientists understand diseases that are associated with structural repeats in the centromere. "We are finally digging into what we once called junk DNA, because we could not understand it or look at it accurately," Formenti said. "Now that these sequences are no longer missing from the human reference genome, we can begin to map the origins of these diseases."

Cancer has been linked to centromere defects, for example. When some heterochromatic centromere genes are overactive, cancer cells divide wildly. Now that we have the sequence of the complete human genome, scientists can learn more about these mysterious regions.

Sources: Rockefeller University, Nature Methods, Science

About the Author
BS
Experienced research scientist and technical expert with authorships on over 30 peer-reviewed publications, traveler to over 70 countries, published photographer and internationally-exhibited painter, volunteer trained in disaster-response, CPR and DV counseling.
You May Also Like
MAR 29, 2022
Drug Discovery & Development
Keto Diet Boosts Chemotherapy for Pancreatic Cancer
MAR 29, 2022
Keto Diet Boosts Chemotherapy for Pancreatic Cancer
A ketogenic diet- or a ‘keto’ diet- which is high in fat and low in protein and carbohydrates, may help kill ...
APR 05, 2022
Immunology
Some Immune Disorders Have an Epigenetic Basis
APR 05, 2022
Some Immune Disorders Have an Epigenetic Basis
Twin studies can be enormously helpful to researchers that are trying to determine if the cause of a disease is genetic. ...
MAY 16, 2022
Genetics & Genomics
Sarin Gas Revealed as Cause of Gulf War Syndrome
MAY 16, 2022
Sarin Gas Revealed as Cause of Gulf War Syndrome
Gulf War illness (GWI) is an umbrella term that describes an array of chronic symptoms that have impacted an estimated 2 ...
MAY 17, 2022
Cell & Molecular Biology
Single-Cell Atlas Reveals Gene Activity Across Human Tissues
MAY 17, 2022
Single-Cell Atlas Reveals Gene Activity Across Human Tissues
Scientists have learned a lot about individual genes, especially those that are connected to human disease. And while we ...
MAY 17, 2022
Genetics & Genomics
The Labroots 2022 Genetics & Genomics Virtual Event Poster Winner: Leveraging STR detection via STRling to identify possible pathogenic variants
MAY 17, 2022
The Labroots 2022 Genetics & Genomics Virtual Event Poster Winner: Leveraging STR detection via STRling to identify possible pathogenic variants
Labroots’ virtual events are a fantastic way to network and learn about others’ work. These events feature p ...
MAY 21, 2022
Cell & Molecular Biology
The Effect of a Genetic Mutation Can Change Over Time, and Evolution
MAY 21, 2022
The Effect of a Genetic Mutation Can Change Over Time, and Evolution
Geneticists have sought to understand the impact of genetic mutations, and what drives and maintains changes in DNA.
Loading Comments...