In a major milestone, scientists at the National Human Genome Research Institute (NHGRI) have completed the first sequence of a human chromosome that reaches from one end to the other, without any gaps in between. While the human genome was successfully sequenced around 2003, there were challenges in precisely determining the exact sequences of certain portions of the human genome, including highly repetitive regions. That left some gaps. But this work shows that it is possible to produce a highly accurate, totally complete sequence of the human genome. The study has been reported in Nature.
"This accomplishment begins a new era in genomics research," said Eric Green, M.D., Ph.D., NHGRI director. "The ability to generate truly complete sequences of chromosomes and genomes is a technical feat that will help us gain a comprehensive understanding of genome function and inform the use of genomic information in medical care."
The human genome is around 6 million nucleotide bases long, each represented by a letter A, G, C, or T. Instead of reading each base at a time, so-called next-generation sequencing techniques chop the genome into tiny sections that are sequenced at the same time. Powerful computational tools then piece these parts together in the right order. This works for many areas of the genome, but trouble arises when long stretches of highly repetitive sequences arise.
"Imagine having to reconstruct a jigsaw puzzle. If you are working with smaller pieces, each contains less context for figuring out where it came from, especially in parts of the puzzle without any unique clues, like a blue sky," explained the senior study author Adam Phillippy, Ph.D., of NHGRI. "The same is true for sequencing the human genome. Until now, the pieces were too small, and there was no way to put the hardest parts of the genome puzzle together."
While the repetitive parts of the human genome often do not code for protein, they do contain sequences that help regulate gene expression or may serve other functions that we don't yet know. It's important to get the entire sequence of the genome to understand the importance of all the parts, and how they are relevant to health and disease.
In this work, the researchers focused on the X chromosome because it's connected to many different diseases like Duchenne muscular dystrophy, and hemophilia. They also did not use a sample from a person but instead utilized a special kind of cell that carries two identical X chromosomes. This avoids trouble that can arise when two copies of a chromosome are present (as in a human sample), and each can carry slight differences from the other or a male cell that only carries one copy of the chromosome. They also chose to keep the DNA mostly intact instead of chopping it to bits for sequencing. The team used an advanced machine that can generate long sequences of DNA, and cutting-edge computational tools. The efforts closed the gap and created a complete sequence of the X chromosome that includes about 3 million bases of repetitive sequence.
"We have never actually seen these sequences before in our genome, and do not have many tools to test if the predictions we are making are correct. This is why it is important to have specialists in the genomics community weigh in and ensure the final product is high-quality," said first study author Karen Miga, Ph.D., of the University of California, Santa Cruz.
This project is part of the Telomere-to-Telomere (T2T) consortium, which aims to complete the sequencing of the human genome in its entirety.
"We don't yet know what we'll find in the newly uncovered sequences. It is the exciting unknown of discovery. This is the era of complete genome sequences, and we are embracing it wholeheartedly," Phillippy said.
Scientists have their work cut out for them; chromosomes 1 and 9 contain larger repetitive sequences than the ones on the X chromosome.
"We know these previously uncharted sites in our genome are very different among individuals, but it is important to start figuring out how these differences contribute to human biology and disease," Miga said.