Every person is comprised of a sequence of genetic data that is housed within the nucleus of the cells and encoded as Deoxyribonucleic acid (DNA) that is geometrically folded into 23 paired chromosomes. Extra-nucleolar DNA is also found in the powerhouse – mitochondria, of the cell. This genetic data code for several proteins and the arrangement of the triplet codons determine the protein to be produced. The human genome is made up of LINEs, SINEs, genes serving regulatory function, introns, and triplets of codons to which no function has been ascribed yet.
DNA sequencing technologies
DNA sequencing encompasses all the processes involved in finding out the arrangement of the bases that make up the nucleic acids in the genome. In the past, mapping the entire genome required splitting the DNA into smaller segments that can then be conveniently mapped. The methods utilizing this technique were referred to as “short-read”.
DNA sequencing technologies are broadly divided into short-read technologies (like the chain termination method or the Sanger sequencing) and the newer “next generation” (or long-read) technologies. Although the price is considerably lesser, from the time of the human genome project, the short-read technology has a major drawback of leaving gaps within the assembled maps. The next-generation tools, in contrast, had longer DNA reads and have the Oxford Nanopore DNA sequencing method and PacBio HiFi DNA sequencing method as examples. They are low-cost, microscale, highly parallel, and fast.
The human genome project(HGP)
This was a transnational research project whose goal was to map out and completely map the human DNA from both a physical and functional perspective. The idea was conceived by the US government in 1984, launched in 1990, and completed in 2003. Although measured as successful, the project was unable to sequence the entire DNA in humans – it sequenced only the euchromatic areas that account for about 92% of the entire genome. The remaining 8% heterochromatic regions found in the centromeric and telomeric regions of chromosomes were not sequenced. The Sanger method was utilized to sequence the genome in this project.
A major drawback to this project was the over 300 gaps in the genome released by the Genome Reference Consortium in 2009 and 160 in 2015. Despite these gaps, the HGP was of great benefit to the advancements in genetic medicine.
A complete gapless genome sequence
Roughly 20 years after the GRC issued the first paper on the genome project, scientists at the Telomere to Telomere (T2T) Consortium have issued for the first time, a sequence of the human genome without gaps. The research is built on already existing work from the HGP and other works since then. The sequencing utilized sophisticated next-generation technologies and currently, 6 papers on the gapless genome are in the journal, Science. It completed the remaining 8% of DNA from the HGP and includes myriads of genes and reoccurring DNA sequences. These genes controlled immune response as well as drug response.
The T2T Consortium has used the comprehensive genome as a template to discern over 2 million added variations to the human genome and in the words of Evan Eichler, Ph.D., T2T Consortium co-author, “the blueprint will revolutionize the way we view genomic dissimilarity, disease, and evolution”. The T2T Consortium is made up of scientists at the National Institute of Health, the National Human Research Institute (NHGRI), the University of Washington, and the University of California with its major funder being NHGRI.
The genome project was the first step toward appreciating the genetic undertone of human variations and ailments and the completion of the genome is an outstanding scientific milestone that has created a wholesome assessment of our genetic blueprint. This will create the needed platform for routine genetic screenings in clinical care in the foreseeable future.