Way back in 2003, you may have heard that the international effort to map the human genome was complete. Turns out, that headline wasn’t 100% accurate.
The idea for the Human Genome Project (HGP) caught the attention of the US government in 1984 and was subsequently funded through the the National Institutes of Health (NIH), the Department of Energy, and by other nations and organizations around the world. After several years of initial planning, the project officially began in 1990 with a budget of $3 billon dollars and a 15-year schedule to make or break. Much progress was made, and a “rough draft” of the human genome was announced in 2000. Although HGP was a huge and complex project, for once international cooperation actually involved genuine cooperation and the project was declared complete in 2003, two years early than planned.
Except, the human genome mapped by HGP was only about 95% complete. The. DNA base pair sequencing was incomplete, with around 150,000 ‘gaps’ within the genome map. The gaps are exceedingly long repeating series of base pairs that were too difficult for the technology of the late 1990s and early 2000s to determine without errors. When they announced the job was complete, what they meant was that it was as complete as the technology of the day could make it.
[Sidebar: For reference, a DNA base pair are two nucleobases (bio-chemical compounds) bound together, either guanine-cytosine (GC, CG) or adenine–thymine (AT, TA) which form the famous DNA double helix.]
Further, the human genome mapped by HGP and “completed” in 2003 is just one reference genome based on the genes of a small number of donors, and that one reference genome is a mosaic of the DNA samples obtained from that small group of test subjects, who did not represent all ancestral groups. To represent all ancestral groups will require at least 350 reference genomes, possibly more. Basically, the one reference genome ‘completed’ in 2003 is woefully inadequate as a reference that’s supposed to reflect all humans. Interestingly, all humans share 99.9% of the same DNA. Only 0.1% of human DNA makes us different from one another, yet we can be categorized into at least 350 ancestral groups.
To address these issues, the gaps in the genome map are being closed with advanced technology by the Telomere-to-Telomere consortium (T2T). There are only around 100 gaps still remaining – there were 150,000 gaps remaining in 2003 – but those 100 are the most difficult gaps to map. The 350 (or more) reference genome sequences which will represent all ancestral groups (and all human genetic diversity) are being assembled by the Human Genome Reference Program (HGRP) under the auspices of the NIH and the National Human Genome Research Institute.
When completed, the database of 350 reference genomes will be known as the human pangenome, meaning the complete genome reference for all humans. This video is designed specifically to educate the average citizen on the project: “The Human Pangenome” (5:31):
Question of the Night: Have you had your DNA tested? If so, was it a good or bad experience? Would you recommend it? If you haven’t had a DNA test, do you plan on getting it done, or what concerns you about it?