The Y chromosome was the last minuscule piece that was missing from the most complete gapless sequence of the human genome ever generated, which was revealed by scientists last year.
After thirty years of research, the smallest member of the human chromosome family has finally been fully sequenced.
The outcome is a comprehensive reference genome for humans, which might now have information regarding male fertility.
“Now that we have this 100 percent complete sequence of the Y chromosome, we can identify and explore numerous genetic variations that could be impacting human traits and disease in a way that we weren’t able to do before,” says Dylan Taylor, a geneticist at Johns Hopkins University and one of the study’s authors.
The Y chromosome has previously been essentially “unreadable” because to its vast number of repeated sequences, some of which are extended palindromes.
The Telomere-to-Telomere group, led by genomicist Arang Rhie of the US National Human Genome Research Institute, stitched together lengthy lengths of DNA using cutting-edge sequencing methods and recently developed bioinformatic algorithms, finally mapping the entire Y chromosome.
Rajiv McCoy, a computational biologist at John Hopkins University, says, “We knew we had an incomplete picture up until now,” which might be a modest understatement given that the previous version of the Y chromosome lacked more than half of its bases.
These gaps, many of which covered genes involved in sperm production, caused subsequent studies to draw all kinds of false conclusions. For instance, some previously unidentified human Y sequences were wrongly believed to represent evidence of bacterial DNA contaminating collections.
But for the first time, McCoy notes, “we can now see the entire genome from beginning to end.”
The scientists assembled the Y chromosome in its entirety, consisting of all 62,460,029 base pairs, by filling in more than 30 million “letters” in the DNA sequence. They also found 41 novel protein-coding genes and fixed numerous mistakes in previously sequenced portions.
According to Adam Phillippy, a computer scientist at the US National Human Genome Research Institute, “the biggest surprise was how organized the repeats are.”
“Satellite DNA, which is composed of alternating blocks of two distinct repeating sequences, makes for over half of the chromosome. It creates a lovely design that resembles a quilt.
In a follow-up study, 43 male participants, half of whom were of African ancestry, had their human Y chromosomes assembled using the reference sequence. This work was directed by geneticist Pille Hallast of the Jackson Laboratory.
The collections, which together covered 183,000 years of human evolution, turned up some unexpected mutations in the Y chromosome.
One was that the Y chromosomes varied greatly in size from 45.2 million to 84.9 million base pairs.
There were also notable structural variations: while the precise gene sequences were preserved (and so continued to encode the correct proteins), occasionally larger DNA segments were flipped and orientated along the Y chromosome in the other manner.
The goal is always that previously unseen variations in the genome will be significant for comprehending human health, according to Phillippy.
Y chromosome genes have recently been linked to aggressive types of common malignancies in men, and Y chromosome deletion has been linked to the development of bladder cancer. However, we are unsure of what more we may have missed.
If sequencing technology develops further and makes it feasible to sequence entire genomes rather than just particular regions, a new era of personalized treatment could be on the horizon.
But if historical injustices and the lack of diversity in research studies aren’t addressed, genome sequencing could make healthcare disparities worse.
In the end, the researchers predict that “reference genomes” will simply be referred to as “genomes” once the complete, accurate, and gapless assembly of diploid human genomes becomes routine.
The studies have been published in Nature, here and here.