The Y-linked gene content of
D. willistoni shares several important features with the two other species with well-known Y chromosomes,
D. melanogaster and
D. virilis. First, in the three species nearly all genes originated by duplications from genes already strongly or exclusively expressed in testis. The only known exception is
FDY, a recent Y-linked gene from
D. melanogaster whose ancestral gene (
vig2) is expressed in many tissues and organs (including testis) and is strongly expressed in ovaries
[34]. Second, the gene duplications to the
D. willistoni Y were mediated by a DNA mechanism in all seven cases that we could ascertain. This probably holds true for other
Drosophila species, although they have not yet been systematically examined in this respect. If confirmed, this would be an interesting difference between genes acquired by the Y and the other chromosomes, which involve an RNA intermediate in 25% of the cases
[28]. A possible explanation for the absence of RNA-based duplications to the
Drosophila Y is that this chromosome (as other heterochromatic regions) is a harsh environment for arriving genes: it has been long known that euchromatic genes that move to heterochromatic regions are silenced
[35], and it is likely that a gene duplication carrying flanking euchromatic sequence has a higher survival chance than a naked, promoter-less retrocopy. The third commonality among the three species is the preponderance of gene gains over gene losses, as suggested by the inspection of
Figure 1 and confirmed by the statistical analysis. As detailed in the
Supplementary Material, using the data of the three species we found a gain–loss ratio of 25 (
p = 0.002; 95% confidence interval: 3.4–184.5), and the same qualitative result is obtained when removing the four genes whose original autosomal copies are functional (
Figure S3 and
Table S8). These four genes—
GK20591,
YOgnWI018045, and the multicopy
GK18510 and
GK20618/
GK20619—arguably can become pseudogenes in the Y; if we remove them, we obtain a gain–loss ratio of 21;
p = 0.003. It will be interesting to look at other species, particularly outgroups, to better understand what is happening. However, it is already clear that
Drosophila Y chromosomes are not evolving according to the canonical theory of Y chromosome evolution (see
Appendix A for an alternative view).
On the other hand, the
D. willistoni Y chromosome has some features not seen in other
Drosophila species. First, it seems that a recent burst in gene acquisitions happened after the split between
D. willistoni and its cryptic species
D. paulistorum/
D. equinoxialis, which created a set of ~10 private Y-linked genes in
D. willistoni. This gene gain burst is even more remarkable given the rather short time interval (4.8 Mya; Ref.
[38]) and the lack of recent bursts in the well-studied
D. melanogaster and
D. virilis (the former has one or two private Y-linked genes
[15][39];
D. virilis has none). Second, it seems that the gene gain rate is higher in
D. willistoni. The previous estimates of the gain-loss ratio were 10.7 and 4.9 (using
D. melanogaster and
D. virilis as the focal species, respectively
[13][14], and when we included
D. willistoni we got 25. Furthermore, the heterogeneity in the gain–loss ratio among branches becomes statistically significant, suggesting that
D. willistoni is an outlier. Third, we found four large segmental duplications that copied ~700 kb of autosomal sequence in the
D. willistoni Y chromosome (
Table 1); the previously known cases have a few kb
[15][39]. The first and second peculiarities of the
D. willistoni Y may be a consequence of the third: a higher rate of segmental duplications to the Y is expected to increase the gene gain rate and, if recent, may generate a fairly large amount of recent gene acquisitions by the Y. We got mixed results while trying to find evidence for this hypothesis in the literature. Suppose this hypothesis is correct and that segmental duplications occur in other chromosomes as well. In that case, one might expect to find increased gene movements in general, but
D. willistoni does not seem to be an outlier in this respect
[23]. On the other hand, Vibranovski et al.
[28] analyzed the same dataset of
[23] by partitioning it in A→A, A→X, and X→A movements and noted that
D. willistoni is an outlier to the general pattern of excess of gene movements out of the X: it has many A→A and A→X movements, which weakly supports the idea that gene gains are more frequent in the
D. willistoni lineage. Perhaps at this point, the most solid conclusion we can obtain was already outlined in the 12
Drosophila Genome Project paper: “
D. willistoni is an exceptional outlier by several criteria, including its unusually skewed codon usage, increased transposable element content and potential lack of seleno-proteins”
[40][41]. The list goes on: “Some clades, like the
willistoni group, seem to undergo many more [intron] losses per million years than others”
[42]. It will be interesting to directly investigate the occurrence of segmental duplications in the X and autosomes to verify if it is indeed increased in the
D. willistoni lineage and if these duplications created new genes. In particular, one can look at the unexpected gene movements reported by Vibranovski et al.
[28], and check if several genes came from adjacent chromosomal regions, which is a telltale sign of a segmental duplication. Finally, careful studies of more
Drosophila Y chromosomes might help us better understand the relationship between segmental duplications and gene traffic to the Y chromosome.